Double integration extends the concept of a single integral to functions of two variables, $$f(x, y)$$. Geometrically, it represents the volume under a surface within a given region $$D$$.
Weighted sampling is a sampling technique where each item is
assigned a weight, and the probability of selecting that item is
proportional to its weight.
Higher weight $\Rightarrow$ higher chance of being selected
Lower weight $\Rightarrow$ lower chance (but usually not zero)
Weighted sampling is a sampling technique where each item is
assigned a weight, and the probability of selecting that item is
proportional to its weight.
Higher weight → higher chance of being selected
Lower weight → lower chance (but usually not zero)
Randomized Algorithms:An algorithm that uses random numbers to decide what to do next
anywhere in its logic is called a Randomized Algorithm.
For example, in
Randomized Quick Sort, we use a random number to pick the next pivot (or
we randomly shuffle the array)
Type of Randomized Algorithms
1. Las Vegas: A Las Vegas algorithm is an algorithm which uses
randomness, but gives guarantees that the solution obtained for given
problem is correct.
Advantage: Always produce a correct answer. Randomness affects the running time, not correctness. Expected running time is finite and analyzed probabilistically.
A
randomized quick-sort algorithm where we randomly choose pivot is an
example of Las-Vegas algorithm. The algorithm always sorts the array.
The advantage we get with randomization is, there is no pattern for
which the quick sort causes worst case (unlike choosing fixed pivot like
a corner element causes worst case for sorted input)
Disadvantage: Time complexity of these algorithms is based on a random value and time complexity is evaluated as expected value.
Quick Short: This is a method to partition the array in place such that all elements to the left of
the pivot element are smaller, while all elements to the right of the
pivot are greater than the pivot. Then we recursively call the same
procedure for left and right subarrays.
There are two types of algorithm for random pivoting. Lomuto Partitioning and Hoare Partitioning.
Algorithm for random pivoting using Lomuto Partitioning
partition(arr[], lo, hi) pivot = arr[hi] i = lo // place for swapping for j := lo to hi – 1 do if arr[j] <= pivot then swap arr[i] with arr[j] i = i + 1 swap arr[i] with arr[hi] return i partition_r(arr[], lo, hi) r = Random Number from lo to hi Swap arr[r] and arr[hi] return partition(arr, lo, hi) quicksort(arr[], lo, hi) if lo < hi p = partition_r(arr, lo, hi) quicksort(arr, lo , p-1) quicksort(arr, p+1, hi)
# Python implementation QuickSort using # Lomuto's partition Scheme.
import random
''' The function which implements QuickSort. arr :- array to be sorted. start :- starting index of the array. stop :- ending index of the array. ''' def quicksort(arr, start , stop): if(start < stop):
# pivotindex is the index where the pivot lies in the array pivotindex = partitionrand(arr,\ start, stop)
# At this stage the array is partially sorted around the pivot.
# Separately sorting the left half of the array and the right
# half of the array. quicksort(arr , start , pivotindex-1) quicksort(arr, pivotindex + 1, stop)
# This function generates random pivot swaps the first element with the pivot # and calls the partition function. def partitionrand(arr , start, stop):
# Generating a random number between the starting index of the array
# and the ending index of the array. randpivot = random.randrange(start, stop)
# Swapping the starting element of the array and the pivot arr[start], arr[randpivot] = \ arr[randpivot], arr[start] return partition(arr, start, stop)
''' This function takes the first element as pivot, places the pivot element at the correct position in the sorted array. All the elements are re-arranged according to the pivot, the elements smaller than the pivot is places on the left and the elements greater than the pivot is placed to the right of pivot. ''' def partition(arr,start,stop): pivot = start # pivot
# a variable to memorize where the i = start + 1
# partition in the array starts from. for j in range(start + 1, stop + 1):
# if the current element is smaller # or equal to pivot, shift it to the # left side of the partition. if arr[j] <= arr[pivot]: arr[i] , arr[j] = arr[j] , arr[i] i = i + 1 arr[pivot] , arr[i - 1] =\ arr[i - 1] , arr[pivot] pivot = i - 1 return (pivot)
Algorithm for random pivoting using using Hoare's partition Scheme.
partition(arr[], lo, hi) pivot = arr[lo] i = lo - 1 // Initialize left index j = hi + 1 // Initialize right index while(True) // Find a value in left side greater than pivot do i = i + 1 while arr[i] < pivot // Find a value in right side smaller than pivot do j = j - 1 while arr[j] > pivot if i >= j then return j else swap arr[i] with arr[j] end while partition_r(arr[], lo, hi) r = Random number from lo to hi Swap arr[r] and arr[lo] return partition(arr, lo, hi) quicksort(arr[], lo, hi) if lo < hi p = partition_r(arr, lo, hi) quicksort(arr, lo, p) quicksort(arr, p+1, hi)
# Python implementation QuickSort using Hoare's partition Scheme.
import random
''' The function which implements randomised QuickSort, using Haore's partition scheme. arr :- array to be sorted. start :- starting index of the array. stop :- ending index of the array. ''' def quicksort(arr, start, stop): if(start < stop):
# pivotindex is the index where # the pivot lies in the array pivotindex = partitionrand(arr,\ start, stop)
# At this stage the array is # partially sorted around the pivot. # separately sorting the left half of # the array and the right # half of the array. quicksort(arr , start , pivotindex) quicksort(arr, pivotindex + 1, stop)
# This function generates random pivot, # swaps the first element with the pivot # and calls the partition function. def partitionrand(arr , start, stop):
# Generating a random number between # the starting index of the array and # the ending index of the array. randpivot = random.randrange(start, stop)
# Swapping the starting element of # the array and the pivot arr[start], arr[randpivot] =\ arr[randpivot], arr[start] return partition(arr, start, stop)
''' This function takes the first element as pivot, places the pivot element at the correct position in the sorted array. All the elements are re-arranged according to the pivot, the elements smaller than the pivot is places on the left and the elements greater than the pivot is placed to the right of pivot. ''' def partition(arr,start,stop): pivot = start # pivot i = start - 1 j = stop + 1 while True: while True: i = i + 1 if arr[i] >= arr[pivot]: break while True: j = j - 1 if arr[j] <= arr[pivot]: break if i >= j: return j arr[i] , arr[j] = arr[j] , arr[i]
# Driver Code if __name__ == "__main__": array = [
2.Monte Carlo Algorithms:A random algorithm is Monte-Carlo algorithms if it can give the wrong answer sometimes.
Running time is typically fixed or bounded.
Output may be incorrect with a small probability.
Often
used when approximate answers are acceptable or when error probability
can be made tiny. These algorithms are used for solving physical
simulation system and mathematical system.
The Monte-Carlo methods are used in places where
deterministic algorithms take a lot time. Monte-Carlo integration is
the most common application of Monte-Carlo algorithm. There are various
methods used for integration by using Monte-Carlo methods such as,
i) Direct sampling methods which includes the stratified sampling, recursive stratified sampling, importance sampling. ii) Random walk Monte-Carlo algorithm which is used to find out the integration for given problem. iii) Gibbs sampling.
Applications of Randomized Algorithms
Randomized algorithms have huge applications in Cryptography.
Load Balancing.
Number-Theoretic Applications: Primality Testing.
Data Structures: Hashing, Sorting, Searching, Order Statistics and Computational Geometry.
Algebraic identities: Polynomial and Matrix identity verification. Interactive proof systems.
Mathematical
programming: Faster algorithms for linear programming, Rounding linear
program solutions to integer program solutions
Counting and Enumeration: Matrix permanent Counting combinatorial structures.
Parallel and distributed computing: Deadlock avoidance distributed consensus.
Probabilistic
existence proofs: Show that a combinatorial object arises with non-zero
probability among objects drawn from a suitable probability space.
Derandomization: First devise a randomized algorithm then argue that it can be derandomized to yield a deterministic algorithm.
Consider a function of two variables
$$
z = f(x, y)
$$
This function represents a surface in three-dimensional space.
Each point on the surface corresponds to a particular value of $x$ and $y$.
The red point represents a point on the surface corresponding to the
values $(x, y)$.
Small Changes in Variables
Suppose the independent variables $x$ and $y$ change by small amounts
$dx$ and $dy$. These small changes cause a change in $z$, denoted by $dz$.
The movement along the $x$-direction by $dx$ and along the $y$-direction
by $dy$ together produce a total change in the value of $z$.
=============================
Total Change in z
The total change in $z$ is the combined effect of the change due to $x$
and the change due to $y$.
Mathematically, it is given by:
When a dependent variable depends on more than one independent variable,
its change depends on the changes in all those variables.
In such cases, we use the concept of Total Derivative.
If a function is given by
$$
z = f(x, y)
$$
where $x$ and $y$ are independent variables, then a small change in $z$
due to small changes in $x$ and $y$ is called the total differential.
The total differential of $z$ is denoted by $dz$ and is defined as:
$$
dz = \frac{\partial z}{\partial x}\,dx
+ \frac{\partial z}{\partial y}\,dy
$$
Total derivatives are widely used in error analysis, approximation,
thermodynamics, and engineering applications.
If $z = x^2y + y^2x$, find the total differential $dz$ and hence find the
approximate change in $z$ when $x$ changes from 1 to 1.02 and $y$ changes
from 2 to 2.01.
If $z = x^2 + y^2$, find the total differential and hence obtain the
approximate error in $z$ when the errors in $x$ and $y$ are $\delta x$
and $\delta y$ respectively.
Solution:
$$
dz = 2x\,dx + 2y\,dy
$$
Approximate error:
$$
\delta z = 2x\,\delta x + 2y\,\delta y
$$
Question 10
Find the total derivative of $z = x^2y^2$ and hence find the rate of
change of $z$ with respect to $t$, given $x = \sin t$ and $y = \cos t$.
In many practical situations, a quantity depends on more than one
independent variable. For example, the volume of a gas may depend on
both pressure and temperature, or the area of a surface may depend on
two spatial coordinates. Such functions are called
functions of several variables.
If a function depends on two independent variables $x$ and $y$, it is
written as
$$
z = f(x, y)
$$
Partial differentiation is the process of finding the rate of
change of a function with respect to one variable while keeping the
other variable(s) constant.
The partial derivative of $z$ with respect to $x$ is denoted by
$$
\frac{\partial z}{\partial x}
$$
and is obtained by differentiating $z$ with respect to $x$, treating
$y$ as a constant.
Similarly, the partial derivative of $z$ with respect to $y$ is denoted by
$$
\frac{\partial z}{\partial y}
$$
and is obtained by differentiating $z$ with respect to $y$, treating
$x$ as a constant.
For example, if
$$
z = x^2y + xy^2
$$
then
$$
\frac{\partial z}{\partial x} = 2xy + y^2,
\quad
\frac{\partial z}{\partial y} = x^2 + 2xy
$$
Partial differentiation plays an important role in mathematics,
physics, engineering, economics, and other applied sciences. It is
widely used in topics such as total derivatives, maxima and minima,
Euler’s theorem, and differential equations.
Matrix multiplication is a fundamental operation in scientific computing,
machine learning, data mining, and numerical linear algebra. Given two matrices
\( A \in \mathbb{R}^{n \times d} \) and
\( B \in \mathbb{R}^{d \times m} \), the exact computation of
\( AB \) requires \( O(ndm) \) arithmetic operations, which becomes
computationally expensive when the inner dimension \( d \) is large.
Approximate Matrix Multiplication (AMM) addresses this challenge by computing
an approximation \( \widetilde{C} \approx AB \) that is significantly faster
to obtain while maintaining provable accuracy guarantees. The key idea is to
reduce the dimensionality of the problem using randomized techniques, thereby
lowering computational cost and memory usage.
Random Sampling of Columns and Rows (Sketching)
One of the simplest and most intuitive approaches to AMM is
random sampling of columns of \( A \) and the corresponding rows of \( B \).
Since matrix multiplication can be written as a sum of outer products,
\[
AB = \sum_{i=1}^{d} A_{(:,i)} B_{(i,:)}
\]
we can approximate this sum by randomly selecting only a small subset of
indices \( S \subset \{1,2,\dots,d\} \) with \( |S| = k \ll d \).
The approximation is then given by:
The scaling factor \( \frac{d}{k} \) ensures that the estimator is
unbiased, meaning:
\[
\mathbb{E}[\widetilde{C}] = AB
\]
Why Sketching Works
Each column--row pair \( A_{(:,i)} B_{(i,:)} \) contributes differently to
the final product. Random sampling exploits the fact that many contributions
are often small or redundant, especially in large, structured, or noisy data.
By sampling only a few representative pairs, we can closely approximate the
full sum with much less computation.
Accuracy and Trade-offs
The accuracy of the approximation depends on the number of samples \( k \).
As \( k \) increases, the approximation converges to the exact product.
With appropriate sampling strategies (such as importance sampling based on
column norms), strong error bounds can be obtained in Frobenius or spectral norm.
Random sampling of columns and rows is widely used as a building block in:
Large-scale machine learning algorithms
Dimensionality reduction and feature compression
Recommender systems
Streaming and distributed computation
In the following examples, we illustrate how this technique works through
simple numerical demonstrations.
Approximate Matrix Multiplication using Random Sampling of Columns/Rows
Given matrices \( A \in \mathbb{R}^{n \times d} \) and
\( B \in \mathbb{R}^{d \times m} \), approximate:
\[
AB \approx \frac{d}{k} \sum_{i \in S} A_{(:,i)} B_{(i,:)}
\]
\[
AB \approx 2(A_{(:,1)}B_{(1,:)} + A_{(:,4)}B_{(4,:)})
\]
Error Bounds and Intuition
The approximation quality in Random Sampling–based Approximate Matrix
Multiplication depends on how many column–row pairs are sampled.
Let \( \widetilde{C} \) denote the approximate product and \( C = AB \)
the exact product.
With uniform random sampling, the expected error measured in Frobenius norm
satisfies:
This inequality shows that the error decreases as the number of samples
\( k \) increases. In particular, doubling the number of sampled columns
roughly halves the expected error.
Intuition Behind the Error Bound
Matrix multiplication can be viewed as summing many outer products.
Random sampling replaces this full sum with an estimate based on only
a few terms. The approximation error arises from ignoring some
column–row pairs. If most of these pairs contribute little to the final
product, then the error remains small.
Thus, matrices with redundant, correlated, or low-energy columns are
especially well-suited for approximate multiplication.
Importance Sampling
Uniform sampling treats all column–row pairs equally. However, in practice,
some columns of \( A \) and rows of \( B \) contribute much more to
\( AB \) than others. Importance sampling improves accuracy by
sampling indices with higher probability when they have larger impact.
Importance sampling significantly reduces variance and provides tighter
error bounds, especially when column norms vary widely.
Key idea: sample what matters most.
Real-World Machine Learning Applications
Recommendation Systems:
User–item matrices are often large and sparse. Approximate matrix
multiplication accelerates similarity computations and matrix factorization.
Deep Learning:
Large fully connected layers and attention mechanisms can be approximated
using sketching techniques to reduce training and inference cost.
Natural Language Processing:
Word embeddings and document–term matrices benefit from approximate
multiplication during similarity search and latent semantic analysis.
Kernel Methods:
AMM is used to approximate large kernel matrices in support vector
machines and Gaussian processes.
Large-Scale Linear Regression:
Sketching reduces the cost of computing \( X^T X \) and \( X^T y \)
in high-dimensional data.
Conclusion:
Random sampling of columns and rows significantly reduces computation while
providing an unbiased approximation. Increasing the sample size improves accuracy.