Unit -1
Big Data Matrix
What is a Big Data Matrix?
A Big Data Matrix is a very large data structure arranged in rows and columns, where:
- Rows represent entities (users, sensors, documents)
- Columns represent features or attributes
- Data volume is too large for traditional systems
Characteristics of Big Data Matrix
- Large scale (millions or billions of rows)
- High dimensionality
- Sparse data (many zero values)
- Stored and processed in distributed systems
Why Big Data Matrices are Important
- Used in Machine Learning and Artificial Intelligence
- Helps in data analytics and decision making
- Supports large-scale scientific research
- Enables real-time data processing
Types of Data Matrix
- Numerical Data Matrix
- Binary Data Matrix
- Categorical Data Matrix
- Sparse Data Matrix
- Distributed Data Matrix
1. Numerical Data Matrix
Definition
A Numerical Data Matrix is a matrix in which all elements are numerical values.
$$ X = \begin{bmatrix} x_{11} & x_{12} & \cdots & x_{1n} \\ x_{21} & x_{22} & \cdots & x_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ x_{m1} & x_{m2} & \cdots & x_{mn} \end{bmatrix} $$
- Rows represent observations
- Columns represent variables
- $x_{ij} \in \mathbb{R}$
Example 1: Student Marks Matrix
$$ M = \begin{bmatrix} 78 & 85 & 90 \\ 88 & 76 & 84 \\ 92 & 89 & 95 \end{bmatrix} $$
Average Marks of Student i:
$$ \mu_i = \frac{1}{n}\sum_{j=1}^{n} M_{ij} $$
Example 2: Weather Observation Matrix
$$ W = \begin{bmatrix} 30.5 & 78 & 12 \\ 29.8 & 82 & 5 \\ 31.2 & 75 & 0 \end{bmatrix} $$
Mean Temperature:
$$ \bar{T} = \frac{1}{m}\sum_{i=1}^{m} W_{i1} $$
Example 3: Sales Data Matrix
$$ S = \begin{bmatrix} 1200 & 1350 & 1100 \\ 900 & 1000 & 980 \\ 1500 & 1600 & 1700 \end{bmatrix} $$
Total Sales:
$$ \text{Total} = \sum_{i=1}^{m}\sum_{j=1}^{n} S_{ij} $$
Example 4: Sensor Data Matrix
$$ R = \begin{bmatrix} 22.4 & 101.3 \\ 22.6 & 101.1 \\ 22.5 & 101.2 \end{bmatrix} $$
Variance of Temperature:
$$ \sigma^2 = \frac{1}{m}\sum_{i=1}^{m}(R_{i1}-\mu)^2 $$
Example 5: Image Pixel Matrix
$$ I = \begin{bmatrix} 0 & 128 & 255 \\ 64 & 192 & 128 \\ 255 & 128 & 0 \end{bmatrix} $$
Normalization:
$$ I' = \frac{I}{255} $$
Big Data Matrix – Real World Examples
1. User–Item Matrix
- Rows: Users
- Columns: Products
- Used in Amazon, Netflix, Flipkart
2. Social Network Matrix
- Rows and Columns: Users
- Values: 1 (connected), 0 (not connected)
3. Sensor Data Matrix
- Rows: Time intervals
- Columns: Sensors
4. Term–Document Matrix
- Rows: Documents
- Columns: Words
5. Healthcare Data Matrix
- Rows: Patients
- Columns: Medical attributes
Summary
- Big Data Matrices represent massive real-world datasets
- They are sparse, high-dimensional, and distributed
- Widely used in AI, analytics, healthcare, and IoT
No comments:
Post a Comment