Unit -1  

Big Data Matrix


What is a Big Data Matrix?

A Big Data Matrix is a very large data structure arranged in rows and columns, where:

  • Rows represent entities (users, sensors, documents)
  • Columns represent features or attributes
  • Data volume is too large for traditional systems

Characteristics of Big Data Matrix

  • Large scale (millions or billions of rows)
  • High dimensionality
  • Sparse data (many zero values)
  • Stored and processed in distributed systems

Why Big Data Matrices are Important

  • Used in Machine Learning and Artificial Intelligence
  • Helps in data analytics and decision making
  • Supports large-scale scientific research
  • Enables real-time data processing

Types of Data Matrix

  1. Numerical Data Matrix
  2. Binary Data Matrix
  3. Categorical Data Matrix
  4. Sparse Data Matrix
  5. Distributed Data Matrix

1. Numerical Data Matrix

Definition

A Numerical Data Matrix is a matrix in which all elements are numerical values.

$$ X = \begin{bmatrix} x_{11} & x_{12} & \cdots & x_{1n} \\ x_{21} & x_{22} & \cdots & x_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ x_{m1} & x_{m2} & \cdots & x_{mn} \end{bmatrix} $$

  • Rows represent observations
  • Columns represent variables
  • $x_{ij} \in \mathbb{R}$

Example 1: Student Marks Matrix

$$ M = \begin{bmatrix} 78 & 85 & 90 \\ 88 & 76 & 84 \\ 92 & 89 & 95 \end{bmatrix} $$

Average Marks of Student i:

$$ \mu_i = \frac{1}{n}\sum_{j=1}^{n} M_{ij} $$


Example 2: Weather Observation Matrix

$$ W = \begin{bmatrix} 30.5 & 78 & 12 \\ 29.8 & 82 & 5 \\ 31.2 & 75 & 0 \end{bmatrix} $$

Mean Temperature:

$$ \bar{T} = \frac{1}{m}\sum_{i=1}^{m} W_{i1} $$


Example 3: Sales Data Matrix

$$ S = \begin{bmatrix} 1200 & 1350 & 1100 \\ 900 & 1000 & 980 \\ 1500 & 1600 & 1700 \end{bmatrix} $$

Total Sales:

$$ \text{Total} = \sum_{i=1}^{m}\sum_{j=1}^{n} S_{ij} $$


Example 4: Sensor Data Matrix

$$ R = \begin{bmatrix} 22.4 & 101.3 \\ 22.6 & 101.1 \\ 22.5 & 101.2 \end{bmatrix} $$

Variance of Temperature:

$$ \sigma^2 = \frac{1}{m}\sum_{i=1}^{m}(R_{i1}-\mu)^2 $$


Example 5: Image Pixel Matrix

$$ I = \begin{bmatrix} 0 & 128 & 255 \\ 64 & 192 & 128 \\ 255 & 128 & 0 \end{bmatrix} $$

Normalization:

$$ I' = \frac{I}{255} $$


Big Data Matrix – Real World Examples

1. User–Item Matrix

  • Rows: Users
  • Columns: Products
  • Used in Amazon, Netflix, Flipkart

2. Social Network Matrix

  • Rows and Columns: Users
  • Values: 1 (connected), 0 (not connected)

3. Sensor Data Matrix

  • Rows: Time intervals
  • Columns: Sensors

4. Term–Document Matrix

  • Rows: Documents
  • Columns: Words

5. Healthcare Data Matrix

  • Rows: Patients
  • Columns: Medical attributes

Summary

  • Big Data Matrices represent massive real-world datasets
  • They are sparse, high-dimensional, and distributed
  • Widely used in AI, analytics, healthcare, and IoT

Thank You