Demystifying ML Math: From Vectors to Eigenvalues

Data as Structure: Vectors, Matrices, and Tensors

In practice, mathematical objects are simply ways of organizing data for computation. A vector is an ordered list of numbers representing a single data point in space. A matrix is a collection of these vectors, effectively creating a grid of information, while a tensor is a generalization of these structures into higher dimensions. The norm of a vector measures its magnitude (or length), and the dot product is the primary tool for measuring the similarity between two vectors—a fundamental operation in recommendation systems and semantic search. Projections allow us to map data into different spaces, which is essential for dimensionality reduction techniques.

Measuring Change: Gradients and Optimization

Machine learning models "learn" by minimizing error, which requires calculating how a model's output changes relative to its inputs. A derivative measures the rate of change for a single variable, while the gradient extends this to multiple variables, pointing in the direction of the steepest ascent. In practice, we use the Jacobian to understand how a vector-valued function changes and the Hessian (a matrix of second-order derivatives) to understand the curvature of the loss function. These tools allow us to navigate the loss function landscape, using optimization algorithms to find the parameters that minimize error.

Decision Making Under Uncertainty: Probability and Eigen-Decomposition

Learning is ultimately about making decisions when data is incomplete. Probability distributions quantify the likelihood of outcomes, while expectation, variance, and covariance describe the central tendency and spread of data. When analyzing the structure of data itself, eigenvectors and eigenvalues provide a way to decompose a matrix into its core components. These reveal the "principal directions" of a dataset, which is the mathematical foundation behind techniques like Principal Component Analysis (PCA) and the spectral methods used in modern deep learning architectures.

Data as Structure: Vectors, Matrices, and Tensors

Measuring Change: Gradients and Optimization

Decision Making Under Uncertainty: Probability and Eigen-Decomposition

More from Data Science & Visualization

NMI Bias Favors Complex Clusters Over Insight

Balance Linear Simplicity and Nonlinear Flexibility to Avoid Fit Failures

Time Series Fundamentals Before Modeling

Synthetic Data Exposes Hidden ML Bias Before Production