
Singular Value Decomposition (SVD) and Principal Component Analysis (PCA)
Explore the concepts of Singular Value Decomposition and Principal Component Analysis, including definitions, applications, and algorithms. Learn how SVD is utilized in PCA to analyze high-dimensional data efficiently. Discover the motivation behind PCA and its importance in reducing computational costs and extracting meaningful information from data.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Singular Value Decomposition
SVD - Definition Definition: Matrix ? ? ?can be written as ? = ? ? where ? ? ?, ? ? ?, ? ? such that?,? are orthogonal matrices and is a rectangular diagonal matrix. Note: The decomposition of ?can be thought of a reduction of the matrix into three transformations: (an initial rotation ? , a scaling , and a final rotation ?) Uses: 1. Finding Pseudoinverses 2. Low-Rank Matrix Approximation 3. Whitening (more later!) 4. PCA (more later!)
SVD - Intuition Low Rank Approximation: Say we want ? a rank ? representation of matrix ? ? ?. All we have to do iswrite as ? = ? ? ?where ? ? ?, ? ? ? ?, ? ? such that?,? are orthogonal matrices and is a rectangular diagonal matrix. In other words, Say ? ? ?is a matrix of ? examples and ? features. Each example (i.e., row) ?? is a vector of dimension ?. To "generate" this low rank approximation take a linear combination of ? vectors each of dimension ?. For example, let ? = ??, where ? ? ?, ? ? ?. Each row of ? is a "factor" and each row of ? are "coefficients". So, ?1 = ?1?, i.e., row ?1 is a linear combination of the rows of ? where the coefficients of the linear combination are the first row of ?.
PCA - Motivation Problem: The dimension of our data is too high! Can lead to: 1. High computational costs 2. Persistence of useless information 3. Low morale
PCA - Algorithm Algorithm: 1. Demean the data (whiten?). Decide whether to standardize. Output ?. 1. Is the importance of features independent of the variance of the features ? 2. Compute covariance = ???. 3. Calculate Eigenvectors and Eigenvalues of . How do we calculate this? 1. SVD! 4. Sort Eigenvalues and order columns of the Eigenvector matrix ? accordingly to create ? . 5. Calculate final embedding ? = ?? A One-Stop Shop for Principal Component Analysis | by Matt Brems | Towards Data Science Source: A One-Stop Shop for Principal Component Analysis | by Matt Brems | Towards Data Science A One-Stop Shop for Principal Component Analysis | by Matt Brems | Towards Data Science
PCA SVD Why use SVD? Take ? to be our data matrix. For PCA we need the eigenvalues of ???. Recall from part (1a) that ???= ??2?? when ? = ????. By its Eigendecomposition, we know ?2 is the eigenvalue matrix. So, we then have that each eigenvalue of ??? is just the square of the singular values for ?!
PCA SVD Why use SVD? Take ? to be our data matrix. For PCA we need the eigenvalues of ???. Recall from part (1a) that ???= ??2?? when ? = ????. By its Eigendecomposition, we know ?2 is the eigenvalue matrix. So, we then have that each eigenvalue of ??? is just the square of the singular values for ?! In essence the PCA problem boils down to SVD!
PCA Intuition Our principal components form a basis for the dataset ?. We can interpret them as: Maximizing Variance: The ?th largest principal component we generate is the vector in our orthonormal basis that encapsulates the most variance. Minimizing Reconstruction Error: The set of principal components is the set of vectors at a lower dimension that, when used minimize the reconstruction error of the dataset