Explanation for 1.6 (d) - Covariance Matrix Eigendecomposition
Based on the derivation in Problem 1.6 (d) , here is a detailed breakdown of how we get from eigenvalues to the full matrix decomposition Σ = V Λ V T \Sigma = V \Lambda V^T Σ = V Λ V T .
1. From Vectors to Matrices
The most common confusing part is moving from the single vector equation to the full matrix equation.
We start with the definition of an eigenvector v i v_i v i and eigenvalue λ i \lambda_i λ i :
Σ v i = λ i v i \Sigma v_i = \lambda_i v_i Σ v i = λ i v i
If we have d d d eigenvectors (v 1 , . . . , v d v_1, ..., v_d v 1 , ... , v d ), we can write them all side-by-side using matrix multiplication.
Left-hand side (Σ V \Sigma V Σ V ):
When you multiply a matrix Σ \Sigma Σ by a matrix V V V (where V V V is made of columns v 1 , . . . , v d v_1, ..., v_d v 1 , ... , v d ), the result is simply Σ \Sigma Σ multiplying each column individually:
Σ V = Σ [ ∣ ∣ v 1 … v d ∣ ∣ ] = [ ∣ ∣ Σ v 1 … Σ v d ∣ ∣ ] \Sigma V = \Sigma \begin{bmatrix} | & & | \\ v_1 & \dots & v_d \\ | & & | \end{bmatrix} = \begin{bmatrix} | & & | \\ \Sigma v_1 & \dots & \Sigma v_d \\ | & & | \end{bmatrix} Σ V = Σ ∣ v 1 ∣ … ∣ v d ∣ = ∣ Σ v 1 ∣ … ∣ Σ v d ∣
Since Σ v i = λ i v i \Sigma v_i = \lambda_i v_i Σ v i = λ i v i , we can replace the columns:
= [ ∣ ∣ λ 1 v 1 … λ d v d ∣ ∣ ] = \begin{bmatrix} | & & | \\ \lambda_1 v_1 & \dots & \lambda_d v_d \\ | & & | \end{bmatrix} = ∣ λ 1 v 1 ∣ … ∣ λ d v d ∣
Right-hand side (V Λ V \Lambda V Λ ):
Now look at V Λ V \Lambda V Λ . If you multiply a matrix V V V on the right by a diagonal matrix Λ \Lambda Λ , it scales each column of V V V by the corresponding diagonal element:
[ ∣ ∣ v 1 … v d ∣ ∣ ] [ λ 1 0 ⋱ 0 λ d ] = [ ∣ ∣ λ 1 v 1 … λ d v d ∣ ∣ ] \begin{bmatrix} | & & | \\ v_1 & \dots & v_d \\ | & & | \end{bmatrix}
\begin{bmatrix} \lambda_1 & & 0 \\ & \ddots & \\ 0 & & \lambda_d \end{bmatrix}
= \begin{bmatrix} | & & | \\ \lambda_1 v_1 & \dots & \lambda_d v_d \\ | & & | \end{bmatrix} ∣ v 1 ∣ … ∣ v d ∣ λ 1 0 ⋱ 0 λ d = ∣ λ 1 v 1 ∣ … ∣ λ d v d ∣
Conclusion:
Since the columns match exactly, we have proven:
Σ V = V Λ \Sigma V = V \Lambda Σ V = V Λ
2. Why V V V is Orthogonal (V T = V − 1 V^T = V^{-1} V T = V − 1 )
The problem states that Σ \Sigma Σ is a covariance matrix .
Covariance matrices are always symmetric (Σ = Σ T \Sigma = \Sigma^T Σ = Σ T ).
A key theorem in linear algebra (The Spectral Theorem) states that symmetric matrices always have orthogonal eigenvectors .
This means the dot product of any two different eigenvectors is 0, and we normalize them so their length is 1:
v i T v j = 0 (if i ≠ j ) , v i T v i = 1 v_i^T v_j = 0 \text{ (if } i \neq j), \quad v_i^T v_i = 1 v i T v j = 0 (if i = j ) , v i T v i = 1
In matrix form, calculating V T V V^T V V T V :
V T V = [ − v 1 T − ⋮ − v d T − ] [ ∣ ∣ v 1 … v d ∣ ∣ ] = [ 1 0 … 0 1 … ⋮ ⋮ ⋱ ] = I V^T V = \begin{bmatrix} - v_1^T - \\ \vdots \\ - v_d^T - \end{bmatrix} \begin{bmatrix} | & & | \\ v_1 & \dots & v_d \\ | & & | \end{bmatrix} = \begin{bmatrix} 1 & 0 & \dots \\ 0 & 1 & \dots \\ \vdots & \vdots & \ddots \end{bmatrix} = I V T V = − v 1 T − ⋮ − v d T − ∣ v 1 ∣ … ∣ v d ∣ = 1 0 ⋮ 0 1 ⋮ … … ⋱ = I
Since V T V = I V^T V = I V T V = I , by definition V T V^T V T is the inverse of V V V .
3. Solving for Σ \Sigma Σ
Now we just rearrange the algebra:
Start with: Σ V = V Λ \Sigma V = V \Lambda Σ V = V Λ
Multiply both sides by V T V^T V T (from the right):
Σ V V T = V Λ V T \Sigma V V^T = V \Lambda V^T Σ V V T = V Λ V T
Since V V T = I V V^T = I V V T = I , the V V V 's on the left cancel out:
Σ = V Λ V T \Sigma = V \Lambda V^T Σ = V Λ V T
Geometric Interpretation
This equation tells us that any covariance matrix can be thought of as:
Rotating the space to align with the data's axes (V T V^T V T ).
Stretching along those axes based on the variance (Λ \Lambda Λ ).
Rotating back to the original orientation (V V V ).