-
Substitute the ML estimate of μ:
We substitute μ with μ^ML=N1∑i=1Nxi. Let S=∑i=1N(xi−μ^)(xi−μ^)T be the scatter matrix.
-
Rewrite the Log-Likelihood using Trace:
The term in the summation is a scalar:
(xi−μ)TΣ−1(xi−μ)=tr((xi−μ)TΣ−1(xi−μ))=tr(Σ−1(xi−μ)(xi−μ)T)
Summing over i:
∑i=1N(xi−μ)TΣ−1(xi−μ)=tr(Σ−1∑i=1N(xi−μ)(xi−μ)T)=tr(Σ−1S)
So the relevant part of the log-likelihood (ignoring constants) is:
ℓ(Σ)∝−2Nlog∣Σ∣−21tr(Σ−1S)
-
Differentiate with respect to Σ:
Using the provided identities:
- ∂Σ∂log∣Σ∣=Σ−T=Σ−1 (since symmetric).
- ∂Σ∂tr(Σ−1S)=−(Σ−TSTΣ−T). Since S and Σ are symmetric, this is −Σ−1SΣ−1.
∂Σ∂ℓ=−2NΣ−1−21(−Σ−1SΣ−1)=−2NΣ−1+21Σ−1SΣ−1
-
Set derivative to zero and solve:
−2NΣ−1+21Σ−1SΣ−1Σ−1SΣ−1=0=NΣ−1
Multiply by Σ on the left and right:
Σ(Σ−1SΣ−1)ΣSΣ=Σ(NΣ−1)Σ=NΣ=N1S
So,
Σ^ML=N1∑i=1N(xi−μ^)(xi−μ^)T