Pre-required Knowledge
- Definition of Covariance: For a random variable x with mean μ, the covariance is E[(x−μ)(x−μ)T]=E[xxT]−μμT.
- Steiner's Translation Theorem (Parallel Axis Theorem) equivalent: cov(x)=E[xxT]−E[x]E[x]T.
- Kernel Properties:
- ∫k~(u)du=1.
- ∫uk~(u)du=0 (Zero mean).
- ∫uuTk~(u)du=H (From Eq 5.7, since mean is 0).
Step-by-Step Proof
Let μ^ be the mean of p^(x) (derived in part (a)). The covariance is defined as:
Σ^=Ep^[(x−μ^)(x−μ^)T]=∫(x−μ^)(x−μ^)Tp^(x)dx
Alternatively, using the property cov(x)=E[xxT]−E[x]E[x]T:
Σ^=∫xxTp^(x)dx−μ^μ^T
Let's compute the second moment term ∫xxTp^(x)dx:
-
Substitute p^(x):
∫xxT(n1i=1∑nk~(x−xi))dx=n1i=1∑n∫xxTk~(x−xi)dx
-
Change of Variables:
Let u=x−xi, so x=u+xi.
∫(u+xi)(u+xi)Tk~(u)du
Expand (u+xi)(u+xi)T=uuT+uxiT+xiuT+xixiT.
-
Evaluate integral term by term:
∫(uuT+uxiT+xiuT+xixiT)k~(u)du
- ∫uuTk~(u)du=H (By definition of Kernel covariance with zero mean).
- ∫uxiTk~(u)du=(∫uk~(u)du)xiT=0⋅xiT=0.
- ∫xiuTk~(u)du=xi(∫uTk~(u)du)=xi⋅0=0.
- ∫xixiTk~(u)du=xixiT∫k~(u)du=xixiT⋅1=xixiT.
So, ∫xxTk~(x−xi)dx=H+xixiT.
-
Summate the results:
Ep^[xxT]=n1i=1∑n(H+xixiT)=H+n1i=1∑nxixiT
-
Calculate Covariance:
Σ^=Ep^[xxT]−μ^μ^T
Σ^=H+n1i=1∑nxixiT−μ^μ^T
-
Rearrange to sample covariance form:
Recall that sample covariance is S=n1∑(xi−μ^)(xi−μ^)T=(n1∑xixiT)−μ^μ^T.
Therefore:
Σ^=H+(n1i=1∑nxixiT−μ^μ^T)
Σ^=H+n1i=1∑n(xi−μ^)(xi−μ^)T■