Skip to main content

Answer

Pre-required Knowledge

  1. Definition of Covariance: For a random variable xx with mean μ\mu, the covariance is E[(xμ)(xμ)T]=E[xxT]μμT\mathbb{E}[(x - \mu)(x - \mu)^T] = \mathbb{E}[xx^T] - \mu\mu^T.
  2. Steiner's Translation Theorem (Parallel Axis Theorem) equivalent: cov(x)=E[xxT]E[x]E[x]T\text{cov}(x) = \mathbb{E}[x x^T] - \mathbb{E}[x]\mathbb{E}[x]^T.
  3. Kernel Properties:
    • k~(u)du=1\int \tilde{k}(u) du = 1.
    • uk~(u)du=0\int u \tilde{k}(u) du = 0 (Zero mean).
    • uuTk~(u)du=H\int u u^T \tilde{k}(u) du = H (From Eq 5.7, since mean is 0).

Step-by-Step Proof

Let μ^\hat{\mu} be the mean of p^(x)\hat{p}(x) (derived in part (a)). The covariance is defined as:

Σ^=Ep^[(xμ^)(xμ^)T]=(xμ^)(xμ^)Tp^(x)dx\hat{\Sigma} = \mathbb{E}_{\hat{p}}[(x - \hat{\mu})(x - \hat{\mu})^T] = \int (x - \hat{\mu})(x - \hat{\mu})^T \hat{p}(x) dx

Alternatively, using the property cov(x)=E[xxT]E[x]E[x]T\text{cov}(x) = \mathbb{E}[xx^T] - \mathbb{E}[x]\mathbb{E}[x]^T:

Σ^=xxTp^(x)dxμ^μ^T\hat{\Sigma} = \int x x^T \hat{p}(x) dx - \hat{\mu}\hat{\mu}^T

Let's compute the second moment term xxTp^(x)dx\int x x^T \hat{p}(x) dx:

  1. Substitute p^(x)\hat{p}(x):

    xxT(1ni=1nk~(xxi))dx=1ni=1nxxTk~(xxi)dx\int x x^T \left( \frac{1}{n} \sum_{i=1}^n \tilde{k}(x - x_i) \right) dx = \frac{1}{n} \sum_{i=1}^n \int x x^T \tilde{k}(x - x_i) dx
  2. Change of Variables: Let u=xxiu = x - x_i, so x=u+xix = u + x_i.

    (u+xi)(u+xi)Tk~(u)du\int (u + x_i)(u + x_i)^T \tilde{k}(u) du

    Expand (u+xi)(u+xi)T=uuT+uxiT+xiuT+xixiT(u + x_i)(u + x_i)^T = uu^T + ux_i^T + x_i u^T + x_i x_i^T.

  3. Evaluate integral term by term:

    (uuT+uxiT+xiuT+xixiT)k~(u)du\int (uu^T + ux_i^T + x_i u^T + x_i x_i^T) \tilde{k}(u) du
    • uuTk~(u)du=H\int uu^T \tilde{k}(u) du = H (By definition of Kernel covariance with zero mean).
    • uxiTk~(u)du=(uk~(u)du)xiT=0xiT=0\int u x_i^T \tilde{k}(u) du = (\int u \tilde{k}(u) du) x_i^T = 0 \cdot x_i^T = 0.
    • xiuTk~(u)du=xi(uTk~(u)du)=xi0=0\int x_i u^T \tilde{k}(u) du = x_i (\int u^T \tilde{k}(u) du) = x_i \cdot 0 = 0.
    • xixiTk~(u)du=xixiTk~(u)du=xixiT1=xixiT\int x_i x_i^T \tilde{k}(u) du = x_i x_i^T \int \tilde{k}(u) du = x_i x_i^T \cdot 1 = x_i x_i^T.

    So, xxTk~(xxi)dx=H+xixiT\int x x^T \tilde{k}(x - x_i) dx = H + x_i x_i^T.

  4. Summate the results:

    Ep^[xxT]=1ni=1n(H+xixiT)=H+1ni=1nxixiT\mathbb{E}_{\hat{p}}[x x^T] = \frac{1}{n} \sum_{i=1}^n (H + x_i x_i^T) = H + \frac{1}{n} \sum_{i=1}^n x_i x_i^T
  5. Calculate Covariance:

    Σ^=Ep^[xxT]μ^μ^T\hat{\Sigma} = \mathbb{E}_{\hat{p}}[x x^T] - \hat{\mu}\hat{\mu}^T Σ^=H+1ni=1nxixiTμ^μ^T\hat{\Sigma} = H + \frac{1}{n} \sum_{i=1}^n x_i x_i^T - \hat{\mu}\hat{\mu}^T
  6. Rearrange to sample covariance form: Recall that sample covariance is S=1n(xiμ^)(xiμ^)T=(1nxixiT)μ^μ^TS = \frac{1}{n}\sum (x_i - \hat{\mu})(x_i - \hat{\mu})^T = (\frac{1}{n}\sum x_i x_i^T) - \hat{\mu}\hat{\mu}^T. Therefore:

    Σ^=H+(1ni=1nxixiTμ^μ^T)\hat{\Sigma} = H + \left( \frac{1}{n} \sum_{i=1}^n x_i x_i^T - \hat{\mu}\hat{\mu}^T \right) Σ^=H+1ni=1n(xiμ^)(xiμ^)T\hat{\Sigma} = H + \frac{1}{n} \sum_{i=1}^n (x_i - \hat{\mu})(x_i - \hat{\mu})^T \quad \blacksquare