Skip to main content

Problem 1.8 Product of Multivariate Gaussian Distributions - Answer

Prerequisite Knowledge

To solve this problem, you need to be familiar with the following concepts:

  1. Multivariate Gaussian Definition: N(xμ,Σ)=1(2π)d/2Σ1/2exp(12(xμ)TΣ1(xμ))\mathcal{N}(x|\mu, \Sigma) = \frac{1}{(2\pi)^{d/2}|\Sigma|^{1/2}} \exp\left(-\frac{1}{2}(x-\mu)^T \Sigma^{-1} (x-\mu)\right)
  2. Completing the Square: As derived in Problem 1.10, a quadratic form xTAx2xTbx^T A x - 2x^T b can be completed as (xA1b)TA(xA1b)bTA1b(x - A^{-1}b)^T A (x - A^{-1}b) - b^T A^{-1} b.
  3. Matrix Identities:
    • (A1+B1)1=A(A+B)1B=B(A+B)1A(A^{-1} + B^{-1})^{-1} = A(A+B)^{-1}B = B(A+B)^{-1}A (useful for determinant manipulation).
    • Identity for quadratic forms: aTA1a+bTB1b(A1a+B1b)T(A1+B1)1(A1a+B1b)=(ab)T(A+B)1(ab)a^T A^{-1} a + b^T B^{-1} b - (A^{-1}a + B^{-1}b)^T (A^{-1} + B^{-1})^{-1} (A^{-1}a + B^{-1}b) = (a-b)^T (A+B)^{-1} (a-b).

Step-by-Step Derivation

We evaluate the product: P(x)=N(xa,A)N(xb,B)P(x) = \mathcal{N}(x|a, A)\mathcal{N}(x|b, B)

Step 1: Expand the Exponents

Ignoring the normalization constants for a moment, let's look at the exponent term EE (where the total exponent is 12E-\frac{1}{2}E):

E=(xa)TA1(xa)+(xb)TB1(xb)E = (x-a)^T A^{-1} (x-a) + (x-b)^T B^{-1} (x-b)

Expanding these quadratics: E=(xTA1x2xTA1a+aTA1a)+(xTB1x2xTB1b+bTB1b)E = (x^T A^{-1} x - 2x^T A^{-1} a + a^T A^{-1} a) + (x^T B^{-1} x - 2x^T B^{-1} b + b^T B^{-1} b)

Group terms by powers of xx: E=xT(A1+B1)x2xT(A1a+B1b)+(aTA1a+bTB1b)E = x^T (A^{-1} + B^{-1}) x - 2x^T (A^{-1} a + B^{-1} b) + (a^T A^{-1} a + b^T B^{-1} b)

Step 2: Define New Parameters cc and CC

We want to match the form of a Gaussian exponent: (xc)TC1(xc)(x-c)^T C^{-1} (x-c). Looking at the quadratic term in xx, we identify the precision matrix equal to the sum of the precisions: C1=A1+B1    C=(A1+B1)1C^{-1} = A^{-1} + B^{-1} \implies C = (A^{-1} + B^{-1})^{-1} This matches equation (1.25).

Looking at the linear term 2xT(A1a+B1b)-2x^T (A^{-1} a + B^{-1} b), we equate it to 2xTC1c-2x^T C^{-1} c from the expansion of (xc)TC1(xc)(x-c)^T C^{-1} (x-c). C1c=A1a+B1bC^{-1} c = A^{-1} a + B^{-1} b Multiplying by CC from the left: c=C(A1a+B1b)c = C(A^{-1} a + B^{-1} b) This matches equation (1.24).

Step 3: Complete the Square

Using the result from Problem 1.10, we can rewrite the terms involving xx in EE: xTC1x2xT(C1c)=(xc)TC1(xc)cTC1cx^T C^{-1} x - 2x^T (C^{-1}c) = (x-c)^T C^{-1} (x-c) - c^T C^{-1} c

Substitute this back into the expression for EE: E=(xc)TC1(xc)cTC1c+(aTA1a+bTB1b)E = (x-c)^T C^{-1} (x-c) - c^T C^{-1} c + (a^T A^{-1} a + b^T B^{-1} b)

Let RR be the residual scalar term: R=aTA1a+bTB1bcTC1cR = a^T A^{-1} a + b^T B^{-1} b - c^T C^{-1} c

So the product P(x)P(x) can be written as: P(x)=1Znormexp(12(xc)TC1(xc))exp(12R)P(x) = \frac{1}{Z*{norm}} \exp\left( -\frac{1}{2} (x-c)^T C^{-1} (x-c) \right) \exp\left( -\frac{1}{2} R \right) where Znorm=(2π)d/2A1/2(2π)d/2B1/2=(2π)dA1/2B1/2Z*{norm} = (2\pi)^{d/2}|A|^{1/2} (2\pi)^{d/2}|B|^{1/2} = (2\pi)^{d} |A|^{1/2} |B|^{1/2}.

Notice that exp(12(xc)TC1(xc))\exp\left( -\frac{1}{2} (x-c)^T C^{-1} (x-c) \right) is the unnormalized kernel of N(xc,C)\mathcal{N}(x|c, C).

Step 4: Simplify the Residual Term RR

We need to show that R=(ab)T(A+B)1(ab)R = (a-b)^T (A+B)^{-1} (a-b).

Substitute c=C(A1a+B1b)c = C(A^{-1} a + B^{-1} b) and C1=A1+B1C^{-1} = A^{-1} + B^{-1} back into cTC1cc^T C^{-1} c: cTC1c=(A1a+B1b)TCTC1C(A1a+B1b)c^T C^{-1} c = (A^{-1} a + B^{-1} b)^T C^T C^{-1} C (A^{-1} a + B^{-1} b) =(A1a+B1b)TC(A1a+B1b)= (A^{-1} a + B^{-1} b)^T C (A^{-1} a + B^{-1} b) Thus, R=aTA1a+bTB1b(A1a+B1b)T(A1+B1)1(A1a+B1b)R = a^T A^{-1} a + b^T B^{-1} b - (A^{-1} a + B^{-1} b)^T (A^{-1} + B^{-1})^{-1} (A^{-1} a + B^{-1} b)

Using the matrix identity (proven via Woodbury or elementary algebra) for completing the square in the exponent: xTA1x+yTB1y(A1x+B1y)T(A1+B1)1(A1x+B1y)=(xy)T(A+B)1(xy)x^T A^{-1} x + y^T B^{-1} y - (A^{-1}x + B^{-1}y)^T (A^{-1} + B^{-1})^{-1} (A^{-1}x + B^{-1}y) = (x-y)^T (A+B)^{-1} (x-y) Substituting x=a,y=bx=a, y=b: R=(ab)T(A+B)1(ab)R = (a-b)^T (A+B)^{-1} (a-b)

This term exactly matches the exponent of N(ab,A+B)\mathcal{N}(a|b, A+B).

Step 5: Determine the Scaling Factor ZZ

We have: N(xa,A)N(xb,B)=N(xc,C)(2π)d/2C1/2(2π)dA1/2B1/2exp(12R)_Z\mathcal{N}(x|a, A)\mathcal{N}(x|b, B) = \mathcal{N}(x|c, C) \cdot \underbrace{\frac{(2\pi)^{d/2}|C|^{1/2}}{(2\pi)^{d}|A|^{1/2}|B|^{1/2}} \exp\left(-\frac{1}{2}R\right)}\_{Z}

We identified exp(12R)\exp(-\frac{1}{2}R) matches the exponential part of N(ab,A+B)\mathcal{N}(a|b, A+B). Now check the determinant pre-factor. Prefactor=C1/2(2π)d/2A1/2B1/2\text{Prefactor} = \frac{|C|^{1/2}}{(2\pi)^{d/2} |A|^{1/2} |B|^{1/2}} We want this to match the constant of N(ab,A+B)\mathcal{N}(a|b, A+B), which is 1(2π)d/2A+B1/2\frac{1}{(2\pi)^{d/2} |A+B|^{1/2}}.

We need to check if: C1/2A1/2B1/2=1A+B1/2\frac{|C|^{1/2}}{|A|^{1/2} |B|^{1/2}} = \frac{1}{|A+B|^{1/2}} Squaring both sides: CAB=1A+B    CA+B=AB\frac{|C|}{|A||B|} = \frac{1}{|A+B|} \iff |C||A+B| = |A||B| Recalling C=(A1+B1)1C = (A^{-1} + B^{-1})^{-1}. Using XY=XY|X Y| = |X| |Y| and X1=1/X|X^{-1}| = 1/|X|: C=(A1+B1)1=1A1+B1=1A1(A+B)B1|C| = |(A^{-1} + B^{-1})^{-1}| = \frac{1}{|A^{-1} + B^{-1}|} = \frac{1}{|A^{-1}(A+B)B^{-1}|} =1A1A+BB1=ABA+B= \frac{1}{|A^{-1}| |A+B| |B^{-1}|} = \frac{|A||B|}{|A+B|} Thus, the determinants match.

Conclusion

Combining the residual exponent and the determinant factors, we get: Z=1(2π)d/2A+B1/2exp(12(ab)T(A+B)1(ab))=N(ab,A+B)Z = \frac{1}{(2\pi)^{d/2}|A+B|^{1/2}} \exp\left( -\frac{1}{2} (a-b)^T (A+B)^{-1} (a-b) \right) = \mathcal{N}(a|b, A+B)

So, N(xa,A)N(xb,B)=ZN(xc,C)\mathcal{N}(x|a, A)\mathcal{N}(x|b, B) = Z \mathcal{N}(x|c, C) Q.E.D.