-
Define the Objective Function
The problem asks us to find the parameter vector θ that minimizes the sum-squared-error, denoted as J(θ):
J(θ)=∥y−ΦTθ∥2
-
Expand the Objective Function
We can express the squared L2 norm as an inner product:
J(θ)=(y−ΦTθ)T(y−ΦTθ)=(yT−θTΦ)(y−ΦTθ)=yTy−yTΦTθ−θTΦy+θTΦΦTθ
-
Simplify the Expression
Note that yTΦTθ is a scalar quantity (dimension 1×n⋅n×D⋅D×1=1×1). The transpose of a scalar is itself, so:
(yTΦTθ)T=θTΦy
Therefore, the two middle terms are equal, and the objective function simplifies to:
J(θ)=yTy−2θTΦy+θTΦΦTθ
-
Compute the Derivative with Respect to θ
To find the minimum, we take the gradient of J(θ) with respect to θ and set it to the zero vector. Using standard matrix calculus identities:
- ∇θ(θTA)=A
- ∇θ(θTAθ)=(A+AT)θ
For symmetric matrix A=ΦΦT, ∇θ(θT(ΦΦT)θ)=2ΦΦTθ. Thus:
∂θ∂J(θ)=−2Φy+2ΦΦTθ
-
Solve for θ
Set the derivative to zero to find the optimal θ:
−2Φy+2ΦΦTθΦΦTθ=0=Φy
Assuming ΦΦT is invertible, we multiply both sides by (ΦΦT)−1:
θ^LS=(ΦΦT)−1Φy