Skip to main content

Answer

Prerequisites

  • Linear Algebra (Block Matrices)
  • Matrix Calculus
  • Quadratic Programming (QP)

Step-by-Step Derivation

  1. Expanding the Objective Function: We are given the optimization problem from (3.63): J(θ+,θ)=12yΦT(θ+θ)2+λi(θ+_i+θ_i)J(\theta^+, \theta^-) = \frac{1}{2} \|y - \Phi^T(\theta^+ - \theta^-)\|^2 + \lambda \sum_i (\theta^+\_i + \theta^-\_i)

    First, let's expand the squared L2-norm term 12yΦT(θ+θ)2\frac{1}{2} \|y - \Phi^T(\theta^+ - \theta^-)\|^2: 12(yΦT(θ+θ))T(yΦT(θ+θ))\frac{1}{2} (y - \Phi^T(\theta^+ - \theta^-))^T (y - \Phi^T(\theta^+ - \theta^-)) =12yTyyTΦT(θ+θ)+12(θ+θ)TΦΦT(θ+θ)= \frac{1}{2} y^Ty - y^T\Phi^T(\theta^+ - \theta^-) + \frac{1}{2} (\theta^+ - \theta^-)^T \Phi \Phi^T (\theta^+ - \theta^-)

    Now, we express this in terms of the concatenated vector x=[θ+θ]\mathbf{x} = \begin{bmatrix} \theta^+ \\ \theta^- \end{bmatrix}. Note that: θ+θ=[II][θ+θ]=[II]x\theta^+ - \theta^- = \begin{bmatrix} \mathbf{I} & -\mathbf{I} \end{bmatrix} \begin{bmatrix} \theta^+ \\ \theta^- \end{bmatrix} = \begin{bmatrix} \mathbf{I} & -\mathbf{I} \end{bmatrix} \mathbf{x}

  2. Formulating the Quadratic Term H\mathbf{H}: Look at the quadratic part of the expansion: 12(xT[II])ΦΦT([II]x)\frac{1}{2} (\mathbf{x}^T \begin{bmatrix} \mathbf{I} \\ -\mathbf{I} \end{bmatrix}) \Phi \Phi^T (\begin{bmatrix} \mathbf{I} & -\mathbf{I} \end{bmatrix} \mathbf{x}) =12xT([II]ΦΦT[II])x= \frac{1}{2} \mathbf{x}^T \left( \begin{bmatrix} \mathbf{I} \\ -\mathbf{I} \end{bmatrix} \Phi \Phi^T \begin{bmatrix} \mathbf{I} & -\mathbf{I} \end{bmatrix} \right) \mathbf{x} =12xT[ΦΦTΦΦTΦΦTΦΦT]x= \frac{1}{2} \mathbf{x}^T \begin{bmatrix} \Phi \Phi^T & -\Phi \Phi^T \\ -\Phi \Phi^T & \Phi \Phi^T \end{bmatrix} \mathbf{x} Thus, we identify the Hessian matrix H\mathbf{H}: H=[ΦΦTΦΦTΦΦTΦΦT]\mathbf{H} = \begin{bmatrix} \Phi\Phi^T & -\Phi\Phi^T \\ -\Phi\Phi^T & \Phi\Phi^T \end{bmatrix}

  3. Formulating the Linear Term f\mathbf{f}: The linear parts come from the cross term in the squared norm and the L1 penalty term. Cross term: yTΦT(θ+θ)=(Φy)T(θ+θ)=[ΦyΦy]T[θ+θ]- y^T\Phi^T(\theta^+ - \theta^-) = - ( \Phi y )^T (\theta^+ - \theta^-) = \begin{bmatrix} -\Phi y \\ \Phi y \end{bmatrix}^T \begin{bmatrix} \theta^+ \\ \theta^- \end{bmatrix} Notice that [ΦyΦy]Tx- \begin{bmatrix} \Phi y \\ -\Phi y \end{bmatrix}^T \mathbf{x} matches this perfectly.

    Regularization term: λi(θ+_i+θ_i)=λ1Tθ++λ1Tθ=(λ[11])T[θ+θ]=(λ12D)Tx\lambda \sum*i (\theta^+\_i + \theta^-\_i) = \lambda \mathbf{1}^T \theta^+ + \lambda \mathbf{1}^T \theta^- = (\lambda \begin{bmatrix} \mathbf{1} \\ \mathbf{1} \end{bmatrix})^T \begin{bmatrix} \theta^+ \\ \theta^- \end{bmatrix} = (\lambda \mathbf{1}*{2D})^T \mathbf{x}

    Combining these linear pieces gives fTx\mathbf{f}^T \mathbf{x}: fTx=(λ1[ΦyΦy])Tx\mathbf{f}^T \mathbf{x} = \left( \lambda \mathbf{1} - \begin{bmatrix} \Phi y \\ -\Phi y \end{bmatrix} \right)^T \mathbf{x} So we identify exactly: f=λ1[ΦyΦy]\mathbf{f} = \lambda \mathbf{1} - \begin{bmatrix} \Phi y \\ -\Phi y \end{bmatrix}

  4. Final Check: The objective becomes: J(x)=12xTHx+fTx+12yTyJ(\mathbf{x}) = \frac{1}{2} \mathbf{x}^T \mathbf{H} \mathbf{x} + \mathbf{f}^T \mathbf{x} + \frac{1}{2}y^Ty Since 12yTy\frac{1}{2}y^Ty is a constant with respect to x\mathbf{x}, it can be dropped from the minimization objective. The constraints θ+0\theta^+ \geq 0 and θ0\theta^- \geq 0 compactly become x0\mathbf{x} \geq 0. Hence, the problem is identical to: min_x12xTHx+fTxs.t. x0\min\_{\mathbf{x}} \frac{1}{2} \mathbf{x}^T \mathbf{H} \mathbf{x} + \mathbf{f}^T \mathbf{x} \quad \text{s.t. } \mathbf{x} \geq 0 Which matches the required form.