-
Analyzing the Constraints:
Let θi=θi+−θi− with the constraints θi+≥0 and θi−≥0. We want to show that minimizing λ∣θi+−θi−∣ is equivalent to minimizing λ(θi++θi−) under the optimization objective.
-
Absolute Value Representation:
Note that by definition:
∣θi∣=∣θ+_i−θ−_i∣
We know that for any positive numbers, ∣θi+−θi−∣≤θi++θi−, because the right-hand side is the sum of two positive numbers and the left hand side is their absolute difference. The equality holds if and only if θi+=0 or θi−=0 (or both).
-
Behavior at the Optimum:
Suppose at the optimum, there is an index i such that both θi+>0 and θi−>0. Let c=min(θi+,θi−)>0.
We can define a new pair of variables:
(θi+)new=θi+−c
(θ−_i)new=θ−_i−c
Notice two things:
- The value of θi remains unchanged:
(θi+)new−(θi−)new=(θ+_i−c)−(θ−_i−c)=θ+_i−θ−_i=θi
Hence, the squared loss term 21∥y−ΦT(θ+−θ−)∥2 remains exactly the same.
- However, the regularization sum in (3.63) drops strictly:
(θi+)new+(θi−)new=θ+_i−c+θ−_i−c=(θ+_i+θ−_i)−2c<(θ+_i+θ−_i)
Because our objective is to minimize the cost function, any solution where both θi+>0 and θi−>0 simultaneously cannot be the true minimum. The optimizer will always prefer to subtract c from both until at least one of them hits zero to strictly lower the objective value.
-
Conclusion:
Therefore, at the optimal solution, for every i, either θi+=0 or θi−=0.
When at least one is zero, the absolute value becomes:
∣θ+_i−θ−_i∣=θ+_i+θ−_i
Because of this property at the optimum, we can safely replace the non-differentiable term ∣θi+−θi−∣ in (3.62) with the simple linear summation (θi++θi−) in (3.63). This converts the problem into a much easier form without changing the optimal answer.