Question
Problem 4.12 Lagrange multipliers and equality constraints
In the M-step of the EM algorithm for mixture models, we need to calculate an estimate of the component prior probabilities via the optimization problem,
(4.47)
for some . Note that this optimization problem has an equality constraint, which is must sum to 1, since they represent a probability distribution.
One method of solving an optimization problem with equality constraints is to use Lagrange multipliers. Consider the following problem,
(4.48)
where is the objective function and is the constraint function. Let's look at two properties of these functions,
- First, the gradient is orthogonal to the constraint surface, since should be constant along the direction of the constraint surface (otherwise it would not be 0).
- Second, at the optimum, the gradient must also be orthogonal to the constraint surface. Otherwise, we could move along the constraint surface to increase .
This is illustrated in the following figure:
_(Figure description: A contour plot showing concentric circles representing the level curves of an objective function . A straight red line represents the constraint function . The optimal point is where the level curve of represents the highest reachable value while being tangent to the constraint line . An arrow pointing towards the center represents the gradient , which is orthogonal to the constraint line at the optimal point.)_
Hence, and must be parallel or anti-parallel, and by extension,
(4.49)
for some . Define the Lagrangian function,
(4.50)
The optimality condition in (4.49) is obtained by setting . Furthermore, setting yields the equality constraint, . Hence, to solve the constrained optimization problem (4.48), we form the Lagrangian function, and find the stationary point w.r.t. both and , by simultaneously solving
(4.51)
(a) Use Lagrange multipliers to optimize (4.47), and show that the solution is
(4.52)