Skip to main content

Answer

Prerequisites

  • Calculus: Partial differentiation of logarithmic sums.
  • Optimization: The Method of Lagrange Multipliers for constrained optimization problems.
  • Probability Theory: The constraint that categorical probabilities must sum to 1.

Step-by-Step Derivation

Step 1: Define the objective and constraint functions We want to maximize the objective function: f(π)=_j=1KNjlogπjf(\pi) = \sum\_{j=1}^K N_j \log \pi_j

subject to the equality constraint: j=1Kπj=1    g(π)=j=1Kπj1=0\sum*{j=1}^K \pi_j = 1 \implies g(\pi) = \sum*{j=1}^K \pi_j - 1 = 0

(Note: While there is also an inequality constraint πj0\pi_j \ge 0, we can proceed by ignoring it temporarily to find a stationary point, and then verify that our solution satisfies it since Nj0N_j \ge 0.)

Step 2: Form the Lagrangian Using the formulation L(π,λ)=f(π)λg(π)L(\pi, \lambda) = f(\pi) - \lambda g(\pi) (or using addition f(π)+λg(π)f(\pi) + \lambda' g(\pi), the resulting scalar multiplier just changes sign), we construct the Lagrangian function. Let's use subtraction to match common conventions that yield a positive scale, though either is computationally identical:

L(π,λ)=j=1KNjlogπjλ(j=1Kπj1)L(\pi, \lambda) = \sum*{j=1}^K N_j \log \pi_j - \lambda \left(\sum*{j=1}^K \pi_j - 1\right)

Step 3: Find the stationary point with respect to πj\pi_j We take the partial derivative of the Lagrangian with respect to a specific component πj\pi_j and set it to zero:

Lπj=πj(Njlogπjλπj)=Njπjλ=0\frac{\partial L}{\partial \pi_j} = \frac{\partial}{\partial \pi_j} \left( N_j \log \pi_j - \lambda \pi_j \right) = \frac{N_j}{\pi_j} - \lambda = 0

Step 4: Express πj\pi_j in terms of λ\lambda Rearranging the equation to solve for πj\pi_j:

Njπj=λ    πj=Njλ\frac{N_j}{\pi_j} = \lambda \implies \pi_j = \frac{N_j}{\lambda}

Step 5: Enforce the equality constraint to solve for λ\lambda We substitute our expression for πj\pi_j back into the original constraint j=1Kπj=1\sum_{j=1}^K \pi_j = 1:

_j=1KNjλ=1\sum\_{j=1}^K \frac{N_j}{\lambda} = 1

Since λ\lambda is a constant multiplier that does not depend on jj, we can pull it out of the summation:

1λj=1KNj=1    λ=j=1KNj\frac{1}{\lambda} \sum*{j=1}^K N_j = 1 \implies \lambda = \sum*{j=1}^K N_j

Step 6: Substitute λ\lambda back to find the final solution Plugging λ=k=1KNk\lambda = \sum_{k=1}^K N_k (using index kk to avoid confusion with the specific jj) back into our expression for πj\pi_j from Step 4:

πj=Njk=1KNk\pi*j = \frac{N_j}{\sum*{k=1}^K N_k}

Verification: Given that Nj0N_j \ge 0 for all jj, it is clear that πj0\pi_j \ge 0, thus satisfying the bounds constraint. The solution is complete.