Explain
Intuition behind the EM Derivation
The Expectation-Maximization algorithm is a powerful tool used when we have "hidden" or "missing" data. In the context of a Mixture of Poissons, the "missing data" is the label for each bomb square: Which component (target type) did this square belong to?
If we knew for sure that square was a target () and square was not (), we could just separate the data into two piles and compute the average () for each pile separately. This is what we refer to as the standard Maximum Likelihood Estimate (MLE).
However, we don't know the labels. We only see the count of bombs.
The E-step: Soft Assignment
Since we don't know the label, we guess probabilities. This is the E-step. We ask: Given our current best guess of the rates (), how likely is it that a square with bombs belongs to group 1 vs group 2? If a square has 5 bombs, and , it's super likely to be from group 1. We assign it, say, 99.9% responsibility to group 1. If it has 2 bombs, maybe it's 60% group 1, 40% group 2.
The M-step: Weighted MLE
Now we act as if these probabilities are the truth. This is the M-step. To update our estimate of , we look at all the squares, but we listen more to the ones that we think belong to group 1.
- The formula is exactly this "Weighted Average".
- sums up the bomb counts, but multiplied by how confident we are that they belong to group .
- is the "effective total count" of squares in group .
So, the M-step is mathematically identical to the standard Poisson MLE (Total Bombs / Total Squares), but generalized to fractional (probabilistic) memberships.