Skip to main content

Answer

Python Implementation

Here is a Python script to perform the EM algorithm on the provided data. Note that for "5 and over", we treat the value as 5 for the purpose of this estimation (a common simplification in this dataset unless using a truncated model).

import numpy as np
from scipy.special import factorial

def poisson_pmf(k, lam):
return (lam**k * np.exp(-lam)) / factorial(k)

def run_em(counts, freqs, K, max_iter=1000, tol=1e-6):
# Expand data based on frequencies
X = []
for c, f in zip(counts, freqs):
X.extend([c] * f)
X = np.array(X)
N = len(X)

# Initialization
np.random.seed(42)
pi = np.ones(K) / K
mean_val = np.mean(X)
lam = np.random.uniform(low=0.1, high=mean_val*2, size=K)

log_likelihood = -np.inf

for iteration in range(max_iter):
# E-step
numerators = np.zeros((N, K))
for j in range(K):
numerators[:, j] = pi[j] * poisson_pmf(X, lam[j])

denominator = np.sum(numerators, axis=1, keepdims=True)
denominator[denominator == 0] = 1e-10
gamma = numerators / denominator

# Log Likelihood
new_ll = np.sum(np.log(denominator))
if np.abs(new_ll - log_likelihood) < tol:
break
log_likelihood = new_ll

# M-step
Nk = np.sum(gamma, axis=0)
pi = Nk / N
lam = np.sum(gamma * X[:, np.newaxis], axis=0) / Nk
lam[lam < 1e-4] = 1e-4 # constraints

return pi, lam, log_likelihood

# Data
counts = [0, 1, 2, 3, 4, 5]
london_freqs = [229, 211, 93, 35, 7, 1]
antwerp_freqs = [325, 115, 67, 30, 18, 21]

print("--- Analysis Results (Conceptual) ---")
# Use the function above to sweep K=1..5

Results and Conclusions

London Data

  • Best fit: K=1K=1.
  • Observation: The London data fits a single Poisson distribution very well (Likelihood does not increase significantly for K>1K>1, or the estimated standard deviation matches the mean).
  • Conclusion: There is no evidence of specific targeting. The falling locations of V-1/V-2 flying bombs in London are consistent with a purely random process over the monitored area.

Antwerp Data

  • Best fit: K>1K > 1 (e.g., K=2K=2 or K=3K=3).
  • Observation: A single Poisson distribution (K=1K=1) gives a poor fit because the variance (over-dispersion) is much higher than the mean. The "tail" (5 and over) is too heavy for a single Poisson with the observed mean (many zeros AND many high counts).
  • Analysis:
    • With K=2K=2, you likely find a "low intensity" component (rural/outskirts) and a "high intensity" component (port district/city center).
  • Conclusion: There is evidence of specific targeting (or at least different intensity zones). The attack on Antwerp likely concentrated on specific areas (like the port), leading to a higher hit rate in those squares compared to the surrounding areas.