Skip to main content

Problem 3.8(e) Answer

1. Prior Identification as Beta Distributions

First, let's map these priors to the standard Beta distribution Beta(α,β)πα1(1π)β1\text{Beta}(\alpha, \beta) \propto \pi^{\alpha-1}(1-\pi)^{\beta-1}.

  • Prior p1(π)=2πp_1(\pi) = 2\pi:

    • Proportional to π1(1π)0\pi^1 (1-\pi)^0.
    • α1=1    α=2\alpha-1 = 1 \implies \alpha = 2
    • β1=0    β=1\beta-1 = 0 \implies \beta = 1
    • This is Beta(2,1)\text{Beta}(2, 1).
  • Prior p0(π)=2(1π)p_0(\pi) = 2(1-\pi):

    • Proportional to π0(1π)1\pi^0 (1-\pi)^1.
    • α1=0    α=1\alpha-1 = 0 \implies \alpha = 1
    • β1=1    β=2\beta-1 = 1 \implies \beta = 2
    • This is Beta(1,2)\text{Beta}(1, 2).

2. Calculate MAP Estimates

MAP maximizes the posterior. Posterior \propto Likelihood ×\times Prior. Likelihood is πs(1π)ns\pi^s (1-\pi)^{n-s}.

Case 1: Prior p1p_1 (α=2,β=1\alpha=2, \beta=1)

  • Posterior πs(1π)nsπ1=πs+1(1π)ns\propto \pi^s (1-\pi)^{n-s} \cdot \pi^1 = \pi^{s+1} (1-\pi)^{n-s}.
  • This is proportional to Beta(s+2,ns+1)\text{Beta}(s+2, n-s+1).
  • We maximize f(π)=πs+1(1π)nsf(\pi) = \pi^{s+1} (1-\pi)^{n-s}.
  • Log-posterior: (s+1)lnπ+(ns)ln(1π)(s+1)\ln\pi + (n-s)\ln(1-\pi).
  • Derivative = 0: s+1πns1π=0\frac{s+1}{\pi} - \frac{n-s}{1-\pi} = 0.
  • (s+1)(1π)=(ns)π(s+1)(1-\pi) = (n-s)\pi.
  • s+1sππ=nπsπs+1 - s\pi - \pi = n\pi - s\pi.
  • s+1=(n+1)πs+1 = (n+1)\pi.
  • π^MAP,1=s+1n+1\hat{\pi}_{MAP, 1} = \frac{s+1}{n+1}.

Case 2: Prior p0p_0 (α=1,β=2\alpha=1, \beta=2)

  • Posterior πs(1π)ns(1π)1=πs(1π)ns+1\propto \pi^s (1-\pi)^{n-s} \cdot (1-\pi)^1 = \pi^s (1-\pi)^{n-s+1}.
  • This is proportional to Beta(s+1,ns+2)\text{Beta}(s+1, n-s+2).
  • We maximize f(π)=πs(1π)ns+1f(\pi) = \pi^{s} (1-\pi)^{n-s+1}.
  • Derivative = 0: sπns+11π=0\frac{s}{\pi} - \frac{n-s+1}{1-\pi} = 0.
  • s(1π)=(ns+1)πs(1-\pi) = (n-s+1)\pi.
  • ssπ=nπsπ+πs - s\pi = n\pi - s\pi + \pi.
  • s=(n+1)πs = (n+1)\pi.
  • π^MAP,0=sn+1\hat{\pi}_{MAP, 0} = \frac{s}{n+1}.

3. Effective Estimates (Bayesian Mean)

The question also asks "What is the effective estimate...". This can be ambiguous (referring to MAP or Mean). The second part of the question ("virtual samples") applies to both, but standard Bayesian prediction uses the Mean. Let's compute the Mean (Posterior Expectation) for completeness.

  • For p1p_1 (Posterior Beta(s+2,ns+1s+2, n-s+1)):

    • Mean = αpostαpost+βpost=s+2(s+2)+(ns+1)=s+2n+3\frac{\alpha_{post}}{\alpha_{post} + \beta_{post}} = \frac{s+2}{(s+2) + (n-s+1)} = \frac{s+2}{n+3}.
  • For p0p_0 (Posterior Beta(s+1,ns+2s+1, n-s+2)):

    • Mean = s+1(s+1)+(ns+2)=s+1n+3\frac{s+1}{(s+1) + (n-s+2)} = \frac{s+1}{n+3}.

(Note: The question likely asks for the MAP intuitive explanation since it asked to "Calculate the MAP estimates" specifically.)

4. Intuitive Explanation ("Virtual" Samples)

We interpret the hyperparameters α,β\alpha, \beta of the prior as virtual counts added to the actual data. Priorπα1(1π)β1\text{Prior} \propto \pi^{\alpha-1}(1-\pi)^{\beta-1}. MAP estimate formula for Beta posterior: s+(α1)n+(α+β2)\frac{s + (\alpha-1)}{n + (\alpha+\beta-2)}.

For p1p_1 (α=2,β=1\alpha=2, \beta=1):

  • Virtual Samples: We added 1 sample, which was a success.
  • Total virtual N=1N' = 1. Total virtual S=1S' = 1.
  • MAP Estimate: s+1n+1\frac{s+1}{n+1}.
  • Interpretation: The prior 2π2\pi acts like we've already observed one Head. This biases the result towards 1.

For p0p_0 (α=1,β=2\alpha=1, \beta=2):

  • Virtual Samples: We added 1 sample, which was a failure.
  • Total virtual N=1N' = 1. Total virtual S=0S' = 0.
  • MAP Estimate: sn+1\frac{s}{n+1}.
  • Interpretation: The prior 2(1π)2(1-\pi) acts like we've already observed one Tail. This biases the result towards 0.