Explain
Intuition
The conclusion from part (b) — that the model's variance is exactly the original data's variance plus — reveals a profound reality about Kernel methods.
It tells us that Kernel Density Estimation (KDE) strictly inflates the variance of your modeled distribution in comparison to your raw data. Changing the kernel bandwidth (which modifies ) is equivalent to controlling a "blur" slider on an image editing tool.
The Bias connection
When designing a statistical estimator , we are constantly fighting a war on two fronts: Variance (instability due to dataset randomness) and Bias (systematic inaccuracy).
- Turn the "Blur" up (Large ): We get a smooth, steady, aesthetically pleasing curve. It has low variance (the curve won't drastically change if we sample slightly different data points). However, because we forcibly widened the distribution by adding , the curve no longer tightly traces the peaks and drops of the real phenomenon. Because we systematically force the model to be excessively flat, we introduce a massive Bias (an error between our expected smooth curve and the spiky reality).
- Turn the "Blur" down (Small ): We faithfully paint a probability mass exactly where the data is, eliminating the spread. Our systematic Bias drops to near zero. But your resulting curve will look like a chaotic skyline of sharp needles. It simply memorized the dataset instead of generalizing it (meaning it has exceptionally high variance).
The extra term in our covariance tells us: "This is the penalty you pay to get a smooth curve." We trade the accuracy of predicting perfectly sharp local details (Bias) for the global stability of a continuous probability density curve.