Explain
Intuitive Explanation
The result tells us something fundamental about smoothing:
Smoothing always adds variance.
Think of the data points as sharp spikes (zero width, zero variance around the point). The sample covariance measures how far these spikes are from the center. When we replace each spike with a kernel of width , we are essentially adding stochastic noise to each data point. It's like adding a random variable to each sample .
The variance of the sum of independent variables is the sum of their variances: .
Relation to Bias: This inflation of variance is a form of systematic error (bias).
- True Variance:
- Estimated Variance:
The estimator systematically overestimates the spread of the distribution. This is the price we pay for making the distribution smooth. If we want a very smooth curve (large ), we must accept that our estimated distribution will be much wider (more biased variance) than the true distribution. This illustrates the bias-variance tradeoff in KDE: larger bandwidth reduces the variance of the density estimator (the curve doesn't change much with different samples) but increases the bias of the density estimate (the curve is too simple/wide).