Explain

Intuition

Think of Kernel Density Estimation (KDE) as placing a small pile of sand (representing a unit of probability mass) directly on top of every data point we observe. The shape and width of this sand pile are controlled by our chosen kernel function $\tilde{k}(x)$ .

Because our kernel function is perfectly balanced (it has a zero mean, meaning it doesn't lean left or right), the "center of mass" for each individual sand pile rests precisely on the data point $x_i$ where we placed it.

When we calculate the overall mean (or expected value) of our entire estimated distribution $\hat{p}(x)$ , we are essentially trying to find the joint "center of mass" of all these combined sand piles.

Here is the key insight:

Because every single sand pile has exactly the same weight (each contributes $1/n$ of the total probability).
And because each sand pile is perfectly centered on its respective data point ( $x_i$ ).

The overall center of mass perfectly aligns with the simple arithmetic average of all the original data points (which is the sample mean). No matter how wide or narrow you make the individual sand piles (i.e. changing the kernel bandwidth), as long as they remain symmetrically centered, they will not shift the overall average.

Diagram: Kernel Sand Piles

graph TD
    subgraph Data Points
        x1(("x₁"))
        x2(("x₂"))
        x3(("x₃"))
    end

    subgraph Kernel Contributions
        k1["Centered on x₁\nMean: x₁"]
        k2["Centered on x₂\nMean: x₂"]
        k3["Centered on x₃\nMean: x₃"]
    end

    x1 -->|Place Kernel| k1
    x2 -->|Place Kernel| k2
    x3 -->|Place Kernel| k3

    subgraph Overall Distribution p̂(x)
        Average["Overall Mean = (x₁ + x₂ + x₃) / 3"]
    end

    k1 --> Average
    k2 --> Average
    k3 --> Average

Intuition​

Diagram: Kernel Sand Piles​

Intuition

Diagram: Kernel Sand Piles