Skip to main content

Question

Problem 6.2 BDR for regression

In this problem, we will consider the Bayes decision rule for regression. Suppose we have a regression problem, where yRy \in \mathbb{R} is the output, xRdx \in \mathbb{R}^d is the input, and we have already learned the distribution p(yx)p(y|x), which maps the input xx to a distribution of outputs yy. The goal is to select the optimal output yy for a given xx.

(a) Consider the squared-loss function, L(g(x),y)=(g(x)y)2L(g(x), y) = (g(x) - y)^2. Show that the BDR is to decide the conditional mean of p(yx)p(y|x), or g(x)=E[yx]g^*(x) = \mathbb{E}[y|x]. In other words, show that g(x)g^*(x) minimizes the conditional risk R(x)=L(g(x),y)p(yx)dyR(x) = \int L(g(x), y) p(y|x) dy.