Question

(b) Formulate the problem as one of ML estimation, i.e. write down the likelihood function $p(y|x, \theta)$ , and compute the ML estimate, i.e. the value of $\theta$ that maximizes $p(y_1, \cdots, y_n | x_1, \cdots, x_n, \theta)$ . Show that this is equivalent to (a).

Hint: the vector derivatives listed in Problem 2.6 might be helpful.