Observe that the learner (g_{mathscr{T}}) can be written as a linear combination of the response variable: (g_{mathscr{T}}(boldsymbol{x})=boldsymbol{x}^{top}

Question:

Observe that the learner \(g_{\mathscr{T}}\) can be written as a linear combination of the response variable: \(g_{\mathscr{T}}(\boldsymbol{x})=\boldsymbol{x}^{\top} \mathbf{X}^{+} \boldsymbol{Y}\). Prove that for any learner of the form \(\boldsymbol{x}^{\top} \mathbf{A} \boldsymbol{y}\), where \(\mathbf{A} \in \mathbb{R}^{p \times n}\) is some matrix and that satisfies \(\mathbb{E}_{\mathbf{X}}\left[x^{\top} \mathbf{A} \boldsymbol{Y}\right]=g^{*}(\boldsymbol{x})\), we have

\[ \operatorname{Var}_{\mathbf{x}}\left[\boldsymbol{x}^{\top} \mathbf{X}^{+} \boldsymbol{Y}\right] \leqslant \operatorname{Var}_{\mathbf{x}}\left[\boldsymbol{x}^{\top} \mathbf{A} \boldsymbol{Y}\right] \]

where the equality is achieved for \(\mathbf{A}=\mathbf{X}^{+}\). This is called the Gauss-Markov inequality. Hence, using the Gauss-Markov inequality deduce that for the unconditional variance:

\[ \mathbb{V a r g}_{\mathscr{T}}(\boldsymbol{x}) \leqslant \mathbb{V} \operatorname{ar}\left[\boldsymbol{x}^{\top} \mathbf{A} \boldsymbol{Y}\right] \]

Deduce that \(\mathbf{A}=\mathbf{X}^{+}\)also minimizes the expected generalization risk.

Fantastic news! We've Found the answer you've been seeking!

Step by Step Answer:

Related Book For  book-img-for-question

Data Science And Machine Learning Mathematical And Statistical Methods

ISBN: 9781118710852

1st Edition

Authors: Dirk P. Kroese, Thomas Taimre, Radislav Vaisman, Zdravko Botev

Question Posted: