Consider the linear model (boldsymbol{Y}=mathbf{X} boldsymbol{beta}+varepsilon) in (5.8), with (mathbf{X}) being the (n times p) model matrix
Question:
Consider the linear model \(\boldsymbol{Y}=\mathbf{X} \boldsymbol{\beta}+\varepsilon\) in (5.8), with \(\mathbf{X}\) being the \(n \times p\) model matrix and \(\boldsymbol{\varepsilon}\) having expectation vector \(\mathbf{0}\) and covariance matrix \(\sigma^{2} \mathbf{I}_{n}\). Suppose that \(\widehat{\boldsymbol{\beta}}_{-i}\) is the least-squares estimate obtained by omitting the \(i\)-th observation, \(Y_{i}\); that is,
\[ \widehat{\boldsymbol{\beta}}_{-i}=\underset{\boldsymbol{\beta}}{\arg \min } \sum_{j eq i}\left(Y_{j}-\boldsymbol{x}_{j}^{\top} \boldsymbol{\beta}\right)^{2} \]
where \(\boldsymbol{x}_{j}^{\top}\) is the \(j\)-th row of \(\mathbf{X}\). Let \(\widehat{Y}_{-i}=\boldsymbol{x}_{i}^{\top} \widehat{\boldsymbol{\beta}}_{-i}\) be the corresponding fitted value at \(\boldsymbol{x}_{i}\). Also, define \(\boldsymbol{B}_{i}\) as the least-squares estimator of \(\boldsymbol{\beta}\) based on the response data
\[ \boldsymbol{Y}^{(i)}:=\left[Y_{1}, \ldots, Y_{i-1}, \widehat{Y_{-i}}+Y_{i+1}, \ldots, Y_{n}\right]^{\top} \]
(a) Prove that \(\widehat{\boldsymbol{\beta}}_{-i}=\boldsymbol{B}_{i}\); that is, the linear model obtained from fitting all responses except the \(i\)-th is the same as the one obtained from fitting the data \(\boldsymbol{Y}^{(i)}\).
(b) Use the previous result to verify that
\[ Y_{i}-\widehat{Y_{-i}}=\left(Y_{i}-\widehat{Y_{-i}}\right) /\left(1-\mathbf{P}_{i i}\right) \]
where \(\mathbf{P}=\mathbf{X X}^{+}\)is the projection matrix onto the columns of \(\mathbf{X}\). Hence, deduce the PRESS formula in Theorem 5.1.
Step by Step Answer:
Data Science And Machine Learning Mathematical And Statistical Methods
ISBN: 9781118710852
1st Edition
Authors: Dirk P. Kroese, Thomas Taimre, Radislav Vaisman, Zdravko Botev