Question: For a real dataset, we cannot obtain the expected test MSE. It requires knowledge of the true model, irreducible error, and access to an infinite
For a real dataset, we cannot obtain the expected test MSE. It requires knowledge of the true model, irreducible error, and access to an infinite number of training sets. In simulations, we can get close to obtaining the expected test MSE and this is exactly what we'll do. Suppose we know that the true population regression line is: Y = Bo + B1X1 + B2X} + E. Suppose Bo = B1 = B2 = 1, and e ~ N(0,1). Generate n = 100 observations for Yi under this model. You can use the following code to generate X1: X1 = seq (0,5,length. out =100) Produce a plot of Y and X1 and print that here. Ideally, to compute the expected test MES ve would have an infinite number of training sets. That's not computationally feasible, so instead let's just simulate 1000 training sets (each with n = 100). That means you'll need to simulate (n = 100) Y values 1000 times. There is no need to generate new Xi's (think about why). For each of these 1000 training sets, train 5 models of increasing complexity (M1 - M5). Mi will be the linear regression model, M2 includes a 2nd order-term, M3 includes a 3rd order term, and so on until M5. For each model, store the predicted value of Y when X1 = 1. Report the first 5 predicted values for each model here. Create a test set of 1000 observations: (xo, yo). For each test observation, let xo = 1. Generate yo using the true regression line with xo = 1. Report the first 5 values in your test set. Use the results from above to obtain the expected test MSE for each of the five models when x0 = 1. Report the five expected test MSEs here. Which model has the smallest expected test MSE? Produce a plot with expected test MSE on the y-axis and model complexity (1-5) on the x-axis. Present that plot here. Explain the behavior of your results in the context of the bias-variance tradeoff.
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
