Question: In this problem, we are going to use simulated datasets to better understand how the square of bias, variance, irreducible error, and MSE vary with

In this problem, we are going to use simulated datasets to better understand how the square
of bias, variance, irreducible error, and MSE vary with model flexibility.
(a)4pts Generate a simulated dataset as follows:
def f(x):
return x **5-2** x*3
def get_sim_data(f, sample_size=100, std=0.01):
x = np.random.uniform(0,1, sample_size)
y = f(x)+ np.random.normal(0, std, sample_size)
df = pd.DataFrame({'x': x,'y': y})
return df
In this dataset, what is the number of observations n and what is the number of features
p(different powers of x are counted as different features)? Write out the model used
to generate the data in equation form.
(b)[4pts] Fit the polynomial functions of degree from 0 to 15 using the simulated data in
(a):
f0(x)=0+
f1(x)=0+1x+
f2(x)=0+1x+2x2+
vdots
f15(x)=0+1x+2x2+3x3cdots+15x15+
(Hint: You may find
from sklearn. preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
useful.)
(c)[4pts] Predict the response at x0=0.18 using the fitted functions in (b).
(d)4pts Repeat (a)-(c) for 250 times.
(e)4pts Use (d) to calculate the square of bias for the fitted polynomials hat(f)0(x0),hat(f)1(x0),cdots,hat(f)15(x0).
In this problem, we are going to use simulated datasets to better understand how the square
of bias, variance, irreducible error, and MSE vary with model flexibility.
(a)4pts Generate a simulated dataset as follows:
def f(x):
return x **5-2** x*3
def get_sim_data(f, sample_size=100, std=0.01):
x = np.random.uniform(0,1, sample_size)
y = f(x)+ np.random.normal(0, std, sample_size)
df = pd.DataFrame({'x': x,'y': y})
return df
In this dataset, what is the number of observations n and what is the number of features
p(different powers of x are counted as different features)? Write out the model used
to generate the data in equation form.
(b)[4pts] Fit the polynomial functions of degree from 0 to 15 using the simulated data in
(a):
f0(x)=0+
f1(x)=0+1x+
f2(x)=0+1x+2x2+
vdots
f15(x)=0+1x+2x2+3x3cdots+15x15+
(Hint: You may find
from sklearn. preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
useful.)
(c)[4pts] Predict the response at x0=0.18 using the fitted functions in (b).
(d)4pts Repeat (a)-(c) for 250 times.
(e)4pts Use (d) to calculate the square of bias for the fitted polynomials hat(f)0(x0),hat(f)1(x0),cdots,hat(f)15(x0).
(f)4pts Use (d) to calculate the variance for the fitted polynomials hat(f)0(x0),hat(f)1(x0),cdots,hat(f)15(x0).
(g)4pts Calculate the irreducible error based on the data generating process.
(h)4pts Calculate the MSE based on (e),(f), and (g).
(i)[6pts] Plot how the square of bias, variance, irreducible error, and MSE vary with the
degree of polynomials. Explain your findings.
(f)4pts Use (d) to calculate the variance for the fitted polynomials hat(f)0(x0),hat(f)1(x0),cdots,hat(f)15(x0).
(g)4pts Calculate the irreducible error based on the data generating process.
(h)4pts Calculate the MSE based on (e),(f), and (g).
(i)[6pts] Plot how the square of bias, variance, irreducible error, and MSE vary with the
degree of polynomials. Explain your findings.
In this problem, we are going to use simulated

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!