Question: PLEASE USE PYTHON training error should strictly decrease as the degree of the hypothesis polynomials increases. That is because any high degree polynomial can simulate
PLEASE USE PYTHON
training error should strictly decrease as the degree of the hypothesis polynomials increases. That is because any high degree polynomial can "simulate" a lower degree polynomial by making it's high order coefficients zero. Thus nothing is lost and something might be gained by increasing the degree.
But the code below shows that in-sample error actually starts to increase on our dataset for polynomials of very high degree. Why do you think this happens?
CODE BELOW:
## Numerical error
xmin,xmax = 0,4*np.pi x = np.linspace(xmin,xmax,1000) D = 30
N = 100 shuff = np.random.permutation(len(x)) x_pts = np.array(sorted(x[shuff][:N]))
K = 200 train_vals = np.zeros(D*K).reshape(K,D) test_vals = np.zeros(D*K).reshape(K,D) noise = np.random.randn(N) y = np.sin(x_pts)+ noise/7
for k in range(K): shuff = np.random.permutation(len(x)) x_pts = np.array(sorted(x[shuff][:N])) noise = np.random.randn(N) y = np.sin(x_pts)+ noise/7 for i,deg in enumerate(range(D)): X = np.ones(N*deg).reshape(N,deg) for j in range(1,deg): X[:,j] = x_pts**j X_train,X_test,y_train,y_test = test_train_split(X,y,0.13)
w = linear_fit(X_train,y_train)
g_train = linear_predict(X_train,w) g_test = linear_predict(X_test,w)
r_train = MAE(g_train,y_train) r_test = MAE(g_test,y_test) train_vals[k][i] = r_train test_vals[k][i] = r_test
tr_vals = np.mean(train_vals,axis=0) te_vals = np.mean(test_vals,axis=0)
plt.plot(range(D),tr_vals) plt.title("In sample error rises due to numerical error") plt.xlabel("Polynomial degree") plt.ylabel("MAE")
#plt.axis([0,D,0,2]) plt.show()
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
