Question: 1. Regression with Polynomial Basis Functions, 30 points. This problem extends ordinary least squares regression, which uses the hypothesis class of linear regression functions, to

1. Regression with Polynomial Basis Functions, 30 points. This problem extends

ordinary least squares regression, which uses the hypothesis class of linear regression

functions, to non-linear regression functions modeled using polynomial basis functions. In order to learn nonlinear models using linear regression, we have to explicitly transform

1. Regression with Polynomial Basis Functions, 30 points. This problem extends ordinary least squares regression, which uses the hypothesis class of linear regression functions, to non-linear regression functions modeled using polynomial basis functions. In order to learn nonlinear models using linear regression, we have to explicitly transform the data into a higher- dimensional space. The nonlinear hypothesis class we will consider is the set of d-degree polynomials of the form f(x) = wo + w1x + w2x+...+wqxd or a linear combination of polynomial basis function: X f(x) = [wo, W1, W2 ..., wa] x2 rd The monomials {1, x, x, ..., xd } are called basis functions, and each basis function xk has a corresponding weight wk associated with it, for all k = 1,..., d. We transform each univariate data point x; into into a multivariate (d-dimensional) data point via p(x;) [1, x, xz, x]. When this transformation is applied to every data point, it produces the Vandermonde matrix: X1 x? 1 x2 x? : : 0 = 1 xn a. (10 points) Complete the Python function below that takes univariate data as input and computes a Vandermonde matrix of dimension d. This transforms one-dimensional data into d-dimensional data in terms of the polynomial basis and allows us to model regression using a d-degree polynomial. In [6]: # x float(n, ): univariate data # d int: degree of polynomial def polynomial_transform(x, d): # # # *** Insert your code here *** # # File "", line 8 SyntaxError: unexpected EOF while parsing b. (10 points) Complete the Python function below that takes a Vandermonde matrix o and the labels y as input and learns weights via ordinary least squares regression. Specifically, given a Vandermonde matrix O, implement the computation of w = (070)-'0'y. Remember that in Python, @performs matrix multiplication, while * performs element-wise multiplication. Alternately, numpy.dot also performs matrix multiplication. In [ ]: # Phi float(n, d): transformed data # y float(n, ): Labels def train_model(Phi, y): # # # *** Insert your code here *** # # c. (5 points) Complete the Python function below that takes a Vandermonde matrix o, corresponding labels y, and a linear regression model w as input and evaluates the model using mean squared error. That is, emse = X= (yi wo;) In [ ]: # Phi float(n, d): transformed data # y float(n, ): Labels float(d, ): Linear regression model def evaluate_model(Phi, y, w): # w # # # *** Insert your code here *** # # From plot of d vs. validation error below, which choice of d do you expect will generalize best? In [ ]: W = {} validationErr = testErr = {} {} # Dictionary to store all the trained models # Validation error of the models # Test error of all the models for d in range(3, 25, 3): # Iterate over polynomial degree Phi_trn = polynomial_transform(x_trn, d) w[d] = train_model(Phi_trn, y tnn) # Transform training data into d dimensions # Learn model on training data Phi_val = polynomial_transform(X_val, d) # Transform validation data into d dimensions validationErr[d] = evaluate_model(Phi_val, y_val, w[d]) # Evaluate model on validation data Phi_tst = polynomial_transform(x_tst, d) # Transform test data into d dimensions testErr[d] = evaluate_model(Phi_tst, y_tst, w[d]) # Evaluate model on test data # Plot all the models plt.figure() plt.plot(validationErr.keys(), validationErr.values(), marker='o', linewidth=3, markersize=12) plt.plot(testErr.keys(), testErr.values(), marker='s', linewidth=3, markersize=12) plt.xlabel('Polynomial degree', fontsize=16) plt.ylabel('Validation/Test error', fontsize=16) plt.xticks (list(validationErr.keys()), fontsize=12) plt.legend (['Validation Error', 'Test Error'], fontsize=16) plt.axis ([2, 25, 15, 60]) 1. Regression with Polynomial Basis Functions, 30 points. This problem extends ordinary least squares regression, which uses the hypothesis class of linear regression functions, to non-linear regression functions modeled using polynomial basis functions. In order to learn nonlinear models using linear regression, we have to explicitly transform the data into a higher- dimensional space. The nonlinear hypothesis class we will consider is the set of d-degree polynomials of the form f(x) = wo + w1x + w2x+...+wqxd or a linear combination of polynomial basis function: X f(x) = [wo, W1, W2 ..., wa] x2 rd The monomials {1, x, x, ..., xd } are called basis functions, and each basis function xk has a corresponding weight wk associated with it, for all k = 1,..., d. We transform each univariate data point x; into into a multivariate (d-dimensional) data point via p(x;) [1, x, xz, x]. When this transformation is applied to every data point, it produces the Vandermonde matrix: X1 x? 1 x2 x? : : 0 = 1 xn a. (10 points) Complete the Python function below that takes univariate data as input and computes a Vandermonde matrix of dimension d. This transforms one-dimensional data into d-dimensional data in terms of the polynomial basis and allows us to model regression using a d-degree polynomial. In [6]: # x float(n, ): univariate data # d int: degree of polynomial def polynomial_transform(x, d): # # # *** Insert your code here *** # # File "", line 8 SyntaxError: unexpected EOF while parsing b. (10 points) Complete the Python function below that takes a Vandermonde matrix o and the labels y as input and learns weights via ordinary least squares regression. Specifically, given a Vandermonde matrix O, implement the computation of w = (070)-'0'y. Remember that in Python, @performs matrix multiplication, while * performs element-wise multiplication. Alternately, numpy.dot also performs matrix multiplication. In [ ]: # Phi float(n, d): transformed data # y float(n, ): Labels def train_model(Phi, y): # # # *** Insert your code here *** # # c. (5 points) Complete the Python function below that takes a Vandermonde matrix o, corresponding labels y, and a linear regression model w as input and evaluates the model using mean squared error. That is, emse = X= (yi wo;) In [ ]: # Phi float(n, d): transformed data # y float(n, ): Labels float(d, ): Linear regression model def evaluate_model(Phi, y, w): # w # # # *** Insert your code here *** # # From plot of d vs. validation error below, which choice of d do you expect will generalize best? In [ ]: W = {} validationErr = testErr = {} {} # Dictionary to store all the trained models # Validation error of the models # Test error of all the models for d in range(3, 25, 3): # Iterate over polynomial degree Phi_trn = polynomial_transform(x_trn, d) w[d] = train_model(Phi_trn, y tnn) # Transform training data into d dimensions # Learn model on training data Phi_val = polynomial_transform(X_val, d) # Transform validation data into d dimensions validationErr[d] = evaluate_model(Phi_val, y_val, w[d]) # Evaluate model on validation data Phi_tst = polynomial_transform(x_tst, d) # Transform test data into d dimensions testErr[d] = evaluate_model(Phi_tst, y_tst, w[d]) # Evaluate model on test data # Plot all the models plt.figure() plt.plot(validationErr.keys(), validationErr.values(), marker='o', linewidth=3, markersize=12) plt.plot(testErr.keys(), testErr.values(), marker='s', linewidth=3, markersize=12) plt.xlabel('Polynomial degree', fontsize=16) plt.ylabel('Validation/Test error', fontsize=16) plt.xticks (list(validationErr.keys()), fontsize=12) plt.legend (['Validation Error', 'Test Error'], fontsize=16) plt.axis ([2, 25, 15, 60])

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

2. Regression with Radial Basis Functions, 70 points In the previous case, we considered a nonlinear extension to linear regression using a linear combination of polynomial basis functions, where...

Simple Regression Analysis What are we trying to accomplish? Quantify the relationship between two variables or more. How can we quantify a relationship between two variables? 1. Correlation Analysis...

summarize and raise outstanding issues about managerial accounting in the article Accounting Information for Product Costing - Case Study Suzana Keglevi Kozjak, Tanja estanj-Peri Faculty of...

** You can use EXCEL or any other software. ** You need to include your printout in your submissions. The layout and format of the paper should include the following sections: Title page, Abstract,...

Question: Evaluate the two forecasting models described in the case for predicting daily check-in volume. What are the strengths and weaknesses of each one? Do you find any of the results surprising?...

***data.csv*** **lin_reg.py** import numpy as np import pandas as pd import matplotlib.pyplot as plt # function name: least_sq # inputs: file_name- name of the csv file # output: m(slope),...

Need Urgent Help with Question 3 and Question 9 of the attached document - Question 3 needs only ~400 words (also attaching an article which might be needed to answer question-3) My word count for...

Note: 2), 4) and 5) must be for the initial and final year. 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 98 88 8 8 8 8 8 8 8 8 Module 8 Spatial Econometrics and Regional Income Convergence ECON 760...

Fixed Capital (Current U.S. Dollars) Labor Force GDP (Current U.S. Dollars) K Country Name L Y Afghanistan 2,190,816,336.48 8,720,341.85 14,213,670,485.31 Albania 3,489,305,719.30 1,478,120.92...

Please help me on Significant testing (Regression coefficient) and satisfying assumptions and confidence interval for regression coefficient!! Please don't copy other cheggs answers please Problem...

Anaya Corporation allocates manufacturing overhead using a predetermined rate based on machine-hours used. At the beginning of 2022, Anaya estimated manufacturing overhead of $358,000 and 20,000...

Our company has entered into a contract with an advertising agency for monthly market research and competitive analysis reports. The contract requires us to make end-of-month payments of $3,275 for...

Adding another stock to the portfolio with a beta lower than the portfolio beta, will Group of answer choices decrease portfolio systemic risk increase portfolio systemic risk increase portfolio...

Compared with half a century ago, adoption has become _ _ _ _ _ _ _ _ _ common, but it is more open and acceptabl e , so we probably discuss it _ _ _ _ _ _ _ . fill in the blanks more or much less or...

What tends to skew and distort Average Salaries in most Gender Pay Equity Studies?

The FedScope employment database has a number of Dimension Tables and a Single Fact Table, as shown in Table 7.1. Which columns/data elements in the Fact Table would be most useful in Pay Equity...

After Defining and Building a Multidimensional OLAP Cube, what is stored in the Cube?