Question: Regression Models 1. Create a python file named myregressor.py. Import the following package. import pickle import numpy as np from sklearn import linear_model import sklearn.metrics

Regression Models

1. Create a python file named myregressor.py. Import the following package.

import pickle

import numpy as np

from sklearn import linear_model

import sklearn.metrics as sm

import matplotlib.pyplot as plt

2. Add the following lines. Read these lines and explain their purpose?

input_file = ' regressor_data.txt'

data = np.loadtxt(input_file, delimiter=',')

X, y = data[:, :-1], data[:, -1]

num_training = int(0.8 * len(X))

num_test = len(X) - num_training

X_train, y_train = X[:num_training], y[:num_training]

X_test, y_test = X[num_training:], y[num_training:]

3. Add the following lines. What is the purpose for these added lines?

regressor = linear_model.LinearRegression()

regressor.fit(X_train, y_train)

y_test_pred = regressor.predict(X_test)

4. Add the following lines. Run the program. Save the plot diagram to your local computer and insert the diagram below.

plt.scatter(X_test, y_test, color='green')

plt.plot(X_test, y_test_pred, color='black', linewidth=4)

plt.xticks(())

plt.yticks(())

plt.show()

5. Explain what have been drawn in the graph?

6. Modify the above code to display in a same diagram the scatter plots of (1) training data set in blue, (2) testing data set in green, and predicted data set in red. Please show your code and insert the diagram you saved.

7. Add the following lines and run your program. Please show the printout.

print("Linear regressor performance:")

print("Mean absolute error =", round(sm.mean_absolute_error(y_test, y_test_pred), 2))

print("Mean squared error =", round(sm.mean_squared_error(y_test, y_test_pred), 2))

print("Median absolute error =", round(sm.median_absolute_error(y_test, y_test_pred), 2))

print("Explain variance score =", round(sm.explained_variance_score(y_test, y_test_pred), 2))

print("R2 score =", round(sm.r2_score(y_test, y_test_pred), 2))

8. Use the equations to explain what are mean_absolute_error and mean_squared_error?

9. From the provided document, learn what is explained variation and what is R squared?

10. Add the following lines and run your program. What is the printout?

output_model_file = 'myregressor.pkl'

with open(output_model_file, 'wb') as f:

pickle.dump(regressor, f)

with open(output_model_file, 'rb') as f:

regressor_model = pickle.load(f)

y_test_pred_new = regressor_model.predict(X_test)

print(" New mean absolute error =", round(sm.mean_absolute_error(y_test, y_test_pred_new), 2))

11. Read these above lines and consider what the intent of these lines?

12. According to the previous labs, consider how to use model_selection to split the training and testing data set. Answer the following questions: (1) Which library package should be imported? (2) Which function is used for splitting the data set? (3) Write the code to replace the last four lines in the previous question No.2. (4) Run your code and show the printout only.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!