Question: In Python: machine learning homework for regression I have provided the data set and part of the code. I hope you can help me with

In Python: machine learning homework for regression

I have provided the data set and part of the code. I hope you can help me with Problem 4. Thank you!

Here is the data set and code for previous question, which is relevant with Problem 4.

==============Code Chunk==================

from sklearn.linear_model import LinearRegression import pandas as pd import pylab as plt import seaborn import numpy.random as nprnd import random

%matplotlib inline

# Import data df = pd.read_csv('http://www-bcf.usc.edu/~gareth/ISL/Advertising.csv', index_col=0) df.head()

==============Code Chunk==================

In Python: machine learning homework for regression I have provided the data ==============Code Chunk==================

from sklearn.linear_model import LinearRegression from sklearn.model_selection import train_test_split

# Set y to be the sales in df y = df['sales']

# Set X to be just the features described above in df, also create a new column called interecept which is just 1. X = df.drop(['sales'],1)

# Randomly split data into training and testing - 80% training, 20% testing. X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create linear regression object regr = LinearRegression()

# Train the model using the training sets regr.fit(X_train, y_train)

# The coefficients print('Coefficients: ', regr.coef_)

# The mean square error print("Residual sum of squares: %.2f" % np.mean((regr.predict(X_test) - y_test) ** 2))

# Explained variance score: 1 is perfect prediction print('R^2 Score: %.2f' % regr.score(X_test, y_test)) ==============Code Chunk==================

set and part of the code. I hope you can help me with Problem 4. Thank you! Here is the data set and code

Please provide relevant answers and Python code. (Screenshots of your Jupyter Notebook are okay!!! )

Out [3]: 1 230.1 37.8 2 44.5 39.3 3 17.2 45.9 4 151.5 41.3 5 180.8 10.8 TV radio newspaper sales 69.2 22.1 45.1 10.4 69.3 9.3 58.5 18.5 58.4 12.9 What are the features (variables, covariates, all mean the same thing)? TV: advertising dollars spent on TV for a single product in a given market (in thousands of dollars) Radio: advertising dollars spent on Radio Newspaper: advertising dollars spent on Newspaper . Sales: Number of 1k units sold Goal: Predict the amount of sales in a given market based on the advertising in TV, Radio and Newspaper. [5 points] a) First rescale you features to have mean zero and unit variance using from sklearn import preprocessing X_scaled preprocessing. scale (x) y_scaled y - np. mean (y) [5 points] a) Repeat the regression above, but using ski earn 's Lasso() method with = 0 on the new scaled features. Notice you may obtain a warning. Can you explain what the warning about convergence of the gradient descent method may mean? [10 points] b) Choose a range of ranging from 0 to 1 . Why does the R2 score on test data seem to increase then decrease again? Plot the R2 on test data as you vary a. Note: Because of the randomness, you may need to experiment a bit with replacing 1 by smaller or lower numbers. The goal is to find a concave region. Also, to create a collection of uniformly distributed points from 0 to 1, you can use: import numpy as np alphas np. linspace (0, 1, num_samples) [10 points] c) Choose the best from part b) (you can use alphas [np, argmax (scores) ] ). What is your new score on test data? (it may not change much) What are the coefficients for this model and how do they compare as a ranges from close to 0 to larger values? [10 points] d) Experiment with different values of a in part b) but now using Ridge ? How do the coefficients vary as you vary a in the Ridge method? How do you explain the difference between this observation and the solution in part c) based on the level sets of LP for P = 1 and p 2. [5 points] e) Did we need to rescale in part a)? Why would our solution not make sense otherwise? Out [3]: 1 230.1 37.8 2 44.5 39.3 3 17.2 45.9 4 151.5 41.3 5 180.8 10.8 TV radio newspaper sales 69.2 22.1 45.1 10.4 69.3 9.3 58.5 18.5 58.4 12.9 What are the features (variables, covariates, all mean the same thing)? TV: advertising dollars spent on TV for a single product in a given market (in thousands of dollars) Radio: advertising dollars spent on Radio Newspaper: advertising dollars spent on Newspaper . Sales: Number of 1k units sold Goal: Predict the amount of sales in a given market based on the advertising in TV, Radio and Newspaper. [5 points] a) First rescale you features to have mean zero and unit variance using from sklearn import preprocessing X_scaled preprocessing. scale (x) y_scaled y - np. mean (y) [5 points] a) Repeat the regression above, but using ski earn 's Lasso() method with = 0 on the new scaled features. Notice you may obtain a warning. Can you explain what the warning about convergence of the gradient descent method may mean? [10 points] b) Choose a range of ranging from 0 to 1 . Why does the R2 score on test data seem to increase then decrease again? Plot the R2 on test data as you vary a. Note: Because of the randomness, you may need to experiment a bit with replacing 1 by smaller or lower numbers. The goal is to find a concave region. Also, to create a collection of uniformly distributed points from 0 to 1, you can use: import numpy as np alphas np. linspace (0, 1, num_samples) [10 points] c) Choose the best from part b) (you can use alphas [np, argmax (scores) ] ). What is your new score on test data? (it may not change much) What are the coefficients for this model and how do they compare as a ranges from close to 0 to larger values? [10 points] d) Experiment with different values of a in part b) but now using Ridge ? How do the coefficients vary as you vary a in the Ridge method? How do you explain the difference between this observation and the solution in part c) based on the level sets of LP for P = 1 and p 2. [5 points] e) Did we need to rescale in part a)? Why would our solution not make sense otherwise

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

diabetes.cvs python machine learning Use Logistic Regression to predict the outcome for the attached dataset. The outcome refers to: a patient has diabetes (1) or no diabetes (0) 1. What type of...

Python!!! Python homework for Regression Model I have provided the original data set, and part of the code. I hope that you can help me with Question d_1, d_2, f, g, h. Thank you! In this problem we...

MATHEMATICS FOR MACHINE LEARNING Marc Peter Deisenroth A. Aldo Faisal Cheng Soon Ong Contents Foreword 1 Part I Mathematical Foundations 9 1 Introduction and Motivation 11 1.1 Finding Words for...

Python!!! Python programming homework and Interpretation I have provided the original dataset, and part of the data. I hope that you can help me with Question d_1, d_2, f, g, h, i, j. Thank you! In...

Read the above passage and then answer short questions Summarize and elaborate the research method of this article in concise language Application Research Based on Machine Learning in Network...

Read the above passage and then answer short questionsplease use 1,2,3,4 to write a simple and clear overview of the steps for the research process of this article, a hand-drawn chart is better....

5.4 General Methodology-Related Considerations 5.4.1 Planning an Analytics Project A critical success factor in technical projects, particularly where there is any element of exploration and...

Read the above passage and then answer short questionsWhat is the research tool or platform used in this paper? Application Research Based on Machine Learning in Network Privacy Security Abstracts...

Read the above passage and then answer short questionsWhat can be improved about the research method of this paper, that is, where is the gap? Application Research Based on Machine Learning in...

Read the above passage and then answer short questionsThe research method of this paper can be further upgraded and changed. Could you give a general explanation? Application Research Based on...

Hangin Out Night Club maintains an imprest petty cash fund of $ 100, which is under the control of Sandra Morgan. At March 31, the fund holds $ 9 cash and petty cash tickets for office supplies, $...

What steps should a lender go through in trying to resolve a problem loan situation?

Time left 0 : 5 1 : 0 7 Hide 1 0 True or false? If UIP holds, there is an opportunity to make a risk - free profit using currency arbitrage. True False Finish attempt . . .

after a tax preparer has determined that the taxpayer has a net operating loss, what is the next step for claiming an nol?

4. Develop policies to help employees and the company avoid technical obsolescence.

5. Develop policies to help employees deal with work-life conflicts.

1. Design an effective socialization program for employees.