Question: Python!!! Python programming homework and Interpretation I have provided the original dataset, and part of the data. I hope that you can help me with

Python!!! Python programming homework and Interpretation

I have provided the original dataset, and part of the data. I hope that you can help me with Question d_1, d_2, f, g, h, i, j.

Thank you!

In this problem we will explore our first dataset using pandas (for loading and procssing our data) and sklearn (for building machine learning models).

==================Code Chunk======================

from sklearn.linear_model import LinearRegression import pandas as pd import pylab as plt import seaborn import numpy.random as nprnd import random %matplotlib inline df = pd.read_csv('http://www-bcf.usc.edu/~gareth/ISL/Advertising.csv', index_col=0) df.head()

==================Code Chunk======================

Python!!! Python programming homework and Interpretation I have provided the original

Probelm : Predict sales using sklearn

Split data into training and testing subsets.
Train model using LinearRegression() from sklearn.linear_model on training data.
Evaluate using RMSE and R^2 on testing set

====================Code Chunk==========================

from sklearn.linear_model import LinearRegression

# Set y to be the sales in df

y = df['sales']

# Set X to be just the features described above in df, also create a new column called interecept which is just 1.

X = df.drop(['sales'],1)

# Randomly split data into training and testing - 80% training, 20% testing.

from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

====================Code Chunk==========================

dataset, and part of the data. I hope that you can help

me with Question d_1, d_2, f, g, h, i, j. Thank you! In this problem we will explore our first dataset using pandas (for loading and procssing our data) and sklearn (for building machine learning models).

==================Code Chunk====================== from sklearn.linear_model import LinearRegression import pandas as pd import pylab

====================Code Chunk==========================

import numpy as np from mpl_toolkits.mplot3d import Axes3D import matplotlib.pyplot as plt

fig = plt.figure(figsize=(15,15)) ax = fig.add_subplot(111, projection='3d')

x_surf = np.arange(0, 300, 20) y_surf = np.arange(0, 60, 4) x_surf, y_surf = np.meshgrid(x_surf, y_surf)

new_x = pd.core.frame.DataFrame({'TV': x_surf.ravel(), 'radio': y_surf.ravel()}) # define your regr_1 predict_sales = regr_1.predict(new_x) ax.plot_surface(x_surf, y_surf, predict_sales.reshape(x_surf.shape), rstride=1, cstride=1, color='None', alpha = 0.4)

ax.scatter(X['TV'], X['radio'], y, c='r', marker='o')

ax.set_xlabel('TV') ax.set_ylabel("Radio") ax.set_zlabel('sales')

====================Code Chunk========================

as plt import seaborn import numpy.random as nprnd import random %matplotlib inline

Please provide python code and relevant answers. (Screenshots of your Jupyter Notebook are okay!!! )

Out [3]: 1 230.1 37.8 2 44.5 39.3 3 17.2 45.9 4 151.5 41.3 5 180.8 10.8 TV radio newspaper sales 69.2 22.1 45.1 10.4 69.3 9.3 58.5 18.5 58.4 12.9 What are the features (variables, covariates, all mean the same thing)? TV: advertising dollars spent on TV for a single product in a given market (in thousands of dollars) Radio: advertising dollars spent on Radio Newspaper: advertising dollars spent on Newspaper . Sales: Number of 1k units sold Goal: Predict the amount of sales in a given market based on the advertising in TV, Radio and Newspaper. [5 points] d_1) Train model on training data, and make predictions on testing data, using our solution from class It will be useful to use np. 1linalg. inverse. In [19]: # Code here [5 points] d 2) Train model on training data, and make predictions on testing data, using sklearn. linear_model. LinearRegression . Make sure your answer matches part d 1) In [79]: # Code here [5 points] f Interpreting the coefficients of your model ( clf. coef_1 ), which form of advertising appears to have the largest impact on sales? Which has the least impact? In [ ] : # Answer here [10 points] g) Plot the coefficients along with their confidence intervals, recalling that The variance of the coefficients are the diagonal elemements of the covariance matrix 2(X residuals -1, where is the estimated Ensure you obtain the same results for the variance of the coefficients as when you use import scipy, scipy. stats result-sm. OLS ( y, X ), fit() result. summary O In [ ]: # Code here [10 points] i) (synnergetic effects) Try plotting the data in three dimensions along with the hyperplane solution to see where the solution you have stops following the linear trend, and see if you can infer a new variable which will help, which is a product of two of our current variables. More precisely, our previous model has been: See if you an introduce a new term for some j using your intuition from the previous problems. What is your interpretation of this result? ** Hint: The code below can be adopted to make your 3d plot.* [5 points] j) Does your mixed variable in i) imporve performance? Why? Out [3]: 1 230.1 37.8 2 44.5 39.3 3 17.2 45.9 4 151.5 41.3 5 180.8 10.8 TV radio newspaper sales 69.2 22.1 45.1 10.4 69.3 9.3 58.5 18.5 58.4 12.9 What are the features (variables, covariates, all mean the same thing)? TV: advertising dollars spent on TV for a single product in a given market (in thousands of dollars) Radio: advertising dollars spent on Radio Newspaper: advertising dollars spent on Newspaper . Sales: Number of 1k units sold Goal: Predict the amount of sales in a given market based on the advertising in TV, Radio and Newspaper. [5 points] d_1) Train model on training data, and make predictions on testing data, using our solution from class It will be useful to use np. 1linalg. inverse. In [19]: # Code here [5 points] d 2) Train model on training data, and make predictions on testing data, using sklearn. linear_model. LinearRegression . Make sure your answer matches part d 1) In [79]: # Code here [5 points] f Interpreting the coefficients of your model ( clf. coef_1 ), which form of advertising appears to have the largest impact on sales? Which has the least impact? In [ ] : # Answer here [10 points] g) Plot the coefficients along with their confidence intervals, recalling that The variance of the coefficients are the diagonal elemements of the covariance matrix 2(X residuals -1, where is the estimated Ensure you obtain the same results for the variance of the coefficients as when you use import scipy, scipy. stats result-sm. OLS ( y, X ), fit() result. summary O In [ ]: # Code here [10 points] i) (synnergetic effects) Try plotting the data in three dimensions along with the hyperplane solution to see where the solution you have stops following the linear trend, and see if you can infer a new variable which will help, which is a product of two of our current variables. More precisely, our previous model has been: See if you an introduce a new term for some j using your intuition from the previous problems. What is your interpretation of this result? ** Hint: The code below can be adopted to make your 3d plot.* [5 points] j) Does your mixed variable in i) imporve performance? Why

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

In Python: machine learning homework for regression I have provided the data set and part of the code. I hope you can help me with Problem 4. Thank you! Here is the data set and code for previous...

uantitative Analysis BA 452 Homework 3 Questions Homework 3 covers the theory and applications in Lessons I-6 and I-7. This document has four parts: Objectives of doing your homework. Assignment of...

Python!!! Python homework for Regression Model I have provided the original data set, and part of the code. I hope that you can help me with Question d_1, d_2, f, g, h. Thank you! In this problem we...

This have to be completed in the attached spreadsheet . 3-1 Homework: Stock Valuation Calculations This homework submission should include all calculations, completed on the designated tab of the...

Satellite Data Retrieval, Reference Frames, Numerical and Analytical Orbital Propagation SSD Individual Assignment RMIT University Figure 1: Example errors between a ground truth ephemeris and...

Lesson 12 Quiz (Show/Explain all Work) IST 230 Relations on Sets, Databases 1. Let A = {0, 1, 2, 3, 4, 5, 6, 7, 8} and B = {1, 2, 3, 4, 5, 6, 7, 8}. Now let R be a binary relation R from A to B such...

Part 1 : Data Collection and Preparation You re given a Python script ( us election results kmeans.py ) that extracts a table from Wikipedia, listing the U . S . presidential election results by...

Part 1 : Data Collection and Preparation You're given a Python script ( us _ election _ results _ kmeans.py ) that extracts a table from Wikipedia, listing the U . S . presidential election results...

MATHEMATICS FOR MACHINE LEARNING Marc Peter Deisenroth A. Aldo Faisal Cheng Soon Ong Contents Foreword 1 Part I Mathematical Foundations 9 1 Introduction and Motivation 11 1.1 Finding Words for...

The new line character is utilized solely as the last person in each message. On association with the server, a client can possibly (I) question the situation with a client by sending the client's...

I am working on Financial analysis for Pepsi Co using 2020 Q4 financial statements. The analysis should include an evaluation of the companys ratios and common size financial statements such as...

Write about 2007 recession and its impact on global trade.

On September 1 , 1 9 3 9 , after the signing of the Nazi Soviet pact, Germany invaded Poland. What was the short - term repercussion in Europe?

Review the Customer Discovery Hierarchy from chapter 4 of the text. Then, choosing any product or service you wish, identify the specific parties that would be applicable to each of the 5 areas....

Compare the different types of employee separation actions.

Assess alternative dispute resolution methods.

Distinguish between intrinsic and extrinsic rewards.