Question: Questions Step 1: Generating cars dataset This block of Python code will generate the sample data for you. You will not be generating the data
Questions
Step 1: Generating cars dataset
This block of Python code will generate the sample data for you. You will not be generating the data set using numpy module this week. Instead, the data set will be imported from a CSV file. To make the data unique to you, a random sample of size 30, without replacement, will be drawn from the data in the CSV file. The data set will be saved in a Python dataframe that will be used in later calculations.
Click the block of code below and hit theRunbutton above.
In[1]:
import pandas as pd from IPython.display import display, HTML # read data from mtcars.csv data set. cars_df_orig = pd.read_csv("https://s3-us-west-2.amazonaws.com/data-analytics.zybooks.com/mtcars.csv") # randomly pick 30 observations from the data set to make the data set unique to you. cars_df = cars_df_orig.sample(n=30, replace=False) # print only the first five observations in the dataset. print("Cars data frame (showing only the first five observations) ") display(HTML(cars_df.head().to_html())) Cars data frame (showing only the first five observations) Cars data frame (showing only the first five observations)
Unnamed: 0mpgcyldisphpdratwtqsecvsamgearcarb2Datsun 71022.84108.0933.852.32018.61114115 Lincoln Continental10.48460.02153.005.42417.82003416Chrysler Imperial14.78440.02303.235.34517.42003421Dodge Challenger15.58318.01502.763.52016.8700324Hornet Sportabout18.78360.01753.153.44017.020032
Step 2: Scatterplot of miles per gallon against weight
The block of code below will create scatterplot of the variables "miles per gallon" (coded as mpg in the data set) and "weight" of the car (coded as wt).
Click the block of code below and hit theRunbutton above.
NOTE: If the plot is not created, click the code section and hit theRunbutton again.
In[2]:
import matplotlib.pyplot as plt # create scatterplot of variables mpg against wt. plt.plot(cars_df["wt"], cars_df["mpg"], 'o', color='red') # set a title for the plot, x-axis, and y-axis. plt.title('MPG against Weight') plt.xlabel('Weight (1000s lbs)') plt.ylabel('MPG') # show the plot. plt.show() 640x480 with1 Axes>
Step 3: Scatterplot of miles per gallon against horsepower
The block of code below will create scatterplot of the variables "miles per gallon" (coded as mpg in the data set) and "horsepower" of the car (coded as hp).
Click the block of code below and hit theRunbutton above.
NOTE: If the plot is not created, click the code section and hit theRunbutton again.
In[3]:
import matplotlib.pyplot as plt # create scatterplot of variables mpg against hp. plt.plot(cars_df["hp"], cars_df["mpg"], 'o', color='blue') # set a title for the plot, x-axis, and y-axis. plt.title('MPG against Horsepower') plt.xlabel('Horsepower') plt.ylabel('MPG') # show the plot. plt.show()
Step 4: Correlation matrix for miles per gallon, weight and horsepower
Now you will calculate the correlation coefficient between the variables "miles per gallon" and "weight". You will also calculate the correlation coefficient between the variables "miles per gallon" and "horsepower". Thecorrmethod of a dataframe returns the correlation matrix with the correlation coefficients between all variables in the dataframe. You will specify to only return the matrix for the three variables.
Click the block of code below and hit theRunbutton above.
In[4]:
# create correlation matrix for mpg, wt, and hp. # The correlation coefficient between mpg and wt is contained in the cell for mpg row and wt column (or wt row and mpg column).# The correlation coefficient between mpg and hp is contained in the cell for mpg row and hp column (or hp row and mpg column). mpg_wt_corr = cars_df[['mpg','wt','hp']].corr() print(mpg_wt_corr) mpg wt hp mpg 1.000000 -0.855763 -0.791379 wt -0.855763 1.000000 0.663295 hp -0.791379 0.663295 1.000000
Step 5: Multiple regression model to predict miles per gallon using weight and horsepower
This block of code produces a multiple regression model with "miles per gallon" as the response variable, and "weight" and "horsepower" as predictor variables. Theolsmethod in statsmodels.formula.api submodule returns all statistics for this multiple regression model.
Click the block of code below and hit theRunbutton above.
In[5]:
from statsmodels.formula.api import ols # create the multiple regression model with mpg as the response variable; weight and horsepower as predictor variables. model = ols('mpg ~ wt+hp', data=cars_df).fit() print(model.summary()) OLS Regression Results ============================================================================== Dep. Variable: mpg R-squared: 0.822 Model: OLS Adj. R-squared: 0.809 Method: Least Squares F-statistic: 62.23 Date: Thu, 06 Aug 2020 Prob (F-statistic): 7.76e-11 Time: 19:13:24 Log-Likelihood: -69.519 No. Observations: 30 AIC: 145.0 Df Residuals: 27 BIC: 149.2 Df Model: 2 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [0.0250.975] ------------------------------------------------------------------------------ Intercept 36.83191.73421.2430.00033.27440.390 wt -3.67360.675-5.4410.000-5.059-2.288 hp -0.03360.009-3.6800.001-0.052-0.015 ============================================================================== Omnibus: 6.279 Durbin-Watson: 1.690 Prob(Omnibus): 0.043 Jarque-Bera (JB): 4.714 Skew: 0.928 Prob(JB): 0.0947 Kurtosis: 3.571 Cond. No. 630. ============================================================================== Warnings: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
In your initial post, address the following items:
Check to be sure your scatterplots of miles per gallon against horsepower and weight of the car were included in your attachment. Do the plots show any trend? If yes, is the trend what you expected? Why or why not? See Steps 2 and 3 in the Python script.
What are the coefficients of correlation between miles per gallon and horsepower? Between miles per gallon and the weight of the car? What are the directions and strengths of these coefficients? Do the coefficients of correlation indicate a strong correlation, weak correlation, or no correlation between these variables? See Step 4 in the Python script.
Write the multiple regression equation for miles per gallon as the response variable. Use weight and horsepower as predictor variables. See Step 5 in the Python script. How might the car rental company use this model?
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
Students Have Also Explored These Related Mathematics Questions!