Question: For this problem, we will be performing simple linear regression using the following dataset: Fish.csvThis data file comes from kaggle.com:https://www.kaggle.com/aungpyaeap/fish-market As stated on the linked

For this problem, we will be performing simple linear regression using the following dataset:

Fish.csvThis data file comes from kaggle.com:https://www.kaggle.com/aungpyaeap/fish-market

As stated on the linked page: "This dataset is a record of 7 common different fish species in fish market sales. With this dataset, a predictive model can be performed using machine friendly data and estimate the weight of fish can be predicted."

Response:

  • Weight (in grams)

Features:

  • Length1 (vertical length in cm)
  • Length2 (diagonal length in cm)
  • Length3 (cross length in cm)
  • Height (in cm)
  • Width (diagonal width in cm)

The species name of the fish is also given.

Part A: Read the data from the csv of your choosing into a Pandas DataFrame. If you are reading inFish.csv, I would recommend dropping the species column as it is non-numerical.

Also, make sure to re-order the columns so that the response variable is the last column.

[ ]:

 

Part B:Make separate scatter plots for each feature versus the response. From these plots, we will try and make inferences about which features appear to have a relationship with the response variable. Write a brief summary of what you notice in each plot. Do you notice any trends in the data?

[ ]:

 

Part C:Use stats.linregress to fit simple linear regression models to the data. Fit a separate SLR model for each feature.

Further documentation:https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.linregress.html

Once you have fit each model, report the following information about each model:

  • intercept value
  • slope value
  • p-value

[ ]:

 

Part D:Use the SLR model fromPart Cfor

Length3versus

Weightto estimate the weight of a fish whose measurement for=31

Length3=31cm.

[ ]:

 

Part E:Looking at all 5 SLR models fromPart C, what do you notice about the p-values? What inferences could you make from this information.

 

 

Part F:Now, let's fit a multiiple linear regression model! We will uses statsmodels for this task. Execute the following cell to import the required package. Use sm.OLS.fit to accomplish this. Then use model.params to print the regression coeficients to the screen.

Further documentation:https://www.statsmodels.org/stable/generated/statsmodels.regression.linear_model.OLS.html

Finally, explicitly write out the MLR model using the coefficients that you found so that you have an answer of the form:

=

0

+

1

1

+

2

2

+

3

3

+

4

4

+

5

5

y^=0+1x1+2x2+3x3+4x4+5x5

[ ]:

import statsmodels.api as sm 

[ ]:

 

 

 

Part G: Based on your MLR Model inPart F, use the full model to predict the fish weight when the following features are observed:

  • Length1: 26 cm
  • Length2: 28 cm
  • Length3: 31 cm
  • Height: 9 cm
  • Width: 4 cm

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!