Question: Assignment (Information Technology, Data analytics) Question Use 50_Startups.csv dataset to implement three simple linear regression models using Jupyter Notebook platform. In each model you will
Assignment (Information Technology, Data analytics)
Question
Use 50_Startups.csv dataset to implement three simple linear regression models using Jupyter Notebook platform. In each model you will focus on different featurs of the dataset. Model 1: RandDSpend & Profit Model 2: Administration & Profit Model 3: MarketingSpend & Profit You can use the available sample code in the Sample codes and Data section and tweak it based on the dataset. In each model you need to come with two graphs show the distribution of the data and the distribution of the data with the best fit line.
| RandDSpend | Administration | MarketingSpend | State | Profit |
| 165349.2 | 136897.8 | 471784.1 | New York | 192261.8 |
| 162597.7 | 151377.59 | 443898.53 | California | 191792.1 |
| 153441.51 | 101145.55 | 407934.54 | Florida | 191050.4 |
| 144372.41 | 118671.85 | 383199.62 | New York | 182902 |
| 142107.34 | 91391.77 | 366168.42 | Florida | 166187.9 |
| 131876.9 | 99814.71 | 362861.36 | New York | 156991.1 |
| 134615.46 | 147198.87 | 127716.82 | California | 156122.5 |
| 130298.13 | 145530.06 | 323876.68 | Florida | 155752.6 |
| 120542.52 | 148718.95 | 311613.29 | New York | 152211.8 |
Please help me with codes of this question. I have tried to do it but given me error after loading the libaries and reading the csv file. check the error below
# Model 1: "RandDSpend" & "Profit" X = data[['R&D Spend']] y = data['Profit']
--------------------------------------------------------------------------- NameError Traceback (most recent call last) Cell In[5], line 2 1 # Model 1: "RandDSpend" & "Profit" ----> 2 X = data[['R&D Spend']] 3 y = data['Profit'] NameError: name 'data' is not defined
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
