Question: Question 1: # load the data into a dataframe called housing data #MISSING 1 line of code housing_df = pd.read_csv('BostonHousing.csv') # display column/variable names #Create

Question 1:

# load the data into a dataframe called housing data #MISSING 1 line of code housing_df = pd.read_csv('BostonHousing.csv')

# display column/variable names #Create a list called columns with all of the housing_df columns names in it #MISSING 1 line of code

print("Variables in the data are: ") print(columns)

# review first 5 records in the data print(" First 5 records in the data are:") #MISSING 1 line of code

output:

Variables in the data are: ['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX', 'PTRATIO', 'LSTAT', 'MEDV', 'CAT. MEDV'] First 5 records in the data are: 
CRIM ZN INDUS CHAS NOX RM AGE DIS RAD TAX PTRATIO LSTAT MEDV CAT. MEDV
0 0.00632 18.0 2.31 0 0.538 6.575 65.2 4.0900 1 296 15.3 4.98 24.0 0
1 0.02731 0.0 7.07 0 0.469 6.421 78.9 4.9671 2 242 17.8 9.14 21.6 0
2 0.02729 0.0 7.07 0 0.469 7.185 61.1 4.9671 2 242 17.8 4.03 34.7 1
3 0.03237 0.0 2.18 0 0.458 6.998 45.8 6.0622 3 222 18.7 2.94 33.4 1
4 0.06905 0.0 2.18 0 0.458 7.147 54.2 6.0622 3 222 18.7 5.33 36.2 1

Question 2:

# select columns for regression analysis outcome = 'MEDV' predictors = ['CRIM', 'CHAS', 'RM']

#Create a dataframe called x containing the predictor columns #MISSING 1 line of code

#Create a dataframe (technically a series) containing the outcome variable. Call it y #MISSING 1 line of code

Question 3:

#Create a model called housing_lm and set it to be a LinearRegression() model #MISSING 1 line of code

# fit the regression model y on x #MISSING 1 line of code

# print the intercept #MISSING 1 line of code

#print the list of predictor columns and the coefficients #MISSING 1 line of code

output:

intercept -28.81068250635914 Predictor coefficient 0 CRIM -0.260724 1 CHAS 3.763037 2 RM 8.278180

Questionn 4:

new_df = pd.DataFrame( [[0.1, 0, 6]], columns=['CRIM', 'CHAS', 'RM'] ) new_df

output:

CRIM CHAS RM

0 0.1 0 6

#Run the prediction model that you created using the above created dataframe containing # the new predictor values the the results housing_lm_pred #MISSING 1 line of code

print('Predicted value for median house price based on the model built using dataset is:', housing_lm_pred)

output:

Predicted value for median house price based on the model built using dataset is: [20.83232392]

Question 5:

# variables in the data housing_df.columns

output:

Index(['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX', 'PTRATIO', 'LSTAT', 'MEDV', 'CAT. MEDV'], dtype='object')

# Create a new dataframe called predictors_df with only numerical predictors #MISSING 1-5 lines of code (many different ways we have done this before)

predictors_df.columns

output:

Index(['CRIM', 'ZN', 'INDUS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX', 'PTRATIO', 'LSTAT'], dtype='object')

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock

The question is incomplete You need to write the missing lines of code to complete the task as per t... View full answer

blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!