Question: Problem 2 ( Coding ) 1 2 pts: For the following tasks use the following California House Price Data: [ 1 ] 1 from sklearn.datasets
Problem Coding pts:
For the following tasks use the following California House Price Data:
from sklearn.datasets import fetchcaliforniahousing
housing fetchcaliforniahousing
data pdDataFramehousingdata columns housing.featurenames
dataXhead
datay housing.target
datay
arraydots,
datay housing.target
datay
arraydots,
Split the dataset into training and test sets of : ratio use randomseed and testsize
You must train the linear regression model using the training data and compute MSE using the test
dataset.
Apply Multiple Linear Regression MLR using normal least square solution You must not use any direct
or inbuilt package for MLR
a pts Check the five assumptions mentioned in the classroom of MLR use training dataset
and proper interpretation why the assumptions are met or not
b Pts Derive the normal equation for linear regression.
c Pts Apply the standardization technique to all features to ensure that all features have a
consistent scale. Utilize 'fittransform' for the training data and 'transform' for the test data to
prevent data leakage.
d pts Find optimal values of intercept and coefficients using the normal equation of the linear
regression using the training data. To avoid inverse matrix error, you may
use pseudo inverse npling.pinv
e pts Find hatpredict for each datapoints of test show in dataframe making two columns:
yactual & hatpredict
f Finally, for the test dataset:
a Calculate coefficient of determination and interpret the result
b Find MSE mean of sum of squares of error residual
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
