Question: Problem Statement: Write code in Python to Demonstrate feature engineering skills by building regression model to predict the Price of houses in Bengaluru city. The
Problem Statement: Write code in Python to Demonstrate feature engineering skills by building regression model to predict the Price of houses in Bengaluru city.
The objective of this Assignment is to show case Machine learning particularly feature engineering skills leveraging primary and secondary datasets to develop a regression model for predicting the house price in Bengaluru. This predictive model will help the to make purchasingrenting decisions by predicting fair housing prices.
Metric to measure :
The measure of accuracy will be RMSE root mean square error The predicted Price for each house in the test dataset will be compared with the actual Price to calculate the RMSE value of the entire prediction. The lower the RMSE value, the better the model will be
Submission File Format:
Submission Data : A CSV submission file. This file should have exactly two columns.
ID and Price
Submission Code : Python to build model and generated submission file.
Dataset Description :
Housing price dataset of Bengaluru city is provided as train dataset traincsv Along with the train dataset, two more external reference datasets avgrent.csv & distfromcitycentre.csv is given for further feature engineering.
Below are the datasets and features details
train.csv:
areatype: The type of the house area feature 'totalsqft specifies.
availability: The availability date or availability status of the property.
location: The locality of the property in Bengaluru city.
size: The size of the housing property in BHK or Bedrooms etc.,
society: The name of the Apartment. This name is encrypted for confidentiality.
totalsqft: The 'areatype' area of the property.
bath: Number of bathrooms available in the house.
balcony: Number of balconybalconies the house has.
price: Price of the housing property in Lakhs. target feature
The testcsv dataset contains similar information totraincsvbut does not disclose the price feature. The price has to be predicted through your model.
Details of Additional dataset
avgrent.csv:
location : The locality of the property in the bengaluru city.
avgbhkrent : Average rent of two BHK flat in that location
distfromcitycentre.csv:
location : The locality of the property in the bengaluru city.
distfromcity : Distance of the location from city centersubmissiondf pdreadcsvsamplesubmission.csv
traindf pdreadcsvtraincsv
testdf pdreadcsvtestcsv
dccdf pdreadcsvdistfromcitycentre.csv
rentdf pdreadcsvavgrent.csv
submissiondfinfo
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
