Question: Problem Statement: Write code in Python to Demonstrate feature engineering skills by building regression model to predict the Price of houses in Bengaluru city. The

Problem Statement: Write code in Python to Demonstrate feature engineering skills by building regression model to predict the Price of houses in Bengaluru city.
The objective of this Assignment is to show case Machine learning particularly feature engineering skills leveraging primary and secondary datasets to develop a regression model for predicting the house price in Bengaluru. This predictive model will help the to make purchasing/renting decisions by predicting fair housing prices.
Metric to measure :
The measure of accuracy will be RMSE (root mean square error). The predicted Price for each house in the test dataset will be compared with the actual Price to calculate the RMSE value of the entire prediction. The lower the RMSE value, the better the model will be.
Submission File Format:
1.Submission Data : A CSV submission file. This file should have exactly two columns.
ID and Price
2.Submission Code : Python to build model and generated submission file.
Dataset Description :
Housing price dataset of Bengaluru city is provided as train dataset (train.csv). Along with the train dataset, two more external reference datasets (avg_rent.csv & dist_from_city_centre.csv ) is given for further feature engineering.
Below are the datasets and features details
train.csv:
area_type: The type of the house area feature 'total_sqft' specifies.
availability: The availability date or availability status of the property.
location: The locality of the property in Bengaluru city.
size: The size of the housing property in BHK (or Bedrooms etc.,).
society: The name of the Apartment. This name is encrypted for confidentiality.
total_sqft: The 'area_type' area of the property.
bath: Number of bathrooms available in the house.
balcony: Number of balcony/balconies the house has.
price: Price of the housing property in Lakhs. (target feature)
The `test.csv` dataset contains similar information totrain.csvbut does not disclose the price feature. The price has to be predicted through your model.
Details of Additional dataset
avg_rent.csv:
location : The locality of the property in the bengaluru city.
avg_2bhk_rent : Average rent of two BHK flat in that location
dist_from_city_centre.csv:
location : The locality of the property in the bengaluru city.
dist_from_city : Distance of the location from city centersubmission_df = pd.read_csv("sample_submission.csv")
train_df = pd.read_csv("train.csv")
test_df = pd.read_csv("test.csv")
dcc_df = pd.read_csv("dist_from_city_centre.csv")
rent_df = pd.read_csv("avg_rent.csv")
submission_df.info()
 Problem Statement: Write code in Python to Demonstrate feature engineering skills

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!