Question: PLEASE ANSWER USING PYTHON ONLY 9 . 3 Predicting Prices of Used Cars ( Regression Trees ) . The file ToyotaCorolla.csv contains the data on
PLEASE ANSWER USING PYTHON ONLY
Predicting Prices of Used Cars Regression Trees The file ToyotaCorolla.csv contains the data on used cars Toyota Corolla on sale during late summer of in the Netherlands. It has records containing details on attributes, including Price, Age, Kilometers, HP and other specifications. The goal is to predict the price of a used Toyota Corolla based on its specifications. The example in Section is a subset of this dataset.
Data Preprocessing. Split the data into training and validation datasets.
a Run a fullgrown regression tree RT with outcome variable Price and pre dictors Age KM FuelType first convert to dummies HP Auto matic, Doors, QuarterlyTax, MfrGuarantee, GuaranteePeriod, Airco, Auto maticairco, CDPlayer, PoweredWindows, SportModel, and TowBar. Set randomstate
i Which appear to be the three or four most important car specifications for predicting the cars price?
ii Comparethepredictionerrorsofthetrainingandvalidationsetsbyexamining their RMS error and by plotting the two boxplots. How does the predictive performance of the validation set compare to the training set? Why does this occur?
iii. Howmightweachievebettervalidationpredictiveperformanceattheexpense of training performance?
iv CreateasmallertreebyusingGridSearchCVwithcvtofindafinetuned tree. Compared to the fullgrown tree, what is the predictive performance on the validation set?
PROBLEMS
CLASSIFICATION AND REGRESSION TREES
b Let us see the effect of turning the price variable into a categorical variable. First, create a new variable that categorizes price into bins. Now repartition the data keeping BinnedPrice instead of Price. Run a classification tree with the same set of input variables as in the RT and with BinnedPrice as the output variable. As in the less deep regression tree, create a smaller tree by using GridSearchCV with cv to find a finetuned tree.
i ComparethesmallertreegeneratedbytheCTwiththesmallertreegenerated by RT Are they different? Look at structure, the top predictors, size of tree, etc. Why?
ii Predicttheprice,usingthesmallerRTandCT,ofausedToyotaCorollawith the specifications listed in Table
TABLE
SPECIFICATIONS FOR A PARTICULAR
TOYOTA COROLLA
iii. Compare the predictions in terms of the predictors that were used, the mag nitude of the difference between the two predictions, and the advantages and disadvantages of the two methods.
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
