Question: https://www.mediafire.com/file/dm0lueh7rn26zt2/Toyota.xlsx/file - PLEASE DOWNLOAD THE EXCEL FILE USING THIS LINK TO VIEW THE DATA SET. The dataset Toyota from JMP in the above link contains
https://www.mediafire.com/file/dm0lueh7rn26zt2/Toyota.xlsx/file - PLEASE DOWNLOAD THE EXCEL FILE USING THIS LINK TO VIEW THE DATA SET.
The dataset Toyota from JMP in the above link contains data on used cars on sale during the late summer of 2004 in the Netherlands. It has 1436 records containing details on 38 attributes, including Price, Age, Kilometers, HP, and other specifications.
a. Using excel, create a 60% Training Sample for this data. To do this, create a new column called "Partition" that has random numbers for each row using =RAND().
b. Sort the data (keeping the header rows at the top) by this new random number column. The first 60% of the rows will be your training data.
c. Use these first 60% of rows to make a regression model predicting Price with Age using the Data Analysis Toolpak.
d. What is the regression equation and what is the MSE (Mean Squared Error or Mean Squared Residual) from the ANOVA table?
e. What is the MSE of the regression equation scored on your training data (the first 60% you used to train or build the model) - this should somewhat match part d with some rounding error.
To find this - create 3 new columns of data: Prediction (this will be your regression equation), Error (actual Price - Prediction), and Squared Error (Square your Error).
f. What is the MSE of the regression equation scored on your test (or holdout data - those final 40% of your rows)?
g. What does MSE mean?? Would RMSE help interpretation?
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
