Question: In this problem, you will develop a model to predict whether a given car gets high or low gas mileage based on the Auto data

In this problem, you will develop a model to predict whether a given car gets high or
low gas mileage based on the Auto data set.
(a) Create a binary variable, mpg01? that contains a 1 if mpg contains a value above
its median, and a 0 if mpg contains a value below its median. You can compute the
median using the median() function. Note you may find it helpful to use the
dataframe 0 function to create a single data set containing both mpg01 and the
other Auto variables.
(b) Explore the data graphically in order to investigate the association between
mpg 01 and the other features. Which of the other features seem most likely to be
useful in predicting mpg01? Scatterplots and boxplots may be useful tools to
answer this question. Describe your findings.
(c) Split the data randomly into a training set (70%) and a test set (30%). Make sure
to use set seed(1), for reproducible results.
(d) Perform KNN on the training data, with several values of K, in order to predict
mpg01. Use only the variables that seemed most associated with mpg01 in (b).
What test errors do you obtain? Which value of K seems to perform the best on
this data set?
(e) Are the predictors you included into KNN model on the same scale? Proceed to
scale the train & test data from parts (c)-(d) as we did in the lab. Only use the
predictors that you claimed to be useful in explaining mpg01. Repeat part (d) for
the scaled data.
 In this problem, you will develop a model to predict whether

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!