Question: Homework 4 Use the scikit learn library for all the models except when mentioned to use another library. Review examples provided on Blackboard before attempting

Homework 4
Use the scikit learn library for all the models except when mentioned to use another library. Review examples provided on Blackboard before attempting homework. For most of the questions below you can modify the code in the examples provided. Please turn in a Jupyter notebook with the answers.
1. This homework is a continuation of HW 3. Use the same Auto.csv dataset as in HW3 and the binary variable mpg_high_low you created in HW 3
2. Split the dataset into 75% training and 25% test and use 10 fold cross validation for the models below
3. Fit an SVM model to the training set to predict mpg_high_low using all the other features/variables except mpg, year, origin, and name. Use a rbf kernel and cost parameter found by tuning using grid search of 10 evenly linearly spaced numbers in the range 0.1 to 100 and the gamma parameter found by searching 10 evenly logarithmically spaced numbers with a start value of -9 and stop value of 3(hint: use numpy logspace). Predict the mpg_high_low using the test dataset and report the Accuracy, Precision, Recall, Specificity, and F1 measure.
4. Fit a decision tree model to the training set to predict mpg_high_low using all the other features/variables except mpg, year, origin, and name. Predict the mpg_high_low using the test dataset and report the Accuracy, Precision, Recall, Specificity, and F1 measure.
5. Fit a Random Forest model to the training data to predict mpg_high_low using all the other features/variables except mpg, year, origin, and name. Use a n_estimator parameter found by searching amongst the values 50,100,200,500 and max_depth parameter found by searching over the values 2,5,10 and 15. Predict the mpg_high_low using the test dataset.
6. Fit a XGBoost model to the training data to predict mpg_high_low using all the other features/variables except mpg, year, origin, and name. Use a learning rate found by tuning using grid search of 10 evenly linearly spaced numbers in the range 0.1 to 1. Report the accuracy, precision, recall, specificity, F1 score and AUC.
7. Fit a Stacked Classifier model to the training data to predict mpg_high_low using all the other features/variables except mpg, year, origin, and name. The models you need to stack are SVM, decision tree, KNN, and Nave Bayes. Report the accuracy, precision, recall, specificity and F1 score.
8. Summarize the performance of the all the above models by creating a dataframe with 6 columns Model_Name, Accuracy, Precision, Recall, Specificity, F1 Score. The data frame should contain one row for each model you built above with each of the columns filled in with the appropriate metric. Print out the dataframe. Which model performed the best from an accuracy point of view and which model performed best from a recall point of view? Of all the models you built in HW3 and HW4 which one performed best from an F1 score perspective?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!