Question: Recall the dataset given in Homework 2 , that includes 1 4 features representing the clinical conditions of 5 0 0 ICU patients and the
Recall the dataset given in Homework that includes features representing the clinical conditions of ICU patients and the target variable death representing whether the patient died in the ICU or discharged alive Using the same dataset, now try Regularized Logistic Regression both L and L penalties and different C values KNN classifier different numbers of neighbors you believe to be reasonable random forests different numbers of trees and different numbers of features to select at each split of your selection and gradient boosting classifier different numbers of trees and learning rates of your selection BE CAREFUL that the best model should be selected using cross validation hence you should never evaluate methods using the test set during the model selection. Also, be very careful that the standardization needs to be carefully done during cross validation not to end up with data snooping recall the pipe approach discussed in the class
Once you decide on the final method and the set of best parameters, refit your model on the standardized training set and evaluate the performance accuracy on the standardized test set. Also provide the test confusion matrix, as well as test ROCAUC score of the best model.
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
