Question: 2.1 (35 points) Linear SVM In this sub-problem, you need to use the linear SVM to conduct the binary classifi- cation. 1) Load data from

2.1 (35 points) Linear SVM In this sub-problem, you need to use the linear SVM to conduct the binary classifi- cation. 1) Load data from arrhythmia. npy and shuffle the data points. 2) Select 80% of the data points as your training and validation set. The rest 20% is regarded as your test set. Actually, in the cross-validation, the training and validation set can be called as "training set". However, in order to be consistent with the code, we still call it "training and validation set" here. 3) Train the SVM classifier using a linear kernel. In linear SVM, there is a pa- rameter C which adjusts the cost of outliers. You would need to use a grid search method to find the best parameter C'*. In fact, such grid search will utilize the cross-validation (3-fold) to get all the average training accuracies and average validation accuracies from the linear SVM model with different parameter Con training and validation set. The parameter C = C* which max- imizes the average validation accuracy will be selected as the best. In fact, here "average" means the average accuracy over the folds in cross-validation, not the average accuracy over the different parameter C. Hint 1: You are allowed to use svm. SVC() and GridSearchCV() in your code. Hint 2: You can perform grid search on the following list of C: CE {10-6, 10 5, 10-4, 10-3, 10-2, 10-1} 4) Draw heatmaps for the result of grid search and find the best C* for average validation accuracy. Report the heatmaps and best C'*. 5) Use the the best C* to train a linear SVM classifier on training and validation set. Then, use the trained classifier to calculate the accuracy on test set. Report the test accuracy
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
