Question: Download the data file CongressVote.arff . Open it with Notepad or WordPad and read the information about the data. Our task is to classify each
Download the data file CongressVote.arff. Open it with Notepad or WordPad and read the information about the data. Our task is to classify each record (i.e., a House member) to either a democrat or a republican based on his/her voting records. Note that this dataset has many missing values, labeled by ?.
a. Run the Nave Bayes classifier in Weka on the data, using the default parameters. What is the 10-fold cross-validation error rate? Show the output screen with the error rate and confusion matrix.
b. Run the k-nearest neighbor classifier in Weka on the data, using the default parameters. What is the 10-fold cross-validation error rate when k = 5? With all attributes categorical, how can the distances between records be measured? Explain this question using the following three records (which are records 27, 28 and 29 of the dataset). Which of the two records are closer to each other? Why?
y,n,y,n,n,n,y,y,y,n,y,n,n,n,y,y,democrat
y,y,y,n,n,n,y,y,y,n,y,n,n,n,y,y,democrat
y,n,n,y,y,n,y,y,y,n,n,y,y,y,n,y,republican
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
