Question: This exercise is a homework assignment that I have in Data Mining course. This is the link given in the question: https://archive.ics.uci.edu/ml/datasets/spambase PROBLEM 4 In
PROBLEM 4 In this problem you are required to apply various classification techniques on a benchmark dataset, spambase.data, from the UCI repository. This dataset contains 57 attributes, where the last one is the class: spam (1) or non-spam (0). For further details you may visit: https://archive,ics.uci.edu/ml/datasets/spambase Obtain 500 random splits of the dataset into training (80%) and test (20%) and for each split apply all these classification techniques: i. Decision trees ii. KNN iii. Support Vector Machines iv, Logistic Regression v. Naive Bayes Print a summarization table showing the average values of precision, recall, fl score and accuracy, which are obtained from the 500 tests
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
