Question: Excercise 2. (SPAM e-mail representation) The Spambase Data Set is a SPAM classification data set that has exactly the 57 input variables that are roughly

Excercise 2. (SPAM e-mail representation) The Spambase Data Set is a SPAM classification data set that has exactly the 57 input variables that are roughly described in Example 2.9 of the lecture. a) In Example 2.9, we did not mention the specific words and characters that are used in the features. Use the information of the UCI Repository to make a complete description of these variables, i.e. give all the key words, etc. b) Search the web for alternative features that can be used to describe (SPAM) emails. Pick one example feature set, cite the source, and describe these features. (4 Points) Excercise 2. (SPAM e-mail representation) The Spambase Data Set is a SPAM classification data set that has exactly the 57 input variables that are roughly described in Example 2.9 of the lecture. a) In Example 2.9, we did not mention the specific words and characters that are used in the features. Use the information of the UCI Repository to make a complete description of these variables, i.e. give all the key words, etc. b) Search the web for alternative features that can be used to describe (SPAM) emails. Pick one example feature set, cite the source, and describe these features. (4 Points)
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
