Question: Question Five: [10 marks] Students in a statistical experiment have been classified into two groups based on the similarity of their letter grades in two
Question Five: [10 marks]
Students in a statistical experiment have been classified into two groups based on the similarity of their letter grades in two subjects: Math and English as follows:
Group A Math English Group B Math English
student Id student Id
s01 A C s03 A B
s02 C A s04 C B
a. Find the Euclidean distance between a new student with id s20 and scores B in Math and C in English, and each of the 4 students above. Letter grades translate to points as follows: A =4; B = 3; C = 2. [2 marks]
b. To which student(s) is s20 most similar [1 mark]
c. In which group would a K nearest neighbors (KNN) classification algorithm with K = 3 and which uses simple majority voting classify s20? And why? [2 marks]
d. In which group would a K nearest neighbors (KNN) classification algorithm with K = 3 and which uses weighted voting classify s20? And why? [2 marks]
e. Suppose we applied the KNN algorithm to each member of a larger training set of stu-dents in the two groups (s01-s18) and we obtained the results shown below. Based on these results, which value of k would you choose? And why?: [3 marks]
Correctly classified?
student Id Group k=3 k=5 k=7
s01 A Y Y Y
s02 A Y Y Y
s03 A Y Y Y
s04 A Y Y Y
s05 A N N N
s06 A N N N
s07 A Y Y Y
s08 A Y Y Y
s09 A Y Y Y
s10 A Y Y Y
s11 B N Y N
s12 B Y Y Y
s13 B Y Y Y
s14 B N Y Y
s15 B Y Y Y
s16 B Y Y Y
s17 B Y Y Y
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
