Question: Question Five: [10 marks] Students in a statistical experiment have been classified into two groups based on the similarity of their letter grades in two

Question Five: [10 marks]

Students in a statistical experiment have been classified into two groups based on the similarity of their letter grades in two subjects: Math and English as follows:

Group A Math English Group B Math English

student Id student Id

s01 A C s03 A B

s02 C A s04 C B

a. Find the Euclidean distance between a new student with id s20 and scores B in Math and C in English, and each of the 4 students above. Letter grades translate to points as follows: A =4; B = 3; C = 2. [2 marks]

b. To which student(s) is s20 most similar [1 mark]

c. In which group would a K nearest neighbors (KNN) classification algorithm with K = 3 and which uses simple majority voting classify s20? And why? [2 marks]

d. In which group would a K nearest neighbors (KNN) classification algorithm with K = 3 and which uses weighted voting classify s20? And why? [2 marks]

e. Suppose we applied the KNN algorithm to each member of a larger training set of stu-dents in the two groups (s01-s18) and we obtained the results shown below. Based on these results, which value of k would you choose? And why?: [3 marks]

Correctly classified?

student Id Group k=3 k=5 k=7

s01 A Y Y Y

s02 A Y Y Y

s03 A Y Y Y

s04 A Y Y Y

s05 A N N N

s06 A N N N

s07 A Y Y Y

s08 A Y Y Y

s09 A Y Y Y

s10 A Y Y Y

s11 B N Y N

s12 B Y Y Y

s13 B Y Y Y

s14 B N Y Y

s15 B Y Y Y

s16 B Y Y Y

s17 B Y Y Y

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!