Question: Question Five: [10 marks] Students in a statistical experiment have been classified into two groups based on the similarity of their letter grades in two

Question Five: [10 marks]

Students in a statistical experiment have been classified into two groups based on the similarity of their letter grades in two subjects: Math and English as follows:

Group A

Math

English

Group B

Math

English

student Id

student Id

s01

A

C

s03

A

B

s02

C

A

s04

C

B

a. Find the Euclidean distance between a new student with id s20 and scores B in Math and C in English, and each of the 4 students above. Letter grades translate to points as follows: A =4; B = 3; C = 2. [2 marks]

b. To which student(s) is s20 most similar [1 mark]

c. In which group would a K nearest neighbors (KNN) classification algorithm with K = 3 and which uses simple majority voting classify s20? And why? [2 marks]

d. In which group would a K nearest neighbors (KNN) classification algorithm with K = 3 and which uses weighted voting classify s20? And why? [2 marks]

e. Suppose we applied the KNN algorithm to each member of a larger training set of stu-dents in the two groups (s01-s18) and we obtained the results shown below. Based on these results, which value of k would you choose? And why?: [3 marks]

Correctly classified?

student Id

Group

k=3

k=5

k=7

s01

A

Y

Y

Y

s02

A

Y

Y

Y

s03

A

Y

Y

Y

s04

A

Y

Y

Y

s05

A

N

N

N

s06

A

N

N

N

s07

A

Y

Y

Y

s08

A

Y

Y

Y

s09

A

Y

Y

Y

s10

A

Y

Y

Y

s11

B

N

Y

N

s12

B

Y

Y

Y

s13

B

Y

Y

Y

s14

B

N

Y

Y

s15

B

Y

Y

Y

s16

B

Y

Y

Y

s17

B

Y

Y

Y

s18

B

Y

N

N

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!