Question: Question Five: [10 marks] Students in a statistical experiment have been classified into two groups based on the similarity of their letter grades in two
Question Five: [10 marks]
Students in a statistical experiment have been classified into two groups based on the similarity of their letter grades in two subjects: Math and English as follows:
| Group A | Math | English | Group B | Math | English |
| student Id |
|
| student Id |
|
|
| s01 | A | C | s03 | A | B |
| s02 | C | A | s04 | C | B |
a. Find the Euclidean distance between a new student with id s20 and scores B in Math and C in English, and each of the 4 students above. Letter grades translate to points as follows: A =4; B = 3; C = 2. [2 marks]
b. To which student(s) is s20 most similar [1 mark]
c. In which group would a K nearest neighbors (KNN) classification algorithm with K = 3 and which uses simple majority voting classify s20? And why? [2 marks]
d. In which group would a K nearest neighbors (KNN) classification algorithm with K = 3 and which uses weighted voting classify s20? And why? [2 marks]
e. Suppose we applied the KNN algorithm to each member of a larger training set of stu-dents in the two groups (s01-s18) and we obtained the results shown below. Based on these results, which value of k would you choose? And why?: [3 marks]
|
|
| Correctly classified? | ||
| student Id | Group | k=3 | k=5 | k=7 |
| s01 | A | Y | Y | Y |
| s02 | A | Y | Y | Y |
| s03 | A | Y | Y | Y |
| s04 | A | Y | Y | Y |
| s05 | A | N | N | N |
| s06 | A | N | N | N |
| s07 | A | Y | Y | Y |
| s08 | A | Y | Y | Y |
| s09 | A | Y | Y | Y |
| s10 | A | Y | Y | Y |
| s11 | B | N | Y | N |
| s12 | B | Y | Y | Y |
| s13 | B | Y | Y | Y |
| s14 | B | N | Y | Y |
| s15 | B | Y | Y | Y |
| s16 | B | Y | Y | Y |
| s17 | B | Y | Y | Y |
| s18 | B | Y | N | N |
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
