Question: 3 (15). You are given a dataset for a 2-class classification problem. You try three methods on it minimum distance classifier, perceptron, and k-nearest neighbors.

3 (15). You are given a dataset for a 2-class classification problem. You try three methods on it minimum distance classifier, perceptron, and k-nearest neighbors. You try each algorithm in different ways (e.g., changing k, changing learning rates, etc.) until you can get no further improvement. The best overall errors you get in each case are Minimum Distance Classifier: J 0.4 * Perceptron: k-Nearest Neighbors J 0.05 All errors are from a normalized range between 0 and 1, so J-: 0.41s 40% error Assume that each algorithm is used in its standard form and with raw data (i.e., no variable k, no scaling of data, etc.) a) (10 points) Based only on these errors, what can you say about the distribution of the data for the two classes in feature space? Give a bullet-list of all the significant things you can think about, and in each case explain why you think that based on the errors you got. b) (5 points) Assuming a 2-dimensional feature space, draw an approximate picture that illustrates your points about how the data for the two classes is distributed Remember, there is no absolute "right" or "wrong" answer here. You are speculating, but your conjectures must be plausible and justifiable. You will get points for plausible conjectures, and lose points for implausible ones and those you cannot explain. 3 (15). You are given a dataset for a 2-class classification problem. You try three methods on it minimum distance classifier, perceptron, and k-nearest neighbors. You try each algorithm in different ways (e.g., changing k, changing learning rates, etc.) until you can get no further improvement. The best overall errors you get in each case are Minimum Distance Classifier: J 0.4 * Perceptron: k-Nearest Neighbors J 0.05 All errors are from a normalized range between 0 and 1, so J-: 0.41s 40% error Assume that each algorithm is used in its standard form and with raw data (i.e., no variable k, no scaling of data, etc.) a) (10 points) Based only on these errors, what can you say about the distribution of the data for the two classes in feature space? Give a bullet-list of all the significant things you can think about, and in each case explain why you think that based on the errors you got. b) (5 points) Assuming a 2-dimensional feature space, draw an approximate picture that illustrates your points about how the data for the two classes is distributed Remember, there is no absolute "right" or "wrong" answer here. You are speculating, but your conjectures must be plausible and justifiable. You will get points for plausible conjectures, and lose points for implausible ones and those you cannot explain
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
