Question: Please use python to write this program. Part A. k Nearest Neighbor (kNN) Supervised Learner (40 points) Write a program that performs supervised classification using

Please use python to write this program.

Please use python to write this program. Part A. k Nearest Neighbor

Part A. k Nearest Neighbor (kNN) Supervised Learner (40 points) Write a program that performs supervised classification using the kN N algorithm which assigns the majority label of the k closest stored labeled instances to a new input instance. For this project, you will experiment with two variations of kNN training which either stores all or a subset of the training instances: Store All: Store every training instance. This option makes training trivial, but has higher storage and classfi- cation costs. Store errors; Store the first k points of each class that is input. Thereafter, store an input instance only if it is misclassified by the k - NN algorithm, using only the previously stored instances. This option has relatively higher training complexity, but has lower storage and classficiation costs. Note: the points stored in the learnt model depends on the order in which the input instances are presented. You might, but is not required, to avaerage performance over several random ordering of the input sequence. You are provided one artifiial dataset of 1000 points in a file labeled-examples which contains one input instance per line in the form (class, r, y, name), where class represents the classification of a datapoint of name name and described by the real-valued attributes x and y. Provided a labeled data set, your program should perform N-fold cross validation (N = 5), where you first divide the data set into N equal-sized blocks, and then repeatedly test performance on each of the blocks in turn, while training on the remaining blocks (see Algorithm 1). Report the average training and testing classification accuracy over all the N blocks. The classification accuracy of any instance, either in the training or in the test set, is determined by comparing the provided label with the majority label of the k nearest stored instances. For 2: 3: Algorithm 1 N-fold Cross-Validation on dataset, D 1: procedure CROSS-VALIDATION(N,D) Partition D into N roughly equal splits, Di, i = 1,...,N Acctrain + 0, Acctest+ 0 for itl to N do Dtrain + D Di, Dtest+ Di Mi + trainModel(Dtrain) Acctrain + Acctrain + eval(M, Dtrain) 8: Acctest + Acctest + eval(M , Dtest) Return (Acctrain/N, Acctest/N) 5: 6: 7: call kNN learner to build model Accumulate training set accuracy Accumulate test set accuracy Return average train and test set accuracies 9: this project the trainModel call will invoke one of the kNN variants to store a subset of the training instances. Subsequently, the eval call will use those stored instances to classify the provided set of examples. In addition to the artificial dataset provided, primarily to help develop your code, you should also present results from your learning program on some real world data set from the https://archive.ics.uci.edu/ml/datasets.php. Choose data sets that with only real or integer valued attributes. Part A. k Nearest Neighbor (kNN) Supervised Learner (40 points) Write a program that performs supervised classification using the kN N algorithm which assigns the majority label of the k closest stored labeled instances to a new input instance. For this project, you will experiment with two variations of kNN training which either stores all or a subset of the training instances: Store All: Store every training instance. This option makes training trivial, but has higher storage and classfi- cation costs. Store errors; Store the first k points of each class that is input. Thereafter, store an input instance only if it is misclassified by the k - NN algorithm, using only the previously stored instances. This option has relatively higher training complexity, but has lower storage and classficiation costs. Note: the points stored in the learnt model depends on the order in which the input instances are presented. You might, but is not required, to avaerage performance over several random ordering of the input sequence. You are provided one artifiial dataset of 1000 points in a file labeled-examples which contains one input instance per line in the form (class, r, y, name), where class represents the classification of a datapoint of name name and described by the real-valued attributes x and y. Provided a labeled data set, your program should perform N-fold cross validation (N = 5), where you first divide the data set into N equal-sized blocks, and then repeatedly test performance on each of the blocks in turn, while training on the remaining blocks (see Algorithm 1). Report the average training and testing classification accuracy over all the N blocks. The classification accuracy of any instance, either in the training or in the test set, is determined by comparing the provided label with the majority label of the k nearest stored instances. For 2: 3: Algorithm 1 N-fold Cross-Validation on dataset, D 1: procedure CROSS-VALIDATION(N,D) Partition D into N roughly equal splits, Di, i = 1,...,N Acctrain + 0, Acctest+ 0 for itl to N do Dtrain + D Di, Dtest+ Di Mi + trainModel(Dtrain) Acctrain + Acctrain + eval(M, Dtrain) 8: Acctest + Acctest + eval(M , Dtest) Return (Acctrain/N, Acctest/N) 5: 6: 7: call kNN learner to build model Accumulate training set accuracy Accumulate test set accuracy Return average train and test set accuracies 9: this project the trainModel call will invoke one of the kNN variants to store a subset of the training instances. Subsequently, the eval call will use those stored instances to classify the provided set of examples. In addition to the artificial dataset provided, primarily to help develop your code, you should also present results from your learning program on some real world data set from the https://archive.ics.uci.edu/ml/datasets.php. Choose data sets that with only real or integer valued attributes

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!