Question: You are tasked to build an image classifier for the MNIST dataset of handwritten numbers, implementing the k-nearest neighbors (k-NN) algorithm. You will need the
You are tasked to build an image classifier for the MNIST dataset of handwritten numbers, implementing thek-nearest neighbors (k-NN)algorithm. You will need the following:
- The MNIST dataset, available on multiple servers on the Internet. For example:
- http://yann.lecun.com/exdb/mnist/
- http://www.pymvpa.org/datadb/mnist.html
- The Python packageneighbors.KNeighborsClassifier:https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsClassifier.html
The input to your classifier program is an image containing a digit, 0-9. Your program must correctly identify the digit with an accuracy of 95%. Here the outline of your task, but you will have to do a bit of research on your own (and increasingly so throughout the program) to fill in the details:
Familiarize yourself with the MNIST dataset
Familiarize yourself with the k-NN algorithm and its Python implementation in sklearn
USE PYTHON and implement the k-NN algorithm:
- Import the package kNeighborsClassifier.
- Be mindful of the train-test split and set the parameters accordingly (justify your choice).
- Identify the variables in the dataset and define the Euclidean distance between an element in the test set and the training set.
- Calculate the distance between the test element and each of if its k nearest neighbors.
- Count the occurrence of each digit within the k nearest neighbors and identify the most popular digit.
- Identify the test element as the digit voted as most popular in the set of the k nearest neighbors.
- Classify the test element accordingly (i.e. based on the popular vote).
- Calculate the error.
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
