Question: In this assignment, you will apply decision trees to recognizing digits from images of MNIST handwritten digits. The dataset is included within sklearn (and also

In this assignment, you will apply decision trees to recognizing digits from images of MNIST handwritten digits. The dataset is included within sklearn (and also R if you want to use R). You will load the dataset and evaluate the decision tree classifier based on cross validation. Below, I will describe the steps using sklearn (python). If you want to use R instead of python, try to implement the same as best as possible given the APIs in R.

Step 1: Load the data: Use the following package from sklearn.datasets import load_digits. Each instance is 8x8 image (64 pixels/features) and there are 1700+ instances.

Step 2: Perform 5-fold cross validation. Use from sklearn.tree import DecisionTreeClassifier, from sklearn.model_selection import cross_val_score to perform cross validation. Vary the max_depth parameter in DecisionTreeClassifier from 1 to 10. How does the average cross validation error change for each of these depths? You can draw a table (or plot a graph using import matplotlib.pyplot as plt).

Step 4: Analyze the feature importances (for the best max_depth setting) (you an refer to the video and demo code). Which were the 10 most important pixels for the classifier. Were the important pixels meaningful to you?

Step 5: Suggest some steps you could take that could improve the performance of the classifier.

You can submit your notebook (or other code file) along with a pdf that presents you analysis. If you don't want to submit a pdf, you can place your analysis within the notebook itself (you can write text in the notebook).

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!