Question: Load the handwritten zip code digits data from the ElemStatLearn package. There are two datasets: zip.train and zip.test. Take only the observations with true digits

Load the handwritten zip code digits data from the ElemStatLearn package. There are two datasets: zip.train and zip.test. Take only the observations with true digits 2, 4, 6 and 8 from the training data. Plot all observations on the first two principal components and color the observations based on their true digits.

Take the first 3 principal components from the PCA and treat them as 3 new covariates. Hence, you have a new dataset with 3 variables, and the same number of observations as the original data. Now, perform hierarchical clustering again on this new dataset using all three linkage methods. Which one seems to match the true labels the best? You should again demonstrate some necessary results to support your argument. Is this an improvement from the original hierarchical clustering method performed on the 256 pixels? Comment on your findings.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!