Question: This question is to compare different classifiers and their performance for multi-class classifications on the complete MNIST dataset at http://yann.lecun.com/exdb/mnist/. You can find the data

This question is to compare different classifiers and their performance for multi-class classifications on the complete MNIST dataset at http://yann.lecun.com/exdb/mnist/. You can find the data file mnist 10digits.mat in the homework folder. The MNIST database of handwritten digits has a training set of 60,000 examples and a test set of 10,000 examples. Use the number of clusters K = 10. We suggest you "standardize" the features (pixels in this case) by dividing the values of the features by 255 (thus mapping the range of the features from [0, 255] to [0, 1]). We are going to use purity score as a performance metric: each cluster is assigned to the class which is most frequent in the cluster, and then the accuracy of this assignment is measured by the number of correlated assigned samples and divided by the size of the cluster:

purity(i) = number of most frequent label in cluster(i) / size of cluster(i)

e.g in cluster9, if the most frequent digit is 8, then

purity(cluster9) = (number of 8's in cluster9)/ size of cluster 9

1. Use the squared-`2 norm as a metric for clustering (you may base it on the code you had Report the purity score for each cluster by using a python K means clustering code.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Algorithms Questions!

(a) Use the following text to derive distributions for rat and chased. Use a five-word window, including open- and closed- class words, ignore case, punctuation and sentence boundaries and weight...

Q1. You have identified a market opportunity for home media players that would cater for older members of the population. Many older people have difficulty in understanding the operating principles...

Which of these is not a feature of the AES encryption cipher? Question 4 1 4 1 Answer a . . Its fast execution time. b . . Its use of Feistal networks. c . . Its ease of use in common programming...

The TELUS Communications Corp. 2013 financial statements appear on MyAccountinglab. Answer the following questions about the company's share capital: 1. What classes of shares has TELUS issued? How...

On a lossless line, measurements indicate s = 4.2 with the first maximum voltage at /4 from the load. Determine how far from the load a short-circuited stub should be located and calculate its length.

Summarize major issues that can arise when psychologists conduct forensic evaluations.

Grinch's income for the year is as follows: Salary gross Less: CPP and El contributions Add: bonus based on his sales volume $84,000 (3,499) 10.000 $90,501 The costs of travel, wholly related to...

C++ coding please Given the following class definition: class Entry Private string name: int phone Number string address; public: Entryo: void setvals(string, int, string): void display const; int...

1. A driver pulley of diameter 6.50 in. revolves at 1650 rpm. Find the speed of the driven pulley if its diameter is 26.0 in. 2. A driver pulley of diameter 25.0 cm revolves at 120 rpm. At what speed...

Sandpiper Company has 15,000 shares of cumulative preferred 1% stock, $100 par and 50,000 shares of $15 par common stock. The following amounts were distributed as dividends: Year 1 $37,500 Year 2...

What one modification would you recommend to the "Temporal-Three" evacuation signal to increase its "recognition" and perception of "urgency". In your response justify your modification financially...

A crate starts from rest and is pulled by a force P of 500 N. Determine the speed of the crate after 4 s if between crate and surface is 0.2. Take mass of crate as 80 kg. REMEMBER: use Fdt = m(v -v)...

The answers should identify the capabilities,understandings and knowledge you have gained during the course and how your learning has affected your abilities in a group. 3. discuss the pros and cons...

On November 14, US department store chain Macy's alerted customers of a security breach discovered in October on its website that led to the compromise of payment card details and customer...

Recommend one strategy at each level - Corporate Level, Business Level and Functional Level for Airbnb arise from pandemic

3 . Statistical measures of stand - alone risk Remember, the expected value of a probability distribution is a statistical measure of the average ( mean ) value expected to occur during all possible...

An environmentalist wants to determine if the median amount of potassium (mg/L) in rainwater in Lincoln County, Nebraska, is different from that in the rainwater in Clarendon County, South Carolina....

Loss functions. Consider the following two loss functions, including (1) mean-squared error Loss ( T , O ) = 1 2 ( T O ) 2 Loss ( T , O ) = 1 2 ( T O ) 2 , and (2) cross-entropy Loss ( T , O ) = ...

Discovery-driven cube exploration is a desirable way to mark interesting points among a large number of cells in a data cube. Individual users may have different views on whether a point should be...

Consider partitioning clustering and the following constraint on clusters: The number of objects in each cluster must be between $\frac{n}{k}(1-\delta)$ and $\frac{n}{k}(1+\delta)$, where $n$...

Simplify the following and collect the like terms 4(a2 -3a-4)-2(5a2 -a-6)

First substitute the given values into the formula obtained from the inside- back cover. Then manipulate the resulting equation to solve for the unknown variable. Obtain dollar amounts accurate to...