Question: 4 . This problem will work with iris data ( available in base R ) . Make sure to exclude the Species column before applying

4. This problem will work with iris data (available in base R). Make sure to exclude the Species column before applying clustering algorithms.
(a) Apply K-Means clustering with K =1,2,...,10 via eclust(), for nstarts =50 each. Provide the code and the plot of total within-cluster sum of squares (WSS) progression for the resulting clustering solutions.
(b) Run the gap statistic calculations for K =1,...,10, with 50 replicates each. Provide the gap statistic plot. Which K value appears to be optimal?
(c) Using external cluster validation, and comparing results of your clustering to the actual # of distinct iris species in the data (3), is the answer in part (b) close to it?
(d) Proceed to scale iris data, and run the gap statistic calculations for K =1,...,10, with 50 replicates each. Which K value is optimal?
(e) Using external cluster validation, and comparing results of your clustering to the actual # of distinct iris species in the data (3), is the answer in part (d) close to it?
(f) Apply K-means with optimal K selected in part (d). Compare the resulting cluster assignments to the actual species labels. Is there some correspondence?
E.g. do any of your clusters contain only elements of a certain species? Or maybe a combination of two species?
give the answer to this question using r studio

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!