Question: 4 . This problem will work with iris data ( available in base R ) . Make sure to exclude the Species column before applying
This problem will work with iris data available in base R Make sure to exclude the Species column before applying clustering algorithms.
a Apply KMeans clustering with K via eclust for nstarts each. Provide the code and the plot of total withincluster sum of squares WSS progression for the resulting clustering solutions.
b Run the gap statistic calculations for K with replicates each. Provide the gap statistic plot. Which K value appears to be optimal?
c Using external cluster validation, and comparing results of your clustering to the actual # of distinct iris species in the data is the answer in part b close to it
d Proceed to scale iris data, and run the gap statistic calculations for K with replicates each. Which K value is optimal?
e Using external cluster validation, and comparing results of your clustering to the actual # of distinct iris species in the data is the answer in part d close to it
f Apply Kmeans with optimal K selected in part d Compare the resulting cluster assignments to the actual species labels. Is there some correspondence?
Eg do any of your clusters contain only elements of a certain species? Or maybe a combination of two species?
give the answer to this question using r studio
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
