Question: In the package MASS, there is a dataset called iris. The goal of this assignment is to perform a model-based cluster analysis based on the

In the package "MASS", there is a dataset called "iris". The goal of this assignment is to perform a model-based cluster analysis based on the 4 numeric variables (first four columns): Sepal.Length, Sepal.Width, Petal.Length, and Petal.Width. 1. Create a true.id variable based on Species variable from the data. 2. Form a data set based on the first 4 columns, i.e., without the Species variable. 3. Apply the following clustering algorithms: a. Hierarchical Clustering with Single Linkage b. Hierarchical Clustering with Complete Linkage c. K-means d. Model-based clustering from package "mclust" 4. Create a plot using function clusplot( ) from package "cluster". HINT: clusplot (x,y, lines =0, color = TRUE, plotchar = FALSE, main = " ") where x is the object with interval variables and y represents the factor variable. 5. Summarize the misclassification rates of each model in a table. Report your best model based on the results
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
