Question: R STUDIO: Use the CerealNoMissing.Csv file to load the data into a data frame named ceralIDF Create a new data drame name newcerealIDF that contains
R STUDIO:
Use the CerealNoMissing.Csv file to load the data into a data frame named ceralIDF
Create a new data drame name newcerealIDF that contains cerealIDF data except column 1, 2, and 3
Generate summary statistics for the numerical variables and indicate the shape and peakedness of each variable
Perform correlation on all columns in newcerealIDF and place result into a variable named cerealcorr
Usecerealcorr to display a correlation plot
Based on the correlation plot, what would be the top 5 variables that exhibit correlation with other variables
Tranform the data in cerealIDF (but without column 1, 2, and 3) to 0-1 scale and put the result into a data frame named cerealRescaled
Run the cluster analysis (use 5 cluster)
Interpret the clusters with respect to the numerical variables used in forming the clusters. Which variable play a significant role in making the clusters different?
Given the discovered patterns, provide an appropriate name for each cluster using any subset of or all of the variables in the dataset (e.g., high sugar or low fat or hight protass)
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
