Question: R STUDIO: Use the CerealNoMissing.Csv file to load the data into a data frame named ceralIDF Create a new data drame name newcerealIDF that contains

R STUDIO:

Use the CerealNoMissing.Csv file to load the data into a data frame named ceralIDF

Create a new data drame name newcerealIDF that contains cerealIDF data except column 1, 2, and 3

Generate summary statistics for the numerical variables and indicate the shape and peakedness of each variable

Perform correlation on all columns in newcerealIDF and place result into a variable named cerealcorr

Usecerealcorr to display a correlation plot

Based on the correlation plot, what would be the top 5 variables that exhibit correlation with other variables

Tranform the data in cerealIDF (but without column 1, 2, and 3) to 0-1 scale and put the result into a data frame named cerealRescaled

Run the cluster analysis (use 5 cluster)

Interpret the clusters with respect to the numerical variables used in forming the clusters. Which variable play a significant role in making the clusters different?

Given the discovered patterns, provide an appropriate name for each cluster using any subset of or all of the variables in the dataset (e.g., high sugar or low fat or hight protass)

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!