Question: use the mlba::Wine data set available on Canvas, for this exercise. The data consist of chemical data about some wines from 3 different wineries in

use the mlba::Wine data set available on Canvas, for this exercise. The data consist of chemical data about some wines from 3 different wineries in the same region. The target variable is Type. Remember to omit the target variable from the dimension-reduction analysis.
(a) Perform some initial exploratory analysis on your dataset. What variables are present in the data set? Give a brief overview of the dataset.
(b) Should the data be normalized before performing principal component analysis? Explain why or why not. If you think it should be normalized, normalize the data.
(c) Provide a matrix showing the correlation coefficients of each predictor with each other predictor. Use some type of visualization technique to display the correlations so that the reader can easily see at a glance, which are the strongest correlations. Discuss which sets of predictors seem to vary together.
(d) Run PCA on the variables.
(e) Determine the optimal number of components to extract. Explain the criteria you used to select the number of principal components.
(f)*Bonus*(optional) Explore the results of your PCA for however many principal components you think should be used. Which of the original attributes are contributing the most to these principal components?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!