Question: The dataset Education - Post 12th Standard.csv is a dataset that contains the names of various colleges. This particular case study is based on various

The dataset Education - Post 12th Standard.csv is a dataset that contains the names of various colleges. This particular case study is based on various parameters of various institutions. Expected to do Principal Component Analysis for this case study.

Data file link: https://drive.google.com/file/d/1qGL40KNeZEU1s0zp59Gcd3bsQXS7z9y7/view?usp=sharing

Data Dictionary file link: https://drive.google.com/file/d/1pRo56LCw2rvCN-5YgFx0EM1cHLvNSnvT/view?usp=sharing

2.1) Perform Exploratory Data Analysis [both univariate and multivariate analysis to be performed]. The inferences drawn from this to be properly documented.

2.2) Scale the variables and write the inference for using the type of scaling function for this case study.

2.3) Comment on the comparison between covariance and the correlation matrix.

2.4) Check the dataset for outliers before and after scaling. Draw your inferences from this exercise.

2.5) Build the covariance matrix, eigenvalues, and eigenvector.

2.6) explain the explicit form of the first PC (in terms of Eigen Vectors).

2.7) Discuss the cumulative values of the eigenvalues. How does it help you to decide on the optimum number of principal components? What do the eigenvectors indicate?

Perform PCA and export the data of the Principal Component scores into a data frame.

2.8) Mention the business implication of using the Principal Component Analysis for this case study. [explain the Interpretations of the Principal Components Obtained]

Note: please provide the answer in Jupiter notebook python

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!