Question: 5 . Principal Component Analysis ( 1 5 marks ) For part ( a ) and part ( b ) below, we consider an unlabelled

5. Principal Component Analysis (15 marks)
For part (a) and part (b) below, we consider an unlabelled dataset D with 40000
datapoints, each datapoint is in R
2023
.
(a) One application of Principal Component Analysis (PCA) is to plot datapoints
from a high-dimensional space on a 2-D graph.
Describe the main steps of using PCA to plot the datapoints in D on a 2-D
graph. You should describe all pre-processing steps, the outputs from PCA
you need, and the projection step. [7 marks]
(b) When performing PCA on D, a key step is to compute the scatter matrix.
What is the dimension of the scatter matrix? [2 marks]
(c) PCA is performed on another data set. The scatter matrix is
S =
1.10.3000
0.31.9000
000.800
0000.60
00000.6
.
Verify that the vector
1
3
0
0
0
is an eigenvector of S. What is the corresponding
eigenvalue? [4 marks]
(d) Write down all the other eigenvectors of S.[2 marks]

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!