Question: Different dimension reduction techniques can have quite different computational complexity. Beyond the algorithm itself there is also the question of how exactly it is implemented.
Different dimension reduction techniques can have quite different computational complexity. Beyond the algorithm itself there is also the question of how exactly it is implemented. These two factors can have a significant role in how long it actually takes to run a given dimension reduction. Furthermore, the nature of the data you are trying to reduce can also matter mostly the involves the dimensionality of the original data.
In this problem, you will take a brief look at the performance characteristics of the following dimension reduction algorithm implementations. To this end, you can use Python API and call the algorithms.
PCA
UMAP
LocallyLinearEmbedding
SpectralEmbedding
Isomap
TSNE
MDS
As the size of a dataset increases the runtime of a given dimension reduction algorithm will increase at varying rates. If you ever want to run your algorithm on larger datasets you will care not just about the comparative runtime on a single small dataset, but how the performance scales out as you move to larger datasets. You can simulate this by subsampling from MNIST digits via scikitlearn's convenient resample utility and looking at the runtime for varying sized subsamples.
As an example, you can consider different size of the dataset.
Sizes
Finally, you run the algorithms on different sizes of the dataset and plot the results so that we can observe the performance of the runtime by varying the data size for each of the algorithms.
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
