Question: Different dimension reduction techniques can have quite different computational complexity. Beyond the algorithm itself there is also the question of how exactly it is implemented.

Different dimension reduction techniques can have quite different computational complexity. Beyond the algorithm itself there is also the question of how exactly it is implemented. These two factors can have a significant role in how long it actually takes to run a given dimension reduction. Furthermore, the nature of the data you are trying to reduce can also matter -- mostly the involves the dimensionality of the original data.
In this problem, you will take a brief look at the performance characteristics of the following dimension reduction algorithm implementations. To this end, you can use Python API and call the algorithms.
PCA(),
UMAP(),
LocallyLinearEmbedding(),
SpectralEmbedding(),
Isomap(),
TSNE(),
MDS(),
As the size of a dataset increases the runtime of a given dimension reduction algorithm will increase at varying rates. If you ever want to run your algorithm on larger datasets you will care not just about the comparative runtime on a single small dataset, but how the performance scales out as you move to larger datasets. You can simulate this by subsampling from MNIST digits (via scikit-learn's convenient resample utility) and looking at the runtime for varying sized subsamples.
As an example, you can consider different size of the dataset.
Sizes =[100,200,400,800,1600]
Finally, you run the algorithms on different sizes of the dataset and plot the results so that we can observe the performance of the runtime by varying the data size for each of the algorithms.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!