Question: PROBLEM 3 MNIST, 2 0 NG Preprocessing Your second task is normalization. The type of normalization used depends on the task and dataset. Common types

PROBLEM 3 MNIST, 20 NG Preprocessing
Your second task is normalization. The type of normalization used depends on the task and dataset. Common types of normalization include:
Shift-and-scale normalization: subtract the minimum, then divide by new maximum. Now all values are between 0-1
Zero mean, unit variance : subtract the mean, divide by the appropriate value to get variance=1
Term-Frequency (TF) weighting : map each term in a document with its frequency (text only; see the wiki page) It is up to you to determine the appropriate normalization.
available in Matlab/Java/Python/R.
Distance/similarity options to implement:
euclidian distance (required, library)
euclidian distance (required, your own - use batches if you run into memory issues)
edit distance (required for text)-or- cosine similarity (required for vectors)
jaccard similarity (optional)
Manhattan distance (optional)
DATASET:
train-images-idx3-ubyte.gz: training set images (9912422 bytes)
train-labels-idx1-ubyte.gz: training set labels (28881 bytes)
t10k-images-idx3-ubyte.gz: test set images (1648877 bytes)
t10k-labels-idx1-ubyte.gz: test set labels (4542 bytes)
 PROBLEM 3 MNIST, 20 NG Preprocessing Your second task is normalization.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!