Question: PROBLEM 3 MNIST, 2 0 NG Preprocessing Your second task is normalization. The type of normalization used depends on the task and dataset. Common types
PROBLEM MNIST, NG Preprocessing
Your second task is normalization. The type of normalization used depends on the task and dataset. Common types of normalization include:
Shiftandscale normalization: subtract the minimum, then divide by new maximum. Now all values are between
Zero mean, unit variance : subtract the mean, divide by the appropriate value to get variance
TermFrequency TF weighting : map each term in a document with its frequency text only; see the wiki page It is up to you to determine the appropriate normalization.
available in MatlabJavaPythonR
Distancesimilarity options to implement:
euclidian distance required library
euclidian distance required your own use batches if you run into memory issues
edit distance required for textor cosine similarity required for vectors
jaccard similarity optional
Manhattan distance optional
DATASET:
trainimagesidxubyte.gz: training set images bytes
trainlabelsidxubyte.gz: training set labels bytes
tkimagesidxubyte.gz: test set images bytes
tklabelsidxubyte.gz: test set labels bytes
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
