The first thing is to load the MNIST data into our Machine Learning programs. We will set

Fantastic news! We've Found the answer you've been seeking!

Question:

The first thing is to load the MNIST data into our Machine Learning programs. We will set up the digitClassifier.py to expect input as two matrices: (i) the set of training images X and (ii) the set of training labels Y (there will be a corresponding pair of test matrices). Download all 4 files of the MNIST is available at: http://yann.lecun.com/exdb/mnist/

The images in the MNIST data are 28 × 28 pixels, in a proprietory formt For
our work we will flatten out each image to a single row of 28×28+1 = 785 ele- ments in the X matrix. The *+1* is the first element and correspond to the first column of 1's as for any regression X matrix. There are 60,000 example in the train-images-idx3-ubyte.gz file and 60,000 labels in the train-labels-idx1-ubyte.gz file. These are gzipped files. Please read the note (in bold under the data sets) (you
may need to replace the '-' after images with a · if you get a file-not-found error).

The code to load the uncompressed files into your code is given below:

# An MNIST loader.

import numpy as np

#import gzip

import struct

def load_images(filename):

# Open and unzip the file of images:

# with gzip.open(filename, 'rb') as f:

fh = open(filename, 'rb')

# Read the header information into a bunch of variables

_ignored, n_images, columns, rows = struct.unpack('>IIII', fh.read(16))

# Read all the pixels into a NumPy array of bytes:

all_pixels = np.frombuffer(fh.read(), dtype=np.uint8)

# Reshape the pixels into a matrix where each line is an image:

return all_pixels.reshape(n_images, columns * rows)

def stack_ones(X):

c1 = np.ones(len(X)) # can use np.shape(X)[0]

return np.column_stack((c1, X))

def load_labels(filename):

#with gzip.open(filename, 'rb') as f: # uncomment to unzip & open the gzippe

fj = open(filename, 'rb')

# Skip the header bytes:

fj.read(8)

# Read all the labels into a list:

all_labels = fj.read()

# Reshape the list of labels into a one-column matrix:

return np.frombuffer(all_labels, dtype=np.uint8).reshape(-1, 1)

def encode_sevens(Y):

# Convert all 7s to 1, and everything else to 0

return (Y == 7).astype(int)

# The following commands load the data into

# 60000 images, each 785 elements (1 bias + 28 * 28 pixels)

X_train = stack_ones(load_images("train-images.idx3-ubyte"))

#Note "-" replaced wit

# 60K labels, each with value 1 if the digit is a five, and 0 otherwise

Y_train = encode_sevens(load_labels("train-labels.idx1-ubyte"))

# 10000 images, each 785 elements, with the same structure as X_train

X_test = stack_ones(load_images("t10k-images.idx3-ubyte"))

# 10000 labels, with the same encoding as Y_train

Y_test = encode_sevens(load_labels("t10k-labels.idx1-ubyte"))

The above file loads all the data into constants X train etc.. These can be imported into your classification program with import mnist as mn.

Write functions:

def train( X, Y, numIter, learningRate) This sets up the vector of βs and find the βs by calling the gradient function. (Recall: the beta vector is updated in each iteration by -gradient(. . .)*learningRate)
def classify(X, beta) which simply rounds and returns the Yˆ (the pre- dicted result for the logistic regression).
def test(X, Y, beta) that computes the percentage of correct classification from: classify(X, beta). (Note the count of correct results are obtained by summing classify(X, beta) == Y. Recall, Y = 1 for the selected digit and zeros for the rest). The function should print the loss for every tenth iteration.
def loss(. . .) To compute the log-loss function
def gradient(X, Y, beta)
def sigPredict(X, beta)

Run train for 100 iterations with a learningRate = 1.e-5 and then 1000 iterations with a learningRate = 1.e-3 to experiment.

Please screenshot the process & output in Python language.