Question: PROVIDE CODES . . . Data Download the MNIST data and construct the following sets: training set: one example only ( you can pick your

PROVIDE CODES ...
Data
Download the MNIST data and construct the following sets:
training set: one example only (you can pick your favourite digit)
test set: one example per digit from the MNIST test dataset
Map the images to [0,1]^28x28
P1- MLP
P1.1- Impleme^28nt a fully connected neural network h: [0,1]^28x28(goes to)[0,1]^28x28 model that regresses an image into itself. The architecture should have 7 trainable dense layers: the first 6 layers with 4 neurons and ReLU activation, and an output layer with the necessary number of units and activation.
P1.2- Train the model using SGD on the appropriate loss function for 10^3 epochs on the training data. Plot the training loss over epochs.
P1.3- Plot the prediction over the training set and test set (you should spot a pattern in the predictions, but since there is some randomness associated with using the GPU we recommend repeating the training 3-5 times to be sure you pick up the right pattern). Which function do you conjecture h(x) has learnt (write it in formula)?
P2- CNN
P2.1- Implement a CNN g:[0,1]^28x28(goes to)[0,1]^28x28
model that regresses an image into itself. The architecture should have 2 convolutional layers: the first with 10 filters, kernel size 5 X 5 and the same output size as input, and the second a convolutional output layer with the necessary number of filters, kernel and activation.
P2.2- Train the model using SGD on the appropriate loss function for 10^3 epochs on the training data. Plot the training loss over epochs.
P2.3- Plot the prediction over the training set and test set (you should spot a pattern in the predictions, but since there is some randomness associated with using the GPU we recommend repeating the training 3-5 times to be sure you pick up the right pattern). Which function do you conjecture g(x) has learnt (write it in formula)?
P3- Learning the identity map
P3.1- Consider a multilayer ReLU network h: R^n (goes to ) R^n such that h(x)= W3ReLU(W2ReLU(W1x+b1)+b2)+b3 with W1(in the list of )R^a x n, W3(in the list of) R^n x n, b1(in the list of) R^a: b2, b3(in the list of ) R^n. Find a possible solution for W1, W2, W3, b1, b2, b3, such that h represents the identity function.
What if you want h to represent a constant function that always outputs x0?
P3.2- Consider a CNN g:R^n x n (goes to ) R^n x n model composed by a first hidden convolutional layer with c filters, d x d (d>1 odd)kerne, identity activation and a suitable convolutional output layer. Find a possible architecture for g(i.e. specify the complete architecture, c, the values in the filters, padding and stride) such that g represents the identity function.
If instead of the identity activation, we use a ReLU activation, how should the architecture change?
Note: (R) means natural numbers and (^) means to the power of .

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!