Question: You need to use the CIFAR100 data to train two models: 1) a simple convolutional neural network (CNN) model building from scratch and 2) an

You need to use the CIFAR100 data to train two models: 1) a simple convolutional neural network (CNN) model building from scratch and 2) an existing CNN model.

Tasks I: Implement a simple CNN and train the model.

The basic architecture is shown in Figure 1. However, some parameters are moved on purpose. Please follow the description below to complete the implementation.

o The model has four convolutional (Conv) layers and three fully connected (FC) layers. A ReLU functions follows each layer except the output layer. A max[1]pooling layer with a kernel size of 2x2 and a stride of 2 is applied to each of the first two Conv layers.

o For the Conv layers, the 1st layer has six 5x5 filters, the 2nd layer has 12 5x5 filters, the 3rd layer has 24 5x5 filters, and the 4th layer has 48 3x3 filters.

o For the FC layers, the 1st layer has 120 neurons, the 2nd layer has 84 neurons, and the last one has 100 neurons.

After implementing the CNN model, train the model for 20 epochs to see what kind of performance you might get. Please try to get the performance as good as possible. Plot the training/testing loss and F1 score for each epoch and submit those later. You should be able to achieve a performance similar to Figure 2 (e.g., ~25% accuracy or 0.25 F1 score). If your performance is way worse than that, it may indicate something is wrong.

Task II: Applying an existing CNN model on CIFAR100 (e.g., AlexNet, ResNet, DenseNet, etc.) and trying to get the best performance as higher as possible. The model needs to be trained for at least 10 epochs. Students may use any techniques to improve the performance, such as data augmentation, pre-training, different optimizers, etc. After the experiments, think about what are the major factors that make the performance difference between your model of Task II and the one in Task I. Then, discuss that in the report.

Submission: The submission includes the source code and a report.

The source code could be .py file(s) or .ipynb file(s).

The report needs to have two parts, one for each task.

o For each task, the report needs include the following things: 1) the experiments that have been done, 2) the highest performance that has been achieved, and 3) the plot of the training history

o For Task II, the report also needs to include a discussion of why the performance got improved or decreased compared with Task I.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!