Question: Suppose you make two CNN architectures, one with 1 5 layers ( A ) and another with 3 5 layers ( B ) , and
Suppose you make two CNN architectures, one with layers A and another with layers B and train them both with identical infrastructure, training scheme etc. on a class classification task. The performance of the models is as follows:
ModelA: Training accuracy validation accuracy
ModelB: Training accuracy validation accuracy
Which of the following could be a possible explanation for these results?
ModelB being huge, is overfitting and is thus not performing well
ModelB being huge, is difficult to train because of issues such as vanishingexploding gradient
ModelB is underfitting, perhaps because larger networks sometimes have low 'learning capacity'
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
