Question: Question 6 [ 2 pts ] : To reduce the risk of neural network overfitting, one solution is to add a penalty term for the
Question pts: To reduce the risk of neural network overfitting, one solution is to add a
penalty term for the weight magnitude. For example, add a term to the squared error that
increases with the magnitude of the weight vector. This causes the gradient descent search to
seek weight vectors with small magnitudes, thereby reduces the risk of overfitting. Given a single
layer neural network with output nodes ie no hidden layer assuming the defined squared
error is defined by
Where denotes the total number of training instances, denotes the desired output of
the instance from the output node. is the actual output of the instance observed
from the output node. is the weight value of the output node. Assuming an output
node is using sigmoid activation function
Calculate partial derivative of to weight
Derive the weight updating rule for the weight of output node Hint: use gradient
descent pt
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
