Question: We are using gradient descent to learn the parameters of a simple neural network for binary classification: f ( x ) = ( w 1

We are using gradient descent to learn the parameters of a simple neural network for binary classification: f ( x ) = ( w 1 x + w 0 ) f(x)=(w 1 x+w 0 ), where x , w 0 , w 1 R x,w 0 ,w 1 R and is the sigmoid function. We are more likely to encounter the problem of vanishing gradients if we initialize the parameters ( w 0 , w 1 ) (w 0 ,w 1 ) to very large values. Choice 1 of 2:True Choice 2 of 2:False

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock