Question: The activation function used to train the network is given as f ( x ) = 1 / ( 1 + e ^ ( -
The activation function used to train the network is given as fxewxb and the loss function used is squared error loss function. With respect to this, using gradient descent algorithm show how the values of w and b varies if the initial setup is:
iw is chosen as ve and b is ve
iiw is chosen ve and b is also chosen ve
iiiw is chosen ve and b is chosen ve
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
