Question: The activation function used to train the network is given as f ( x ) = 1 / ( 1 + e ^ ( -

The activation function used to train the network is given as f(x)=1/(1+e^(-(w.x+b))) and the loss function used is squared error loss function. With respect to this, using gradient descent algorithm show how the values of w and b varies if the initial setup is:
i)w is chosen as -ve and b is -ve.
ii)w is chosen +ve and b is also chosen +ve.
iii)w is chosen +ve and b is chosen -ve.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!