Question: Consider the following training dataset with input X = ( x 1 , x 2 ) and target ( desired ) output d . A
Consider the following training dataset with input Xx x and target desired output d A neuron with two inputs and one output is used for this training dataset. Activation function is a linear function with zero bias. Sum of square error is used as the loss function.
A If back propagation is used, what will be the weights w w after convergence?
B What will be the nature of the loss function? What is the value of learning rate which leads to convergence in least number of iterations? Show all calculation steps.
C To achieve convergence in least number of iterations, will you use batch gradient descent, stochastic gradient descent or mini batch gradient and why?
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
