Question: Assume throughout this exercise that we are using gradient descent to minimize the error as defined in formula (4.2) on p.89 in the textbook: deD
Assume throughout this

exercise that we are using gradient descent to minimize the error as defined in formula (4.2) on p.89 in the textbook: deD Recall that the corresponding weight update rule for a sigmoid unit like the one in Figure 4.6 on p.96 in the textbook is: td - od) - od - (1 - od ) - Ti,d deD Let us replace the sigmoid function ? in Figure 4.6 by the function "tanh". In other words, the output of the unit is now: Derive the new weight update rule. Show your work, and indicate clearly in your answer what the weight update rule for the tanh unit looks like. Hint: tanh'(z) 1- (tanh(z))2
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
