Question: Neural Nets 1. (2 pts) Why should one use bipolar encodings of binary values instead of standard binary encoding? 2. (2 pts) Why is momentum

Neural NetsNeural Nets 1. (2 pts) Why should one use bipolar encodings of

1. (2 pts) Why should one use bipolar encodings of binary values instead of standard binary encoding? 2. (2 pts) Why is momentum often used in combination with stochastic gradient descent (SGD)? 3. (2 pts) Why does SGD have trouble converging when the gradient has small magnitude values? 4. (2 pts) Why does SGD have trouble converging when the gradient has large magnitude values? 5. (2 pts) What can happen if the learning rate is set too high? 6. (2 pts) What can happen if the learning rate is set too low? 7. (2 pts) What is the name of the matrix of second order partial derivatives of the error function? 8. (2 pts) Why is the matrix of second order partial derivatives not commonly used to help train neural networks? 9. (2 pts) Why is L-BFGS not commonly used for optimization of error functions with neural networks? 10. (2 pts) What is the name of one non-linear activation function which helps prevent gradients from approaching zero magnitude in deep neural networks

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!