Question: please attention : only answer question (a) and (b) 3. a) Derive the update rule for the weights in the output layer of a neural

please attention : only answer question (a) and (b)

please attention : only answer question (a) and (b) 3. a) Derive

3. a) Derive the update rule for the weights in the output layer of a neural network using gradient descent rule. Assume that the sigmoid function is used as an activation function, the quadratic loss as the error function and L1 regularisation is applied. b) Assume the network's error function is Eo. How is it modified when L2 regularisation is applied? Describe how this type of regularization works and what is the difference with LI regularisation. c) Assume that you wish to train a classifier on a large dataset. How would you estimate its generalization performance and optimize its pararneters? Describe briefly the procedure that you would follow d) Compute the classification rate for the given confusion matrix. Do you think the classification rate is a suitable performance measure in this case? Explain your reasoning and the alternatives. Class 1 . Predicted Class 2 - Class 3 Class 1 - Actual 1000 Class 2 - Actual20 Class 3 - Actual Predicted 100 0 10 Predicted 50 10 0 10 e four parts carry, respectively, 40%, 20%, 20%, 20% of the marks. 3. a) Derive the update rule for the weights in the output layer of a neural network using gradient descent rule. Assume that the sigmoid function is used as an activation function, the quadratic loss as the error function and L1 regularisation is applied. b) Assume the network's error function is Eo. How is it modified when L2 regularisation is applied? Describe how this type of regularization works and what is the difference with LI regularisation. c) Assume that you wish to train a classifier on a large dataset. How would you estimate its generalization performance and optimize its pararneters? Describe briefly the procedure that you would follow d) Compute the classification rate for the given confusion matrix. Do you think the classification rate is a suitable performance measure in this case? Explain your reasoning and the alternatives. Class 1 . Predicted Class 2 - Class 3 Class 1 - Actual 1000 Class 2 - Actual20 Class 3 - Actual Predicted 100 0 10 Predicted 50 10 0 10 e four parts carry, respectively, 40%, 20%, 20%, 20% of the marks

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!