Loss functions. Consider the following two loss functions, including (1) mean-squared error Loss ( T , O

Question:

Loss functions. Consider the following two loss functions, including (1) mean-squared error Loss(T,O)=12(TO)2, and (2) cross-entropy Loss(T,O)=TlogO(1T)log(1O) for binary classification. Assume the activation function is sigmoid.

a. Show the derivation of the error δ for the output unit in backpropagation process and compare the two loss functions (e.g., potential problems they might produce).

b. Now, we wish to generalize the cross-entropy loss to the scenario of multiclass classification. The target output is a one-hot vector of length C (i.e., the number of total classes), and the index of nonzero element (i.e., 1) represents the class label. The output is also a vector of the same length O=[O1,O2,,OC]. Show the derivation of the categorical cross-entropy loss and the error δ of the output unit. (hint: there are two key steps, including (1) normalizing the output values by scaling between 0 and 1 , and (2) deriving the cross entropy loss following the definition for the binary case where the loss can be represented as Loss(T,O)=i=12Tilog(Oi).)

Fantastic news! We've Found the answer you've been seeking!

Step by Step Answer:

Related Book For  answer-question

Data Mining Concepts And Techniques

ISBN: 9780128117613

4th Edition

Authors: Jiawei Han, Jian Pei, Hanghang Tong

Question Posted: