Question: Here is problem 1 that it is referring to: (Softmax activation) Now consider the neural network in problem 1, if the activation in the output

Here is problem 1 that it is referring to:

(Softmax activation) Now consider the neural network in problem 1, if the activation in the output layer is the softmax activation: for i = 1, 2, Softmax (zi) exp (zi) {}=1 exp (z;) then i = Softmax(z:{4)), or = Softmax(z(4)). Consider the cross entropy loss function for a binary classification: -y ln i (1 y) ln(1 2), where stands for the true target that is in [0, 1]. Answer the following questions: al Compute L = Y 02.(4) Compute al aw(3) 3* Consider a 4-layer neural network showing below: the first layer is the input layer, suppose there is no bias for every layer's forward pass, and the ReLU function f(x) = max{0, x} is the activation for every layer except the output layer (i.e., output layer does not have an activation). Answer the following questions. OOOO List the size of the weight matrices W(1) associated with the l-th layer 1 {1,2,3}. Let (1, 2)T be the output of this neural network with an input of x, compute k = 2,3. aw(k) al If L= ll y||3, compute aw(2) N 1 If L = 2N i=1 || l) yl0||$, where yle) is the output of this neural network with an input of the i-th sample in the dataset li), and y(i) is the true target of the i-th sample, al compute aw(3) (Softmax activation) Now consider the neural network in problem 1, if the activation in the output layer is the softmax activation: for i = 1, 2, Softmax (zi) exp (zi) {}=1 exp (z;) then i = Softmax(z:{4)), or = Softmax(z(4)). Consider the cross entropy loss function for a binary classification: -y ln i (1 y) ln(1 2), where stands for the true target that is in [0, 1]. Answer the following questions: al Compute L = Y 02.(4) Compute al aw(3) 3* Consider a 4-layer neural network showing below: the first layer is the input layer, suppose there is no bias for every layer's forward pass, and the ReLU function f(x) = max{0, x} is the activation for every layer except the output layer (i.e., output layer does not have an activation). Answer the following questions. OOOO List the size of the weight matrices W(1) associated with the l-th layer 1 {1,2,3}. Let (1, 2)T be the output of this neural network with an input of x, compute k = 2,3. aw(k) al If L= ll y||3, compute aw(2) N 1 If L = 2N i=1 || l) yl0||$, where yle) is the output of this neural network with an input of the i-th sample in the dataset li), and y(i) is the true target of the i-th sample, al compute aw(3)
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
