Question: Please answer the question at the bottom what is the gradient descent update to w ( 1 , 2 ) [ 1 ] with a

Please answer the question at the bottom what is the gradient descent update to w(1,2)[1] with a learning rate of a. Please write it in terms of xi yi and oi and the weights. sigmoid function is the activation function for h1, h2, h3 and o
Q2)
Let x={x(1),cdots,x(m)} be a dataset of m samples with 2 features, i.e x(i)inR2. The samples are classified into 2 categories with labels y(i)in{0,1}. A scatter plot of the dataset is shown in the following figure:
The examples in class 1 are marked as "x" and examples in class 0 are marked as "o". We want to perform binary classification using a simple neural network with the architecture shown in the following figure:
Denote the two features x1 and x2, the three neurons in the hidden layer h1,h2, and h3, and the output neuron as o. Let the weight from xi to hj be wi,j[1] for iin{1,2},jin{1,2,3}, and the weight from hj to o be wj[2]. Finally, denote the intercept weight for hj as w0,j[1], and the intercept weight for o as w0[2]. For the loss function, we'll use average squared loss instead of the usual negative log-likelihood:
l=1mi=1m(o(i)-y(i))2
where o(i) is the result of the output neuron for example i.
Suppose we use the sigmoid function as the activation function for h1,h2,h3, and o. What is the gradient descent update to w1,2[1], assuming we use a learning rate of ? Your answer should be written in terms of x(l),o(i),y(i), and the weights.
 Please answer the question at the bottom what is the gradient

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!