Question: 6 . Neural networks and backpropagation ( 1 0 points ) . Consider a simple two - layer network in the lecture slides. Given n

6. Neural networks and backpropagation (10 points). Consider a simple two-layer network in the lecture slides. Given
n
training data
(x
i
,y
i
)
,
i=1,...,n
, the cost function used to training the neural networks \[\ell(w,\alpha,\beta)=\sum_{i=1}^{n}\left(y^{i}-\sigma\left(w^{T} z^{i}\right)\right)^{2}\] where
\sigma (x)=1/(1+e
x
)
is the sigmoid function,
z
i
is a two-dimensional vector such that
z
1
i
=\sigma (\alpha
T
x
i
)
, and
z
2
i
=\sigma (\beta
T
x
i
)
.2(a)(5 points) Show that the gradient is given by \[\frac{\partial \ell(w,\alpha,\beta)}{\partial w}=-\sum_{i=1}^{n}2\left(y^{i}-\sigma\left(u^{i}\right)\right)\sigma\left(u^{i}\right)\left(1-\sigma\left(u^{i}\right)\right) z^{i},\] where
u
i
=w
T
z
i
. This is also known as backpropagation. (b)(5 points) Also show the gradient of
(w,\alpha ,\beta )
with respect to
\alpha
and
\beta
and write down their expression.
Advanced Math - Calculus

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!