Question: The LSTM ( without bras units and forget gate ) is delined as z ( t ) = t a n h ( U x

The LSTM (without bras units and forget gate) is delined as
z(t)=tanh(Ux(t)+Ph(t-1))
i(t)=(Vx(t)+Qh(t-1))
c(t)=c(t-1)+z(t)o.i(t)
o(t)=(Wx(t)+Rh(t-1))
h(t)=tanh(c(t))o.o(t)
Verbleibende Zeit 0:42:56
with input vectors x(t), hidden activation vectors h(t), memory cell state vectors c(t), gate activation vectors z(t),i(t),o(t), weight matrices P,Q,R,U,V,W. Let L(t)=L(y(t),hat(y)(t)) denote the loss at time t and let L=t=1TL(t) denote the total loss. We use denominator-layout convention, i.e.,delLdelc(t) is a column vector. The diag operator turns a vector into a diagonal matrix, i.e., diag((1,1)TT)=IinR22 and o. denotes Hadamard's product. Which of the following statements are true?
a. If we choose z(t)=(Ux(t)+Ph(t-1)), then E[z(t)]>0 and the memory cells will always increase at every time step. This can be a problem for very long sequences but can also be helpful for certain problems.
b. The gradient of the loss with respect to the hiddens is
c. Because of the simple structure of the memory cell, the LSTM architecture fails to be Turing complete.
d. The LSTM architecture has no exploding or vanishing gradients because delh(t)delh(t-1) is an orthogonal matrix.
e. The memory cell fulfills delc(t)delc(t-)=I (neglecting dependencies via the hiddens) for any in{1,dotst-1}, where I is the identity matrix. This solves the vanishing gradient problem and is called constant error carousel.
 The LSTM (without bras units and forget gate) is delined as

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!