Question: (c) (3 points) Assume we have input as) E R300 and hidden state h) 6 R500. What are the dimensions of the following parameters for

(c) (3 points) Assume we have input as\") E R300 and hidden state h\") 6 R500. What are the dimensions of the following parameters for a fully gated GRU: o The weights W2 for input I\") in the expression to compute the update gate z\"). o The weights Uz for hidden state h('_1) in the expression for the update gate Z\"). 0 The bias b2 in the expression for the update gate 2\"). o The bias hr in the expression for the reset gate rm. 0 The weights WT for input cc\") in the expression for the reset gate 7'\"). o The weights UT for hidden state hU'I) in the expression for the reset gate rm. (d) (4 points) Let's compute the total number of scalar parameters in a GRUbased language modeling net work. We will use the above GRU network with the same dimensions as in part (c). Assume that we use the GRU with a decoding head consisting of a 2layer feedforward network with one layer receiving the recurrent state, and one linear layer outputting to a target vocabulary consisting of 10,000 words. We will compare this GRU against a simple alternative architecture: a fullyconnected feedforward network. Assume that the hidden-layers of the feed-forward networks all have dimension 200. o What is the total number of scalar parameters required by the GRU network to process single word sentences? Include the parameters of the decoding head. a What is the total number of scalar parameters required by the simple fully-connected network (FCN) to process single word sentences? Assume there are two hidden layers in this network, in addition to the input and output layers. 0 Let's new process sentences of maximum length up to 100 words with both the GRU and the fully- connected network (FCN). For the fullyconnected network, assume that the word representation for all the words in the input sentence are concatenated and then fed to the network. What is the total number of scalar parameters required by the GRU, and what is the total number of scalar parameters required by the FCN

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!