Question: 3. Neural sequence models (14 points) (a) (2 points) Suppose you want to build a French to English translator system using an LSTM based sequence

 3. Neural sequence models (14 points) (a) (2 points) Suppose you

3. Neural sequence models (14 points) (a) (2 points) Suppose you want to build a French to English translator system using an LSTM based sequence to sequence model. What would be fed to the decoder as input at each time step t? Note there should be two components to the input. (b) (2 points) Consider the two components of the input to your decoder from part (a). How does each component affect the generalization of the model during inference? (c) (4 points) Describe two different decoding strategies for generating English translations with your decoder from part (a). For each strategy, explain when you would want to use it and what is a possible drawback of that decoding strategy. (d) (2 points) You are building a text classifier using a simple single-layer, unidirectional RNN. Your friend recommend that you used a GRU cell instead of a LSTM cell. Under what circumstance might the GRU work better than the LSTM? (e) (4 points) You notice that the performance of your classifier from part (d) is not very good. Describe two extensions that you can make to your GRU model from part (d) to improve the performance of your classifier. For each extension, explain why it might help. 3. Neural sequence models (14 points) (a) (2 points) Suppose you want to build a French to English translator system using an LSTM based sequence to sequence model. What would be fed to the decoder as input at each time step t? Note there should be two components to the input. (b) (2 points) Consider the two components of the input to your decoder from part (a). How does each component affect the generalization of the model during inference? (c) (4 points) Describe two different decoding strategies for generating English translations with your decoder from part (a). For each strategy, explain when you would want to use it and what is a possible drawback of that decoding strategy. (d) (2 points) You are building a text classifier using a simple single-layer, unidirectional RNN. Your friend recommend that you used a GRU cell instead of a LSTM cell. Under what circumstance might the GRU work better than the LSTM? (e) (4 points) You notice that the performance of your classifier from part (d) is not very good. Describe two extensions that you can make to your GRU model from part (d) to improve the performance of your classifier. For each extension, explain why it might help

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Finance Questions!