Question: 3. Neural sequence models (14 points) (a) (2 points) Suppose you want to build a French to English translator system using an LSTM based sequence

3. Neural sequence models (14 points) (a) (2 points) Suppose you want to build a French to English translator system using an LSTM based sequence to sequence model. What would be fed to the decoder as input at each time step t? Note there should be two components to the input. (b) (2 points) Consider the two components of the input to your decoder from part (a). How does each component affect the generalization of the model during inference? (c) (4 points) Describe two different decoding strategies for generating English translations with your decoder from part (a). For each strategy, explain when you would want to use it and what is a possible drawback of that decoding strategy. (d) (2 points) You are building a text classifier using a simple single-layer, unidirectional RNN. Your friend recommend that you used a GRU cell instead of a LSTM cell. Under what circumstance might the GRU work better than the LSTM? (e) (4 points) You notice that the performance of your classifier from part (d) is not very good. Describe two extensions that you can make to your GRU model from part (d) to improve the performance of your classifier. For each extension, explain why it might help. 3. Neural sequence models (14 points) (a) (2 points) Suppose you want to build a French to English translator system using an LSTM based sequence to sequence model. What would be fed to the decoder as input at each time step t? Note there should be two components to the input. (b) (2 points) Consider the two components of the input to your decoder from part (a). How does each component affect the generalization of the model during inference? (c) (4 points) Describe two different decoding strategies for generating English translations with your decoder from part (a). For each strategy, explain when you would want to use it and what is a possible drawback of that decoding strategy. (d) (2 points) You are building a text classifier using a simple single-layer, unidirectional RNN. Your friend recommend that you used a GRU cell instead of a LSTM cell. Under what circumstance might the GRU work better than the LSTM? (e) (4 points) You notice that the performance of your classifier from part (d) is not very good. Describe two extensions that you can make to your GRU model from part (d) to improve the performance of your classifier. For each extension, explain why it might help
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
