Question: During the lectures, we discussed in detail about language model construction using both RNN and LSTM , and provided the corresponding Python classes SimpleRnnlm and

During the lectures, we discussed in detail about language model construction using both RNN and LSTM, and provided the corresponding Python classes SimpleRnnlm and Rnnlm (also BetterRnnlm). The implementations of these two language models are similar, with one notable difference: the Time RNN layer in SimpleRnnlm is replaced by the Time LSTM layer.
For this exercise, create your own class named 'Grulm' to build a language model using GRU. GRU preserves the concept of utilizing gates from LSTM while reducing parameters and computation time. The Python code for implementing the GRU layer and the Time GRU layer can be found in common/time_layers.py.
Train your GRU-based language model on the PTB dataset using all the training data, following a similar approach to how Rnnlm was trained. After completing the learning process, evaluate the perplexity with test data. At this point, remember to reset the model's state, including the GRUs hidden (GRU does not have memory cells). You can directly use the perplexity evaluation function eval_perplexity() from common/util.py.
Plot the training perplexity curve and report test perplexity

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!