So far weve concentrated on word embeddings to represent the fundamental units of text. But it is

Question:

So far we’ve concentrated on word embeddings to represent the fundamental units of text. But it is also possible to have a character-level model. Read Andrej Karpathy’s 2015 article The Unreasonable Effectiveness of Recurrent Neural Networks and download his char-rnn code (the PyTorch version at github.com/jcjohnson/torch-rnn is easier to use than the original version at github.com/karpathy/char-rnn). Train the model on text of your choice.

a. Show how the randomly generated text improves with more iterations of training.

b. Find any other interesting properties of the model on your text.

c. Replicate some of Karpathy’s findings, such as the fact that some cells learn to count nesting level of parentheses.

d. Compare the character-level RNN model to a character-level n-gram model as described in Yoav Goldberg’s The unreasonable effectiveness of Character-level Language Models (and why RNNs are still cool) at nbviewer.jupyter.org/gist/yoavg/ d76121dfde2618422139.

e. What do you see as the differences between RNN and n-gram models?

Fantastic news! We've Found the answer you've been seeking!