Question: Q 3 . Neural Machine Translation: Code [ 2 0 ] In this question, you will learn to implement the Neural Machine Translation task using

Q3. Neural Machine Translation: Code [20]
In this question, you will learn to implement the Neural Machine Translation task using a sequence-to-sequence model with attention.
1. Load WMT14\({}^{1}\) data from Huggingface. Use de-en (German to English) data configuration. Note that this will need a lot of space (approximately 950 MB ) for the data and a few minutes for generating the train-test split.
2. The training dataset contains a dictionary with languages 'de' and 'en' as keys and the respective sentences as values. Create a DataFrame with 2 columns 'German' and 'English' and the respective sentences. Build two separate vocabularies for German and English words and maintain a dictionary for German words with the word as key corresponding to the index and vice versa (as in Q2. step 3). Add and tags at the start and the end of each English sentence and add the tokens to the vocabulary as well.
3. With the dictionary created above, convert the German sentence to the array of indexes of each word in the sentence. e.g "I learn NLP" might be converted to [10,7,29] where in the dictionary, ' i ' is mapped to index 10, 'learn' is mapped to 7 and '\( n l p \)' is mapped to 29. Find the maximum length of the sentences in the German corpus. Pad each array with 0 such that the length of each index vector calculated above will be of the same length. For example, if the maximum length of a sequence in the dataset is 5,'\( I \) learn NLP' will be converted to [\(10,7,29,0,0]\).
4. Convert each word in the English vocabulary to a one-hot encoded vector of length equal to the number of words in the vocabulary (including the start and the end tag). Perform steps 3 and step 4 on the test set and validation set as well.
5. Build an Encoder with an Embedding layer, LSTM layer(s), and an attention layer. For Decoder, use LSTM(s) and fully connected layers. Use Softmax activation for decoder output.
6. Train the model using Cross entropy loss, an optimizer of your choice (use of adam is recommended), and 10 epochs. You are free to vary the number of epochs to boost performance.
7. To generate the translation of the given German sentence from the trained model, convert the sentence to the index vector, feed it to the model and convert the output to the English word with the highest Softmax probability. Terminate the process once you encounter token.
8. With the help of nltk.translate.bleu_score.sentence_bleu()\({}^{2}\), find the average BLEU score on the test set, which measures the similarity of the machine-translated text to a set of reference translations.
Q 3 . Neural Machine Translation: Code [ 2 0 ] In

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!