Question: Q 3 . Neural Machine Translation: Code [ 2 0 ] In this question, you will learn to implement the Neural Machine Translation task using

3 .

Neural Machine Translation: Code

[20]

In this question, you will learn to implement the Neural Machine Translation task using a sequence

-

-

sequence model with attention.

1 .

Load WMT

14 \ ({}^{1} \)

data from Huggingface. Use de

-

(

German to English

)

data configuration. Note that this will need a lot of space

(

approximately

950

)

for the data and a few minutes for generating the train

-

test split.

2 .

The training dataset contains a dictionary with languages

'

'

and

'

'

as keys and the respective sentences as values. Create a DataFrame with

2

columns 'German' and 'English' and the respective sentences. Build two separate vocabularies for German and English words and maintain a dictionary for German words with the word as key corresponding to the index and vice versa

(

as in Q

2 .

step

3) .

Add and tags at the start and the end of each English sentence and add the tokens to the vocabulary as well.

3 .

With the dictionary created above, convert the German sentence to the array of indexes of each word in the sentence. e

.

g "I learn NLP

"

might be converted to

[10, 7, 29]

where in the dictionary,

'

'

is mapped to index

10,

'learn' is mapped to

7

and

' \ (

n l p

\)'

is mapped to

29 .

Find the maximum length of the sentences in the German corpus. Pad each array with

0

such that the length of each index vector calculated above will be of the same length. For example, if the maximum length of a sequence in the dataset is

5,' \ (

\)

learn NLP

'

will be converted to

[\ (10, 7, 29, 0, 0] \) .

4 .

Convert each word in the English vocabulary to a one

-

hot encoded vector of length equal to the number of words in the vocabulary

(

including the start and the end tag

) .

Perform steps

3

and step

4

on the test set and validation set as well.

5 .

Build an Encoder with an Embedding layer, LSTM layer

(

),

and an attention layer. For Decoder, use LSTM

(

)

and fully connected layers. Use Softmax activation for decoder output.

6 .

Train the model using Cross entropy loss, an optimizer of your choice

(

use of adam is recommended

),

and

10

epochs. You are free to vary the number of epochs to boost performance.

7 .

To generate the translation of the given German sentence from the trained model, convert the sentence to the index vector, feed it to the model and convert the output to the English word with the highest Softmax probability. Terminate the process once you encounter token.

8 .

With the help of nltk

.

translate.bleu

_

score.sentence

_

bleu

() \ ({}^{2} \),

find the average BLEU score on the test set, which measures the similarity of the machine

-

translated text to a set of reference translations.

Q 3 . Neural Machine Translation: Code [ 2 0 ] In

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!

Computer Organization and Networks Practicals 2021/22 October 9, 2021 Computer Organization and Networks Practicals 2021/22 b68495714b Contents Contents 0 Introduction 3 0.1 Registration . . . . . ....

answer the question clearly You are building a flight-control system for which a convincing safety case must be made. Would you assign the tasks of safety requirements engineering, test case...

subject: Differential Equations pls read instructions do not use ai. drop all references and link Instructions ODE application. - find an article related to ODE application - provide a short...

Hint:you have to pack (or typecast) the "key" and the "item" into a "pair" in order to put it into, or remove it from, a map or multimap. A "pair" is like a struct with fields: "first" and "second"...

You are designing a new syntax for a programming language like Java, with the intention of making it more approachable to students by using English words instead of punctuation symbols. (a) How does...

A discrete sequence {xn} can be converted into a continuous representation x(t) = ts X n= (t n ts) xn, where ts is the sampling period. (a) State two characteristic properties of Dirac's function. [2...

lup ] (d) Show how a generating (or "mother") wavelet (x) can spawn a family of "daughter" wavelets jk(x) by simple shifting and scaling operations, and explain the advantages of representing...

QUIZ... Let D be a poset and let f : D D be a monotone function. (i) Give the definition of the least pre-fixed point, fix (f), of f. Show that fix (f) is a fixed point of f. [5 marks] (ii) Show that...

please help I need this by tonight ASAP. if you solve this i will give you the BEST RATING. ALL I NEED IS THE FULL DETAILED CODE FOR THE MULTITASKING COMMANDER. other commanders are finished. I...

can someone solve this Modern workstations typically have memory systems that incorporate two or three levels of caching. Explain why they are designed like this. [4 marks] In order to investigate...

Briefly explain the three components of the definition of a market, as presented in this chapter.

3.) Alternative Splicing - Northern Blot (15 points) The pre-mRNA shown below is spliced as indicated using the A-shaped line notation we introduced in class (Note: all isoforms contain the first and...

1. (True/False): When a macro is invoked, the CALL and RET instructions are automatically inserted into the assembled program.

Mike and Roman case Study PERSONAL AND FINANCIAL OBJECTIES FOR MIKE AND MARIA ROMAN 1. Provide for retirement. They would like to retire when Mike is 60. They want to plan on $100,000 of retirement...