Question: 1 Problem 1 : Transformer Questions ( 6 0 points ) The Transformer model comes from the paper Attention is all you need [ 1
Problem : Transformer Questions points
The Transformer model comes from the paper "Attention is all you need"
and it has achieved a lot of success in Natural Language Processing NLP do
main. It shows several advantages over the previous RNN model such as parallel
processing and longrange context dependencies. BERT and GPT are
two famous extensions of the Transformer. You are going to answer several
questions about your TransformerBERTGPT understanding.
The core idea of selfattention operation is the scaled dotproduct attention.
The features are first transformed into three different matrices and
and the selfattention is calculated by the following:
Attentionsoftmax
Q points: In the above selfattention operation, why do we need to
incorporate the scale factor into the calculation?
Q points: When we train the Transformer on the word sequences, usu
ally we need to add additional positional embedding for each word, why is this
necessary?
Q points: In the Transformer framework, there are two types of atten
tion modules, which are selfattention and encoderdecoder attention. What is
the difference between these two modules in terms of functionality and technical
implementation?
Q points: There are also other types of attention calculations such as
the additive attention Additive attention computes the compatibility func
tion using a feedforward network with a single hidden layer. In the Transformer
model, why the authors choose to use scaleddot product attention instead of
additive attention and what is the main advantages?
Q points: BERT and GPT models pretrain their model on a large
scale dataset in a selfsupervising way. Please describe their pretraining tasks
and discuss why it is useful.
Q points: In the BERT model design, there are two special tokens
and SEP, what is the purpose of designing these two special tokens,
and how they are used during the training and evaluation?
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
