Question: Why do we use masking in self - attention? A . To run the attention mechanism in blocks. B . To eliminate the need of
Why do we use masking in selfattention? A To run the attention mechanism in blocks. B To eliminate the need of recurrent connections and convolutions. C To run the attention mechanism several times in parallel. D To stop the model from looking at the information we dont want it to look at
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
