Question: 4 . This question is on softmax Softmax is briefly described in Jurafsky and Martin, chapter 7 , section 7 . 5 . 1 .

4. This question is on softmax Softmax is briefly described in Jurafsky and Martin, chapter 7, section 7.5.1.(This whole chapter is a good brief introduction to feedforward neural networks.)
(a) Calculate the probabilities for the following sets of inputs to softmax(the inputs to softmax are generally called logits): [-2,-1,0,1],[3,3,3,3],[0.1,0.2,0.0,0.4].
(b) If the logits are [y1,y2,y3], are the softmax probabilities the same for [y1+ c,y2+ c,y3+ c] for any value c? Give an argument justifying your answer.
(c) What would the softmax output be for logits that are [3.3,3.3,-inf,-inf], where -inf is the python expression for minus infinity? (Hint: you dont need to do any calculation! Remark: -inf is actually used as an input to softmax in the GPT attention mechanism.)
(d) Suppose the predicted probabilities of three possible outcomes (A, B,and C) are [0.05,0.8,0.1] respectively. What is the log-loss (cross entropy loss) if the true outcome (ie the label in the training set) is A? What is it if the true label is B? and if it is C?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!