Question: Consider the image captioning problem. Given a picture as the input, your aigorithm should output a sentence describing the picture. (a) Describe a neural network
Consider the image captioning problem. Given a picture as the input, your aigorithm should output a sentence describing the picture. (a) Describe a neural network structure that can do this. (if you use well known CNN or RNN structures, you don' t need to go into the details. Just say how to use them (what the input, what is the output).) (b) Describing the training process (what is the training data? what is the objective?) (c) Describe the test process (In the testing phase, give a picture as the input, how your algorithm output a sentence describing the picture.) (d) Suppose we want to have a parameter to determine the diversity of the output sentence (on one extreme, the output is almost a deterministic sentence, and on the other extreme, the output is almost a random sequence) This can be done by adding a temperature parameter to the softmax function. How would you do it? (there may be more than one reasonable answers.)
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
