Question in computer scienceee Question No 0 2 This is a subjective question, hence you have to write your answer in the Text Fleld given below Consider a combined Lenet 5 and a single layer RNN based visual captioning system that is trained to generate the sequence of small characters corresponding to ten digits, 0 to 9 For example, an input image of ' 7 ' will generate the output character sequences s , e , v , e , n An input image that is not of a digit will generate the output n , o , n , e Assume one hot representation is used for both input and output ( a ) What is the minimum number of input nodes and minimum number of output node required in RNN ( b ) Assuming linear combinations of the output of last convolution layer ( after subsampling and unrolling ) is used to initialize the RNN hidden layer, how many trainable parameters will be needed, excluding the CNN convolution parameters Assume 5 0 hidden nodes are used in RNN Show all steps clearly No attention is used ( c ) Over how many time steps, does the loss function has to be evaluated during training 1

The Answer is in the image, click to view ...

Question: Question in computer scienceee Question No: 0 2 This is a subjective question, hence you have to write your answer in the Text - Fleld

Question in computer scienceee

Question No:

02

This is a subjective question, hence you have to write your answer in the Text

-

Fleld given below.

Consider a combined Lenet

- 5

and a single

-

layer RNN based visual captioning system that is trained to generate the sequence of small characters corresponding to ten digits,

0

9 .

For example, an input image of

' 7'

will generate the output character sequences

s, e, v, e,

.

An input image that is not of a digit will generate the output n

,

,

,

.

Assume one hot representation is used for both input and output.

(

)

What is the minimum number of input nodes and minimum number of output node required in RNN

?

(

)

Assuming linear combinations of the output of last convolution layer

(

after subsampling and unrolling

)

is used to initialize the RNN hidden layer, how many trainable parameters will be needed, excluding the CNN convolution parameters? Assume

50

hidden nodes are used in RNN

.

Show all steps clearly. No attention is used.

(

)

Over how many time steps, does the loss function has to be evaluated during training?

[1]

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock