Question: ( d ) Suppose max pooling is applied on an 8 8 image with a 2 2 filter and stride 2 pixels. What will be

(

)

Suppose max pooling is applied on an

8 8

image with a

2 2

filter

and stride

2

pixels. What will be the number of parameters in this

layer?

(1

mark

)

(

)

Consider the following plot of the number of stochastic gradient

descent

(

SGD

)

iterations required to reach a given loss, as a function

of the batch size:

For small batch sizes, the number of iterations required to reach the target

loss decreases as the batch size increases. Why is that?

(2

marks

)

(

)

Write down the number of parameters in each field. Assume the

convolution filter is of shape

3 3 64,

what would be the values in

the fields II

,

III, and V

?

(3

marks

) (

)

You are given a black box optimizer which produces the loss curve

shown in Figure A

.

You see a big red button on the optimizer and

decide to push it

.

After doing this, you notice the loss curve shown in

Figure B

.

You press the button one more time and finally notice the

loss curve shown in Figure C

.

1

gure

L

The red button modifies a single hyperparameter. Which hyperparameter is

most likely to be modified by pressing the button?

(1

mark

)

Also, of experiments

1, 2

and

3,

which corresponds to largest magnitude of

the hyperparameter?

(1

mark

)

Lastly, the loss curve for experiment

3

seems to be the most desirable.

Despite this, give two reasons why you would choose the hyperparameter

in experiment

2

for training your model.

(2

marks

)

Neural networks.

(

)

Let us say you have a training set

s

containing

m

pairs

(, y i)

where

vector

x

is to be assigned to one of

K

classes in a supervised setting

and the labels yi are the vectors in

{0, 1} K

containing a single

1

representing the target class, i

.

.,

if there are

5

classes and some

should be assigned to class

2

then

y i = (0, 1, 0, 0, 0) .

To do this, it is

proposed that you use

K

neural networks. The ith network has

parameters wi and computes the function wi

, x .

You may make no

further assumptions regarding the function

h .

You aim to treat the output of the

i

th network as an estimate of the

probability class

i | x, w

that

x

should be in the

i

th class,

where

w

collects together all the

K

vectors

w_{l},

dots,

w_{K} .

It is

proposed that to do this you should modify the setup described to

compute

P (n

class

i | x, w) =

prob

(i, x)

=

exp

\frac{h (w_{i}, x)}{_{j = 1}^{K}}

exp

(h (w_{i}, x))

Explain why this modification is required, and how it achieves the

stated aim?

(4

marks

)

(

)

Suppose a convolution layer takes a

32 32 3

input volume, and

applies ten

5 5

filters with stride

1

pixel and padding

2

pixels. What

will be the size of the output volume?

(2

marks

)

(

)

Given the graphs of testing and training error, do you think an

evident problem here is overfitting? Yes or no

,

please justify your

answer!

(4

marks

)

(d) Suppose max pooling is applied on an 88 image with

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

KINDLY JUST ANSWER / WRITE CODES FOR THE GIVEN QUESTIONS, The data represent the log - transformed Mel spectrograms derived from the GTZAN dataset. The original GTZAN dataset contains 3 0 - second...

The data represent the log - transformed Mel spectrograms derived from the GTZAN dataset. The original GTZAN dataset contains 3 0 - second audio files of 1 , 0 0 0 songs associated with 1 0 different...

Consider the following simple CNN architecture: [ 6 Marks ] Input: 6 4 6 4 3 ( RGB image ) , Convolutional Layer 1 : [ 6 4 filters of size = 2 2 , stride = 2 , padding = 'valid' ] , Batch...

Machine Learning Solve it without using any programming language. Question Number 3 (4+4): a) Given the neural network below, calculate and show the weight changes that would be made by one step of...

Please provide the summary of the methodology and your understanding of this paper. Incluse necessary figures as well. Rapid Object Detection using a Boosted Cascade of Simple Features single feature...

Microkernel operating systems aim to address perceived modularity and reliability issues in traditional "monolithic" operating systems. (i) Describe the typical architecture of a microkernel...

Question in computer scienceee ( a ) Write Python code using Tensorflow Keras Library, for implementing the above. 2 ( b ) Compute the number of parameters learned in 'Conv 2 ' layer. Write the...

Note: All ML code must be explained clearly (INJAVAXX)and should be free of needless complexity. 2 CST.2016.1.3 2 Foundations of Computer Science Please help. (2c) (a) A prime number sieve is an...

I need it in JAVAx Objects: Electronic health records (EHRs) in a nationwide service. Policy: The owner (patient) may read from its own EHR. A qualified and employed doctor may read and write the EHR...

\fThis is an electronic version of the print textbook. Due to electronic rights restrictions, some third party content may be suppressed. Editorial review has deemed that any suppressed content does...

In the opening scene of Tom Stoppers play Rosencrantz and Guildenstern Are Dead, about two Elizabethan contemporaries of Hamlet, Guildenstern flips a coin 91 times and gets a head each time. Suppose...

Prove that P (a

Which of the following problems do LDCs face? a. Low per capita income and high GDP growth rate. b. Low population growth and low per capita income. c. Rapid population growth and low human capital....

The theory states that improvements in quality lead to lower costs because they result in less rework, fewer mistakes, fewer delays and snags, and better use of time and materials. Lower costs, in...

Be prepared to discuss conditions that contribute to mass terminations or layoffs

Appreciate the importance of complete, properly executed, and appropriately retained documentation

Be able to explain the concept of constructive discharge