Question: In object recognition, translating an image by a few pixels in some direction should not affect the category recognized. Suppose that we consider images with
In object recognition, translating an image by a few pixels in some direction should not affect the category recognized. Suppose that we consider images with an object in the foreground on top of a uniform background. Suppose also that the objects of interest are always at least 10 pixels away from the borders of the image. Are the following neural networks invariant to translations of at most 10 pixels in some direction?
Here the translation is applied only to the foreground object while keeping the background fixed. If your answer is yes, show that the neural network will necessarily produce the same output for two images where the foreground object is translated by at most 10 pixels. If your answer is no, provide a counter example by describing a situation where the output of the neural network is different for two images where the foreground object is translated by at most 10 pixels.
(a) Neural network with one hidden layer consisting of convolutions (5 x 5 patches with a stride of 1 in each direction) and a softmax output layer.
(b) Neural network with two hidden layers consisting of convolutions (5 x 5 patches with a stride of 1 in each direction) followed by max pooling (4 x 4 patches with a stride of 4 in each direction) and a softmax output layer.
Step by Step Solution
3.40 Rating (156 Votes )
There are 3 Steps involved in it
To determine if the provided neural networks are invariant to translations of at most 10 pixels in some direction lets analyze each case separately focusing on the structure and operations performed i... View full answer
Get step-by-step solutions from verified subject matter experts
