![]() |
VOOZH | about |
Let us begin this article with a basic question - "Why padding and strided convolutions are required?" Assume we have an image with dimensions of n x n. If it is convoluted with an f x f filter, then the dimensions of the image obtained are .
Example:
Consider a 6 x 6 image as shown in figure below. It is to be convoluted with a 3 x 3 filter. The convolution is done using element wise multiplication.
👁 ImageFigure 1: Image obtained after convolution of 6x6 image with a 3x3 filter and s=0
Figure 2: 6 x 6 filter
Figure 3: 3 x 3 filter
Figure 4: Element wise multiplication
But there are two downsides of this convolution:
In order to avoid it, padding is required. Also, sometimes it happens that we have a very large input image is to be convoluted with an f x f filter which may be computationally very expensive. In this situation, strides are used . That is why padding and strides are one of the most basic building blocks of Convolutional Neural Networks
Dimensions of output image:
Lets have an n x n image to be convoluted with an f x f filter. Assume a padding border of p pixels and a stride s, then the dimensions of the output image obtained are:
The stride amount should be selected such that comparatively lesser computations are required and the information loss should be minimum.
👁 ImageFigure 5: Image obtained after convolution of 7x7 image with a 3x3 filter and a stride of 2