What is Batch Normalization in CNN?

Last Updated : 3 Jul, 2025

Batch normalization is a technique used to improve the training of deep neural networks by stabilizing the learning process. It addresses the issue of internal covariate shift where the distribution of each layer's inputs changes during training as the parameters of the previous layers change..

Batch Normalization in CNN addresses several challenges encountered during training.

Addressing Internal Covariate Shift: Internal covariate shift occurs when the distribution of network activations changes as parameters are updated during training. It addresses this by normalizing the activations in each layer. This stabilizes training and speeds up convergence.
Improving Gradient Flow: It contributes to stabilizing the gradient flow during backpropagation by reducing reliance on gradients on parameter scales. As a result, training more stable enabling effective training of deeper networks without facing issues like vanishing or exploding gradients.
Regularization Effect: During training it introduces noise to the network activations. This noise helps to handle overfitting by adding randomness to the system.

How Does Batch Normalization Work in CNN?

Batch normalization works in convolutional neural networks (CNNs) by normalizing the activations of each layer across mini-batch during training. The working is discussed below:

1. Normalization within Mini-Batch

In a CNN, each layer receives inputs from multiple channels (feature maps) and processes them through convolutional filters. Batch Normalization operates on each feature map separately, normalizing the activations across the mini-batch.

During training, batch normalization (BN) regularizes the activations of each layer by subtracting the mean and dividing by the standard deviation of each mini-batch.

Mean Calculation:
Variance Calculation:
Normalization:

2. Scaling and Shifting

After normalization, it adjusts the normalized activations using learned scaling and shifting parameters. These parameters enable the network to instantaneously scale and shift the activations thereby maintaining the network's ability to represent complex patterns in the data.

Scaling:
Shifting:

3. Learnable Parameters

The parameters and are learned during training through backpropagation. This allows the network to adjust the normalization and ensure that the activations are in the appropriate range for learning.

4. Applying Batch Normalization

It is typically applied after the convolutional and activation layers in a CNN before passing the outputs to the next layer. It can also be applied before or after the activation function, depending on the network architecture.

5. Training and Inference

During training, Batch Normalization calculates the mean and variance of each mini-batch. During testing, it uses the averaged mean and variance that are calculated during training to normalize the activations. This ensures consistent normalization between training and testing.

Applying Batch Normalization in CNN model using TensorFlow

For applying batch normalization layers after the convolutional layers and before the activation functions, we use tensorflow's'tf.keras.layers.BatchNormalization()'.

1. Importing Required Libraries

2. Creating Sequential Model

First Convolutional Block

Conv2D(32): Extracts low-level features (edges, textures).
BatchNormalization(): Stabilizes and speeds up training.
MaxPooling2D(): Reduces spatial size.

Second Convolutional Block

Conv2D(64): Learns deeper patterns.
BatchNormalization(): Normalizes activations.
MaxPooling2D(): Further reduces size.

Dense Layers(Classifier)

Flatten(): Converts 3D feature map to 1D.
Dense(64): Learns high-level combinations.
Dense(10, softmax): Outputs probabilities for 10 classes.

Applying Batch Normalization in 1D CNN model using PyTorch

In PyTorch, we can easily apply batch normalization in a CNN model. For applying BN in 1D Convolutional Neural Network model, we use 'nn.BatchNorm1d()'.

1. Importing Libraries

2. Defining the Model

In this step, we structure our model:

Conv1d(3, 16): First convolution layer that transforms 3 input channels to 16 feature maps using 3-sized filters.
BatchNorm1d(16): Normalizes the 16 output channels to improve training stability.
Conv1d(16, 32): Second convolutional layer, increasing feature channels to 32.
BatchNorm1d(32): Normalizes the output from the second conv layer.
Linear(32 * 28, 10): Fully connected layer that maps the flattened feature map to 10 output classes.

3. Forward Pass

Prediction step and input flow:

self.conv1 -> bn1 -> ReLU: Applies the first convolution and activates.
self.conv2 -> bn2 -> ReLU: Applies the second convolution and activates.
view(-1, 32 * 28): Flattens the 3D tensor into 2D for the dense layer.
self.fc: Final layer that outputs a vector of size 10 (class scores).

4. Initialize Model

This code defines and creates a simple 1D Convolutional Neural Network (CNN1D) in PyTorch for classification tasks.

Applying Batch Normalization in 2D CNN model using PyTorch

For applying Batch Normalization in 2D Convolutional Neural Network model, we use 'nn.BatchNorm2d()'.

1. Importing Libraries

2. Structuring Model

In this step, we define the model:

Conv2d(3, 16, 3, 1, 1): Applies 16 filters to a 3-channel image using 3×3 kernels. Padding keeps the image size unchanged.
BatchNorm2d(16): Normalizes the 16 output feature maps from conv1.
Conv2d(16, 32, 3, 1, 1): Applies 32 filters, again preserving spatial dimensions.
BatchNorm2d(32): Normalizes the output of conv2.
Linear(32*28*28, 10): Fully connected layer that flattens the feature map and outputs scores for 10 classes.

3. Forward Pass

Prediction step and input flow:

Conv1 -> BN -> ReLU: First feature extraction block.
Conv2 -> BN -> ReLU: Second feature extraction block.
view(-1, 32*28*28): Flattens the 3D output to 1D for the dense layer.
fc: Maps the extracted features to 10 output classes.

4. Model Initialization

This code defines a 2D Convolutional Neural Network (CNN) in PyTorch for image classification into 10 classes.

In conclusion, batch normalization stands as a technique used in enhancing the training and performance of convolutional neural networks (CNNs).

For more detailed explanation regarding the implementation, refer to

Batch Normalization Implementation in PyTorch
Applying Batch Normalization in Keras using BatchNormalization Class

Comment

Article Tags:

Explore

Introduction to Computer Vision

Image Processing & Transformation

Feature Extraction and Description

Deep Learning for Computer Vision

Object Detection and Recognition

Image Segmentation

3D Reconstruction

Courses

URL: https://www.geeksforgeeks.org/computer-vision/what-is-batch-normalization-in-cnn/