Residual Networks (ResNet) - Deep Learning

Last Updated : 12 May, 2026

Residual Networks (ResNet) is a deep learning architecture designed to enable efficient training of very deep neural networks. It introduces skip (shortcut) connections, which allow the model to learn residual mappings instead of direct transformations.

Helps prevent vanishing gradient problems in very deep models
Allows information to flow directly across layers using skip connections
Enables building networks with hundreds or even thousands of layers

👁 residual_block2

Residual Block

Challenges in Deep Neural Networks

Deep Neural Networks are powerful models, but training them becomes difficult as network depth increases. Two major issues are:

1. Vanishing/Exploding Gradient Problem: As the number of layers increases, gradients can become extremely small (vanishing) or very large (exploding) during backpropagation, making training unstable.

2. Degradation Problem: Increasing network depth does not always improve performance and can even degrade it.

Performance Plateau: Training error stops decreasing after a certain depth
Accuracy Degradation: Validation error increases, leading to poor generalization

Key Features

Residual Connections: Enable very deep networks by allowing gradients to flow through identity shortcuts, reducing the vanishing gradient problem.
Identity Mapping: Simplifies training by learning residual functions instead of full mappings.
Depth: Supports extremely deep architectures for improved image recognition performance.
Fewer Parameters: Achieves high accuracy with fewer parameters hence improving computational efficiency.

The following graph compares training and test errors of 20-layer and 56-layer networks, highlighting the limitations of deeper networks without residual connections.

Training error: The 56-layer network learns slowly and shows fluctuations, while the 20-layer network converges more smoothly
Test error: The deeper network has higher error (degradation problem), whereas the shallower network generalizes better

👁 resnet-1

Comparison of 20-layer vs 56-layer architecture

ResNet-34

ResNet-34 is a deep residual network built on a 34-layer plain network inspired by VGG-19, with shortcut connections forming 16 residual blocks. The architecture is organized into stages as follows:

First stage: 3 residual blocks, each with 2 convolution layers of 64 filters and identity skip connections
Second stage: 4 residual blocks, each with 2 convolution layers of 128 filters; uses 1×1 projection or padding for dimension matching
Third stage: 6 residual blocks, each with 2 convolution layers of 256 filters
Fourth stage: 3 residual blocks, each with 2 convolution layers of 512 filters
Output layer: Feature maps are passed through Global Average Pooling followed by a fully connected layer with softmax for classification

👁 ResNet

ResNet34

Working

Conventional networks try to learn the full mapping . ResNet instead learns a residual function and combines it with the input via a skip connection

where:

: input to the block
: desired mapping
: residual function to be learned

Learning the simpler residual makes optimization easier.

1. Residual Block: A residual block is the core unit of ResNet and consists of

One or more convolutional layers
A skip connection that bypasses these layers
Addition of input to the convolution output

This design ensures smooth flow of information and gradients across layers.

👁 skip_connection

Residual Block

2. Skip (Shortcut) Connection

Bypasses one or more layers
Adds input directly to output
Prevents vanishing gradients
Improves parameter updates

3. Handling Dimension Mismatch: When input and output dimensions differ

Zero Padding: Adds extra zeros to the input to match output dimensions
Linear Projection: Uses a learnable 1x1 convolution to match input and output dimensions for the skip connection.

4. Stacking Residual Blocks : Multiple residual blocks can be stacked to create deep architectures. This allows networks to go very deep without suffering from degradation.

5. Global Average Pooling (GAP): Before the final fully connected layer ResNet uses GAP

Converts each feature map to a single value by averaging
Reduces parameters less overfitting
Produces compact feature representation

Implementation

We will implement ResNet (v1 and v2) for CIFAR-10 and cover data preprocessing, model creation, training and plotting graphs step by step.

Step 1: Importing Libraries

Import libraries like

tensorflow for building and training the model
keras defines model layers and structure
numpy handles numerical operations
os manages files and directories

Step 2: Setting Hyperparameters

Set batch_size, epochs, num_classes and data_augmentation
Choose ResNet version and number of residual blocks
Compute depth based on CIFAR ResNet rules

Step 3: Loading and Preprocessing CIFAR-10 Data

Load CIFAR-10 dataset using Keras.
Normalize pixel values to range [0, 1].
Optionally subtract the dataset mean for zero-centered input.
Convert labels to one hot vectors.

Output:

👁 ndjsncjs

Load Dataset

Step 4: Defining Learning Rate

Define learning rate for our model.

Step 5: Defining a ResNet Layer Function

Defines a single convolutional layer optionally followed by BatchNorm and ReLU.
conv_first applies convolution first

Step 6: Defining ResNet v1

Uses 2 layer residual blocks for each residual unit
Computes number of residual blocks
Adds identity or projection shortcuts when feature map dimensions change
Ends with Global Average Pooling and Dense softmax layer

Step 7: Defining ResNet v2

Uses 3 layer bottleneck residual blocks.
Handles identity or projection shortcuts for dimension matching.
Ends with BatchNorm ,ReLU, GAP, Dense, softmax.

Step 8: Compiling the Model

Instantiate v1 or v2 based on version.
Compile with Adam optimizer, categorical_crossentropy and accuracy metric.

Step 9: Setup Callbacks

ModelCheckpoint saves the best model.
LearningRateScheduler adjusts learning rate during training.
ReduceLROnPlateau reduces LR if validation performance plateaus.

Step 10: Data Augmentation & Training

Uses ImageDataGenerator for real time augmentation if enabled.
history variable stores training metrics for plotting.

Output:

👁 Screenshot-2025-11-20-163638

Traning

You can download full code from here.

ResNet Results on ImageNet and COCO

On the ImageNet dataset, a 152-layer ResNet, much deeper than VGG-19, achieved high accuracy with fewer parameters. An ensemble of ResNet models reached around 3.7% top-5 error. On the COCO dataset, ResNet showed a 28% relative improvement in object detection performance.

👁 Image

Error-rate on ResNet Architecture

The results show that shortcut connections effectively address the problems caused by increasing network depth as increasing layers from 18 to 34 leads to a decrease in error rate on the ImageNet validation set unlike plain networks.

👁 Image

top-1 and top-5 Error rate on ImageNet Validation Set.

Below are the results on ImageNet Test Set. The 3.57% top-5 error rate of ResNet was the lowest and thus ResNet architecture came first in ImageNet classification challenge in 2015.

👁 Image

Advantages

Eases training of deep networks by allowing direct gradient flow through skip connections, reducing vanishing gradient problems
Enables very deep architectures (50–152+ layers) with stable training
Improves accuracy through residual learning in tasks like image classification and object detection
Reduces degradation as increasing depth does not increase training error in ResNet
Achieves better performance with fewer parameters compared to traditional deep networks

Challenges

Requires high computational power due to its deep architecture
Needs projection layers to handle dimension mismatch in skip connections
May overfit on small datasets because of large model capacity
Training can become unstable without proper batch normalization
Very deep networks may still face performance degradation in extreme cases

Comment

Article Tags:

Deep Learning

python

AI-ML-DS With Python

Explore

Basics

Neural Networks

Deep Learning Models

Model Evaluation

Deep Learning Frameworks

Projects

Courses

URL: https://www.geeksforgeeks.org/deep-learning/residual-networks-resnet-deep-learning/