VOOZH about

URL: https://www.coursera.org/learn/deep-learning-advanced-backbones-and-efficient-gpu-training

⇱ Deep Learning: Advanced Backbones and Efficient GPU Training | Coursera


Deep Learning: Advanced Backbones and Efficient GPU Training

Keep adding new skills with 10,000+ programs for $239 (usually $399). Save now.

Deep Learning: Advanced Backbones and Efficient GPU Training

Included with

β€’

Learn more

Ask Coursera

Gain insight into a topic and learn the fundamentals.
Intermediate level

Recommended experience

2 weeks to complete
at 10 hours a week
Flexible schedule
Learn at your own pace

Gain insight into a topic and learn the fundamentals.
Intermediate level

Recommended experience

2 weeks to complete
at 10 hours a week
Flexible schedule
Learn at your own pace

What you'll learn

  • Build and fine-tune ConvNeXt and Vision Transformer models using PyTorch Lightning and the timm library

  • Apply RMSNorm, SwiGLU, and Rotary Position Embeddings (RoPE) in modern transformer architectures

  • Implement mixed precision, gradient accumulation, and DDP/FSDP for efficient multi-GPU training

  • Design, track, and benchmark CNN vs. ViT experiments using TensorBoard, W&B, and PyTorch Profiler

Details to know

Shareable certificate

Add to your LinkedIn profile

Recently updated!

May 2026

Assessments

16 assignments

Taught in English

Build your subject-matter expertise

This course is part of the Advanced Deep Learning Architectures Specialization
When you enroll in this course, you'll also be enrolled in this Specialization.
  • Learn new concepts from industry experts
  • Gain a foundational understanding of a subject or tool
  • Develop job-relevant skills with hands-on projects
  • Earn a shareable career certificate

There are 4 modules in this course

Master advanced deep learning architectures and efficient training techniques using PyTorch Lightning, timm, ConvNeXt, Vision Transformers, RoPE, SwiGLU, RMSNorm, and Weights & Biases. This course equips you to design, train, and benchmark modern backbones on limited GPU hardware for real-world production use.

Module 1 introduces modern backbone architectures, tracing the evolution from ResNets to ConvNeXt and Vision Transformers, covering patch embeddings, multi-head self-attention, and position encodings. Module 2 dives into training dynamics and stabilization techniques including RMSNorm, SwiGLU activations, and Rotary Position Embeddings (RoPE) for stable, scalable training. Module 3 focuses on efficient training on limited GPUs using mixed precision (FP16/BF16), gradient accumulation, efficient data pipelines, and distributed training with DDP/FSDP in Lightning. Module 4 covers experiment tracking with TensorBoard and W&B, profiling FLOPs and throughput, and a hands-on ViT vs. CNN Showdown project with fine-tuning in timm. By the end of this course, you will: - Build and fine-tune ConvNeXt and Vision Transformer backbones using PyTorch Lightning and timm - Apply RMSNorm, SwiGLU, and RoPE to stabilize and scale deep transformer training - Implement mixed precision, gradient accumulation, and DDP/FSDP for efficient multi-GPU training - Design controlled CNN vs. ViT experiments with W&B tracking and PyTorch profiling Disclaimer: This is an independent educational resource created by Board Infinity for informational and educational purposes only. This course is not affiliated with, endorsed by, sponsored by, or officially associated with any company, organization, or certification body unless explicitly stated. The content provided is based on industry knowledge and best practices but does not constitute official training material for any specific employer or certification program. All company names, trademarks, service marks, and logos referenced are the property of their respective owners and are used solely for educational identification and comparison purposes.

Explore the evolution of deep learning backbones from classical CNNs to ConvNeXt and Vision Transformers, understanding their mechanics, trade-offs, and industry relevance.

What's included

10 videos3 readings4 assignments

10 videosβ€’Total 80 minutes
  • Where Advanced Architectures Are Used Todayβ€’11 minutes
  • CNNs vs Transformers: Industry Realityβ€’8 minutes
  • Skills You Need as a Vision Engineerβ€’7 minutes
  • Why Classic CNNs Started Failingβ€’8 minutes
  • What ConvNeXt Fixed in Old CNNsβ€’8 minutes
  • ResNet vs ConvNeXt – Part 1β€’8 minutes
  • ResNet vs ConvNeXt - Part 2β€’10 minutes
  • How Images Become Tokensβ€’6 minutes
  • What Attention Really Doesβ€’6 minutes
  • CNN vs ViT: Choosing the Right Backboneβ€’9 minutes
3 readingsβ€’Total 90 minutes
  • Industry Landscape: Modern Backbones & Global Attentionβ€’30 minutes
  • Architectural Transition: From ResNet to ConvNeXtβ€’30 minutes
  • Inside the ViT Forward Pass: Tokens, Attention & Positional Structureβ€’30 minutes
4 assignmentsβ€’Total 150 minutes
  • Modern Backbone Architectures (ConvNeXt & Vision Transformers)β€’60 minutes
  • Career Scope in Advanced Architecturesβ€’30 minutes
  • The Evolution Beyond ResNetsβ€’30 minutes
  • Vision Transformers Under the Hoodβ€’30 minutes

Learn modern stabilization and efficiency techniques including RMSNorm, SwiGLU activations, and Rotary Position Embeddings that power state-of-the-art transformers.

What's included

8 videos3 readings4 assignments

8 videosβ€’Total 66 minutes
  • Why Normalization Is Neededβ€’6 minutes
  • BatchNorm vs LayerNorm vs RMSNormβ€’8 minutes
  • Practical Effects on Training Stabilityβ€’7 minutes
  • Why ReLU Is Not Enough Anymoreβ€’9 minutes
  • GELU & SwiGLU Explained Visuallyβ€’8 minutes
  • Practical Gains: Stability, Expressiveness, Convergence Speedβ€’11 minutes
  • Why Position Encoding Mattersβ€’8 minutes
  • RoPE Explained Intuitivelyβ€’9 minutes
3 readingsβ€’Total 90 minutes
  • Normalization Benchmarks in Modern Architecturesβ€’30 minutes
  • SwiGLU in Production Transformersβ€’30 minutes
  • RoPE Explained: Sequence Extrapolation & Rotary Geometryβ€’30 minutes
4 assignmentsβ€’Total 150 minutes
  • Training Dynamics & Stabilization Techniquesβ€’60 minutes
  • RMSNorm & Normalization Strategiesβ€’30 minutes
  • SwiGLU & Modern Activation Functionsβ€’30 minutes
  • Rotary Position Embeddings (RoPE)β€’30 minutes

Master practical techniques for training large models on limited hardware including mixed precision, gradient accumulation, and distributed training strategies.

What's included

9 videos3 readings4 assignments

9 videosβ€’Total 79 minutes
  • Why Mixed Precision Mattersβ€’9 minutes
  • FP16 vs BF16: When to Use Whatβ€’9 minutes
  • Common Mixed Precision Failuresβ€’8 minutes
  • What Gradient Accumulation Really Doesβ€’10 minutes
  • Effective Batch Size Explained Clearlyβ€’10 minutes
  • Stability Issues with Large Batchesβ€’9 minutes
  • Single GPU vs Multi-GPU: When to Scaleβ€’9 minutes
  • DDP vs FSDP (Decision-Based)β€’8 minutes
  • Measuring Speed & Memory Correctlyβ€’8 minutes
3 readingsβ€’Total 90 minutes
  • AMP Benchmarks & Failure Patternsβ€’30 minutes
  • Efficient Data Pipelines for Transformers & ViTsβ€’30 minutes
  • Distributed Training on Commodity Hardwareβ€’30 minutes
4 assignmentsβ€’Total 150 minutes
  • Efficient Training on Limited GPUsβ€’60 minutes
  • Mixed Precision Training (FP16/BF16)β€’30 minutes
  • Gradient Accumulation & Large-Batch Simulationβ€’30 minutes
  • Distributed Training with Lightning (DDP/FSDP)β€’30 minutes

Learn to track experiments professionally and apply all course concepts in a hands-on ViT vs CNN Showdown project using fine-tuning with timm and PyTorch Lightning.

What's included

12 videos3 readings4 assignments

12 videosβ€’Total 101 minutes
  • What to Trackβ€’9 minutes
  • Visualizing Loss Curves, Gradient Norms & Failure Modesβ€’9 minutes
  • Profiling Memory & FLOPsβ€’10 minutes
  • How Bad Comparisons Happenβ€’8 minutes
  • Controlling Variables Properlyβ€’8 minutes
  • Forming Clear Hypothesesβ€’6 minutes
  • Fine-Tuning ConvNeXt & ViTβ€’10 minutes
  • Fine-Tuning ConvNeXt & ViT Part 2β€’6 minutes
  • Fine-Tuning ConvNeXt & ViT Part 3β€’9 minutes
  • Applying Mixed Precision & Efficiency Techniques Part -1β€’7 minutes
  • Applying Mixed Precision & Efficiency Techniquesβ€’7 minutes
  • Interpreting Results Like an Engineerβ€’10 minutes
3 readingsβ€’Total 90 minutes
  • Experiment Reproducibility & Performance Debuggingβ€’30 minutes
  • Backbone Design Patterns: Freezing, Unfreezing, Adapters, Head Tuningβ€’30 minutes
  • Case Study: Fine-Grained Classification with Modern Backbonesβ€’30 minutes
4 assignmentsβ€’Total 150 minutes
  • Experimentation, Tracking & The ViT vs CNN Showdown Projectβ€’60 minutes
  • Experiment Tracking (TensorBoard & W&B)β€’30 minutes
  • Designing a CNN vs ViT Experimentβ€’30 minutes
  • The Hands-On Project - The ViT vs CNN Showdownβ€’30 minutes

Earn a career certificate

Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.

Instructor

Board Infinity
261 Coursesβ€’428,749 learners

Why people choose Coursera for their career

πŸ‘ Image

Felipe M.

Learner since 2018
"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."
πŸ‘ Image

Jennifer J.

Learner since 2020
"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."
πŸ‘ Image

Larry W.

Learner since 2021
"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."
πŸ‘ Image

Chaitanya A.

"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."

Frequently asked questions

Yes. You should have working knowledge of PyTorch, CNNs, and standard training loops. Familiarity with transformers is helpful but not mandatory.

You'll work with PyTorch Lightning, the timm library, Weights & Biases, TensorBoard, and the PyTorch Profiler throughout the course.

The course prepares you for roles such as Deep Learning Engineer, Computer Vision Engineer, ML Research Engineer, and AI Infrastructure Engineer.

To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.

When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.

Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.

Financial aid available,