Deep Learning for AI Part 1

Keep adding new skills with 10,000+ programs for $239 (usually $399). Save now.

👁 Northeastern University

Deep Learning for AI Part 1

👁 Xuemin Jin

Instructor: Xuemin Jin

Included with

•

Learn more

Ask Coursera

7 modules

Gain insight into a topic and learn the fundamentals.

Intermediate level

Some related experience required

2 weeks to complete

at 10 hours a week

Flexible schedule

Learn at your own pace

7 modules

Gain insight into a topic and learn the fundamentals.

Intermediate level

Some related experience required

2 weeks to complete

at 10 hours a week

Flexible schedule

Learn at your own pace

Skills you'll gain

Tools you'll learn

Details to know

👁 Image

Shareable certificate

Add to your LinkedIn profile

See how employees at top companies are mastering in-demand skills

Learn more about Coursera for Business

👁 logos of Petrobras, TATA, Danone, Capgemini, P&G and L'Oreal

There are 7 modules in this course

This is Part 1 of a two-part graduate sequence in deep learning. It establishes the foundations of modern deep learning and the core neural architectures behind today's AI systems. You will build from how neural networks learn—through forward propagation and backpropagation—to convolutional networks for computer vision, recurrent networks for sequence data, and the first generative architectures: variational autoencoders, generative adversarial networks, and Transformers. The course emphasizes both conceptual understanding and hands-on implementation in TensorFlow/Keras and PyTorch. Part 2 continues with advanced generative modeling.

Deep learning has transformed artificial intelligence by enabling models to learn hierarchical representations directly from raw data—dramatically outperforming traditional hand-engineered approaches across vision, language, and scientific domains. You will build the conceptual and practical vocabulary the entire course depends on: how neural networks are constructed, how training proceeds through forward and backward passes, and why deep learning is particularly suited to unstructured, high-dimensional data.

What's included

2 videos15 readings3 assignments

2 videos•Total 3 minutes

Why Deep Learning? Modern AI Applications•2 minutes
Neural Networks•2 minutes

15 readings•Total 155 minutes

Course Introduction•2 minutes
Syllabus - Deep Learning for AI Part 1•10 minutes
Meet Your Faculty•1 minute
Academic Integrity•1 minute
Deep Learning Overview and Motivation•15 minutes
Real-World Applications Across Vision, Language, and Science•10 minutes
Neurons, Layers, and the Network Structure•5 minutes
Weights, Biases, and Learned Parameters•10 minutes
The Forward Pass: Computing Predictions•1 minute
Loss Functions for Classification and Reconstruction•10 minutes
Backpropagation and the Chain Rule•30 minutes
Optimization Algorithms: SGD, Momentum, AdaGrad, and Adam•30 minutes
Choosing a Framework: TensorFlow vs. PyTorch•10 minutes
Tensor Fundamentals: Scalars Through 3D+ Tensors and Tensor Attributes•10 minutes
Discriminative vs. Generative Models: A Course Preview•10 minutes

3 assignments•Total 90 minutes

Assess Your Learning: Why Deep Learning and Neural Network Architecture•30 minutes
Assess Your Learning: Forward Propagation and Backpropagation•30 minutes
Assess Your Learning: Frameworks, Tensors, and Discriminative vs. Generative Models•30 minutes

Convolutional Neural Networks are the architectural backbone of modern computer vision and a component you will encounter repeatedly throughout this course—inside autoencoders, GANs, and diffusion model U-Nets. You will develop the ability to read, design, and reason about CNN architectures from filter-level convolution operations through landmark designs like VGG and ResNet, and learn how pretrained models can be adapted to new tasks through transfer learning.

What's included

1 video9 readings3 assignments

1 video•Total 3 minutes

Batch Normalization, Dropout, and Activation Functions•3 minutes

9 readings•Total 105 minutes

Why Convolutional Neural Networks?•10 minutes
Convolution Mechanics and Filter Visualization•15 minutes
Padding Modes and Stride•5 minutes
Max Pooling and Average Pooling•10 minutes
Strided Convolution and Feature Map Interpretation•5 minutes
Batch Normalization and Internal Covariate Shift•10 minutes
Dropout Ratios and Activation Function Choices•10 minutes
CNN Layer Flow and Architecture Patterns•10 minutes
A Simple CNN Example: MNIST and CIFAR-10•30 minutes

3 assignments•Total 90 minutes

Assess Your Learning: Why CNNs, Convolution, and Pooling•30 minutes
Assess Your Learning: Batch Normalization, Dropout, and CNN Layer Flow•30 minutes
Assess Your Learning: CNN Worked Example•30 minutes

Computer vision is the field that enables machines to perceive and interpret visual information—the domain where deep learning first achieved superhuman performance. You will survey its core tasks, from image classification and object detection to semantic segmentation, then work through the full detection pipeline from the R-CNN family to YOLOv8, gaining enough architectural depth to understand how these systems are extended and fine-tuned for new domains.

What's included

10 readings3 assignments

10 readings•Total 132 minutes

What Is Computer Vision? Goals, Scope, and Task Taxonomy•10 minutes
Image and Video Data Types and Applications•30 minutes
R-CNN and the Region Proposal Approach•10 minutes
Fast R-CNN, Faster R-CNN, and Two-Stage Detection•30 minutes
The YOLO Concept and Architecture Overview•10 minutes
YOLOv8: Backbone, FPN Neck, and Detection Head•10 minutes
YOLOv8: Loss Function and Non-Maximum Suppression•10 minutes
Data Preparation and Training Walkthrough•2 minutes
Feature Extraction vs. Fine-Tuning for Vision•10 minutes
The Keras Pretrained Model API for Vision•10 minutes

3 assignments•Total 90 minutes

Assess Your Learning: Computer Vision Tasks and R-CNN•30 minutes
Assess Your Learning: YOLOv8 Architecture and Training•30 minutes
Assess Your Learning: Transfer Learning for Vision•30 minutes

The models you studied in earlier modules treat inputs as fixed-size, spatially arranged structures. Many real-world problems involve sequences where order matters and context accumulates over time: text, speech, time-series data, financial signals. You will learn how RNNs process sequences through a hidden state, how LSTMs and GRUs address the vanishing gradient problem, and why these architectures—and their failure modes—directly motivated the attention mechanism covered in the Transformer module.

What's included

12 readings3 assignments

12 readings•Total 64 minutes

Why Recurrent Networks? Sequence Modeling Applications•5 minutes
RNN vs. CNN: Handling Temporal Data•10 minutes
The Hidden State Update and RNN Unrolling•10 minutes
Backprop Through Time and Vanishing/Exploding Gradients•5 minutes
LSTM Architecture: Forget, Input, and Output Gates•10 minutes
The Cell State and Long-Range Memory in LSTMs•2 minutes
GRU Architecture: Reset and Update Gates•1 minute
GRU vs. LSTM: Trade-offs and Selection Criteria•1 minute
The IMDB Dataset and One-Hot Encoding•5 minutes
Word Embeddings and Embedding Layers•5 minutes
Building the LSTM Model for Sentiment Analysis•5 minutes
Training, Evaluation, and Results•5 minutes

3 assignments•Total 90 minutes

Assess Your Learning: Introduction to RNNs and Backprop Through Time•30 minutes
Assess Your Learning: LSTM and GRU•30 minutes
Assess Your Learning: Text Data Handling and LSTM Sentiment Classification•30 minutes

This module marks the course's inflection point: the shift from discriminative models that learn decision boundaries to generative models that learn to synthesize new data. You will survey the full generative landscape—VAEs, GANs, autoregressive models, normalizing flows, diffusion models, and energy-based models—before diving into the autoencoder and its probabilistic extension, the Variational Autoencoder.

What's included

1 video14 readings4 assignments

1 video•Total 3 minutes

Transposed Convolution•3 minutes

14 readings•Total 85 minutes

Generative vs. Discriminative Models•2 minutes
Challenges in Generative Modeling and a Toy Generative Model•2 minutes
Representation Learning and Probability Theory Review•2 minutes
Generative Model Taxonomy: VAEs, GANs, Flows, Diffusion, EBMs•5 minutes
Autoencoder Motivation and Architecture Overview•2 minutes
Building the Encoder and Decoder•10 minutes
Transposed Convolution for Decoding•10 minutes
The Probabilistic Extension: From AE to VAE•5 minutes
The Reparameterization Trick•10 minutes
The ELBO Loss Function•1 minute
KL Divergence and Regularizing the Latent Space•2 minutes
VAE vs. Autoencoder: Key Differences•2 minutes
Face Generation and Latent Space Arithmetic•30 minutes
Interpolating and Morphing in Latent Space•2 minutes

4 assignments•Total 120 minutes

Assess Your Learning: Introduction to Generative Modeling and Representation Learning•30 minutes
Assess Your Learning: Autoencoders and Latent Space Exploration•30 minutes
Assess Your Learning: VAE Probabilistic Framework and Loss•30 minutes
Assess Your Learning: VAE vs. Autoencoder and Worked Example•30 minutes

Generative Adversarial Networks take a fundamentally different approach to generative modeling: rather than maximizing a likelihood objective, two networks train in competition. You will work through the full GAN toolkit—from Deep Convolutional GANs and training stabilization techniques to Wasserstein distance, gradient penalty, conditional generation, and cycle-consistent domain translation.

What's included

10 readings3 assignments

10 readings•Total 52 minutes

The Adversarial Framework: Generator and Discriminator•5 minutes
GAN Types, Applications, and Ethical Considerations•5 minutes
DCGAN Architecture and Design Principles•10 minutes
DCGAN Training: Fashion MNIST and Lego Bricks Examples•5 minutes
GAN Training Instability and Mode Collapse•5 minutes
Stabilization Techniques: Normalization, Learning Rate, Label Smoothing•2 minutes
Wasserstein Distance and the WGAN Objective•5 minutes
Gradient Penalty and WGAN-GP Training Results•5 minutes
Conditional GAN Architecture and Class Conditioning•5 minutes
CycleGAN and Unpaired Domain Translation•5 minutes

3 assignments•Total 90 minutes

Assess Your Learning: What Are GANs and Deep Convolutional GANs•30 minutes
Assess Your Learning: GAN Training Tips and WGAN-GP•30 minutes
Assess Your Learning: Conditional GANs and CycleGAN•30 minutes

Introduced in "Attention Is All You Need" (Vaswani et al., 2017), the Transformer is arguably the most consequential architectural development in deep learning since the CNN. You will derive the attention mechanism from first principles—Query, Key, Value, scaled dot-product, multi-head attention—assemble the full architecture with positional encoding and causal masking, and see it applied in a GPT-style language model.

What's included

1 video11 readings3 assignments

1 video•Total 2 minutes

The Attention Mechanism•2 minutes

11 readings•Total 102 minutes

Why Transformers? Advantages Over RNNs and CNNs•5 minutes
GPT Overview and Key Transformer Applications•2 minutes
What Is Attention? The Concept and Intuition•5 minutes
Attention Continued•5 minutes
Self-Attention and Network Parameters•10 minutes
Multi-Head Attention and Parallel Representation Learning•10 minutes
Positional Encoding: Sinusoidal and Learned•10 minutes
Causal Masking for Autoregressive Generation•10 minutes
Building a GPT-Style Language Model•10 minutes
Training, Generating, and Evaluating Text•30 minutes
Congratulations! •5 minutes

3 assignments•Total 90 minutes

Assess Your Learning: What Is a Transformer and the Attention Mechanism•30 minutes
Assess Your Learning: Multi-Head Attention and Positional Encoding•30 minutes
Assess Your Learning: GPT-Style Application•30 minutes

Instructor

👁 Xuemin Jin

Xuemin Jin

Northeastern University

8 Courses•1,167 learners

Offered by

👁 Image

Northeastern University

Explore more from Machine Learning

👁 Image
Status: Preview
N
Northeastern University
Deep Learning for AI Part 2
Course
👁 Image
Status: Free Trial
P
Pearson
Learning Deep Learning: Unit 1
Course
👁 Image
Status: Free Trial
P
Packt
Deep Learning & Modern AI Architectures
Course
👁 Image
Status: Free Trial
P
Pearson
Learning Deep Learning: Unit 2
Course

Why people choose Coursera for their career

👁 Image

Felipe M.

Learner since 2018

"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."

👁 Image

Jennifer J.

Learner since 2020

"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."

👁 Image

Larry W.

Learner since 2021

"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."

👁 Image

Chaitanya A.

"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."

Frequently asked questions

To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.

When you purchase a Certificate you get access to all course materials, including graded assignments. Upon completing the course, your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.

Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.

URL: https://www.coursera.org/learn/deep-learning-for-ai-part-1