Generative AI and Large Language Models
Keep adding new skills with 10,000+ programs for $239 (usually $399). Save now.
Generative AI and Large Language Models
This course is part of multiple programs.
4,778 already enrolled
Included with
Learn more
Ask Coursera
12 reviews
Recommended experience
12 reviews
Recommended experience
Skills you'll gain
Details to know
22 assignments
See how employees at top companies are mastering in-demand skills
Build your subject-matter expertise
- Learn new concepts from industry experts
- Gain a foundational understanding of a subject or tool
- Develop job-relevant skills with hands-on projects
- Earn a shareable career certificate
There are 5 modules in this course
Welcome to the world of Generative AI and Large Language Models (LLMs)βwhere technology mirrors human creativity and intelligence. This course is designed to provide you with a comprehensive understanding of generative models, including their evolution, applications, and the underlying architectures that make them possible.
Throughout the modules, you'll explore various generative techniques such as GANs (Generative Adversarial Networks), VAEs (Variational Autoencoders), diffusion models, and multimodal AI. You'll also gain hands-on experience with tools like OpenAI's GPT, Hugging Face, Streamlit, and MLflow, ensuring you can deploy and fine-tune models for real-world applications.
Take your first steps into the exciting world of generative AI, where you'll distinguish between various model types including GANs, VAEs, transformers, and diffusion models. You'll explore the evolution of generative technologies and examine their real-world applications while considering important ethical implications that accompany these powerful tools.
What's included
9 videos7 readings5 assignments2 ungraded labs
9 videosβ’Total 21 minutes
- Welcome to Generative AIβ’2 minutes
- Training a Discriminative Model: Logistic Regression on 2D Blobsβ’2 minutes
- Fitting and Visualizing a Generative Modelβ’2 minutes
- From GANs to Autoregressive Models: Hands-On with Generative Basicsβ’3 minutes
- Diffusion Models in Action: From Noise to Realistic Outputsβ’2 minutes
- What Can LLMs Do Today? Real Use Cases Across Providersβ’2 minutes
- What Can Vision-Language Models Do? Image + Text in Actionβ’3 minutes
- Uncovering Bias in LLM Outputsβ’2 minutes
- Hallucinations & Misinformation in Actionβ’2 minutes
7 readingsβ’Total 63 minutes
- Foundations of Generative AIβ’8 minutes
- Types and Use Cases of Generative AIβ’8 minutes
- Foundations of Generative Modeling: From GANs to VAEsβ’10 minutes
- From Autoregressive to Diffusion: How Modern Generative Models Took Overβ’10 minutes
- Understanding Large Language Models (LLMs): Capabilities, Providers, and Trendsβ’10 minutes
- Understanding Vision-Language Models (VLMs): Capabilities, Use Cases, and Trendsβ’10 minutes
- Responsible AI: Risks and Mitigation Strategiesβ’7 minutes
5 assignmentsβ’Total 90 minutes
- Foundations of Generative AIβ’30 minutes
- Knowledge Check - What is Generative AI?β’15 minutes
- Knowledge Check - Generative Model Evolutionβ’15 minutes
- Knowledge Check - LLMs & VLMsβ’15 minutes
- Knowledge Check - Ethical AI Deploymentβ’15 minutes
2 ungraded labsβ’Total 105 minutes
- Sample from a Simple Generative Modelβ’45 minutes
- Sample from a VAE and an Autoregressive Modelβ’60 minutes
Explore the revolutionary transformer architecture that powers today's most advanced language models. You'll gain hands-on experience with self-attention mechanisms, learn how transformers process and generate text, and experiment with fine-tuning using Hugging Face Transformers. This module bridges theory with practical implementation, equipping you with skills to work directly with cutting-edge LLM technology.
What's included
7 videos6 readings4 assignments3 ungraded labs
7 videosβ’Total 16 minutes
- Transformers Made LLMs Possible: Here's Why That Mattersβ’2 minutes
- The Problem with RNNs and How Transformers Fix Itβ’4 minutes
- Self-Attention, Multi-Head Attention, and Feedforward Networksβ’3 minutes
- Tuning LLM Output with Temperature, Top-k, and Top-pβ’2 minutes
- Accessing LLMs Through APIs and UIsβ’1 minute
- Prompt Engineering: Small Tweaks, Big Resultsβ’2 minutes
- Fine-Tuning a Transformer with Hugging Faceβ’2 minutes
6 readingsβ’Total 43 minutes
- From RNNs to Transformers: A New Way to Process Sequencesβ’8 minutes
- Anatomy of Transformers and Their Architecturesβ’4 minutes
- Prompt Engineering Essentials: How to Write Better Promptsβ’10 minutes
- Calling LLMs via API: How to Get Started Safely and Effectivelyβ’8 minutes
- LLM Fine-Tuning Strategies: From Supervised to Alignedβ’7 minutes
- Understanding PEFT and Reinforcement Learning Fine-Tuningβ’6 minutes
4 assignmentsβ’Total 75 minutes
- Working with Transformers and Fine-Tuning β’30 minutes
- Knowledge Check - Transformer Foundationsβ’15 minutes
- Knowledge Check - Prompt Engineering & APIsβ’15 minutes
- Knowledge Check - LLM Fine-Tuningβ’15 minutes
3 ungraded labsβ’Total 160 minutes
- Experiment with LLM Sampling Parametersβ’40 minutes
- Prompt and Compare Across LLMsβ’60 minutes
- Perform Lightweight Fine-Tuning with LoRAβ’60 minutes
Take your LLM knowledge to the next level with practical applications that power modern AI systems. You'll implement retrieval-augmented generation to enhance responses with external knowledge, use structured output techniques for consistent formatting, and deploy models through APIs. This module tackles both the theory and practice behind modern LLM applications, showing you how to build real-world applications with today's most advanced language models.
What's included
5 videos4 readings5 assignments3 ungraded labs
5 videosβ’Total 8 minutes
- Retrieving Knowledge: Embeddings and Vector Search with FAISSβ’1 minute
- Grounded Generation: Adding Retrieval to an LLM Pipelineβ’1 minute
- Prompting LLMs for Structured Output and Function Simulationβ’2 minutes
- Deploying an LLM Using MLflow and Streamlit Cloudβ’1 minute
- Simulate an AI Agent Using OpenAI Function Calling or Tool Simulationβ’1 minute
4 readingsβ’Total 37 minutes
- What is RAG and Why Does It Matter?β’10 minutes
- Designing for Structure: Output Formats and Tool Useβ’10 minutes
- LLM Deployment: Options, Challenges, and Best Practicesβ’10 minutes
- AI Agents 101: Key Concepts and Applicationsβ’7 minutes
5 assignmentsβ’Total 90 minutes
- Hands-on Applications of LLMsβ’30 minutes
- Knowledge Check - Retrieval-Augmented Generationβ’15 minutes
- Knowledge Check - Structured Output & Function Callsβ’15 minutes
- Knowledge Check - LLM Deploymentβ’15 minutes
- Knowledge Check - AI Agentsβ’15 minutes
3 ungraded labsβ’Total 165 minutes
- Implement a Simple RAG Pipeline with FAISS and Hugging Faceβ’60 minutes
- Prompt LLMs for Structured Output + Simulated Function Useβ’45 minutes
- Deploy a Text Generation Model with Streamlit + MLflowβ’60 minutes
Discover the technology behind today's most impressive image generation systems. You'll learn how diffusion models gradually transform random noise into stunning visuals through an iterative denoising process. Through practical coding exercises, you'll implement your own diffusion model using PyTorch, explore Stable Diffusion for text-to-image generation, and compare diffusion with earlier approaches like GANs and VAEs to understand why diffusion has become the dominant paradigm in visual generation.
What's included
4 videos4 readings4 assignments3 ungraded labs
4 videosβ’Total 10 minutes
- Why Diffusion Has Become the Preferred Approach for High-Quality Image Generationβ’3 minutes
- Text-to-Image Generation with Stable Diffusionβ’1 minute
- Exploring Latent Space in Diffusion Modelsβ’1 minute
- GANs vs. VAEs vs. Diffusion: What Do the Outputs Say?β’5 minutes
4 readingsβ’Total 27 minutes
- Your First Tiny Diffusion Model: Simulate Diffusion in Pixel Spaceβ’6 minutes
- The Diffusion Process Explainedβ’7 minutes
- Inside Stable Diffusion: Architecture and Prompt Controlβ’7 minutes
- Choosing the Right Generative Model: A Comparative Guideβ’7 minutes
4 assignmentsβ’Total 75 minutes
- Diffusion and Generative Model Comparisonβ’30 minutes
- Knowledge Check - Diffusion Basicsβ’15 minutes
- Knowledge Check - Training with Stable Diffusionβ’15 minutes
- Knowledge Check - Comparing Modelsβ’15 minutes
3 ungraded labsβ’Total 165 minutes
- Simulate Forward Diffusion on Images Using PyTorchβ’45 minutes
- Generate Custom Images with Stable Diffusionβ’60 minutes
- Compare Outputs from GAN, VAE, and Diffusion Modelsβ’60 minutes
Discover how cutting-edge AI models can integrate text, images, and audio to create truly multimodal experiences. You'll investigate vision-language models like CLIP and BLIP that understand relationships between text and images, implement audio-based AI with Whisper for speech recognition, and gain hands-on experience building systems that can process multiple types of data simultaneously. This module prepares you for the increasingly multimodal future of generative AI where models seamlessly combine different kinds of information.
What's included
6 videos4 readings4 assignments1 programming assignment3 ungraded labs
6 videosβ’Total 9 minutes
- From Text to Everything: The Multimodal Revolutionβ’2 minutes
- How Do Multimodal Models Combine Text, Image, and Audio?β’1 minute
- Generating Captions with BLIPβ’1 minute
- Zero-Shot Image Classification with CLIPβ’1 minute
- Transcribing Speech to Text with Whisperβ’1 minute
- Generating Speech from Text with TTSβ’1 minute
4 readingsβ’Total 40 minutes
- Multimodal Generative AI: A Foundationβ’10 minutes
- Understanding VLMs: CLIP, BLIP, Geminiβ’7 minutes
- Understanding Whisper and Audio-Based Generative AIβ’7 minutes
- Designing Your Own Generative AI Assistant: From Learner to Builderβ’16 minutes
4 assignmentsβ’Total 75 minutes
- Multimodal Generative AIβ’30 minutes
- Knowledge Check - Multimodal Foundationsβ’15 minutes
- Knowledge Check - CLIP, BLIP, and Geminiβ’15 minutes
- Knowledge Check - Audio-Based Modelsβ’15 minutes
1 programming assignmentβ’Total 120 minutes
- Capstone Project - Build Your Own Generative AI Assistant with RAG, LLMs, and Multimodal Inputβ’120 minutes
3 ungraded labsβ’Total 150 minutes
- Explore Cross-Modal Embeddings with CLIPβ’45 minutes
- Image Captioning and Classification with VLMsβ’45 minutes
- Transcribe and Generate Audio with Whisper + TTSβ’60 minutes
Earn a career certificate
Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.
Instructor
Offered by
Why people choose Coursera for their career
Frequently asked questions
To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.
When you enroll in the course, you get access to all of the courses in the Certificate, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.
More questions
Financial aid available,
