Building and Deploying Generative AI Models
Keep adding new skills with 10,000+ programs for $239 (usually $399). Save now.
Building and Deploying Generative AI Models
This course is part of Generative AI Fundamentals Specialization
Included with
Learn more
Ask Coursera
Recommended experience
Recommended experience
What you'll learn
Construct and evaluate Transformer-based LLMs from scratch using PyTorch and industry metrics like ROUGE and BLEU.
Engineer Retrieval Augmented Generation (RAG) pipelines using LangChain to integrate current, domain-specific knowledge into models.
Deploy autonomous AI Agents to production environments on Google Cloud Platform (Vertex AI) using professional workflows.
Skills you'll gain
Details to know
December 2025
3 assignments
See how employees at top companies are mastering in-demand skills
Build your subject-matter expertise
- Learn new concepts from industry experts
- Gain a foundational understanding of a subject or tool
- Develop job-relevant skills with hands-on projects
- Earn a shareable career certificate
There are 3 modules in this course
Transition from theoretical concepts to production-ready engineering in this hands-on course which is the final part in "Fundamentals of Generative AI" specialization. Designed for learners ready to move beyond the theory, this course focuses entirely on construction: you won't just learn about Large Language Models (LLMs); you will build, refine, and deploy them.
We start at the foundational level, coding different types of Transformer architectures from scratch using PyTorch. Through high-performance training with Automatic Mixed Precision and ROUGE/BLEU evaluation, you will learn the techniques to scale custom components into optimized systems. By utilizing pre-trained models and weighing performance trade-offs, you will gain the insight needed to select the most efficient path for large-scale deployment. Moving to applied architecture, you will master Retrieval Augmented Generation (RAG) using LangChain, learning to evaluate pipelines and apply advanced techniques such as different chunking strategies, reranking and compression, and query transformation. You'll also navigate model selection as well as the critical trade-offs between RAG and Fine-tuning. Finally, you will step into the future of AI by developing autonomous Agents. You will bridge the gap between development and production by setting up a professional workflow with Poetry and deploying a Summarizer AI Agent directly to the Google Cloud Platform (Vertex AI). By the end of this course, you will possess a tangible portfolio of code and a live deployment, proving your ability to engineer robust Generative AI solutions.
In this module, we dive deep into the Transformer architecture, its core mechanics, and different transformer architecture types (encoder-only, decoder-only, encoder-decoder). We gain hands-on experience by building and training a complete suite of PyTorch-based models from scratch. The module concludes with strategic deployment skills, teaching when to build custom models versus leveraging pre-trained models for efficiency and state-of-the-art results.
What's included
18 videos11 readings1 assignment
18 videosβ’Total 113 minutes
- Course Introductionβ’4 minutes
- Meet your instructor: Amreen Anbarβ’1 minute
- Meet your instructor: Anahita Doostiβ’1 minute
- Meet your instructor: Soroush Razaviβ’1 minute
- Transformer: Evolution Unveiledβ’8 minutes
- Transformer: Typesβ’8 minutes
- Transformer: The Componentsβ’7 minutes
- Setting The Stage: Environment, Libraries and Dataβ’8 minutes
- Looking beyond theory: Letβs Build a Transformer!β’9 minutes
- Looking beyond theory: Training and Text Generationβ’8 minutes
- Building the Complete Encoder-Decoder Summarizer: Encoder, Decoder, and the Cross-Attention Mechanismβ’7 minutes
- Building the Complete Encoder-Decoder Summarizer: Teacher Forcing, Loss, and Inferenceβ’7 minutes
- Scaling the Architecture: From Character Tokens to BPE and Massive Dataβ’8 minutes
- Scaling the Architecture: High-Performance Optimization (AMP) and ROUGE Evaluationβ’9 minutes
- Synthesis: Implementation of the Translator Transformerβ’9 minutes
- Bypass the Training Wall: Powerful LLM Applications Without Massive Computeβ’5 minutes
- A Resource-Efficient Approach: Using pre-trained models for Summarization β’6 minutes
- A Resource-Efficient Approach: Using Pre-trained Models for Translationβ’8 minutes
11 readingsβ’Total 290 minutes
- The original paper, "Attention Is All You Need"β’20 minutes
- Interactive Transformer Explainerβ’30 minutes
- Notebook 1β’40 minutes
- Notebook 2β’40 minutes
- Notebook 3β’40 minutes
- Dataset (cnn_dailymail)β’10 minutes
- Notebook 4β’40 minutes
- Dataset (wmt14)β’10 minutes
- ROUGE and BLEU Score for NLP Evaluationβ’20 minutes
- Notebook 5β’20 minutes
- Notebook 6β’20 minutes
1 assignmentβ’Total 30 minutes
- Section 1 Quizβ’30 minutes
Module 2 addresses the limitations of static knowledge and hallucinations in Large Language Models (LLMs) by introducing Retrieval Augmented Generation (RAG). Learners will progress from building fundamental pipelines with Ollama and LangChain to implementing production-ready systems by adding rigorous RAG evaluation and utilizing advanced techniques such as custom chunking strategies, vector stores, reranking, and query transformations to optimize context retrieval and response generation. The module concludes with an overview of another adaptation technique called finetuning and a comparison of RAG vs. finetuning.
What's included
13 videos2 readings1 assignment
13 videosβ’Total 85 minutes
- What is RAG?β’6 minutes
- Building a Minimal RAG from Scratch with Ollama (Part 1)β’7 minutes
- Building a Minimal RAG from Scratch with Ollama (Part 2)β’5 minutes
- An Improved RAG Pipeline with LangChainβ’7 minutes
- RAG Evaluation and Metricsβ’7 minutes
- Implementing RAG Evaluationβ’7 minutes
- Document Loaders and Chunking Strategiesβ’6 minutes
- Vector Stores and Indexingβ’6 minutes
- Reranking and Contextual Compressionβ’7 minutes
- Query Transformationβ’7 minutes
- Pick the Right Models for your RAGβ’7 minutes
- What is Finetuning?β’5 minutes
- RAG vs. Finetuning: Which one to choose?β’7 minutes
2 readingsβ’Total 140 minutes
- Coding Notebooks β’20 minutes
- Final RAG Results β’120 minutes
1 assignmentβ’Total 30 minutes
- Section 2 Quizβ’30 minutes
Module 3 marks a pivotal transition from passive information retrieval to the dynamic realm of autonomous AI Agents, anchored by the "Understand, Think, Take Action" conceptual framework. Students will critically evaluate development ecosystems before applying these concepts to build a functional Summarizer Agent. The module emphasizes professional engineering standards, guiding learners through a complete lifecycle that includes environment management with Poetry, deployment to the Vertex AI Engine, and the implementation of robust performance monitoring using Google Cloud Platformβs logging and tracing tools.
What's included
15 videos1 reading1 assignment
15 videosβ’Total 76 minutes
- What is an Agent?β’7 minutes
- Different Approaches to Building Agentsβ’6 minutes
- Our Approach in This Courseβ’5 minutes
- ADK Features and Toolsβ’5 minutes
- Setting Up the Cloud Environmentβ’5 minutes
- Setting Up the Local Environmentβ’4 minutes
- From Basic to Advanced Agentsβ’6 minutes
- Deployment Pathways for ADK Agentsβ’6 minutes
- Project Installation: Dependency and Environment Managementβ’5 minutes
- Agent Structure and Workflowβ’6 minutes
- Running The Agent Part 1: Initiatingβ’5 minutes
- Running The Agent Part 2: Analyzingβ’4 minutes
- Deploying Agent to The Cloudβ’5 minutes
- Monitoring The Deployment on GCPβ’3 minutes
- Wrap Upβ’4 minutes
1 readingβ’Total 30 minutes
- Project Link and Descriptionβ’30 minutes
1 assignmentβ’Total 30 minutes
- Section 3 Quizβ’30 minutes
Earn a career certificate
Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.
Instructors
Explore more from Algorithms
- Status: Free TrialS
Starweaver
Specialization
- Status: Free Trial
Course
- Status: Free Trial
- Status: Free TrialC
Coursera
Course
Why people choose Coursera for their career
Frequently asked questions
To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.
When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.
Yes. In select learning programs, you can apply for financial aid or a scholarship if you canβt afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, youβll find a link to apply on the description page.
More questions
Financial aid available,
