MLOps and LLMOps: Deploying and Scaling AI in Production

Keep adding new skills with 10,000+ programs for $239 (usually $399). Save now.

👁 Board Infinity

MLOps and LLMOps: Deploying and Scaling AI in Production

This course is part of Managing AI Systems: Development, Deployment, and Governance Specialization

👁 Board Infinity

Instructor: Board Infinity

Included with

•

Learn more

Ask Coursera

4 modules

Gain insight into a topic and learn the fundamentals.

Intermediate level

Recommended experience

2 weeks to complete

at 10 hours a week

Flexible schedule

Learn at your own pace

4 modules

Gain insight into a topic and learn the fundamentals.

Intermediate level

Recommended experience

2 weeks to complete

at 10 hours a week

Flexible schedule

Learn at your own pace

What you'll learn

Configure CI/CD pipelines for ML and LLM systems using GitHub Actions and MLflow
Optimize LLM inference pipelines for reduced latency, token cost, and improved reliability
Build automated evaluation frameworks using LLM-as-a-Judge and quality gates
Instrument production AI systems with tracing, drift detection, and observability dashboards

Skills you'll gain

Tools you'll learn

Details to know

👁 Image

Shareable certificate

Add to your LinkedIn profile

See how employees at top companies are mastering in-demand skills

Learn more about Coursera for Business

👁 logos of Petrobras, TATA, Danone, Capgemini, P&G and L'Oreal

Build your subject-matter expertise

This course is part of the Managing AI Systems: Development, Deployment, and Governance Specialization

When you enroll in this course, you'll also be enrolled in this Specialization.

Learn new concepts from industry experts
Gain a foundational understanding of a subject or tool
Develop job-relevant skills with hands-on projects
Earn a shareable career certificate

👁 Image

There are 4 modules in this course

This intermediate course equips ML engineers, data scientists, and software engineers with the practical skills needed to design, deploy, and scale production AI systems. You’ll learn how to architect reliable ML and LLM applications, including model serving patterns, feature stores, and retrieval-augmented generation (RAG) components. The course walks through reproducible training and experimentation pipelines with tools like MLflow and Weights & Biases, from experiment tracking and model registration to production deployment.

You will configure CI/CD workflows tailored to ML and LLM systems, covering data, model, and prompt versioning, automated testing, and safe rollback strategies. The course emphasizes security, privacy, and compliance best practices, including access control, secrets management, and safe handling of user and training data. You’ll design scalable serving infrastructure using containers, Kubernetes, and autoscaling, and apply deployment patterns such as canary, blue-green, shadow, and A/B testing to introduce changes safely. Finally, you’ll build automated evaluation and observability for production AI. This includes automated evaluation pipelines (e.g., LLM-as-a-judge) wired into CI/CD gates, defining and tracking key quality and performance metrics like hallucination rate, latency, throughput, and cost per request, and implementing robust logging, metrics, distributed tracing, and telemetry. You will also detect and monitor data and model drift, bias, and degradation over time using tools such as Arize Phoenix, design alerting strategies, and collaborate with product and reliability teams to establish incident response, runbooks, and continuous improvement processes for AI systems at scale. Disclaimer: This is an independent educational resource created by Board Infinity for informational and educational purposes only. This course is not affiliated with, endorsed by, sponsored by, or officially associated with any company, organization, or certification body unless explicitly stated. The content provided is based on industry knowledge and best practices but does not constitute official training material for any specific employer or certification program. All company names, trademarks, service marks, and logos referenced are the property of their respective owners and are used solely for educational identification and comparison purposes.

Start by grounding learners in practical, production-ready system design for ML and LLM applications. This module connects architectural patterns—serving topologies, feature stores, and retrieval-augmented generation (RAG)—to reproducible experimentation and compliant design decisions. Expect short instructor videos, readings that map design trade-offs, and hands-on exercises using experiment-tracking tools to make architectures actionable.

What's included

9 videos3 readings4 assignments1 plugin

9 videos•Total 92 minutes

ML/LLM CI/CD Architecture: How It's Different from DevOps•9 minutes
Automating Build → Test → Deploy for ML Pipelines•10 minutes
Integrating Model & Data Validation into CI/CD•9 minutes
Semantic Versioning for Models, Prompts, & Datasets•12 minutes
Model Registries: MLflow, W&B, and Custom Systems•7 minutes
Rollbacks & Lineage Tracking for Experiment Safety•8 minutes
Why ML Environments Drift•18 minutes
Reproducibility with Docker, Conda, Lockfiles, and Hashes•11 minutes
Promoting Environments Across Dev → Staging → Production•9 minutes

3 readings•Total 90 minutes

“CI/CD + CT/CD: Patterns & Anti-patterns in ML Deployment Pipelines”•30 minutes
“Model Registry Design: Governance, Lineage, and Auditability”•30 minutes
“Environment Parity Checklist for ML Systems”•30 minutes

4 assignments•Total 105 minutes

Graded Quiz : Operationalizing AI Pipelines (CI/CD, CT/CD, Versioning)•60 minutes
Practice Quiz : Foundations of CI/CD for ML & LLM Systems•15 minutes
Practice Quiz : Model Versioning & Release Management•15 minutes
Practice Quiz : Environment & Dependency Management•15 minutes

1 plugin•Total 5 minutes

Quick Course Check-In•5 minutes

Move from design to continuous delivery: this module teaches how to build CI/CD pipelines tailored to ML and LLM systems and how to gate changes with automated evaluation. Learners will set up data, model, and prompt versioning, define meaningful metrics (accuracy, hallucination rate, latency, cost), and implement evaluation pipelines—including LLM-as-a-judge methods—that plug into CI/CD gates. Activities include guided configuration examples, scenario-driven readings, and automated practice quizzes.

What's included

9 videos3 readings4 assignments

9 videos•Total 78 minutes

Designing Efficient Context Windows•9 minutes
Structured Prompts for Reliability & Determinism•6 minutes
Techniques to Reduce Hallucination via Prompt Engineering•8 minutes
Understanding Latency Budgets & Token Cost Drivers•13 minutes
Batching, Caching, Streaming, Compression•12 minutes
Model Choices: API vs Local Models•11 minutes
Logging Prompt Variants with W&B/Mlflow•3 minutes
Tracking Prompt-Response Deltas•9 minutes
Scientific Evaluation of Prompt Variants•8 minutes

3 readings•Total 90 minutes

“Prompt Architecture Patterns for Production LLM Systems”•30 minutes
“Token Economics: Understanding Cost Structures of LLM Pipelines•30 minutes
“Prompt Versioning Framework Example Repository”•30 minutes

4 assignments•Total 105 minutes

“Help me reduce the latency and cost of my LLM pipeline.”•60 minutes
Practice Quiz : Managing Context Windows & Prompt Structure•15 minutes
Practice Quiz : Inference Optimization: Latency & Token Cost•15 minutes
Practice Quiz : Prompt Versioning & Experiment Tracking•15 minutes

This module focuses on the operational mechanics of serving models and LLMs at scale. You will design and implement containerized serving architectures using orchestration (e.g., Kubernetes), autoscaling, and cost-aware inference pipelines; practice deployment patterns such as canary, blue-green, shadow, and A/B testing; and learn prompt and context-window optimization techniques to balance latency, quality, and cost. Practical labs and demonstrations show real-world manifests, autoscaling configs, and inference pipeline tuning.

What's included

9 videos3 readings4 assignments

9 videos•Total 66 minutes

Constructing Realistic Evaluation Data•9 minutes
Sampling Edge Cases & Failure Modes•8 minutes
Avoiding Bias in Test Data•5 minutes
Designing Evaluator Prompts•8 minutes
Scoring for Consistency, Relevance, Correctness•8 minutes
Limits of Automated Scoring•6 minutes
Evaluation Triggers During Deployment•7 minutes
Quality Gates & Release Thresholds•7 minutes
Reading Evaluation Dashboards for Release Readiness•7 minutes

3 readings•Total 90 minutes

“LLM Evaluation Dataset Blueprint”•30 minutes
“Automated Scoring Frameworks for LLM Evaluation”•30 minutes
“Evaluation Automation Templates Using MLflow/W&B”•30 minutes

4 assignments•Total 75 minutes

Covers: dataset design, automated evaluation, CI/CD integration.•30 minutes
Designing LLM Evaluation Datasets•15 minutes
LLM-as-a-Judge Methodologies•15 minutes
Practice Quiz : Integrating Evaluation into CI/CD Pipelines•15 minutes

Close the loop by instrumenting systems for deep observability and long-term reliability. Learners will add logging, metrics, distributed tracing, and telemetry; use monitoring platforms (e.g., Arize Phoenix) to detect data/model drift, bias, and degradation; and design alerting and runbooks while coordinating incident response with product and reliability teams. The module culminates in a hands-on capstone programming project that integrates architecture, CI/CD, serving, evaluation, and monitoring into a production-ready AI solution.

What's included

9 videos3 readings4 assignments

9 videos•Total 53 minutes

Logging Prompts, Responses, and Metadata•8 minutes
Comparing Experiments Across Versions•8 minutes
Tracking Inference Metrics•7 minutes
How Chains and Agents Break•8 minutes
Using Phoenix to Trace Execution Steps•5 minutes
Identifying Hallucination Triggers and Bottlenecks•5 minutes
Data Drift vs Behavioral Drift•5 minutes
Drift Dashboards & Alerting•3 minutes
When to Retrain or Update the Pipeline•4 minutes

3 readings•Total 90 minutes

“Telemetry Best Practices for Production AI”•30 minutes
“Tracing Playbook for Complex AI Systems”•30 minutes
“Drift Detection Techniques for LLM Applications”•30 minutes

4 assignments•Total 105 minutes

“Help me diagnose the failure points in this trace and recommend fixes.”•60 minutes
Practice Quiz : Experiment Tracking & Telemetry (W&B / MLflow)•15 minutes
Practice Quiz : Tracing and Debugging with Arize Phoenix•15 minutes
Practice Quiz : Monitoring Drift & System Health•15 minutes

Earn a career certificate

Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.

Instructor

👁 Board Infinity

Board Infinity

261 Courses•428,749 learners

Offered by

👁 Image

Board Infinity

Explore more from Software Development

👁 Image
B
Board Infinity
AI Systems Design: RAG Pipelines and LLM Architecture
Course
👁 Image
B
Board Infinity
AI Risk and Compliance: Audit and Governance Foundations
Course

Why people choose Coursera for their career

👁 Image

Felipe M.

Learner since 2018

"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."

👁 Image

Jennifer J.

Learner since 2020

"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."

👁 Image

Larry W.

Learner since 2021

"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."

👁 Image

Chaitanya A.

"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."

Frequently asked questions

Basic familiarity with Python and ML concepts is recommended. No prior MLOps experience is required — the course builds from foundational CI/CD concepts.

You'll work with MLflow, Weights & Biases (W&B), Arize Phoenix, GitHub Actions, Docker, and various prompt engineering frameworks.

Yes. The course introduces CI/CD and deployment concepts from an ML-first perspective, making it accessible for data scientists.

Absolutely. The skills are directly applicable to ML engineering, AI platform, and data engineering roles in production environments.

To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.

When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.

Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.

URL: https://www.coursera.org/learn/mlops-and-llmops-deploying-and-scaling-ai-in-production