VOOZH about

URL: https://www.coursera.org/learn/mlops-and-llmops-deploying-and-scaling-ai-in-production

⇱ MLOps and LLMOps: Deploying and Scaling AI in Production | Coursera


MLOps and LLMOps: Deploying and Scaling AI in Production

Keep adding new skills with 10,000+ programs for $239 (usually $399). Save now.

MLOps and LLMOps: Deploying and Scaling AI in Production

Included with

Ask Coursera

Gain insight into a topic and learn the fundamentals.
Intermediate level

Recommended experience

2 weeks to complete
at 10 hours a week
Flexible schedule
Learn at your own pace

Gain insight into a topic and learn the fundamentals.
Intermediate level

Recommended experience

2 weeks to complete
at 10 hours a week
Flexible schedule
Learn at your own pace

What you'll learn

  • Configure CI/CD pipelines for ML and LLM systems using GitHub Actions and MLflow

  • Optimize LLM inference pipelines for reduced latency, token cost, and improved reliability

  • Build automated evaluation frameworks using LLM-as-a-Judge and quality gates

  • Instrument production AI systems with tracing, drift detection, and observability dashboards

Details to know

Shareable certificate

Add to your LinkedIn profile

Recently updated!

April 2026

Assessments

16 assignments

Taught in English

Build your subject-matter expertise

This course is part of the Managing AI Systems: Development, Deployment, and Governance Specialization
When you enroll in this course, you'll also be enrolled in this Specialization.
  • Learn new concepts from industry experts
  • Gain a foundational understanding of a subject or tool
  • Develop job-relevant skills with hands-on projects
  • Earn a shareable career certificate

There are 4 modules in this course

This intermediate course equips ML engineers, data scientists, and software engineers with the practical skills needed to design, deploy, and scale production AI systems. You’ll learn how to architect reliable ML and LLM applications, including model serving patterns, feature stores, and retrieval-augmented generation (RAG) components. The course walks through reproducible training and experimentation pipelines with tools like MLflow and Weights & Biases, from experiment tracking and model registration to production deployment.

You will configure CI/CD workflows tailored to ML and LLM systems, covering data, model, and prompt versioning, automated testing, and safe rollback strategies. The course emphasizes security, privacy, and compliance best practices, including access control, secrets management, and safe handling of user and training data. You’ll design scalable serving infrastructure using containers, Kubernetes, and autoscaling, and apply deployment patterns such as canary, blue-green, shadow, and A/B testing to introduce changes safely. Finally, you’ll build automated evaluation and observability for production AI. This includes automated evaluation pipelines (e.g., LLM-as-a-judge) wired into CI/CD gates, defining and tracking key quality and performance metrics like hallucination rate, latency, throughput, and cost per request, and implementing robust logging, metrics, distributed tracing, and telemetry. You will also detect and monitor data and model drift, bias, and degradation over time using tools such as Arize Phoenix, design alerting strategies, and collaborate with product and reliability teams to establish incident response, runbooks, and continuous improvement processes for AI systems at scale. Disclaimer: This is an independent educational resource created by Board Infinity for informational and educational purposes only. This course is not affiliated with, endorsed by, sponsored by, or officially associated with any company, organization, or certification body unless explicitly stated. The content provided is based on industry knowledge and best practices but does not constitute official training material for any specific employer or certification program. All company names, trademarks, service marks, and logos referenced are the property of their respective owners and are used solely for educational identification and comparison purposes.

Start by grounding learners in practical, production-ready system design for ML and LLM applications. This module connects architectural patterns—serving topologies, feature stores, and retrieval-augmented generation (RAG)—to reproducible experimentation and compliant design decisions. Expect short instructor videos, readings that map design trade-offs, and hands-on exercises using experiment-tracking tools to make architectures actionable.

What's included

9 videos3 readings4 assignments1 plugin

9 videosTotal 92 minutes
  • ML/LLM CI/CD Architecture: How It's Different from DevOps9 minutes
  • Automating Build → Test → Deploy for ML Pipelines10 minutes
  • Integrating Model & Data Validation into CI/CD9 minutes
  • Semantic Versioning for Models, Prompts, & Datasets12 minutes
  • Model Registries: MLflow, W&B, and Custom Systems7 minutes
  • Rollbacks & Lineage Tracking for Experiment Safety8 minutes
  • Why ML Environments Drift18 minutes
  • Reproducibility with Docker, Conda, Lockfiles, and Hashes11 minutes
  • Promoting Environments Across Dev → Staging → Production9 minutes
3 readingsTotal 90 minutes
  • “CI/CD + CT/CD: Patterns & Anti-patterns in ML Deployment Pipelines”30 minutes
  • “Model Registry Design: Governance, Lineage, and Auditability”30 minutes
  • “Environment Parity Checklist for ML Systems”30 minutes
4 assignmentsTotal 105 minutes
  • Graded Quiz : Operationalizing AI Pipelines (CI/CD, CT/CD, Versioning)60 minutes
  • Practice Quiz : Foundations of CI/CD for ML & LLM Systems15 minutes
  • Practice Quiz : Model Versioning & Release Management15 minutes
  • Practice Quiz : Environment & Dependency Management15 minutes
1 pluginTotal 5 minutes
  • Quick Course Check-In5 minutes

Move from design to continuous delivery: this module teaches how to build CI/CD pipelines tailored to ML and LLM systems and how to gate changes with automated evaluation. Learners will set up data, model, and prompt versioning, define meaningful metrics (accuracy, hallucination rate, latency, cost), and implement evaluation pipelines—including LLM-as-a-judge methods—that plug into CI/CD gates. Activities include guided configuration examples, scenario-driven readings, and automated practice quizzes.

What's included

9 videos3 readings4 assignments

9 videosTotal 78 minutes
  • Designing Efficient Context Windows9 minutes
  • Structured Prompts for Reliability & Determinism6 minutes
  • Techniques to Reduce Hallucination via Prompt Engineering8 minutes
  • Understanding Latency Budgets & Token Cost Drivers13 minutes
  • Batching, Caching, Streaming, Compression12 minutes
  • Model Choices: API vs Local Models11 minutes
  • Logging Prompt Variants with W&B/Mlflow3 minutes
  • Tracking Prompt-Response Deltas9 minutes
  • Scientific Evaluation of Prompt Variants8 minutes
3 readingsTotal 90 minutes
  • “Prompt Architecture Patterns for Production LLM Systems”30 minutes
  • “Token Economics: Understanding Cost Structures of LLM Pipelines30 minutes
  • “Prompt Versioning Framework Example Repository”30 minutes
4 assignmentsTotal 105 minutes
  • “Help me reduce the latency and cost of my LLM pipeline.”60 minutes
  • Practice Quiz : Managing Context Windows & Prompt Structure15 minutes
  • Practice Quiz : Inference Optimization: Latency & Token Cost15 minutes
  • Practice Quiz : Prompt Versioning & Experiment Tracking15 minutes

This module focuses on the operational mechanics of serving models and LLMs at scale. You will design and implement containerized serving architectures using orchestration (e.g., Kubernetes), autoscaling, and cost-aware inference pipelines; practice deployment patterns such as canary, blue-green, shadow, and A/B testing; and learn prompt and context-window optimization techniques to balance latency, quality, and cost. Practical labs and demonstrations show real-world manifests, autoscaling configs, and inference pipeline tuning.

What's included

9 videos3 readings4 assignments

9 videosTotal 66 minutes
  • Constructing Realistic Evaluation Data9 minutes
  • Sampling Edge Cases & Failure Modes8 minutes
  • Avoiding Bias in Test Data5 minutes
  • Designing Evaluator Prompts8 minutes
  • Scoring for Consistency, Relevance, Correctness8 minutes
  • Limits of Automated Scoring6 minutes
  • Evaluation Triggers During Deployment7 minutes
  • Quality Gates & Release Thresholds7 minutes
  • Reading Evaluation Dashboards for Release Readiness7 minutes
3 readingsTotal 90 minutes
  • “LLM Evaluation Dataset Blueprint”30 minutes
  • “Automated Scoring Frameworks for LLM Evaluation”30 minutes
  • “Evaluation Automation Templates Using MLflow/W&B”30 minutes
4 assignmentsTotal 75 minutes
  • Covers: dataset design, automated evaluation, CI/CD integration.30 minutes
  • Designing LLM Evaluation Datasets15 minutes
  • LLM-as-a-Judge Methodologies15 minutes
  • Practice Quiz : Integrating Evaluation into CI/CD Pipelines15 minutes

Close the loop by instrumenting systems for deep observability and long-term reliability. Learners will add logging, metrics, distributed tracing, and telemetry; use monitoring platforms (e.g., Arize Phoenix) to detect data/model drift, bias, and degradation; and design alerting and runbooks while coordinating incident response with product and reliability teams. The module culminates in a hands-on capstone programming project that integrates architecture, CI/CD, serving, evaluation, and monitoring into a production-ready AI solution.

What's included

9 videos3 readings4 assignments

9 videosTotal 53 minutes
  • Logging Prompts, Responses, and Metadata8 minutes
  • Comparing Experiments Across Versions8 minutes
  • Tracking Inference Metrics7 minutes
  • How Chains and Agents Break8 minutes
  • Using Phoenix to Trace Execution Steps5 minutes
  • Identifying Hallucination Triggers and Bottlenecks5 minutes
  • Data Drift vs Behavioral Drift5 minutes
  • Drift Dashboards & Alerting3 minutes
  • When to Retrain or Update the Pipeline4 minutes
3 readingsTotal 90 minutes
  • “Telemetry Best Practices for Production AI”30 minutes
  • “Tracing Playbook for Complex AI Systems”30 minutes
  • “Drift Detection Techniques for LLM Applications”30 minutes
4 assignmentsTotal 105 minutes
  • “Help me diagnose the failure points in this trace and recommend fixes.”60 minutes
  • Practice Quiz : Experiment Tracking & Telemetry (W&B / MLflow)15 minutes
  • Practice Quiz : Tracing and Debugging with Arize Phoenix15 minutes
  • Practice Quiz : Monitoring Drift & System Health15 minutes

Earn a career certificate

Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.

Instructor

Board Infinity
261 Courses428,749 learners

Why people choose Coursera for their career

👁 Image

Felipe M.

Learner since 2018
"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."
👁 Image

Jennifer J.

Learner since 2020
"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."
👁 Image

Larry W.

Learner since 2021
"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."
👁 Image

Chaitanya A.

"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."

Frequently asked questions

Basic familiarity with Python and ML concepts is recommended. No prior MLOps experience is required — the course builds from foundational CI/CD concepts.

You'll work with MLflow, Weights & Biases (W&B), Arize Phoenix, GitHub Actions, Docker, and various prompt engineering frameworks.

Yes. The course introduces CI/CD and deployment concepts from an ML-first perspective, making it accessible for data scientists.

Absolutely. The skills are directly applicable to ML engineering, AI platform, and data engineering roles in production environments.

To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.

When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.

Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.

Financial aid available,