VOOZH about

URL: https://www.coursera.org/learn/building-automated-data-pipelines-with-sparkdbtand-airflow

⇱ Building Automated Data Pipelines with Spark,dbt,and Airflow | Coursera


Building Automated Data Pipelines with Spark,dbt,and Airflow

Keep adding new skills with 10,000+ programs for $239 (usually $399). Save now.

Building Automated Data Pipelines with Spark,dbt,and Airflow

Included with

Ask Coursera

Gain insight into a topic and learn the fundamentals.
Beginner level

Recommended experience

9 hours to complete
Flexible schedule
Learn at your own pace

Gain insight into a topic and learn the fundamentals.
Beginner level

Recommended experience

9 hours to complete
Flexible schedule
Learn at your own pace

What you'll learn

  • Build end-to-end data pipelines that automatically ingest from databases, APIs, and streams using Spark, dbt, and Airflow tools.

  • Design data models with historical tracking using SCD Type 2 patterns to preserve complete change history for analytics.

  • Create automated workflows with intelligent retry logic, SLA monitoring, and parameterization for production reliability.

  • Optimize Spark job performance using partitioning and caching strategies to achieve 30%+ runtime improvements.

Details to know

Shareable certificate

Add to your LinkedIn profile

Recently updated!

March 2026

Assessments

19 assignments¹

AI Graded see disclaimer
Taught in English

Build your Data Analysis expertise

This course is part of the Open source Data Engineering with Spark, dbt & Airflow Professional Certificate
When you enroll in this course, you'll also be enrolled in this Professional Certificate.
  • Learn new concepts from industry experts
  • Gain a foundational understanding of a subject or tool
  • Develop job-relevant skills with hands-on projects
  • Earn a shareable career certificate from Coursera

There are 11 modules in this course

You'll master the art of building production-ready data pipelines that automatically process millions of records. In this hands-on course, you'll design end-to-end workflows that integrate diverse data sources—from databases and APIs to real-time streams—using industry-standard tools like Apache Spark, dbt, and Apache Airflow. You'll learn to create robust data models that preserve historical changes, implement performance optimizations that reduce processing time by 30% or more, and build automated workflows with intelligent retry logic and monitoring alerts.

By the end, you'll have created a complete data pipeline system that demonstrates the technical skills data engineering teams need most. You'll know how to unify fragmented data sources, apply advanced transformation techniques, and ensure your pipelines run reliably at scale. This practical experience directly translates to the challenges you'll face as a data engineer, data analyst, or anyone working with large-scale data systems in modern organizations.

You will learn the foundational concepts and tools needed to create systematic visual documentation of data pipeline architectures.

What's included

3 videos2 readings1 assignment

3 videosTotal 15 minutes
  • Why Data Flow Visualization Drives Engineering Success4 minutes
  • Systematic Approach to Identifying Sources and Destinations8 minutes
  • Creating Your First Data Flow Diagram3 minutes
2 readingsTotal 11 minutes
  • Essential Components of Professional Data Flow Diagrams6 minutes
  • Transformation Mapping Principles for Complex Data Pipelines5 minutes
1 assignmentTotal 3 minutes
  • Data Flow Fundamentals Knowledge Check3 minutes

You will apply advanced techniques to create professional-quality data flow diagrams that accurately represent complex enterprise data systems and support stakeholder collaboration.

What's included

2 videos2 readings3 assignments

2 videosTotal 12 minutes
  • Advanced Diagramming Techniques for Complex Data Systems9 minutes
  • Mapping Complex Multi-System Data Pipelines3 minutes
2 readingsTotal 13 minutes
  • Enterprise Data Flow Best Practices and Industry Standards7 minutes
  • Validation and Review Processes for Data Flow Documentation6 minutes
3 assignmentsTotal 25 minutes
  • Comprehensive Data Flow Mastery Assessment10 minutes
  • Create Complete Enterprise Data Flow Diagram12 minutes
  • Advanced Data Flow Concepts Knowledge Check3 minutes

You will establish the foundational understanding and core skills for creating modular data pipeline stages, focusing on the principles of separation of concerns and tool integration fundamentals.

What's included

1 video1 reading1 assignment

1 videoTotal 7 minutes
  • Open Source Tool Ecosystem: Spark, dbt, and Airflow Integration7 minutes
1 readingTotal 12 minutes
  • Fundamentals of Modular Data Pipeline Architecture12 minutes
1 assignmentTotal 3 minutes
  • Modular Pipeline Design Fundamentals Assessment3 minutes

You will implement complete end-to-end data pipelines by integrating modular components with industry-standard tools, culminating in comprehensive assessment of their pipeline development capabilities.

What's included

2 readings3 assignments

2 readingsTotal 20 minutes
  • End-to-End Pipeline Integration Patterns12 minutes
  • Implementing Complete Pipeline Integration with Spark, dbt, and Airflow8 minutes
3 assignmentsTotal 38 minutes
  • Comprehensive Modular Pipeline Development Assessment15 minutes
  • End-to-End Pipeline Development Project20 minutes
  • Modular Pipeline Integration and Coordination Quiz3 minutes

You will establish foundational knowledge of connector architecture and complete their first database connector configuration using Airbyte.

What's included

2 videos2 readings1 assignment

2 videosTotal 10 minutes
  • Why Data Source Unification Matters for Enterprise Success4 minutes
  • Airbyte Connector Fundamentals - Your Integration Foundation6 minutes
2 readingsTotal 17 minutes
  • Understanding Connector Architecture and Integration Patterns8 minutes
  • Professional Guide: Configuring Your First Database Connector Step-by-Step9 minutes
1 assignmentTotal 3 minutes
  • Connector Configuration Foundation Knowledge Check 3 minutes

You will implement complete multi-source data integration by configuring streaming and API connectors, applying enterprise security patterns, and demonstrating mastery through comprehensive connector configuration.

What's included

2 videos2 readings2 assignments

2 videosTotal 10 minutes
  • Enterprise Integration Success Stories - Why Multi-Source Unity Matters 4 minutes
  • Streaming and API Connector Configuration Mastery6 minutes
2 readingsTotal 15 minutes
  • Authentication and Security Patterns for Production Connectors 7 minutes
  • Multi-Source Data Integration: A How-To Guide8 minutes
2 assignmentsTotal 18 minutes
  • Connector Configuration Mastery Assessment 15 minutes
  • Multi-Source Integration Configuration Check3 minutes

You will understand the fundamental concepts of SCD2 logic and begin applying these principles to create data models that preserve historical context in enterprise data warehouses.

What's included

3 videos1 reading1 assignment

3 videosTotal 14 minutes
  • Why SCD2 Matters in Enterprise Data Warehouses4 minutes
  • Understanding SCD2 Core Components and Business Logic 7 minutes
  • Building Your First SCD2 Table Structure in SQL4 minutes
1 readingTotal 8 minutes
  • SCD2 Implementation Patterns and Data Model Design8 minutes
1 assignmentTotal 3 minutes
  • SCD2 Fundamentals Knowledge Check3 minutes

You will implement production-ready SCD2 models using dbt, creating automated historical tracking systems with proper change detection, validity periods, and current status management.

What's included

2 videos2 readings3 assignments

2 videosTotal 13 minutes
  • dbt Snapshots for Automated SCD2 Change Detection8 minutes
  • Building Complete dbt SCD2 Model with Validity Periods5 minutes
2 readingsTotal 18 minutes
  • Why dbt Transforms SCD2 Implementation for Data Teams8 minutes
  • dbt SCD2 Implementation Patterns and Production Considerations 10 minutes
3 assignmentsTotal 36 minutes
  • SCD2 Implementation Mastery Assessment 15 minutes
  • Build Production SCD2 Data Model for Product Dimensions 18 minutes
  • dbt SCD2 Implementation Knowledge Check 3 minutes

You will understand the foundational concepts and design principles for creating robust data workflows with Apache Airflow.

What's included

3 videos1 reading1 assignment

3 videosTotal 15 minutes
  • The Cost of Fragile Data Pipelines2 minutes
  • Apache Airflow Fundamentals for Production Workflows6 minutes
  • Building Your First Production-Ready DAG Structure7 minutes
1 readingTotal 10 minutes
  • Design Principles for Robust Data Workflows10 minutes
1 assignmentTotal 3 minutes
  • Workflow Design Principles Assessment3 minutes

You will implement production-grade Airflow workflows with retry mechanisms, SLA monitoring, and parameterization for enterprise-ready data pipeline resilience.

What's included

2 videos1 reading2 assignments1 ungraded lab

2 videosTotal 12 minutes
  • When Production Workflows Save Business Operations3 minutes
  • Implementing Advanced Production Patterns in Airflow9 minutes
1 readingTotal 10 minutes
  • Production Implementation Patterns and Best Practices10 minutes
2 assignmentsTotal 13 minutes
  • Production Workflow Mastery Assessment10 minutes
  • Production Implementation Patterns Assessment3 minutes
1 ungraded labTotal 20 minutes
  • Building Production-Ready Airflow DAGs with Retry Logic and SLA Monitoring20 minutes

You will integrate data engineering skills to build a complete automated data pipeline that processes diverse data sources, applies historical tracking, and orchestrates workflows. This project synthesizes mapping, transformation, integration, modeling, and automation capabilities into a production-ready data system.

What's included

4 readings1 assignment

4 readingsTotal 90 minutes
  • Why This Project Matters10 minutes
  • Project Requirements10 minutes
  • Assignment: Data Pipeline Automation System60 minutes
  • Solution Key10 minutes
1 assignmentTotal 15 minutes
  • Graded Quiz: Building Automated Data Pipelines with Spark, dbt, and Airflow15 minutes

Earn a career certificate

Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.

Why people choose Coursera for their career

👁 Image

Felipe M.

Learner since 2018
"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."
👁 Image

Jennifer J.

Learner since 2020
"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."
👁 Image

Larry W.

Learner since 2021
"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."
👁 Image

Chaitanya A.

"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."

Frequently asked questions

To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.

When you enroll in the course, you get access to all of the courses in the Certificate, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.

Financial aid available,

¹ Some assignments in this course are AI-graded. For these assignments, your data will be used in accordance with Coursera's Privacy Notice.