Process Real-Time Data with Spark Streams
Process Real-Time Data with Spark Streams
This course is part of Real-Time, Real Fast: Kafka & Spark for Data Engineers Specialization
Instructors: Caio Avelino
Included with
Ask Coursera
Recommended experience
Recommended experience
What you'll learn
Explain the execution model of Spark Structured Streaming and build a simple pipeline from a file source to a console sink.
Develop streaming pipelines that integrate with Kafka, apply event-time processing with watermarks, and write reliable outputs to Delta Lake.
Build an end-to-end Spark streaming pipeline that can be deployed in real-world production environments.
Skills you'll gain
Tools you'll learn
Details to know
January 2026
See how employees at top companies are mastering in-demand skills
Build your subject-matter expertise
- Learn new concepts from industry experts
- Gain a foundational understanding of a subject or tool
- Develop job-relevant skills with hands-on projects
- Earn a shareable career certificate
There are 3 modules in this course
Real-time data is everywhere — from fraud detection in financial transactions to personalized recommendations in e-commerce and anomaly detection in IoT devices. Traditional batch processing is too slow for these use cases, and businesses need insights the moment data is generated. This course teaches you how to design, build, and operate reliable streaming pipelines using Apache Spark Structured Streaming and Kafka.
In this course, you’ll start with the fundamentals of Spark’s streaming model, learning how micro-batching, triggers, and checkpoints enable continuous processing. You’ll then connect Spark to real-world sources like Kafka, apply event-time processing with watermarks, and deliver results to Delta Lake. Finally, you’ll take pipelines to production by enriching streams with static data, monitoring query health, handling failures, and ensuring scalability. This course introduces you to real-time data processing using Apache Spark Streaming. You’ll learn how to handle continuous data flows, design fault-tolerant stream pipelines, and analyze live data efficiently. By the end, you’ll understand how Spark handles streaming workloads, integrates with various data sources, and powers decision-making in real-world applications. Learners should have a basic understanding of Python programming and Spark DataFrames, along with familiarity with JSON and SQL. By the end, you’ll have the skills to confidently implement streaming solutions that power real-time decision-making in modern data-driven organizations.
Learners are introduced to the Spark Structured Streaming model and its core concepts, including micro-batch execution, triggers, checkpoints, output modes and data transformation.
What's included
4 videos3 readings
4 videos•Total 30 minutes
- Welcome to Process Real-Time Data with Spark Streams•3 minutes
- Understanding Spark’s Streaming Model•7 minutes
- Setting Up Spark Streams: Schema and Outputs•13 minutes
- Transformations, JSON parsing and handling malformed events•7 minutes
3 readings•Total 40 minutes
- Welcome to the Course: Course Overview•5 minutes
- Getting Started with Structured Streaming in Apache Spark•10 minutes
- Hands On Learning (HOL): Build Your First Spark Streaming Pipeline•25 minutes
This module focuses on integrating Spark with real-world streaming systems. Learners will consume data from Kafka, transform and parse messages, and write results to sinks such as Delta Lake, ensuring reliability with checkpointing and triggers
What's included
3 videos2 readings1 peer review
3 videos•Total 36 minutes
- Connecting Spark to Kafka and Writing to Delta•15 minutes
- Handling Event Time with Watermarks and Windows•10 minutes
- Ensuring Reliability with Checkpointing and Triggers•10 minutes
2 readings•Total 30 minutes
- Handling Event-Time and Late Data in Streaming Systems•5 minutes
- HOL: Stream Kafka Events into Delta with Watermarks •25 minutes
1 peer review•Total 25 minutes
- Hands-On-Learning: Stream Kafka Events into Delta with Watermarks •25 minutes
Learners design an end-to-end streaming pipeline that combines ingestion, transformation, enrichment with static datasets, and reliable output.
What's included
4 videos3 readings1 assignment1 peer review
4 videos•Total 33 minutes
- Building an End-to-End Streaming Pipeline•10 minutes
- Monitoring and Troubleshooting Your Streams•9 minutes
- Testing and Deploying Streaming Applications•11 minutes
- Course Wrap-up•4 minutes
3 readings•Total 95 minutes
- Monitoring and Debugging Structured Streaming Queries•10 minutes
- HOL: Deploy and Monitor an End-to-End Streaming Application•25 minutes
- Ungraded Project: Real-Time Fraud Streaming Pipeline•60 minutes
1 assignment•Total 20 minutes
- Process Real-Time Data with Spark Streams•20 minutes
1 peer review•Total 25 minutes
- Hands-On-Learning: Deploy and Monitor an End-to-End Streaming Application•25 minutes
Earn a career certificate
Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.
Instructors
Explore more from Data Analysis
Guided Project
- Status: Free Trial
Specialization
- Status: Free Trial
- Status: Free Trial
Course
Why people choose Coursera for their career
Advance your career with an online degree
Earn a degree from world-class universities - 100% online
Frequently asked questions
To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.
When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.
Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.
More questions
Financial aid available,
¹ Some assignments in this course are AI-graded. For these assignments, your data will be used in accordance with Coursera's Privacy Notice.
