VOOZH about

URL: https://www.coursera.org/learn/data-enginering-capstone-project

⇱ Data Engineering Capstone Project | Coursera


Data Engineering Capstone Project

Ends soon! Keep adding new skills with 10,000+ programs for $239 (usually $399). Save now.

Data Engineering Capstone Project

20,471 already enrolled

Included with

β€’

Learn more

Gain insight into a topic and learn the fundamentals.
4.7

143 reviews

Advanced level

Recommended experience

2 weeks to complete
at 10 hours a week

Gain insight into a topic and learn the fundamentals.
4.7

143 reviews

Advanced level

Recommended experience

2 weeks to complete
at 10 hours a week

What you'll learn

  • Demonstrate proficiency in skills required for an entry-level data engineering role.

  • Design and implement various concepts and components in the data engineering lifecycle such as data repositories.

  • Showcase working knowledge with relational databases, NoSQL data stores, big data engines, data warehouses, and data pipelines.

  • Apply skills in Linux shell scripting, SQL, and Python programming languages to Data Engineering problems.

Details to know

Shareable certificate

Add to your LinkedIn profile

Assessments

14 assignmentsΒΉ

AI Graded see disclaimer
Taught in English
Flexible schedule
Learn at your own pace

Build your Data Management expertise

This course is part of the IBM Data Engineering Professional Certificate
When you enroll in this course, you'll also be enrolled in this Professional Certificate.
  • Learn new concepts from industry experts
  • Gain a foundational understanding of a subject or tool
  • Develop job-relevant skills with hands-on projects
  • Earn a shareable career certificate from IBM

There are 7 modules in this course

Showcase your skills in this Data Engineering project! In this course you will apply a variety of data engineering skills and techniques you have learned as part of the previous courses in the IBM Data Engineering Professional Certificate.

You will demonstrate your knowledge of Data Engineering by assuming the role of a Junior Data Engineer who has recently joined an organization and be presented with a real-world use case that requires architecting and implementing a data analytics platform. In this Capstone project you will complete numerous hands-on labs. You will create and query data repositories using relational and NoSQL databases such as MySQL and MongoDB. You’ll also design and populate a data warehouse using PostgreSQL and IBM Db2 and write queries to perform Cube and Rollup operations. You will generate reports from the data in the data warehouse and build a dashboard using Cognos Analytics. You will also show your proficiency in Extract, Transform, and Load (ETL) processes by creating data pipelines for moving data from different repositories. You will perform big data analytics using Apache Spark to make predictions with the help of a machine learning model. This course is the final course in the IBM Data Engineering Professional Certificate. It is recommended that you complete all the previous courses in this Professional Certificate before starting this course.

In this module, you will design a data platform that uses MySQL as an OLTP database. You will be using MySQL to store the OLTP data.

What's included

1 video2 assignments1 app item4 plugins

1 videoβ€’Total 4 minutes
  • Introduction to Capstone Project β€’4 minutes
2 assignmentsβ€’Total 36 minutes
  • Checklist: OLTP Databaseβ€’24 minutes
  • Graded Quiz: OLTP Databaseβ€’12 minutes
1 app itemβ€’Total 30 minutes
  • Lab: OLTP Databaseβ€’30 minutes
4 pluginsβ€’Total 45 minutes
  • Reading: Final Project Submission Guidelines and Deliverablesβ€’15 minutes
  • Data Platform Architectureβ€’10 minutes
  • Assignment Overview: OLTP Databaseβ€’15 minutes
  • OLTP Database Requirements and Designβ€’5 minutes

In this module, you will design a data platform that uses MongoDB as a NoSQL database. You will use MongoDB to store the e-commerce catalog data.

What's included

2 assignments1 app item1 plugin

2 assignmentsβ€’Total 25 minutes
  • Checklist: Querying Data in NoSQL Databasesβ€’10 minutes
  • Graded Quiz: Querying Data in NoSQL Databasesβ€’15 minutes
1 app itemβ€’Total 30 minutes
  • Hands-on Lab: Querying Data in NoSQL Databasesβ€’30 minutes
1 pluginβ€’Total 15 minutes
  • Assignment Overview: Querying Data in NoSQL Databasesβ€’15 minutes

In this module you will design and implement a data warehouse and you will then generate reports from the data in the data warehouse.

What's included

3 assignments2 app items1 plugin

3 assignmentsβ€’Total 69 minutes
  • Checklist: Data Warehouse Design & Setupβ€’15 minutes
  • Checklist: Data Warehouse Reportingβ€’24 minutes
  • Graded Quiz: Build a Data Warehouseβ€’30 minutes
2 app itemsβ€’Total 120 minutes
  • Hands-on Lab: Data Warehousingβ€’60 minutes
  • Hands-on Lab: Data Warehouse Reporting using PostgreSQLβ€’60 minutes
1 pluginβ€’Total 15 minutes
  • Assignment Overview:Data Warehouse Design and Reportingβ€’15 minutes

In this module, you will assume the role of a data engineer at an e-commerce company. Your company has finished setting up a data warehouse. Now you are assigned the responsibility to design a reporting dashboard that reflects the key metrics of the business.

What's included

5 readings2 assignments6 plugins

5 readingsβ€’Total 42 minutes
  • (Optional): About this optional lesson with Looker Studioβ€’2 minutes
  • (Optional) : Getting Started with Google Looker Studioβ€’10 minutes
  • (Optional): Creating Visualizations in Reports using Looker Studioβ€’10 minutes
  • (Optional) : Summary and Highlightsβ€’10 minutes
  • Final Assignment Overviewβ€’10 minutes
2 assignmentsβ€’Total 27 minutes
  • Checklist: Dashboard Creation β€’12 minutes
  • Graded Quiz: Dashboard Creation β€’15 minutes
6 pluginsβ€’Total 210 minutes
  • Assignment Overview: Data Analyticsβ€’15 minutes
  • (Optional):Hands-on Lab: Getting Started with Google Looker Studioβ€’60 minutes
  • (Optional): Hands-on Lab: Creating and Configuring Visualizations in Reports with Google Looker Studioβ€’60 minutes
  • (Optional) Hands-on Lab: Advanced charts in Looker Studioβ€’15 minutes
  • (Optional): Final Assignment : Dashboard Creation using IBM Cognos Analyticsβ€’30 minutes
  • (Optional): Final Assignment : Dashboard Creation using Google Looker Studio β€’30 minutes

In this module, you will perform ETL operations to move transactional data from an OLTP database (MySQL) into a data warehouse (PostgreSQL).Finally, you will implement and automate an ETL pipeline in Python that extracts daily incremental records from the production database, transforms them as needed, and loads them into the warehouse. Once the ETL process is established, you will extend it further using Apache Airflow, a powerful workflow orchestration tool. You will design DAGs (Directed Acyclic Graphs) that define task dependencies, automate the extraction and transformation of web server logs, and archive processed data for downstream analytics.

What's included

3 assignments2 app items1 plugin

3 assignmentsβ€’Total 66 minutes
  • Checklist: ETL β€’9 minutes
  • Checklist: Data Pipelines using Apache Airflowβ€’27 minutes
  • Graded Quiz: ETL and Data Pipelinesβ€’30 minutes
2 app itemsβ€’Total 90 minutes
  • Hands-on Lab: ETLβ€’60 minutes
  • Hands-on Lab: Data Pipelines using Apache Airflowβ€’30 minutes
1 pluginβ€’Total 15 minutes
  • Assignment Overview: ETL and Data Pipelinesβ€’15 minutes

In this module, you will use the data from a webserver to analyse search terms. You will then load a pretrained sales forecasting model and predict the sales forecast for a future year.

What's included

2 assignments2 app items1 plugin

2 assignmentsβ€’Total 29 minutes
  • Checklist: Big Data Analytics with Sparkβ€’14 minutes
  • Graded Quiz: Big Data Analytics with Sparkβ€’15 minutes
2 app itemsβ€’Total 60 minutes
  • Practice Hands On Lab: Saving and loading a SparkML modelβ€’30 minutes
  • Hands-on Lab: SparkML Opsβ€’30 minutes
1 pluginβ€’Total 15 minutes
  • Assignment Overview: Big Data Analytics with Sparkβ€’15 minutes

In this module, you will make a final submission of all the labs you’ve completed throughout the course for evaluation.You can choose to have your submission evaluated by an AI tool or through a peer-graded review.

What's included

2 readings1 peer review1 app item

2 readingsβ€’Total 3 minutes
  • Congrats & Next Stepsβ€’2 minutes
  • Thanks from the Course Teamβ€’1 minute
1 peer reviewβ€’Total 60 minutes
  • Option 2 - Peer Graded: Final Project - Submission and Evaluationβ€’60 minutes
1 app itemβ€’Total 60 minutes
  • Option 1 - AI Graded: Final Project-Submission and Evaluationβ€’60 minutes

Earn a career certificate

Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.

Instructor

Instructor ratings
4.5 (33 ratings)
IBM
55 Coursesβ€’5,143,420 learners

Explore more from Data Management

Why people choose Coursera for their career

πŸ‘ Image

Felipe M.

Learner since 2018
"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."
πŸ‘ Image

Jennifer J.

Learner since 2020
"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."
πŸ‘ Image

Larry W.

Learner since 2021
"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."
πŸ‘ Image

Chaitanya A.

"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."

Learner reviews

  • 5 stars

    84.61%

  • 4 stars

    9.79%

  • 3 stars

    2.09%

  • 2 stars

    1.39%

  • 1 star

    2.09%

Showing 3 of 143

RV
Β·

Reviewed on Mar 9, 2024

The Capstone was a bit of an anticlimax. I was expecting a very challenging Capstone, but found a "follow the instructions" approach which made it seem too simple. I'm not complaining ;-)

RS
Β·

Reviewed on Mar 17, 2023

I enjoyed having to go back and revise the other courses in the specialization. I had forgotten how interesting they were.

BB
Β·

Reviewed on Aug 13, 2023

Great course to learn the fundamentals to become a very good Data Engineer !

Frequently asked questions

This project requires you to architect a multi-tiered data platform utilizing various database paradigms. For transactional data, you will implement a MySQL OLTP database to log live e-commerce operations. For unstructured data storage, you will design a MongoDB NoSQLdatabase to manage product catalogs. Finally, you will construct and populate a data warehouse using PostgreSQL and IBM Db2, writing complex analytics queries for business reporting.

You will gain hands-on experience handling automated data movement across different platform layers. You will build an Extract, Transform, and Load (ETL) pipeline using Python to extract daily incremental records from production systems and load them safely into your data warehouse. Moving beyond basic scripts, you will orchestrate this entire pipeline using Apache Airflow, designing DAGs (Directed Acyclic Graphs) to manage task dependencies, automate the ingestion of web server logs, and clean data for downstream analytics.

Yes. To prepare you for entry-level engineering roles, the capstone integrates big data processing engines and BI platforms. You will use Apache Spark to run large-scale log analysis, extracting and parsing search terms directly from web server data. Furthermore, you will hook your engineered data paths into a pretrained machine learning model to execute sales forecasting and then pipe those results into IBM Cognos Analytics to build interactive, live reporting dashboards reflecting key performance indicators.

To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.

When you enroll in the course, you get access to all of the courses in the Certificate, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.

Financial aid available,

ΒΉ Some assignments in this course are AI-graded. For these assignments, your data will be used in accordance with Coursera's Privacy Notice.