Introduction to Data Engineering on AWS
Keep adding new skills with 10,000+ programs for $239 (usually $399). Save now.
Introduction to Data Engineering on AWS
This course is part of Data Engineering on AWS - The Complete Training Specialization
Included with
Ask Coursera
Recommended experience
Recommended experience
What you'll learn
Master how to use AWS Glue for ETL development and data transformation.
Learn to efficiently catalog, process, and manage large-scale datasets with AWS Glue.
Explore Amazon Redshift's architecture and optimize query performance for fast data analysis.
Gain hands-on experience with Redshift Spectrum and Serverless to enhance data scalability and flexibility.
Details to know
See how employees at top companies are mastering in-demand skills
Build your subject-matter expertise
- Learn new concepts from industry experts
- Gain a foundational understanding of a subject or tool
- Develop job-relevant skills with hands-on projects
- Earn a shareable career certificate
There are 5 modules in this course
This course features Coursera Coach!
A smarter way to learn with interactive, real-time conversations that help you test your knowledge, challenge assumptions, and deepen your understanding as you progress through the course. In this course, you'll gain a comprehensive understanding of data engineering using AWS Glue and Redshift, two critical tools for modern data workflows. You will be equipped with the skills to manage and transform data at scale, from cataloging and processing with AWS Glue to leveraging Redshift for powerful data warehousing and analytics. By diving into hands-on tutorials, you'll learn the core concepts and practical applications necessary to streamline data pipelines and optimize query performance. As you progress through the course, you will explore a variety of AWS Glue features such as Data Catalogs, ETL development, job bookmarking, and data quality evaluation, empowering you to automate data workflows and manage large datasets effectively. With Amazon Redshift, you will learn how to configure clusters, optimize queries, and even work with Redshift Spectrum and Serverless, improving the scalability and efficiency of your data operations. This course is ideal for data professionals looking to enhance their cloud-based data engineering skills, especially those who want to integrate AWS Glue and Redshift into their existing systems. It is suitable for learners with a basic understanding of data analytics, but prior knowledge of AWS or data engineering concepts would be beneficial. The course is designed for both beginners and intermediate learners, offering a solid foundation and practical skills that can be applied in real-world data engineering roles. By the end of the course, you will be able to build and optimize ETL pipelines using AWS Glue, manage data workflows, configure Redshift clusters, optimize query performance, and deploy serverless Redshift for scalable data warehousing solutions.
In this module, we will introduce the concept of data as the new oil and explore its growing importance in the modern digital world. You'll gain a high-level overview of the course and understand the pivotal role data plays in driving innovation and business success.
What's included
1 video2 readings
1 video•Total 8 minutes
- Introduction to Specialization•8 minutes
2 readings•Total 20 minutes
- Introduction to the Course 'Introduction to Data Engineering on AWS'•10 minutes
- Full Specialization Resource•10 minutes
In this module, we will introduce your trainer and provide insights into their professional background. You’ll learn what to expect from this course and how their expertise will guide you throughout your learning journey.
What's included
1 video1 assignment
1 video•Total 3 minutes
- Know Your Trainer•3 minutes
1 assignment•Total 15 minutes
- Know Your Trainer - Assessment•15 minutes
In this module, we will explore the foundational concepts of data engineering, focusing on how AWS services facilitate modern data analytics. You'll also be introduced to essential terminologies to build a strong understanding of data engineering workflows.
What's included
2 videos1 assignment
2 videos•Total 24 minutes
- Data Engineering on AWS•13 minutes
- Basic Terminologies•11 minutes
1 assignment•Total 15 minutes
- Getting Started with Data Analytics - Assessment•15 minutes
In this module, we will dive deep into AWS Glue, exploring its features for data cataloging, ETL processes, and data quality management. You’ll gain hands-on experience in setting up and orchestrating workflows that automate data transformation and processing tasks.
What's included
11 videos1 assignment
11 videos•Total 152 minutes
- Glue Data Catalog•32 minutes
- Glue ETL: Part 1•4 minutes
- Glue ETL: Part 2•10 minutes
- Glue ETL: Part 3•22 minutes
- Workflows•12 minutes
- Job Bookmark•5 minutes
- Execution Type•10 minutes
- Data Quality: Part 1•5 minutes
- Data Quality: Part 2•22 minutes
- Glue DataBrew•21 minutes
- Additional Features•9 minutes
1 assignment•Total 15 minutes
- AWS Glue: Catalog and Process Your Data - Assessment•15 minutes
In this module, we will explore Amazon Redshift, focusing on its architecture, cluster management, and querying capabilities. You'll also learn about advanced features like Redshift Spectrum, Serverless Redshift, and materialized views to optimize your data warehousing experience.
What's included
14 videos1 reading3 assignments
14 videos•Total 154 minutes
- Amazon Redshift•9 minutes
- Architecture•11 minutes
- Creating a Cluster•17 minutes
- Query Editor v2•9 minutes
- Distribution Styles•16 minutes
- Cluster Operations•5 minutes
- Data API•7 minutes
- Redshift Spectrum•21 minutes
- Redshift Serverless: Part 1•13 minutes
- Redshift Serverless: Part 2•8 minutes
- Materialized Views•10 minutes
- WLM and Concurrency•10 minutes
- DataShare•8 minutes
- Additional Information•11 minutes
1 reading•Total 10 minutes
- Conclusion to the Course 'Introduction to Data Engineering on AWS'•10 minutes
3 assignments•Total 90 minutes
- Amazon Redshift: A Data Warehouse in AWS - Assessment•15 minutes
- Full Course Assessment•60 minutes
- Full Course Practice Assessment•15 minutes
Earn a career certificate
Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.
Instructor
Explore more from Data Management
- Status: Free Trial
Specialization
- Status: Free TrialD
DeepLearning.AI
Course
- Status: Preview
Course
Why people choose Coursera for their career
Frequently asked questions
Data Engineering is the practice of designing and building systems to collect, store, and analyze data. It is a crucial aspect of modern data analytics because businesses and organizations rely on accurate, clean, and easily accessible data to make informed decisions. As data becomes more valuable than ever, the need for professionals who can manage and optimize data workflows is growing, making this field essential in today’s data-driven world.
This course provides an in-depth exploration of data engineering on AWS, focusing on the use of AWS Glue for ETL (Extract, Transform, Load) processes and Amazon Redshift for data warehousing. You’ll learn how to manage data using AWS Glue's features, create ETL pipelines, and leverage Redshift for high-performance data analytics and warehousing. The course covers core concepts like data quality, serverless architecture, and performance optimization in both tools.
Upon completing this course, you’ll be proficient in building and managing ETL pipelines using AWS Glue and deploying and optimizing data warehouses with Amazon Redshift. You’ll be equipped with the skills to work with complex data workflows, ensure data quality, and utilize serverless features for scalability. Additionally, you’ll understand how to create and manage Redshift clusters and enhance query performance.
More questions
Financial aid available,
