Snowflake - Build and Architect Data Pipelines Using AWS
Keep adding new skills with 10,000+ programs for $239 (usually $399). Save now.
Snowflake - Build and Architect Data Pipelines Using AWS
Included with
Ask Coursera
Recommended experience
Recommended experience
What you'll learn
Architect and optimize scalable data pipelines with Snowflake and AWS.
Implement ingestion, transformation, and extraction workflows with best practices.
Deploy machine learning pipelines using Snowpark and real-time streaming with Kafka.
Ensure data governance with advanced security and compliance features.
Skills you'll gain
Details to know
See how employees at top companies are mastering in-demand skills
There are 14 modules in this course
This course features Coursera Coach!
A smarter way to learn with interactive, real-time conversations that help you test your knowledge, challenge assumptions, and deepen your understanding as you progress through the course. Learn how to design and build robust data pipelines using Snowflake and AWS in this comprehensive course. You will explore Snowflake's architecture, including its virtual warehouses, billing components, and object hierarchy, to fully understand how to leverage this powerful platform. In addition to that, you'll dive into key areas like data ingestion, partitioning, clustering, and performance optimization techniques, while getting hands-on experience with labs to reinforce your learning. By the end of the course, you will be able to create optimized, cost-effective data pipelines that integrate seamlessly with AWS services like S3, Lambda, and Glue. The journey includes building tasks and queries, utilizing streams for real-time data tracking, and understanding how to set up user-defined and external functions. As you progress, you will also explore advanced concepts such as Snowpark for Data Science, streaming with Kafka, and data governance techniques to ensure your data pipeline meets security and compliance standards. This course is designed for those seeking hands-on expertise in building scalable data pipelines using Snowflake and AWS. Whether you're an aspiring data engineer or an experienced professional looking to sharpen your skills, this course will provide you with the tools and knowledge to implement real-world data engineering solutions effectively.
In this module, we will set the stage for the entire course by outlining the roadmap, discussing the prerequisites, and sharing success strategies. These foundational insights will ensure you're well-prepared to navigate and excel in the upcoming material.
What's included
2 videos1 reading
2 videos•Total 6 minutes
- Course Roadmap•3 minutes
- Prerequisites and How to Succeed in This Course•3 minutes
1 reading•Total 10 minutes
- Full Course Resources•10 minutes
In this module, we will explore the foundational concepts of data warehousing and its significance within a data ecosystem. We’ll take a closer look at Snowflake’s architecture, object hierarchy, and virtual warehouses. Additionally, you’ll learn about Snowflake’s billing components, tracking consumption, and setting up resource monitors, ensuring you’re equipped to manage resources effectively.
What's included
9 videos1 assignment
9 videos•Total 47 minutes
- What Is Data-Warehouse?•3 minutes
- Two Aspects of a Data Ecosystem•2 minutes
- Lab - Set Up Snowflake Trial Account•2 minutes
- Snowflake Architecture•4 minutes
- Snowflake Object Hierarchy•5 minutes
- Snowflake - Virtual Warehouses•10 minutes
- Snowflake - Different Billing Components•10 minutes
- Snowflake - Track Your Consumption•5 minutes
- Snowflake- Resource Monitors•5 minutes
1 assignment•Total 15 minutes
- Introduction to Snowflake and AWS - Assessment•15 minutes
In this module, we will delve into the various table types available in Snowflake, providing a comprehensive introduction to their structures and purposes. You’ll gain hands-on experience through labs focused on creating tables, views, and secure views. We’ll also explore the nuances of views, including materialized and secure views, to enhance your understanding of Snowflake's data presentation capabilities.
What's included
6 videos1 assignment
6 videos•Total 30 minutes
- Introduction - Different Tables in Snowflake•4 minutes
- Lab - Create Tables in Snowflake•9 minutes
- Snowflake - Views, Materialized Views and Secure Views•4 minutes
- Lab - Create Views in Snowflake•5 minutes
- Lab - Create Secure Views in Snowflake•5 minutes
- More about Views in Snowflake•3 minutes
1 assignment•Total 15 minutes
- Snowflake - Tables - Assessment•15 minutes
In this module, we will examine Snowflake’s advanced data organization features, focusing on micro-partitions and clustering keys. Through hands-on labs, you’ll learn to select and configure clustering keys, analyze query profiles, and leverage caching mechanisms to enhance performance. Additionally, we’ll explore the benefits of search optimization to further streamline data retrieval and processing efficiency.
What's included
9 videos1 assignment
9 videos•Total 72 minutes
- Section Overview•2 minutes
- Introduction to Partitions and Clustering Keys•8 minutes
- Lab - Micro-Partitions and Clustering Keys•18 minutes
- Benefits of Micro-Partitions and Clustering•6 minutes
- Understanding Clustering Depth and Cluster Overlap•9 minutes
- Lab - Selecting Your Clustering Keys•6 minutes
- Lab - Check Query Profile and History•5 minutes
- Lab - Query Processing and Caching•8 minutes
- Search Optimization Feature•11 minutes
1 assignment•Total 15 minutes
- Snowflake - Partitioning, Clustering, and Performance Optimization - Assessment•15 minutes
In this module, we will explore the end-to-end processes for loading and extracting data in Snowflake. You'll learn how to connect Snowflake with AWS S3, ingest structured and semi-structured data, and implement continuous ingestion using Snowpipe. Additionally, we'll cover critical aspects such as billing estimation and key considerations to ensure efficient data operations. Hands-on labs will solidify your understanding of these concepts.
What's included
9 videos1 assignment
9 videos•Total 54 minutes
- Section Overview•1 minute
- Data Ingestion - Real-World Use Cases•4 minutes
- Lab - Create an Integration Object to Connect Snowflake with AWS S3•8 minutes
- Lab - Ingest CSV from S3 to Snowflake•8 minutes
- Lab - Ingest JSON from S3 to Snowflake•11 minutes
- Introduction to Continuous Data Ingestion in Snowflake•2 minutes
- Lab - Create and Implement Snow Pipe•11 minutes
- Snow pipe - Billing Estimation and Key Considerations for Data Ingestion•3 minutes
- Lab - Extracting/Unload Data from Snowflake to S3•6 minutes
1 assignment•Total 15 minutes
- Snowflake - Data Loading/Ingestion and Extraction - Assessment•15 minutes
In this module, we will delve into Snowflake's task management and query scheduling features. You'll learn how to create and manage tasks, build complex task trees for dependent workflows, and monitor their execution. We'll also explore billing insights and query history to ensure efficient and cost-effective operations. Through hands-on labs, you’ll gain practical skills in implementing and optimizing tasks in Snowflake.
What's included
4 videos1 assignment
4 videos•Total 15 minutes
- Section Overview•0 minutes
- Introduction to Tasks•4 minutes
- Lab - Create Standalone and Dependent Tree of Tasks•9 minutes
- Lab - Billing and Query History for Tasks•2 minutes
1 assignment•Total 15 minutes
- Snowflake - Tasks and Query Scheduling - Assessment•15 minutes
In this module, we will uncover the power of streams in Snowflake for implementing Change Data Capture (CDC) workflows. You'll learn how to use standard and append-only streams, manage data retention, and handle stream staleness. Through a series of labs and a project, you’ll create and implement end-to-end pipelines that leverage streams to track and process data changes efficiently. This hands-on experience will solidify your understanding of CDC in modern data architectures.
What's included
11 videos1 assignment
11 videos•Total 58 minutes
- Section Overview•1 minute
- Introduction to Streams•3 minutes
- Lab - Implement Standard Streams•15 minutes
- Lab - Implement Append-Only Streams•4 minutes
- Lab - Streams in a Transaction•6 minutes
- Streams - Data Retention and Staleness•6 minutes
- Lab - Change Tracking Using "Changes"•6 minutes
- Project Overview•2 minutes
- Lab - Create Streams - Project Solution•7 minutes
- Lab - Create Streams - Continuation•3 minutes
- Lab - End-to-End Pipeline in Action•5 minutes
1 assignment•Total 15 minutes
- Snowflake - Streams and Change Data Capture - Assessment•15 minutes
In this module, we will explore User-Defined Functions (UDFs) in Snowflake, a powerful feature for extending database functionality. You'll learn about different UDF types, including scalar, tabular, and JavaScript-based UDFs, and gain hands-on experience implementing them. Additionally, we'll discuss pushdown in UDFs and its impact, as well as best practices for writing secure UDFs to ensure data privacy and compliance.
What's included
7 videos1 assignment
7 videos•Total 36 minutes
- Introduction to User-Defined Functions and UDF Types•3 minutes
- Lab - Write and Implement a Scalar UDF•5 minutes
- Lab - Write Tabular UDF in SQL•4 minutes
- Lab - Implement JavaScript UDFs•5 minutes
- What Is Pushdown in UDF?•3 minutes
- Lab - How Can Pushdown Expose the Underlying Data?•6 minutes
- Lab - Write Secure UDFs•10 minutes
1 assignment•Total 15 minutes
- Snowflake - User-Defined Functions - Assessment•15 minutes
In this module, we will explore the capabilities of external functions in Snowflake for interacting with external systems. You’ll learn how to deploy AWS Lambda functions, create and secure API Gateway, and integrate these components with Snowflake to build external functions. Through hands-on labs, you will gain practical skills in configuring and deploying these powerful integrations for extending Snowflake’s functionality.
What's included
7 videos1 assignment
7 videos•Total 30 minutes
- Section Overview•1 minute
- Introduction to External Functions•2 minutes
- Lab - Deploy AWS Lambda Function•7 minutes
- Create IAM Role•2 minutes
- Lab - Create API Gateway•7 minutes
- Lab - Secure and Deploy API Gateway•5 minutes
- Lab - Create External Function in Snowflake•7 minutes
1 assignment•Total 15 minutes
- Snowflake - External Functions - Assessment•15 minutes
In this module, we will explore how to integrate Snowflake with Python, Spark, and Airflow on AWS to build robust data engineering solutions. You’ll learn how to connect Snowflake with Python locally and on AWS Glue, parameterize scripts, and use Pandas for data manipulation. Additionally, we will dive into PySpark jobs, the pushdown optimization in Spark 3.1, and setting up Airflow for task orchestration. Hands-on labs will provide practical experience in deploying and automating workflows across these tools.
What's included
12 videos1 assignment
12 videos•Total 52 minutes
- Section Overview•1 minute
- Lab - Connect Python with Snowflake in Your Local Machine•3 minutes
- Introduction to AWS Glue•2 minutes
- Lab - Deploy and Execute Python Script to AWS Glue•6 minutes
- Lab - Parameterize Your Python Script on AWS Glue•3 minutes
- Lab - Python Pandas with Snowflake on AWS Glue•4 minutes
- What Is Pushdown in Spark 3.1?•4 minutes
- Lab - Deploy a PySpark Script Using AWS Glue•8 minutes
- Lab - Set Up Managed Airflow Cluster on AWS•6 minutes
- Lab - Configure Snowflake Connectivity in Airflow•6 minutes
- Lab - Deploy a PySpark Transformation job in AWS Glue•5 minutes
- Lab - Set Up Airflow DAG•5 minutes
1 assignment•Total 15 minutes
- Snowflake with Python, Spark, and Airflow on AWS - Assessment•15 minutes
In this module, we will focus on real-time streaming using Kafka and Snowflake. You'll learn to set up Kafka on your local system, configure the Kafka-Snowflake connector, and enable secure connectivity with encryption keys. Through hands-on labs, you'll implement streaming pipelines to ingest real-time data into Snowflake, solidifying your understanding of integrating modern streaming platforms with Snowflake.
What's included
6 videos1 assignment
6 videos•Total 32 minutes
- Section Overview•1 minute
- Lab - Download the Necessary JAR Files•5 minutes
- Lab - Set Up Kafka in your local system•7 minutes
- Lab - Set Up Kafka Snowflake Connector•5 minutes
- Lab - Set Up Encryption Keys for Kafka-Snowflake Connectivity•5 minutes
- Lab - Streaming Data in Action•8 minutes
1 assignment•Total 15 minutes
- Real-Time Streaming with Kafka and Snowflake - Assessment•15 minutes
In this module, we will explore key features of Snowflake that ensure robust data protection and governance. You'll learn about Time Travel and Failsafe mechanisms for data recovery, and implement column-level dynamic data masking for safeguarding sensitive information. Additionally, we'll cover row-level security and guide you through hands-on labs to create and apply access policies, ensuring controlled and compliant data access.
What's included
6 videos1 assignment
6 videos•Total 23 minutes
- Section Overview•1 minute
- What Is Time Travel and Failsafe in Snowflake?•2 minutes
- Lab - Time Travel and Data Recovery•6 minutes
- Lab - Column Level Dynamic Data Masking•5 minutes
- What Is Row Level Security?•2 minutes
- Lab - Create and Implement Row Level Access Policy•7 minutes
1 assignment•Total 15 minutes
- Snowflake - Data Protection and Governance - Assessment•15 minutes
In this module, we will dive into Snowpark, Snowflake's powerful framework for building advanced data pipelines and supporting data science use cases. You'll gain hands-on experience with deploying Python UDFs, creating stored procedures for ETL tasks, and preparing data for machine learning. Furthermore, you will build and deploy model training and prediction pipelines using Scikit-Learn, all powered by Snowpark. Additional learning resources and a coupon code for extended exploration will also be provided.
What's included
9 videos1 assignment
9 videos•Total 72 minutes
- Introduction - What Is Snowpark?•6 minutes
- Lab - Getting Started with Snowpark•16 minutes
- Overview - UDFs and Store Procedures•3 minutes
- Lab - Deploy Python UDFs•7 minutes
- Lab - Deploy Stored Procedures for ETL Batch Processing•15 minutes
- Data Science - Use Case Overview and Data Preparation•5 minutes
- Lab - Deploy Model - Training Code for Scikit-Learn Using Stored Procedures•10 minutes
- Lab - Deploy Model Serving/Prediction Serving Pipeline Using UDFs•10 minutes
- More Learning Reference and Coupon Code•1 minute
1 assignment•Total 15 minutes
- Snowpark - For Data Pipelines and Data Science - Assessment•15 minutes
In this module, we will wrap up the course by reflecting on the key topics and skills covered. You’ll receive guidance on next steps, including updates on Snowflake's evolving features and additional learning opportunities. This final section will help you chart a path for continued growth and mastery of Snowflake and its ecosystem.
What's included
1 video2 assignments
1 video•Total 1 minute
- More Updates and What's Next•1 minute
2 assignments•Total 75 minutes
- Full Course Assessment•60 minutes
- Full Course Practice Assessment•15 minutes
Instructor
Explore more from Cloud Computing
- Status: Free Trial
Course
- Status: Free TrialS
Snowflake
Course
- Status: Preview
Course
- Status: Free Trial
Course
Why people choose Coursera for their career
Frequently asked questions
Yes, you can preview the first video and view the syllabus before you enroll. You must purchase the course to access content not included in the preview.
If you decide to enroll in the course before the session start date, you will have access to all of the lecture videos and readings for the course. You’ll be able to submit assignments once the session starts.
Once you enroll and your session begins, you will have access to all videos and other resources, including reading items and the course discussion forum. You’ll be able to view and submit practice assessments, and complete required graded assignments to earn a grade and a Course Certificate.
More questions
Financial aid available,
