VOOZH about

URL: https://www.coursera.org/learn/packt-snowflake-build-and-architect-data-pipelines-using-aws-f8fn8

⇱ Snowflake - Build and Architect Data Pipelines Using AWS | Coursera


Snowflake - Build and Architect Data Pipelines Using AWS

Keep adding new skills with 10,000+ programs for $239 (usually $399). Save now.

Snowflake - Build and Architect Data Pipelines Using AWS

Included with

Ask Coursera

Gain insight into a topic and learn the fundamentals.
Intermediate level

Recommended experience

1 week to complete
at 10 hours a week
Flexible schedule
Learn at your own pace

Gain insight into a topic and learn the fundamentals.
Intermediate level

Recommended experience

1 week to complete
at 10 hours a week
Flexible schedule
Learn at your own pace

What you'll learn

  • Architect and optimize scalable data pipelines with Snowflake and AWS.

  • Implement ingestion, transformation, and extraction workflows with best practices.

  • Deploy machine learning pipelines using Snowpark and real-time streaming with Kafka.

  • Ensure data governance with advanced security and compliance features.

Details to know

Shareable certificate

Add to your LinkedIn profile

Assessments

14 assignments

Taught in English

There are 14 modules in this course

This course features Coursera Coach!

A smarter way to learn with interactive, real-time conversations that help you test your knowledge, challenge assumptions, and deepen your understanding as you progress through the course. Learn how to design and build robust data pipelines using Snowflake and AWS in this comprehensive course. You will explore Snowflake's architecture, including its virtual warehouses, billing components, and object hierarchy, to fully understand how to leverage this powerful platform. In addition to that, you'll dive into key areas like data ingestion, partitioning, clustering, and performance optimization techniques, while getting hands-on experience with labs to reinforce your learning. By the end of the course, you will be able to create optimized, cost-effective data pipelines that integrate seamlessly with AWS services like S3, Lambda, and Glue. The journey includes building tasks and queries, utilizing streams for real-time data tracking, and understanding how to set up user-defined and external functions. As you progress, you will also explore advanced concepts such as Snowpark for Data Science, streaming with Kafka, and data governance techniques to ensure your data pipeline meets security and compliance standards. This course is designed for those seeking hands-on expertise in building scalable data pipelines using Snowflake and AWS. Whether you're an aspiring data engineer or an experienced professional looking to sharpen your skills, this course will provide you with the tools and knowledge to implement real-world data engineering solutions effectively.

In this module, we will set the stage for the entire course by outlining the roadmap, discussing the prerequisites, and sharing success strategies. These foundational insights will ensure you're well-prepared to navigate and excel in the upcoming material.

What's included

2 videos1 reading

2 videosTotal 6 minutes
  • Course Roadmap3 minutes
  • Prerequisites and How to Succeed in This Course3 minutes
1 readingTotal 10 minutes
  • Full Course Resources10 minutes

In this module, we will explore the foundational concepts of data warehousing and its significance within a data ecosystem. We’ll take a closer look at Snowflake’s architecture, object hierarchy, and virtual warehouses. Additionally, you’ll learn about Snowflake’s billing components, tracking consumption, and setting up resource monitors, ensuring you’re equipped to manage resources effectively.

What's included

9 videos1 assignment

9 videosTotal 47 minutes
  • What Is Data-Warehouse?3 minutes
  • Two Aspects of a Data Ecosystem2 minutes
  • Lab - Set Up Snowflake Trial Account2 minutes
  • Snowflake Architecture4 minutes
  • Snowflake Object Hierarchy5 minutes
  • Snowflake - Virtual Warehouses10 minutes
  • Snowflake - Different Billing Components10 minutes
  • Snowflake - Track Your Consumption5 minutes
  • Snowflake- Resource Monitors5 minutes
1 assignmentTotal 15 minutes
  • Introduction to Snowflake and AWS - Assessment15 minutes

In this module, we will delve into the various table types available in Snowflake, providing a comprehensive introduction to their structures and purposes. You’ll gain hands-on experience through labs focused on creating tables, views, and secure views. We’ll also explore the nuances of views, including materialized and secure views, to enhance your understanding of Snowflake's data presentation capabilities.

What's included

6 videos1 assignment

6 videosTotal 30 minutes
  • Introduction - Different Tables in Snowflake4 minutes
  • Lab - Create Tables in Snowflake9 minutes
  • Snowflake - Views, Materialized Views and Secure Views4 minutes
  • Lab - Create Views in Snowflake5 minutes
  • Lab - Create Secure Views in Snowflake5 minutes
  • More about Views in Snowflake3 minutes
1 assignmentTotal 15 minutes
  • Snowflake - Tables - Assessment15 minutes

In this module, we will examine Snowflake’s advanced data organization features, focusing on micro-partitions and clustering keys. Through hands-on labs, you’ll learn to select and configure clustering keys, analyze query profiles, and leverage caching mechanisms to enhance performance. Additionally, we’ll explore the benefits of search optimization to further streamline data retrieval and processing efficiency.

What's included

9 videos1 assignment

9 videosTotal 72 minutes
  • Section Overview2 minutes
  • Introduction to Partitions and Clustering Keys8 minutes
  • Lab - Micro-Partitions and Clustering Keys18 minutes
  • Benefits of Micro-Partitions and Clustering6 minutes
  • Understanding Clustering Depth and Cluster Overlap9 minutes
  • Lab - Selecting Your Clustering Keys6 minutes
  • Lab - Check Query Profile and History5 minutes
  • Lab - Query Processing and Caching8 minutes
  • Search Optimization Feature11 minutes
1 assignmentTotal 15 minutes
  • Snowflake - Partitioning, Clustering, and Performance Optimization - Assessment15 minutes

In this module, we will explore the end-to-end processes for loading and extracting data in Snowflake. You'll learn how to connect Snowflake with AWS S3, ingest structured and semi-structured data, and implement continuous ingestion using Snowpipe. Additionally, we'll cover critical aspects such as billing estimation and key considerations to ensure efficient data operations. Hands-on labs will solidify your understanding of these concepts.

What's included

9 videos1 assignment

9 videosTotal 54 minutes
  • Section Overview1 minute
  • Data Ingestion - Real-World Use Cases4 minutes
  • Lab - Create an Integration Object to Connect Snowflake with AWS S38 minutes
  • Lab - Ingest CSV from S3 to Snowflake8 minutes
  • Lab - Ingest JSON from S3 to Snowflake11 minutes
  • Introduction to Continuous Data Ingestion in Snowflake2 minutes
  • Lab - Create and Implement Snow Pipe11 minutes
  • Snow pipe - Billing Estimation and Key Considerations for Data Ingestion3 minutes
  • Lab - Extracting/Unload Data from Snowflake to S36 minutes
1 assignmentTotal 15 minutes
  • Snowflake - Data Loading/Ingestion and Extraction - Assessment15 minutes

In this module, we will delve into Snowflake's task management and query scheduling features. You'll learn how to create and manage tasks, build complex task trees for dependent workflows, and monitor their execution. We'll also explore billing insights and query history to ensure efficient and cost-effective operations. Through hands-on labs, you’ll gain practical skills in implementing and optimizing tasks in Snowflake.

What's included

4 videos1 assignment

4 videosTotal 15 minutes
  • Section Overview0 minutes
  • Introduction to Tasks4 minutes
  • Lab - Create Standalone and Dependent Tree of Tasks9 minutes
  • Lab - Billing and Query History for Tasks2 minutes
1 assignmentTotal 15 minutes
  • Snowflake - Tasks and Query Scheduling - Assessment15 minutes

In this module, we will uncover the power of streams in Snowflake for implementing Change Data Capture (CDC) workflows. You'll learn how to use standard and append-only streams, manage data retention, and handle stream staleness. Through a series of labs and a project, you’ll create and implement end-to-end pipelines that leverage streams to track and process data changes efficiently. This hands-on experience will solidify your understanding of CDC in modern data architectures.

What's included

11 videos1 assignment

11 videosTotal 58 minutes
  • Section Overview1 minute
  • Introduction to Streams3 minutes
  • Lab - Implement Standard Streams15 minutes
  • Lab - Implement Append-Only Streams4 minutes
  • Lab - Streams in a Transaction6 minutes
  • Streams - Data Retention and Staleness6 minutes
  • Lab - Change Tracking Using "Changes"6 minutes
  • Project Overview2 minutes
  • Lab - Create Streams - Project Solution7 minutes
  • Lab - Create Streams - Continuation3 minutes
  • Lab - End-to-End Pipeline in Action5 minutes
1 assignmentTotal 15 minutes
  • Snowflake - Streams and Change Data Capture - Assessment15 minutes

In this module, we will explore User-Defined Functions (UDFs) in Snowflake, a powerful feature for extending database functionality. You'll learn about different UDF types, including scalar, tabular, and JavaScript-based UDFs, and gain hands-on experience implementing them. Additionally, we'll discuss pushdown in UDFs and its impact, as well as best practices for writing secure UDFs to ensure data privacy and compliance.

What's included

7 videos1 assignment

7 videosTotal 36 minutes
  • Introduction to User-Defined Functions and UDF Types3 minutes
  • Lab - Write and Implement a Scalar UDF5 minutes
  • Lab - Write Tabular UDF in SQL4 minutes
  • Lab - Implement JavaScript UDFs5 minutes
  • What Is Pushdown in UDF?3 minutes
  • Lab - How Can Pushdown Expose the Underlying Data?6 minutes
  • Lab - Write Secure UDFs10 minutes
1 assignmentTotal 15 minutes
  • Snowflake - User-Defined Functions - Assessment15 minutes

In this module, we will explore the capabilities of external functions in Snowflake for interacting with external systems. You’ll learn how to deploy AWS Lambda functions, create and secure API Gateway, and integrate these components with Snowflake to build external functions. Through hands-on labs, you will gain practical skills in configuring and deploying these powerful integrations for extending Snowflake’s functionality.

What's included

7 videos1 assignment

7 videosTotal 30 minutes
  • Section Overview1 minute
  • Introduction to External Functions2 minutes
  • Lab - Deploy AWS Lambda Function7 minutes
  • Create IAM Role2 minutes
  • Lab - Create API Gateway7 minutes
  • Lab - Secure and Deploy API Gateway5 minutes
  • Lab - Create External Function in Snowflake7 minutes
1 assignmentTotal 15 minutes
  • Snowflake - External Functions - Assessment15 minutes

In this module, we will explore how to integrate Snowflake with Python, Spark, and Airflow on AWS to build robust data engineering solutions. You’ll learn how to connect Snowflake with Python locally and on AWS Glue, parameterize scripts, and use Pandas for data manipulation. Additionally, we will dive into PySpark jobs, the pushdown optimization in Spark 3.1, and setting up Airflow for task orchestration. Hands-on labs will provide practical experience in deploying and automating workflows across these tools.

What's included

12 videos1 assignment

12 videosTotal 52 minutes
  • Section Overview1 minute
  • Lab - Connect Python with Snowflake in Your Local Machine3 minutes
  • Introduction to AWS Glue2 minutes
  • Lab - Deploy and Execute Python Script to AWS Glue6 minutes
  • Lab - Parameterize Your Python Script on AWS Glue3 minutes
  • Lab - Python Pandas with Snowflake on AWS Glue4 minutes
  • What Is Pushdown in Spark 3.1?4 minutes
  • Lab - Deploy a PySpark Script Using AWS Glue8 minutes
  • Lab - Set Up Managed Airflow Cluster on AWS6 minutes
  • Lab - Configure Snowflake Connectivity in Airflow6 minutes
  • Lab - Deploy a PySpark Transformation job in AWS Glue5 minutes
  • Lab - Set Up Airflow DAG5 minutes
1 assignmentTotal 15 minutes
  • Snowflake with Python, Spark, and Airflow on AWS - Assessment15 minutes

In this module, we will focus on real-time streaming using Kafka and Snowflake. You'll learn to set up Kafka on your local system, configure the Kafka-Snowflake connector, and enable secure connectivity with encryption keys. Through hands-on labs, you'll implement streaming pipelines to ingest real-time data into Snowflake, solidifying your understanding of integrating modern streaming platforms with Snowflake.

What's included

6 videos1 assignment

6 videosTotal 32 minutes
  • Section Overview1 minute
  • Lab - Download the Necessary JAR Files5 minutes
  • Lab - Set Up Kafka in your local system7 minutes
  • Lab - Set Up Kafka Snowflake Connector5 minutes
  • Lab - Set Up Encryption Keys for Kafka-Snowflake Connectivity5 minutes
  • Lab - Streaming Data in Action8 minutes
1 assignmentTotal 15 minutes
  • Real-Time Streaming with Kafka and Snowflake - Assessment15 minutes

In this module, we will explore key features of Snowflake that ensure robust data protection and governance. You'll learn about Time Travel and Failsafe mechanisms for data recovery, and implement column-level dynamic data masking for safeguarding sensitive information. Additionally, we'll cover row-level security and guide you through hands-on labs to create and apply access policies, ensuring controlled and compliant data access.

What's included

6 videos1 assignment

6 videosTotal 23 minutes
  • Section Overview1 minute
  • What Is Time Travel and Failsafe in Snowflake?2 minutes
  • Lab - Time Travel and Data Recovery6 minutes
  • Lab - Column Level Dynamic Data Masking5 minutes
  • What Is Row Level Security?2 minutes
  • Lab - Create and Implement Row Level Access Policy7 minutes
1 assignmentTotal 15 minutes
  • Snowflake - Data Protection and Governance - Assessment15 minutes

In this module, we will dive into Snowpark, Snowflake's powerful framework for building advanced data pipelines and supporting data science use cases. You'll gain hands-on experience with deploying Python UDFs, creating stored procedures for ETL tasks, and preparing data for machine learning. Furthermore, you will build and deploy model training and prediction pipelines using Scikit-Learn, all powered by Snowpark. Additional learning resources and a coupon code for extended exploration will also be provided.

What's included

9 videos1 assignment

9 videosTotal 72 minutes
  • Introduction - What Is Snowpark?6 minutes
  • Lab - Getting Started with Snowpark16 minutes
  • Overview - UDFs and Store Procedures3 minutes
  • Lab - Deploy Python UDFs7 minutes
  • Lab - Deploy Stored Procedures for ETL Batch Processing15 minutes
  • Data Science - Use Case Overview and Data Preparation5 minutes
  • Lab - Deploy Model - Training Code for Scikit-Learn Using Stored Procedures10 minutes
  • Lab - Deploy Model Serving/Prediction Serving Pipeline Using UDFs10 minutes
  • More Learning Reference and Coupon Code1 minute
1 assignmentTotal 15 minutes
  • Snowpark - For Data Pipelines and Data Science - Assessment15 minutes

In this module, we will wrap up the course by reflecting on the key topics and skills covered. You’ll receive guidance on next steps, including updates on Snowflake's evolving features and additional learning opportunities. This final section will help you chart a path for continued growth and mastery of Snowflake and its ecosystem.

What's included

1 video2 assignments

1 videoTotal 1 minute
  • More Updates and What's Next1 minute
2 assignmentsTotal 75 minutes
  • Full Course Assessment60 minutes
  • Full Course Practice Assessment15 minutes

Instructor

Offered by

Explore more from Cloud Computing

Why people choose Coursera for their career

👁 Image

Felipe M.

Learner since 2018
"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."
👁 Image

Jennifer J.

Learner since 2020
"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."
👁 Image

Larry W.

Learner since 2021
"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."
👁 Image

Chaitanya A.

"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."

Frequently asked questions

Yes, you can preview the first video and view the syllabus before you enroll. You must purchase the course to access content not included in the preview.

If you decide to enroll in the course before the session start date, you will have access to all of the lecture videos and readings for the course. You’ll be able to submit assignments once the session starts.

Once you enroll and your session begins, you will have access to all videos and other resources, including reading items and the course discussion forum. You’ll be able to view and submit practice assessments, and complete required graded assignments to earn a grade and a Course Certificate.

If you complete the course successfully, your electronic Course Certificate will be added to your Accomplishments page - from there, you can print your Course Certificate or add it to your LinkedIn profile.

This course is currently available only to learners who have paid or received financial aid, when available.

Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.

Financial aid available,