Data Storage and Queries
Keep adding new skills with 10,000+ programs for $239 (usually $399). Save now.
Data Storage and Queries
This course is part of DeepLearning.AI Data Engineering Professional Certificate
Instructors: Joe Reis
Top Instructor
8,320 already enrolled
Ask Coursera
84 reviews
Recommended experience
84 reviews
Recommended experience
What you'll learn
Design storage architectures for various use cases, and select appropriate technologies to implement these architectures
Practice common query patters and identify ways to improve query performance and enhance the value of your data systems
Skills you'll gain
Details to know
See how employees at top companies are mastering in-demand skills
Build your Cloud Computing expertise
- Learn new concepts from industry experts
- Gain a foundational understanding of a subject or tool
- Develop job-relevant skills with hands-on projects
- Earn a shareable career certificate from DeepLearning.AI
There are 3 modules in this course
In this course, you will learn about the raw ingredients and processes that are used to physically store data on disk and in memory. You’ll explore different storage systems, including object, block, and file storage, as well as databases, that are built on top of these raw ingredients. You’ll also get a chance to use the Cypher language to query a Neo4j graph database, and perform vector similarity search, a key feature behind generative AI and large language models. You will explore the evolution of data storage abstractions, from data warehouses, to data lakes, and data lakehouses, while comparing the advantages and drawbacks of each architectural paradigm. With hands-on practice, you will design a simple data lake using Amazon Glue, and build a data lakehouse using AWS LakeFormation and Apache Iceberg. In the last week of this course, you’ll see how queries work behind the scenes, practice writing more advanced SQL queries, compare the query performance in row vs column-oriented storage, and perform streaming queries using Apache Flink.
What's included
16 videos12 readings1 assignment1 programming assignment1 ungraded lab
16 videos•Total 103 minutes
- Welcome to Course 3•4 minutes
- Course 3 Overview•4 minutes
- Storage Raw Ingredients- Physical Components of Data Storage•8 minutes
- Storage Raw Ingredients - Processes Required for Data Storage•6 minutes
- Cloud Storage Options: Block, Object and File storage•8 minutes
- Storage Tiers - Hot, Warm, & Cold Data•3 minutes
- Distributed Storage Systems•7 minutes
- Lab Walkthrough - Comparing Cloud Storage Options•4 minutes
- How Databases Store Data•5 minutes
- Row vs Column Storage•6 minutes
- Graph Databases•5 minutes
- Vector Databases•5 minutes
- Neo4j and Cypher Query Language (Part 1)•4 minutes
- Neo4j and Cypher Query Language (Part 2)•9 minutes
- [Optional] - Conversation with Juan Sequeda•24 minutes
- Week 1 Summary•2 minutes
12 readings•Total 60 minutes
- Program Syllabus•5 minutes
- [Optional] Compression Algorithms•5 minutes
- [Optional] Database Partitioning/Sharding Methods•5 minutes
- [IMPORTANT] Guidelines before you start the labs in this course•10 minutes
- [Optional] FAQ VS Code Lab Environment•5 minutes
- Join the DeepLearning.AI Forum to ask questions, get support, or share amazing ideas!•2 minutes
- [Optional] The Parquet Format•5 minutes
- [Optional] Wide-Column Databases•5 minutes
- [Optional] ANN Algorithm: Hierarchical Navigable Small World (HNSW)•5 minutes
- [Optional] - Links to Data and Cypher Instructions•2 minutes
- Lecture Notes W1•1 minute
- Week 1 Resources•10 minutes
1 assignment•Total 30 minutes
- Week 1 Quiz•30 minutes
1 programming assignment•Total 120 minutes
- Assignment 1: Graph Databases and Vector Search with Neo4j•120 minutes
1 ungraded lab•Total 120 minutes
- Practice Lab: Comparing Cloud Data Storage Options•120 minutes
What's included
16 videos2 readings1 assignment1 programming assignment1 ungraded lab
16 videos•Total 84 minutes
- Week 2 Overview•2 minutes
- [Optional] Conversation with Bill Inmon•12 minutes
- Data Warehouse - Key Architectural Ideas•6 minutes
- Modern Cloud Data Warehouses•4 minutes
- Data Lakes - Key Architectural Ideas•4 minutes
- Next-Generation Data Lakes •5 minutes
- Lab Walkthrough - Simple Data Lake with AWS Glue (Part 1)•5 minutes
- Lab Walkthrough - Simple Data Lake with AWS Glue (Part 2)•6 minutes
- Lab Walkthrough - Simple Data Lake with AWS Glue (Part 3 - Optional)•3 minutes
- The Data Lakehouse Architecture•3 minutes
- Date Lakehouse Implementation•5 minutes
- Lakehouse Architecture on AWS•5 minutes
- Implementing a Lakehouse on AWS•8 minutes
- Lab Walkthrough - Building a Data Lakehouse with AWS Lake Formation and Apache Iceberg (Part 1)•10 minutes
- Lab Walkthrough - Building a Data Lakehouse with AWS Lake Formation and Apache Iceberg (Part 2)•5 minutes
- Week 2 Summary•2 minutes
2 readings•Total 11 minutes
- Lecture Notes W2•1 minute
- Week 2 Resources•10 minutes
1 assignment•Total 30 minutes
- Week 2 Quiz•30 minutes
1 programming assignment•Total 120 minutes
- Assignment 2: Building a Data Lakehouse with AWS Lake Formation and Apache Iceberg•120 minutes
1 ungraded lab•Total 120 minutes
- Practice Lab: Simple Data Lake with AWS Glue•120 minutes
What's included
15 videos4 readings1 assignment1 programming assignment2 ungraded labs
15 videos•Total 77 minutes
- Week 3 Overview•3 minutes
- The Life of a Query•5 minutes
- Advanced SQL Queries (Part 1)•6 minutes
- Advanced SQL Queries (Part 2)•7 minutes
- Index Deep Dive •7 minutes
- Retrieving Only the Data You Need•3 minutes
- The Join Statement•8 minutes
- Aggregate Queries •3 minutes
- Amazon Redshift Cloud Data Warehouse•9 minutes
- Lab Walkthrough - Comparing the Query Performance Between Row and Columnar Storage•3 minutes
- Additional Query Strategies•5 minutes
- Queries on Streaming data •5 minutes
- Deploying an Application with Amazon Managed Service for Apache Flink•6 minutes
- Deploying a Studio Notebook with Amazon Managed Service for Apache Flink•6 minutes
- Course 3 Summary•1 minute
4 readings•Total 19 minutes
- [Optional] - Additional Index Examples•3 minutes
- Lecture Notes W3•1 minute
- Week 3 Resources•10 minutes
- Acknowledgments•5 minutes
1 assignment•Total 30 minutes
- Week 3 Quiz•30 minutes
1 programming assignment•Total 180 minutes
- Assignment 3: Advanced SQL Queries•180 minutes
2 ungraded labs•Total 240 minutes
- Practice Lab 1: Comparing the Query Performance Between Row-Oriented and Column-Oriented Databases•120 minutes
- Practice Lab 2: Streaming Queries with Apache Flink•120 minutes
Earn a career certificate
Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.
Instructors
Explore more from Cloud Computing
- Status: Free TrialM
Microsoft
Course
- Status: Preview
Course
- Status: Free TrialW
Whizlabs
Course
- Status: Free Trial
Course
Why people choose Coursera for their career
Learner reviews
- 5 stars
82.14%
- 4 stars
9.52%
- 3 stars
2.38%
- 2 stars
2.38%
- 1 star
3.57%
Showing 3 of 84
Reviewed on Apr 24, 2025
This is a really excellent course covering a number of topics that anyone going into data engineering should be familiar with.
Reviewed on Oct 6, 2025
Just excellent all around (from a current practitioner)
Reviewed on Sep 15, 2025
Solid! bit of all but too in depth nor too practice oriented
Frequently asked questions
To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.
When you enroll in the course, you get access to all of the courses in the Certificate, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.
More questions
Financial aid available,
