Process Data from Dirty to Clean
Keep adding new skills with 10,000+ programs for $239 (usually $399). Save now.
Process Data from Dirty to Clean
This course is part of Google Data Analytics Professional Certificate
942,567 already enrolled
Included with
Ask Coursera
18,888 reviews
Recommended experience
18,888 reviews
Recommended experience
What you'll learn
Define different types of data integrity and identify risks to data integrity.
Apply basic SQL functions to clean string variables in a database.
Develop basic SQL queries for use on databases.
Describe the process of verifying data cleaning results.
Details to know
See how employees at top companies are mastering in-demand skills
Build your Data Analysis expertise
- Learn new concepts from industry experts
- Gain a foundational understanding of a subject or tool
- Develop job-relevant skills with hands-on projects
- Earn a shareable career certificate from Google
There are 6 modules in this course
This is the fourth course in the Google Data Analytics Certificate. In this course, you’ll continue to build your understanding of data analytics and the concepts and tools that data analysts use in their work. You’ll learn how to check and clean your data using spreadsheets and SQL, as well as how to verify and report your data cleaning results. Current Google data analysts will continue to instruct and provide you with hands-on ways to accomplish common data analyst tasks with the best tools and resources.
Learners who complete this certificate program will be equipped to apply for introductory-level jobs as data analysts. No previous experience is necessary. By the end of this course, learners will: - Check for data integrity. - Apply data cleaning techniques using spreadsheets. - Develop basic SQL queries for use on databases. - Use basic SQL functions to clean and transform data. - Verify the results of cleaning data. - Write an effective data cleaning report
Data integrity is critical to successful analysis. In this part of the course, you’ll explore methods and steps that analysts take to check their data for integrity. This includes knowing what to do when you don’t have enough data. You’ll also learn about random samples and understand how to avoid sampling bias. All of these methods will also help you ensure your analysis is successful.
What's included
8 videos10 readings6 assignments
8 videos•Total 33 minutes
- Introduction to data integrity•4 minutes
- Why data integrity is important•3 minutes
- Balance objectives with data integrity•3 minutes
- Deal with insufficient data•4 minutes
- The importance of sample size•3 minutes
- Using statistical power•5 minutes
- Determine the best sample size •5 minutes
- Evaluate data reliability•6 minutes
10 readings•Total 68 minutes
- Course 4 overview•8 minutes
- Helpful resources and tips•4 minutes
- More about data integrity and compliance•8 minutes
- Well-aligned objectives and data •8 minutes
- When you find an issue with your data•4 minutes
- Calculate sample size•8 minutes
- When data isn't readily available•8 minutes
- Sample size calculator•8 minutes
- All about margin of error•8 minutes
- Glossary terms from module 1•4 minutes
6 assignments•Total 92 minutes
- Module 1 challenge•40 minutes
- Test your knowledge on data integrity and analytics objectives•8 minutes
- Self-Reflection: Pre-cleaning activities•20 minutes
- Test your knowledge on insufficient data•8 minutes
- Test your knowledge on testing your data•8 minutes
- Test your knowledge on margin of error•8 minutes
Every data analyst wants to analyze clean data. In this part of the course, you’ll learn the difference between clean and dirty data. Then, you’ll practice cleaning data in spreadsheets and other tools.
What's included
10 videos10 readings6 assignments1 plugin
10 videos•Total 66 minutes
- Clean it up!•3 minutes
- Why data cleaning is critical•6 minutes
- Angie: I love cleaning data•1 minute
- Recognize and remedy dirty data•5 minutes
- Data-cleaning tools and techniques•6 minutes
- Clean data from multiple sources•6 minutes
- Data-cleaning features in spreadsheets•8 minutes
- Optimize the data-cleaning process•14 minutes
- Different data perspectives•10 minutes
- Even more data-cleaning techniques•7 minutes
10 readings•Total 72 minutes
- What is dirty data?•8 minutes
- Common data-cleaning pitfalls•8 minutes
- Step-by-Step guide: Data-cleaning features in spreadsheets•8 minutes
- Step-by-Step: Optimize the data-cleaning process •8 minutes
- Workflow automation•8 minutes
- Step-by-Step: Different data perspectives•8 minutes
- Step-by-Step: Even more data-cleaning techniques•8 minutes
- Working with .csv files•4 minutes
- Develop your approach to cleaning data•8 minutes
- Glossary terms from module 2•4 minutes
6 assignments•Total 184 minutes
- Module 2 challenge•40 minutes
- Test your knowledge on data cleaning•8 minutes
- Hands-On Activity: Cleaning data with spreadsheets•60 minutes
- Test your knowledge on the first steps toward clean data•8 minutes
- Hands-On Activity: Clean data with spreadsheet functions•60 minutes
- Test your knowledge on cleaning data in spreadsheets•8 minutes
1 plugin•Total 10 minutes
- Principles of data integrity •10 minutes
Knowing a variety of ways to clean data can make a data analyst’s job much easier. In this part of the course, you’ll use SQL to clean data from databases. In particular, you’ll explore how SQL queries and functions can be used to clean and transform your data before an analysis.
What's included
9 videos7 readings5 assignments1 plugin
9 videos•Total 49 minutes
- Use SQL to clean data•1 minute
- Sally: For the love of SQL•3 minutes
- Understand SQL capabilities•3 minutes
- Spreadsheets versus SQL•4 minutes
- Widely used SQL queries•6 minutes
- Evan: Having fun with SQL •3 minutes
- Clean string variables using SQL•13 minutes
- Advanced data-cleaning functions, part 1•6 minutes
- Advanced data-cleaning functions, part 2•9 minutes
7 readings•Total 42 minutes
- How a junior data analyst uses SQL•4 minutes
- SQL dialects and their uses•8 minutes
- Review: Set up your BigQuery account•8 minutes
- Review: Get started with BigQuery•8 minutes
- Optional: Upload the customer dataset to BigQuery•4 minutes
- Optional: Upload the store transactions dataset to BigQuery•8 minutes
- Glossary terms from module 3•2 minutes
5 assignments•Total 195 minutes
- Module 3 challenge•45 minutes
- Hands-On Activity: Processing time with SQL•60 minutes
- Hands-On Activity: Clean data using SQL•60 minutes
- Test your knowledge on SQL queries•10 minutes
- Self-Reflection: Challenges with SQL•20 minutes
1 plugin•Total 10 minutes
- Data-cleaning with SQL functions•10 minutes
When you clean data, you make changes to the original dataset. It’s important to verify the changes you make are accurate and to let your teammates know about the changes. In this part of the course, you’ll learn to verify that data is clean and report your data cleaning results. With verified clean data, you’re ready to begin analyzing!
What's included
6 videos5 readings4 assignments
6 videos•Total 28 minutes
- Verify and report results•3 minutes
- Confirm data-cleaning meets business expectations•5 minutes
- Verification of data cleaning•8 minutes
- Capture cleaning changes•6 minutes
- Why documentation is important•3 minutes
- Feedback and cleaning•2 minutes
5 readings•Total 26 minutes
- Step-by-Step: Verification of data cleaning•8 minutes
- Data-cleaning verification checklist•4 minutes
- Embrace changelogs•8 minutes
- Advanced functions for speedy data cleaning•4 minutes
- Glossary terms from module 4•2 minutes
4 assignments•Total 76 minutes
- Module 4 challenge•40 minutes
- Test your knowledge on manual data cleaning•8 minutes
- Self-Reflection: Creating a changelog•20 minutes
- Test your knowledge on documenting the cleaning process•8 minutes
Creating an effective resume will help you in your data analytics career. In this part of the course, you’ll learn all about the job application process. Your focus will be on building a resume that highlights your strengths and relevant experience.
What's included
3 videos2 readings
3 videos•Total 9 minutes
- Make your resume unique•3 minutes
- Joseph: Black and African American inclusion in the data industry•2 minutes
- Where does your interest lie?•4 minutes
2 readings•Total 12 minutes
- The importance of diversity on a data analytics team•4 minutes
- Add technical skills to your resume•8 minutes
Review the course glossary and prepare for the next course in the Google Data Analytics Certificate program.
What's included
1 video3 readings
1 video•Total 1 minute
- Congratulations! Course wrap-up•1 minute
3 readings•Total 12 minutes
- Reflect and connect with peers•4 minutes
- Course 4 glossary•4 minutes
- Coming up next ...•4 minutes
Earn a career certificate
Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.
Instructor
Explore more from Data Analysis
- Status: Free TrialJ
Johns Hopkins University
Course
Course
- Status: Free Trial
Course
Guided Project
Why people choose Coursera for their career
Learner reviews
- 5 stars
85.14%
- 4 stars
12.19%
- 3 stars
1.84%
- 2 stars
0.42%
- 1 star
0.39%
Showing 3 of 18888
Reviewed on Oct 27, 2023
Fun, concise, and on point course walking new folks through (or a great review for not so new folks) the process of identification, basic change management, and reporting for dataset validation
Reviewed on Jul 7, 2025
Great way of teaching, her lectures were outstaning and engaging, understood each and every concepts very clearly. Thank you Google and Coursera team for making us to interact with such personality...
Reviewed on Oct 26, 2021
Good content overall. However it will be nice to have the glossary for each week not mixed up with those from previous as it makes it hard to navigate and know which new words need to be learnt
Frequently asked questions
Data is a group of facts that can take many different forms, such as numbers, pictures, words, videos, observations, and more. We use and create data everyday, like when we stream a show or song or post on social media.
Data analytics is the collection, transformation, and organization of these facts to draw conclusions, make predictions, and drive informed decision-making.
The amount of data created each day is tremendous. Any time you use your phone, look up something online, stream music, shop with a credit card, post on social media, or use GPS to map a route, you’re creating data. Companies must continually adjust their products, services, tools, and business strategies to meet consumer demand and react to emerging trends. Because of this, data analyst roles are in demand and competitively paid.
Data analysts make sense of data and numbers to help organizations make better business decisions. They prepare, process, analyze, and visualize data, discovering patterns and trends and answering key questions along the way. Their work empowers their wider team to make better business decisions.
You will learn the skill set required for becoming a junior or associate data analyst in the Google Data Analytics Certificate. Data analysts know how to ask the right question; prepare, process, and analyze data for key insights; effectively share their findings with stakeholders; and provide data-driven recommendations for thoughtful action.
You’ll learn these job-ready skills in our certificate program through interactive content (discussion prompts, quizzes, and activities) in under six months, with under 10 hours of flexible study a week. Along the way, you'll work through a curriculum designed with input from top employers and industry leaders, like Tableau, Accenture, and Deloitte. You’ll even have the opportunity to complete a case study that you can share with potential employers to showcase your new skill set.
After you’ve graduated from the program, you’ll have access to career resources and be connected directly with employers hiring for open entry-level roles in data analytics.
More questions
Financial aid available,
