VOOZH about

URL: https://www.coursera.org/learn/data-cleaning

⇱ Getting and Cleaning Data | Coursera


Getting and Cleaning Data

Keep adding new skills with 10,000+ programs for $239 (usually $399). Save now.

Getting and Cleaning Data

This course is part of multiple programs.

215,324 already enrolled

Included with

β€’

Learn more

Ask Coursera

Gain insight into a topic and learn the fundamentals.
4.5

8,078 reviews

2 weeks to complete
at 10 hours a week
Flexible schedule
Learn at your own pace
90%
Most learners liked this course

Gain insight into a topic and learn the fundamentals.
4.5

8,078 reviews

2 weeks to complete
at 10 hours a week
Flexible schedule
Learn at your own pace
90%
Most learners liked this course

What you'll learn

  • Understand common data storage systems

  • Apply data cleaning basics to make data "tidy"

  • Use R for text and date manipulation

  • Obtain usable data from the web, APIs, and databases

Details to know

Shareable certificate

Add to your LinkedIn profile

Assessments

4 assignments

Taught in English

Build your subject-matter expertise

This course is available as part of
When you enroll in this course, you'll also be asked to select a specific program.
  • Learn new concepts from industry experts
  • Gain a foundational understanding of a subject or tool
  • Develop job-relevant skills with hands-on projects
  • Earn a shareable career certificate

There are 4 modules in this course

Before you can work with data you have to get some. This course will cover the basic ways that data can be obtained. The course will cover obtaining data from the web, from APIs, from databases and from colleagues in various formats. It will also cover the basics of data cleaning and how to make data β€œtidy”. Tidy data dramatically speed downstream data analysis tasks. The course will also cover the components of a complete data set including raw data, processing instructions, codebooks, and processed data. The course will cover the basics needed for collecting, cleaning, and sharing data.

In this first week of the course, we look at finding data and reading different file types.

What's included

9 videos4 readings1 assignment

9 videosβ€’Total 67 minutes
  • Obtaining Data Motivationβ€’6 minutes
  • Raw and Processed Dataβ€’7 minutes
  • Components of Tidy Dataβ€’9 minutes
  • Downloading Filesβ€’7 minutes
  • Reading Local Filesβ€’5 minutes
  • Reading Excel Filesβ€’4 minutes
  • Reading XMLβ€’13 minutes
  • Reading JSONβ€’5 minutes
  • The data.table Packageβ€’11 minutes
4 readingsβ€’Total 40 minutes
  • Welcome to Week 1β€’10 minutes
  • Syllabusβ€’10 minutes
  • Pre-Course Surveyβ€’10 minutes
  • Practical R Exercises in swirl Part 1β€’10 minutes
1 assignmentβ€’Total 30 minutes
  • Week 1 Quizβ€’30 minutes

Welcome to Week 2 of Getting and Cleaning Data! The primary goal is to introduce you to the most common data storage systems and the appropriate tools to extract data from web or from databases like MySQL.

What's included

5 videos1 assignment

5 videosβ€’Total 41 minutes
  • Reading from MySQLβ€’15 minutes
  • Reading from HDF5β€’7 minutes
  • Reading from The Webβ€’7 minutes
  • Reading From APIsβ€’8 minutes
  • Reading From Other Sourcesβ€’5 minutes
1 assignmentβ€’Total 30 minutes
  • Week 2 Quizβ€’30 minutes

Welcome to Week 3 of Getting and Cleaning Data! This week the lectures will focus on organizing, merging and managing the data you have collected using the lectures from Weeks 1 and 2.

What's included

7 videos1 reading1 assignment3 programming assignments

7 videosβ€’Total 60 minutes
  • Subsetting and Sortingβ€’7 minutes
  • Summarizing Dataβ€’12 minutes
  • Creating New Variablesβ€’11 minutes
  • Reshaping Dataβ€’9 minutes
  • Managing Data Frames with dplyr - Introductionβ€’3 minutes
  • Managing Data Frames with dplyr - Basic Toolsβ€’12 minutes
  • Merging Dataβ€’6 minutes
1 readingβ€’Total 10 minutes
  • Practical R Exercises in swirl Part 2β€’10 minutes
1 assignmentβ€’Total 30 minutes
  • Week 3 Quizβ€’30 minutes
3 programming assignmentsβ€’Total 540 minutes
  • swirl Lesson 1: Manipulating Data with dplyrβ€’180 minutes
  • swirl Lesson 2: Grouping and Chaining with dplyrβ€’180 minutes
  • swirl Lesson 3: Tidying Data with tidyrβ€’180 minutes

Welcome to Week 4 of Getting and Cleaning Data! This week we finish up with lectures on text and date manipulation in R. In this final week we will also focus on peer grading of Course Projects.

What's included

5 videos2 readings1 assignment1 programming assignment1 peer review

5 videosβ€’Total 34 minutes
  • Editing Text Variablesβ€’11 minutes
  • Regular Expressions Iβ€’5 minutes
  • Regular Expressions IIβ€’8 minutes
  • Working with Datesβ€’6 minutes
  • Data Resourcesβ€’4 minutes
2 readingsβ€’Total 20 minutes
  • Practical R Exercises in swirl Part 4β€’10 minutes
  • Post-Course Surveyβ€’10 minutes
1 assignmentβ€’Total 30 minutes
  • Week 4 Quizβ€’30 minutes
1 programming assignmentβ€’Total 180 minutes
  • swirl Lesson 1: Dates and Times with lubridateβ€’180 minutes
1 peer reviewβ€’Total 60 minutes
  • Getting and Cleaning Data Course Projectβ€’60 minutes

Earn a career certificate

Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.

Instructors

Instructor ratings
4.3 (337 ratings)
Johns Hopkins University
32 Coursesβ€’1,762,220 learners

Explore more from Data Analysis

Why people choose Coursera for their career

πŸ‘ Image

Felipe M.

Learner since 2018
"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."
πŸ‘ Image

Jennifer J.

Learner since 2020
"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."
πŸ‘ Image

Larry W.

Learner since 2021
"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."
πŸ‘ Image

Chaitanya A.

"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."

Learner reviews

  • 5 stars

    67.41%

  • 4 stars

    23.63%

  • 3 stars

    5.85%

  • 2 stars

    1.64%

  • 1 star

    1.44%

Showing 3 of 8078

WC
Β·

Reviewed on Oct 31, 2016

This course is amazing! I have spent the majority of my time in R merely doing analytics. This course taught me the tools needed to go out and grab the data that I need for those analytics.

AC
Β·

Reviewed on Jan 1, 2019

It was pretty hard for someone like me who has a weakness in programming but it provided sufficient exposure and tasks for me to learn within my capabilities. I did enjoy its challenges.

HS
Β·

Reviewed on May 2, 2020

This course provides an introduction of some important concepts and tools on a very important aspect of data science: cleaning and organizing data before any analysis. A must for any data scientist.

Frequently asked questions

To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.

When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.

Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.

Financial aid available,