VOOZH about

URL: https://www.coursera.org/learn/tidyverse-importing-data

⇱ Importing Data in the Tidyverse | Coursera


Importing Data in the Tidyverse

Keep adding new skills with 10,000+ programs for $239 (usually $399). Save now.

Gain insight into a topic and learn the fundamentals.
4.7

51 reviews

Beginner level

Recommended experience

2 weeks to complete
at 10 hours a week
Flexible schedule
Learn at your own pace

Gain insight into a topic and learn the fundamentals.
4.7

51 reviews

Beginner level

Recommended experience

2 weeks to complete
at 10 hours a week
Flexible schedule
Learn at your own pace

What you'll learn

  • Describe different data formats

  • Apply Tidyverse functions to import data into R from external formats

  • Obtain data from a web API

Details to know

Shareable certificate

Add to your LinkedIn profile

Assessments

5 assignments

Taught in English

Build your subject-matter expertise

This course is part of the Tidyverse Skills for Data Science in R Specialization
When you enroll in this course, you'll also be enrolled in this Specialization.
  • Learn new concepts from industry experts
  • Gain a foundational understanding of a subject or tool
  • Develop job-relevant skills with hands-on projects
  • Earn a shareable career certificate

There are 6 modules in this course

Getting data into your statistical analysis system can be one of the most challenging parts of any data science project. Data must be imported and harmonized into a coherent format before any insights can be obtained. You will learn how to get data into R from commonly used formats and harmonizing different kinds of datasets from different sources. If you work in an organization where different departments collect data using different systems and different storage formats, then this course will provide essential tools for bringing those datasets together and making sense of the wealth of information in your organization.

This course introduces the Tidyverse tools for importing data into R so that it can be prepared for analysis, visualization, and modeling. Common data formats are introduced, including delimited files, spreadsheets and relational databases, and techniques for obtaining data from the web are demonstrated, such as web scraping and web APIs. In this specialization we assume familiarity with the R programming language. If you are not yet familiar with R, we suggest you first complete R Programming before returning to complete this course.

A basic data type in the tidyverse is the tibble. Tibbles store tabular data and are a modern take on the standard R data frame. They have many user-friendly features that are an improvement over standard data frames when doing interactive data analysis. The remainder of this module covers tabular data in spreadsheet formats like Excel, CSV, TSV, and other delimited files.

What's included

15 readings1 assignment

15 readingsβ€’Total 166 minutes
  • About This Courseβ€’5 minutes
  • Tibblesβ€’10 minutes
  • Creating a tibbleβ€’20 minutes
  • Subsettingβ€’10 minutes
  • Spreadsheetsβ€’1 minute
  • Excel filesβ€’30 minutes
  • Google Sheetsβ€’45 minutes
  • CSVsβ€’10 minutes
  • Downloading CSV filesβ€’5 minutes
  • Reading CSVs into Rβ€’10 minutes
  • TSVsβ€’2 minutes
  • Reading TSVs Files into Rβ€’5 minutes
  • Delimited Filesβ€’3 minutes
  • Reading Delimited Files into Rβ€’5 minutes
  • Exporting Data from Rβ€’5 minutes
1 assignmentβ€’Total 30 minutes
  • Importing and Exporting Data Quizβ€’30 minutes

Data can come in non-tabular formats, especially unstructured data or data that otherwise would not fit into a table. JSON and XML are common formats for storing arbitrarily structured data and this module covers the packages used to read in those data formats. In addition, relational databases are common for storing very large collections of tables where you do not need to read in the entire dataset at once. There are many relational database formats and we will cover the SQLite format, which is a compact and simple to use format.

What's included

10 readings1 assignment

10 readingsβ€’Total 132 minutes
  • JSONβ€’30 minutes
  • XMLβ€’15 minutes
  • Databasesβ€’2 minutes
  • Relational Dataβ€’15 minutes
  • Relational Databases: SQLβ€’5 minutes
  • Connecting to Databases: RSQLiteβ€’10 minutes
  • Working with Relational Data: dplyr & dbplyrβ€’5 minutes
  • Mutating Joinsβ€’30 minutes
  • Filtering Joinsβ€’10 minutes
  • How to Connect to a Database Onlineβ€’10 minutes
1 assignmentβ€’Total 30 minutes
  • JSON, XML, and Databases Quizβ€’30 minutes

Reading in data from various Internet sources can be a useful way to build analyses that need to be regularly updated. The rvest and httr packages are useful for connecting to web sites, web APIs and other online sources of data.

What's included

11 readings1 assignment

11 readingsβ€’Total 105 minutes
  • Web Scrapingβ€’10 minutes
  • rvest Basicsβ€’0 minutes
  • SelectorGadgetβ€’10 minutes
  • Web Scraping Exampleβ€’10 minutes
  • A final note: SelectorGadgetβ€’2 minutes
  • APIβ€’5 minutes
  • Getting Data: httrβ€’5 minutes
  • Example 1: GitHub’s APIβ€’30 minutes
  • Example 2: Obtaining a CSVβ€’20 minutes
  • read_csv() from a URLβ€’3 minutes
  • API keysβ€’10 minutes
1 assignmentβ€’Total 30 minutes
  • Getting Data from the Internet Quizβ€’30 minutes

Working with others in a data science project often involves reading output or data produced using other statistical analysis packages or other software. This module covers packages for reading in these foreign formats, as well as images and data from Google Drive.

What's included

3 readings1 assignment

3 readingsβ€’Total 65 minutes
  • havenβ€’15 minutes
  • Imagesβ€’30 minutes
  • googledriveβ€’20 minutes
1 assignmentβ€’Total 30 minutes
  • Foreign Formats, Images and googledrive Quizβ€’30 minutes

Now we will demonstrate how to import data using our case study examples. When working through the steps of the case studies, you can use either RStudio on your own computer or Coursera lab spaces provided for each case study.

What's included

11 readings2 ungraded labs

11 readingsβ€’Total 142 minutes
  • Case Study #1: Health Expendituresβ€’5 minutes
  • Healthcare Coverage Dataβ€’45 minutes
  • Healthcare Spending Dataβ€’30 minutes
  • New Case Study #2: Firearmsβ€’2 minutes
  • Census Dataβ€’5 minutes
  • Counted Dataβ€’5 minutes
  • Suicide Dataβ€’10 minutes
  • Brady Dataβ€’10 minutes
  • Crime Dataβ€’10 minutes
  • Land Area Dataβ€’10 minutes
  • Unemployment Dataβ€’10 minutes
2 ungraded labsβ€’Total 120 minutes
  • Health Expenditures Labβ€’60 minutes
  • Firearms Case Study Labβ€’60 minutes

This project will give you the opportunity to read in data from multiple sources and conduct some simple operations on those data.

What's included

2 readings1 assignment

2 readingsβ€’Total 20 minutes
  • Introduction and Backgroundβ€’10 minutes
  • Datasetsβ€’10 minutes
1 assignmentβ€’Total 30 minutes
  • Importing Data into R Projectβ€’30 minutes

Earn a career certificate

Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.

Instructors

Instructor ratings
4.3 (14 ratings)
Johns Hopkins University
5 Coursesβ€’7,093 learners
Johns Hopkins University
5 Coursesβ€’7,093 learners
Johns Hopkins University
37 Coursesβ€’1,689,638 learners

Explore more from Data Analysis

Why people choose Coursera for their career

πŸ‘ Image

Felipe M.

Learner since 2018
"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."
πŸ‘ Image

Jennifer J.

Learner since 2020
"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."
πŸ‘ Image

Larry W.

Learner since 2021
"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."
πŸ‘ Image

Chaitanya A.

"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."

Learner reviews

  • 5 stars

    78%

  • 4 stars

    18%

  • 3 stars

    4%

  • 2 stars

    0%

  • 1 star

    0%

Showing 3 of 51

EL
Β·

Reviewed on Nov 22, 2022

Excellent. While there were no lectures, and it is possible to simply read the authors' book, having the quizzes makes the difference between just reading and actually learning. Thanks!

FC
Β·

Reviewed on Jan 28, 2021

Excellent tutorial for importing data into the tidyverse environment

VM
Β·

Reviewed on Mar 27, 2021

Great for beginners. Clearly explained, and easy to follow.

Frequently asked questions

To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.

When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.

Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.

Financial aid available,