VOOZH about

URL: https://www.coursera.org/learn/ibm-data-analyst-capstone-project

⇱ IBM Data Analyst Capstone Project | Coursera


IBM Data Analyst Capstone Project

Ends soon! Keep adding new skills with 10,000+ programs for $239 (usually $399). Save now.

IBM Data Analyst Capstone Project

89,673 already enrolled

Included with

Ask Coursera

Gain insight into a topic and learn the fundamentals.
4.6

1,345 reviews

Advanced level

Recommended experience

Flexible schedule
3 weeks at 10 hours a week
Learn at your own pace
91%
Most learners liked this course

Gain insight into a topic and learn the fundamentals.
4.6

1,345 reviews

Advanced level

Recommended experience

Flexible schedule
3 weeks at 10 hours a week
Learn at your own pace
91%
Most learners liked this course

What you'll learn

  • Apply techniques to gather and wrangle data from multiple sources.

  • Analyze data to identify patterns, trends, and insights through exploratory techniques.

  • Create visual representations of data using Python libraries to communicate findings effectively.

  • Construct interactive dashboards with BI tools to present and explore data dynamically.

Details to know

Shareable certificate

Add to your LinkedIn profile

Assessments

24 assignments¹

AI Graded see disclaimer
Taught in English

Build your Data Analysis expertise

This course is part of the IBM Data Analyst Professional Certificate
When you enroll in this course, you'll also be enrolled in this Professional Certificate.
  • Learn new concepts from industry experts
  • Gain a foundational understanding of a subject or tool
  • Develop job-relevant skills with hands-on projects
  • Earn a shareable career certificate from IBM

There are 6 modules in this course

In an increasingly data-centric world, the ability to derive meaningful insights from raw data is essential. The IBM Data Analyst Capstone Project gives you the opportunity to apply the skills and techniques learned throughout the IBM Data Analyst Professional Certificate. Working with actual datasets, you will carry out tasks commonly performed by professional data analysts, such as data collection from multiple sources, data wrangling, exploratory analysis, statistical analysis, data visualization, and creating interactive dashboards. Your final deliverable will include a comprehensive data analysis report, complete with an executive summary, detailed insights, and a conclusion for organizational stakeholders.

Throughout the project, you will demonstrate your proficiency in tools such as Jupyter Notebooks, SQL, Relational Databases (RDBMS), and Business Intelligence (BI) tools like IBM Cognos Analytics. You will also apply Python libraries, including Pandas, Numpy, Scikit-learn, Scipy, Matplotlib, and Seaborn. We recommend completing the previous courses in the Professional Certificate before starting this capstone project, as it integrates all key concepts and techniques into a single, real-world scenario.

In this module, you’ll apply key data collection and analysis techniques using APIs and web scraping. You’ll start by exploring HTTP requests and using APIs to retrieve and paginate job postings across different technologies. Then, you’ll work with a JSON endpoint to collect job data through API requests. Next, you’ll use web scraping techniques to download webpages, extract links and images, and gather data from HTML tables into a CSV file. By the end of this module, you’ll have hands-on experience with real-world data collection methods. You’ll also complete a graded quiz to check your understanding.

What's included

2 videos4 readings4 assignments5 app items

2 videosTotal 7 minutes
  • Course Introduction2 minutes
  • Project Overview5 minutes
4 readingsTotal 40 minutes
  • Prerequisites and Course Syllabus5 minutes
  • Emerging Trends in Data Analytics10 minutes
  • Project Scenario10 minutes
  • About the Dataset15 minutes
4 assignmentsTotal 62 minutes
  • Graded Quiz: Data Collection30 minutes
  • Checklist: Collecting Data Using APIs10 minutes
  • Checklist: Collecting Data Using Webscraping8 minutes
  • Checklist: Exploring Data14 minutes
5 app itemsTotal 180 minutes
  • (Optional) Lab 1: Review Of Accessing APIs30 minutes
  • Lab 2: Collecting Data Using APIs30 minutes
  • Lab 3: Review Of Web Scraping30 minutes
  • Lab 4: Collecting Data Using Web Scraping60 minutes
  • Lab 5: Exploring the Dataset30 minutes

In this module, you will perform essential data-wrangling techniques necessary for cleaning and preparing datasets for analysis. Throughout the module, you will engage in hands-on activities to identify and handle common data issues, including duplicate entries and missing values. You will strategically remove duplicate records, apply suitable imputation strategies for missing data, and normalize datasets to ensure consistency and accuracy. Additionally, you will have a graded quizz to assess your understanding and reinforce the concepts covered.

What's included

1 reading7 assignments6 app items

1 readingTotal 5 minutes
  • Assignment Overview 5 minutes
7 assignmentsTotal 104 minutes
  • Graded Quiz: Data Wrangling30 minutes
  • Checklist: Finding Duplicates14 minutes
  • Checklist: Removing Duplicates10 minutes
  • Checklist: Finding Missing Values16 minutes
  • Checklist: Imputing Missing Values12 minutes
  • Checklist: Normalizing Data10 minutes
  • Checklist: Data Wrangling12 minutes
6 app itemsTotal 240 minutes
  • Lab 6: Finding Duplicates30 minutes
  • Lab 7: Removing Duplicates30 minutes
  • Lab 8: Finding Missing Values30 minutes
  • Lab 9: Impute Missing Values60 minutes
  • Lab 10: Normalizing Data60 minutes
  • Lab 11: Data Wrangling30 minutes

In this module, you will engage in essential exploratory data analysis (EDA) techniques to uncover meaningful insights from your data set. You will start by identifying the distribution of the data through plotting distribution curves and histograms, which are crucial for understanding how values are spread across different features. Next, you will detect outliers that may skew your analysis and learn how to effectively remove them to ensure data integrity. Additionally, you will explore correlations between various features in the data set, revealing relationships that can inform your overall analysis. Finally, you will create a new DataFrame to organize and present your findings. The module includes a graded quiz to test your knowledge.

What's included

1 reading5 assignments4 app items

1 readingTotal 2 minutes
  • Assignment Overview2 minutes
5 assignmentsTotal 92 minutes
  • Graded Quiz: Exploratory Data Analysis30 minutes
  • Checklist: Exploratory Data Analysis22 minutes
  • Checklist: Analyzing the Data Distribution14 minutes
  • Checklist: Handling Outliers14 minutes
  • Checklist: Correlation12 minutes
4 app itemsTotal 120 minutes
  • Lab 12: Exploratory Data Analysis30 minutes
  • Lab 13: Finding How The Data Is Distributed30 minutes
  • Lab 14: Finding Outliers30 minutes
  • Lab 15: Finding Correlation30 minutes

In this lab, you will perform essential data visualization techniques to extract meaningful insights from the Stack Overflow survey data set. You will start by visualizing the distribution of data using histograms and box plots to understand the spread of compensation and age. Next, you will explore relationships between features through scatterplots and bubble plots, followed by examining the composition of data with pie charts and stacked charts. Additionally, you will compare data across categories using line and bar charts. The module includes a graded quizz that will assess your knowledge of these concepts, ensuring you are well prepared for further analysis in your final project.

What's included

1 reading6 assignments9 app items

1 readingTotal 2 minutes
  • Assignment Overview2 minutes
6 assignmentsTotal 78 minutes
  • Graded Quiz: Data Visualization30 minutes
  • Checklist: Data Visualization16 minutes
  • Checklist: Visualizing Distribution of Data8 minutes
  • Checklist: Visualizing Relationship8 minutes
  • Checklist: Visualizing Composition of Data8 minutes
  • Checklist: Visualizing Comparison of Data8 minutes
9 app itemsTotal 330 minutes
  • Lab 16: Data Visualization60 minutes
  • Lab 17: Histograms30 minutes
  • Lab 18: Box Plots30 minutes
  • Lab 19: Scatter Plot30 minutes
  • Lab 20: Bubble Plots30 minutes
  • Lab 21: Pie Charts30 minutes
  • Lab 22: Stacked Charts30 minutes
  • Lab 23: Line Charts60 minutes
  • Lab 24: Bar Charts30 minutes

In this module, you will create dashboards using Stack Overflow survey data using either IBM Cognos Analytics or Google Looker Studio. The assignment is divided into Part A: Building a Dashboard with IBM Cognos Analytics and Part B: Building a Dashboard with Google Looker Studio. You will design a dashboard with sections on Current Technology Usage, Future Technology Trends, and Demographics. After completing the assignment, you will be required to submit the link of the Cognos or Looker Studio dashboard you complete. The module also includes a checklist that helps you ensure you have completed all necessary tasks before moving on.

What's included

1 reading2 assignments2 plugins

1 readingTotal 10 minutes
  • Assignment Overview10 minutes
2 assignmentsTotal 40 minutes
  • Graded Quiz: Building a Dashboard30 minutes
  • Checklist: Dashboards10 minutes
2 pluginsTotal 60 minutes
  • Lab 25: Option A - Building A Dashboard With IBM Cognos Analytics45 minutes
  • Lab 26: Option B - Building A Dashboard With Google Looker Studio15 minutes

In the final module, you will focus on presenting your data findings effectively. You will begin by exploring key elements contributing to a successful data findings report, including structuring your report, using best practices for data visualization, and presenting complex information in an engaging, accessible format. The module also includes labs covering basics in PowerPoint, foundational presentation techniques, and saving your presentation as a PDF to ensure a polished, professional final product. Finally, you will complete and submit a final presentation highlighting insights derived from the Stack Overflow Developer Survey data for evaluation through AI Grading or Peer Review.

What's included

2 videos4 readings1 peer review1 app item3 plugins

2 videosTotal 8 minutes
  • Elements Of A Successful Data Findings Report4 minutes
  • Best Practices For Presenting Your Findings3 minutes
4 readingsTotal 30 minutes
  • Structure Of A Report20 minutes
  • Final Project Submission Guidelines and Deliverables5 minutes
  • Congratulations and Next Steps3 minutes
  • Thanks from the Course Team2 minutes
1 peer reviewTotal 60 minutes
  • Option 2: Peer Graded - Final Project Submission and Evaluation60 minutes
1 app itemTotal 60 minutes
  • Option 1: AI Graded - Final Project: Submission and Evaluation60 minutes
3 pluginsTotal 40 minutes
  • (Optional) Lab 27: Getting Started With PowerPoint For The Web20 minutes
  • (Optional) Lab 28: Basics of PowerPoint15 minutes
  • (Optional) Lab 29: Save your PowerPoint Presentation as PDF5 minutes

Earn a career certificate

Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.

Instructors

Instructor ratings
4.6 (394 ratings)
IBM
55 Courses5,143,420 learners

Offered by

Why people choose Coursera for their career

👁 Image

Felipe M.

Learner since 2018
"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."
👁 Image

Jennifer J.

Learner since 2020
"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."
👁 Image

Larry W.

Learner since 2021
"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."
👁 Image

Chaitanya A.

"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."

Learner reviews

  • 5 stars

    77.71%

  • 4 stars

    14.93%

  • 3 stars

    4.08%

  • 2 stars

    1.18%

  • 1 star

    2.08%

Showing 3 of 1345

TT
·

Reviewed on Sep 26, 2021

Great. I practiced data visualization on IBM's Cognos Analytics software. It's a great piece of software. I then learned how to make a presentation to present the results of the analysis.

TS
·

Reviewed on Aug 8, 2021

Course was excellent! Enjoyed the final project and being able to work with authentic data that helps understand IT career trends.

MK
·

Reviewed on Jul 17, 2022

A good beginner friendly course in data analysis. Using the jupyter notebook was easier than going over to some websites to open the same jupyter notebook.

Frequently asked questions

Data analysis is the process of inspecting, cleaning, transforming, and modeling data to uncover useful information, make informed decisions, and support business strategies. It involves techniques such as statistical analysis, visualization, and reporting to identify trends, patterns, and insights from datasets.

When you subscribe to a course that is part of a Certificate, you’re automatically subscribed to the full Certificate. Visit your learner dashboard to track your progress.

This course is completely online, so there’s no need to show up to a classroom in person. You can access your lectures, readings, and assignments anytime and anywhere through the web or your mobile device.

Each course you complete earns you an IBM badge to certify successful course completion. You will earn an IBM course badge. Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and for performance reviews.

This capstone utilizes the extensive Stack Overflow Developer Survey dataset to simulate a large-scale corporate analysis. You will gain hands-on experience building dynamic, interactive business intelligence dashboards using industry-standard BI platforms: IBM Cognos Analytics or Google Looker Studio. Your dashboards will be meticulously structured into professional reporting sections tracking Current Technology Usage, Future Technology Trends, and global Developer Demographics, providing a highly visible asset to share with employers.

Rather than giving you clean, pre-packaged data, this capstone requires you to manage the entire data lifecycle. You will write Python code to execute programmatic data collection via REST APIs (including pagination handling) and web scrape HTML tables using libraries like Pandas and NumPy. From there, you will perform essential data wrangling—such as duplicate removal, outlier detection, and data imputation—to prepare data frames for deep statistical analysis using SciPy and Scikit-learn.

The ultimate goal of a data analyst is data storytelling. For your final submission, you will compile your Exploratory Data Analysis (EDA) and visualizations into a comprehensive, boardroom-ready executive report and slide presentation. You will apply presentation best practices to translate complex technical correlations, histograms, and scatter plots into an accessible format for organizational leaders. Your work will undergo rigorous evaluation via peer review or AI grading to certify your readiness for a professional analytics role.

To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.

When you enroll in the course, you get access to all of the courses in the Certificate, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.

Financial aid available,

¹ Some assignments in this course are AI-graded. For these assignments, your data will be used in accordance with Coursera's Privacy Notice.