AI Workflow: Data Analysis and Hypothesis Testing
AI Workflow: Data Analysis and Hypothesis Testing
This course is part of IBM AI Enterprise Workflow Specialization
Instructors: Mark J Grover
9,963 already enrolled
Included with
Ask Coursera
128 reviews
128 reviews
Skills you'll gain
- Matplotlib
- Statistical Methods
- Data Presentation
- Probability & Statistics
- Statistical Analysis
- Data Preprocessing
- Dashboard Creation
- Statistics
- Data Visualization Software
- Data Science
- Exploratory Data Analysis
- Machine Learning
- Data Analysis
- Statistical Hypothesis Testing
- Statistical Inference
- Probability Distribution
Tools you'll learn
Details to know
See how employees at top companies are mastering in-demand skills
Build your subject-matter expertise
- Learn new concepts from industry experts
- Gain a foundational understanding of a subject or tool
- Develop job-relevant skills with hands-on projects
- Earn a shareable career certificate
There are 2 modules in this course
This is the second course in the IBM AI Enterprise Workflow Certification specialization. You are STRONGLY encouraged to complete these courses in order as they are not individual independent courses, but part of a workflow where each course builds on the previous ones.
In this course you will begin your work for a hypothetical streaming media company by doing exploratory data analysis (EDA). Best practices for data visualization, handling missing data, and hypothesis testing will be introduced to you as part of your work. You will learn techniques of estimation with probability distributions and extending these estimates to apply null hypothesis significance tests. You will apply what you learn through two hands on case studies: data visualization and multiple testing using a simple pipeline. By the end of this course you should be able to: 1. List several best practices concerning EDA and data visualization 2. Create a simple dashboard in Watson Studio 3. Describe strategies for dealing with missing data 4. Explain the difference between imputation and multiple imputation 5. Employ common distributions to answer questions about event probabilities 6. Explain the investigative role of hypothesis testing in EDA 7. Apply several methods for dealing with multiple testing Who should take this course? This course targets existing data science practitioners that have expertise building machine learning models, who want to deepen their skills on building and deploying AI in large enterprises. If you are an aspiring Data Scientist, this course is NOT for you as you need real world expertise to benefit from the content of these courses. What skills should you have? It is assumed that you have completed Course 1 of the IBM AI Enterprise Workflow specialization and have a solid understanding of the following topics prior to starting this course: Fundamental understanding of Linear Algebra; Understand sampling, probability theory, and probability distributions; Knowledge of descriptive and inferential statistical concepts; General understanding of machine learning techniques and best practices; Practiced understanding of Python and the packages commonly used in data science: NumPy, Pandas, matplotlib, scikit-learn; Familiarity with IBM Watson Studio; Familiarity with the design thinking process.
Exploratory data analysis is mostly about gaining insight through visualization and hypothesis testing. This unit looks at EDA, data visualization, and missing values. One missing value strategy may be better for some models, but for others another strategy may show better predictive performance.
What's included
6 videos11 readings4 assignments2 peer reviews1 ungraded lab
6 videos•Total 26 minutes
- EDA Overview•4 minutes
- Introduction to Data Visualizations•3 minutes
- Data Visualizations•8 minutes
- Introduction to Missing Values•4 minutes
- Missing Values•4 minutes
- Case Study Introduction•2 minutes
11 readings•Total 37 minutes
- Why is Exploratory Data Analysis Necessary?•3 minutes
- Data Visualization: Through the Eyes of Our Working Example•3 minutes
- Getting Started / Unit Materials•2 minutes
- Data Visualization in Python•3 minutes
- Missing Data: Introduction•2 minutes
- Strategies for Missing Data•3 minutes
- Categories of Missing Data•2 minutes
- Simple Imputation•2 minutes
- Bayesian Imputation•10 minutes
- Case Study: Getting started•2 minutes
- Summary/Review•5 minutes
4 assignments•Total 95 minutes
- Data Analysis Module Quiz•5 minutes
- Check for Understanding: EDA•30 minutes
- Check for Understanding: Data Visualization•30 minutes
- Check for Understanding: Missing Data•30 minutes
2 peer reviews•Total 105 minutes
- Visualization and Imputation•45 minutes
- Build a Deliverable!•60 minutes
1 ungraded lab•Total 60 minutes
- Case Study Answer Key Notebook•60 minutes
Data scientists employ a broad range of statistical tools to analyze data and reach conclusions from data. This unit focuses on the foundational techniques of estimation with probability distributions and extending these estimates to apply null hypothesis significance tests.
What's included
3 videos14 readings3 assignments1 ungraded lab
3 videos•Total 16 minutes
- Introduction to hypothesis testing•3 minutes
- Hypothesis Testing•10 minutes
- Case Study Introduction•2 minutes
14 readings•Total 181 minutes
- TUTORIAL: IBM Watson Studio dashboard•10 minutes
- Hypothesis Testing: Through the eyes of our Working Example•10 minutes
- Overview•2 minutes
- Statistical Inference•2 minutes
- Business Scenarios and Probability•3 minutes
- Variants on t-tests•2 minutes
- One-way Analysis of Variance (ANOVA)•4 minutes
- p-value Limitations•10 minutes
- Multiple Testing•4 minutes
- Explain Methods for Dealing with Multiple Testing•3 minutes
- Getting Started•3 minutes
- Import the Data•4 minutes
- Data Processing (Includes Assessment)•120 minutes
- Summary/Review•4 minutes
3 assignments•Total 65 minutes
- Data Investigation Module Quiz•5 minutes
- Check for Understanding: Hypothesis Testing•30 minutes
- Check for Understanding: Hypothesis Testing Limitations•30 minutes
1 ungraded lab•Total 60 minutes
- Case Study Answer Key Notebook•60 minutes
Earn a career certificate
Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.
Instructors
Explore more from Machine Learning
Why people choose Coursera for their career
Learner reviews
- 5 stars
60.15%
- 4 stars
19.53%
- 3 stars
13.28%
- 2 stars
2.34%
- 1 star
4.68%
Showing 3 of 128
Reviewed on Apr 2, 2020
More practicality and assignment should me there. Which is more helpful for the learners.
Reviewed on Jul 6, 2020
Very Informative and Labs for Hands-on session was useful.
Reviewed on Apr 15, 2026
This course provided a clear and structured introduction to data analysis and hypothesis testing within an AI workflow.
Advance your career with an online degree
Earn a degree from world-class universities - 100% online
Frequently asked questions
This course assumes that you are already familiar with basic data science concepts including probability and statistics, linear algebra, machine learning, and the use of Python and Jupyter. Additionally, you should have already completed the first course in this specialization: AI Workflow: Business Priorities and Data Ingestion.
No. The certification exam is administered by Pearson VUE and must be taken at one of their testing facilities. You may visit their site at https://home.pearsonvue.com/ for more information.
Please visit the Pearson VUE web site at https://home.pearsonvue.com/ for the latest information on taking the AI Enterprise Workflow certification test.
More questions
Financial aid available,
¹ Some assignments in this course are AI-graded. For these assignments, your data will be used in accordance with Coursera's Privacy Notice.
