AI Workflow: Feature Engineering and Bias Detection

Keep adding new skills with 10,000+ programs for $239 (usually $399). Save now.

👁 IBM

AI Workflow: Feature Engineering and Bias Detection

This course is part of IBM AI Enterprise Workflow Specialization

👁 Mark J Grover

👁 Ray Lopez, Ph.D.

Instructors: Mark J Grover

9,194 already enrolled

Included with

•

Learn more

Ask Coursera

2 modules

Gain insight into a topic and learn the fundamentals.

4.4

82 reviews

Advanced level

Designed for those already in the industry

1 week to complete

at 10 hours a week

Flexible schedule

Learn at your own pace

2 modules

Gain insight into a topic and learn the fundamentals.

4.4

82 reviews

Advanced level

Designed for those already in the industry

1 week to complete

at 10 hours a week

Flexible schedule

Learn at your own pace

Skills you'll gain

Tools you'll learn

Details to know

👁 Image

Shareable certificate

Add to your LinkedIn profile

Assessments

10 assignments

Taught in English

See how employees at top companies are mastering in-demand skills

Learn more about Coursera for Business

👁 logos of Petrobras, TATA, Danone, Capgemini, P&G and L'Oreal

Build your subject-matter expertise

This course is part of the IBM AI Enterprise Workflow Specialization

When you enroll in this course, you'll also be enrolled in this Specialization.

Learn new concepts from industry experts
Gain a foundational understanding of a subject or tool
Develop job-relevant skills with hands-on projects
Earn a shareable career certificate

👁 Image

There are 2 modules in this course

This is the third course in the IBM AI Enterprise Workflow Certification specialization. You are STRONGLY encouraged to complete these courses in order as they are not individual independent courses, but part of a workflow where each course builds on the previous ones.

Course 3 introduces you to the next stage of the workflow for our hypothetical media company. In this stage of work you will learn best practices for feature engineering, handling class imbalances and detecting bias in the data. Class imbalances can seriously affect the validity of your machine learning models, and the mitigation of bias in data is essential to reducing the risk associated with biased models. These topics will be followed by sections on best practices for dimension reduction, outlier detection, and unsupervised learning techniques for finding patterns in your data. The case studies will focus on topic modeling and data visualization. By the end of this course you will be able to: 1. Employ the tools that help address class and class imbalance issues 2. Explain the ethical considerations regarding bias in data 3. Employ ai Fairness 360 open source libraries to detect bias in models 4. Employ dimension reduction techniques for both EDA and transformations stages 5. Describe topic modeling techniques in natural language processing 6. Use topic modeling and visualization to explore text data 7. Employ outlier handling best practices in high dimension data 8. Employ outlier detection algorithms as a quality assurance tool and a modeling tool 9. Employ unsupervised learning techniques using pipelines as part of the AI workflow 10. Employ basic clustering algorithms Who should take this course? This course targets existing data science practitioners that have expertise building machine learning models, who want to deepen their skills on building and deploying AI in large enterprises. If you are an aspiring Data Scientist, this course is NOT for you as you need real world expertise to benefit from the content of these courses. What skills should you have? It is assumed that you have completed Courses 1 and 2 of the IBM AI Enterprise Workflow specialization and you have a solid understanding of the following topics prior to starting this course: Fundamental understanding of Linear Algebra; Understand sampling, probability theory, and probability distributions; Knowledge of descriptive and inferential statistical concepts; General understanding of machine learning techniques and best practices; Practiced understanding of Python and the packages commonly used in data science: NumPy, Pandas, matplotlib, scikit-learn; Familiarity with IBM Watson Studio; Familiarity with the design thinking process.

This module will introduce you to skills required for effective feature engineering in today's business enterprises. The skills are presented as a series of best practices representing years of practical experience.

What's included

6 videos14 readings5 assignments1 ungraded lab

6 videos•Total 31 minutes

Data Transformations Overview•3 minutes
Introduction to Class Imbalance•2 minutes
Class Imbalance Deep Dive•9 minutes
Introduction to Dimensionality Reduction•2 minutes
Dimension Reduction•13 minutes
Case Study Intro / Feature Engineering•2 minutes

14 readings•Total 162 minutes

Data Transformation: Through the eyes of our Working Example•3 minutes
Transforms with scikit-learn•3 minutes
Pipelines•3 minutes
Class imbalance: Through the Eyes of our Working Example•3 minutes
Class Imbalance•5 minutes
Sampling Techniques•2 minutes
Models that Naturally Handle Imbalance•2 minutes
Data Bias•2 minutes
Dimensionality Reduction: Through the Eyes of Our Working Example•3 minutes
Why is Dimensionality Reduction Important?•3 minutes
Dimensionality Reduction and Topic models•5 minutes
Topic modeling: Through the Eyes of our Working Example•3 minutes
Getting Started with the Topic Modeling Case Study (hands-on)•120 minutes
Data Transforms and Feature Engineering: Summary/Review•5 minutes

5 assignments•Total 103 minutes

Data Transforms and Feature Engineering: End of Module Quiz•10 minutes
Getting Started: Check for Understanding•30 minutes
Class Imbalance, Data Bias: Check for Understanding•30 minutes
Dimensionality Reduction: Check for Understanding•3 minutes
CASE STUDY - Topic Modeling: Check for Understanding•30 minutes

1 ungraded lab•Total 60 minutes

Case Study Answer Key Notebook•60 minutes

This module will continue the discussion of skill related to feature engineering for practicing data scientists, with a focus on outliers and the use of unsupervised learning techniques for finding patterns.

What's included

5 videos11 readings5 assignments1 ungraded lab

5 videos•Total 16 minutes

Exploring IBM's AI Fairness 360 Toolkit•2 minutes
Introduction to Outliers•3 minutes
Outlier Detection•3 minutes
Introduction to Unsupervised learning•2 minutes
Unsupervised Learning•6 minutes

11 readings•Total 172 minutes

ai360: Through the Eyes of our Working Example•3 minutes
Introduction to 360 (hands-on)•15 minutes
Outlier Detection: Through the Eyes of our Working Example•3 minutes
Outliers•3 minutes
Unsupervised learning: Through the Eyes of our Working Example•3 minutes
An Overview of Unsupervised Learning•2 minutes
Clustering•3 minutes
Clustering Evaluation•3 minutes
Clustering: Through the Eyes of our Working Example•3 minutes
Getting Started with the Clustering Case Study (hands-on)•130 minutes
Pattern Recognition and Data Mining Best Practices: Summary/Review•4 minutes

5 assignments•Total 132 minutes

Pattern Recognition and Data Mining Best Practices: End of Module Quiz•12 minutes
ai360 Tutorial: Check for Understanding•30 minutes
Outlier Detection: Check for Understanding•30 minutes
Unsupervised Learning: Check for Understanding•30 minutes
CASE STUDY - Clustering: Check for Understanding•30 minutes

1 ungraded lab•Total 60 minutes

Case Study Answer Key Notebook•60 minutes

Earn a career certificate

Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.

Instructors

Instructor ratings

4.1 (16 ratings)

👁 Mark J Grover

Mark J Grover

13 Courses•168,824 learners

Offered by

👁 Image

IBM

Explore more from Machine Learning

👁 Image
I
IBM
AI Workflow: Business Priorities and Data Ingestion
Course
👁 Image
I
IBM
AI Workflow: Machine Learning, Visual Recognition and NLP
Course
👁 Image
I
IBM
AI Workflow: AI in Production
Course
👁 Image
I
IBM
AI Workflow: Enterprise Model Deployment
Course

Why people choose Coursera for their career

👁 Image

Felipe M.

Learner since 2018

"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."

👁 Image

Jennifer J.

Learner since 2020

"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."

👁 Image

Larry W.

Learner since 2021

"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."

👁 Image

Chaitanya A.

"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."

Learner reviews

5 stars
62.19%
4 stars
24.39%
3 stars
8.53%
2 stars
2.43%
1 star
2.43%

Showing 3 of 82

Reviewed on Jul 5, 2020

Dear Team,Namaste !! Well...All Instructer Very Help Full ...Quick Reply for any Queries ...Concept Clearance.Thanks & RegardsNeela Mistry

Reviewed on May 3, 2020

It's quite good but the content could be more in-depth as an 'advance' course.

View more reviews

Frequently asked questions

This course assumes that you are already familiar with basic data science concepts including probability and statistics, linear algebra, machine learning, and the use of Python and Jupyter. It is assumed you have completed the first two courses of the specialization: AI Workflow: Business Priorities and Data Ingestion, AI Workflow: Data Analysis and Hypothesis Testing.

No. The certification exam is administered by Pearson VUE and must be taken at one of their testing facilities. You may visit their site at https://home.pearsonvue.com/ for more information.

Please visit the Pearson VUE web site at https://home.pearsonvue.com/ for the latest information on taking the AI Enterprise Workflow certification test.

It is highly recommended that you have at least a basic working knowledge of design thinking and Watson Studio prior to taking this course. Please visit the IBM Skills Gateway at http://ibm.com/training/badges and "Find a Badge" related to "design thinking" or "Watson Studio". From there you will be directed to courses covering these topics.

No. Most of the exercises may be completed with open source tools running on your personal computer. However, the exercises are designed with an enterprise focus and are intended to be run in an enterprise environment that allows for easier sharing and collaboration. The exercises in the last two modules of the course are heavily focused on deployment and testing of machine learning models and use the IBM Watson tooling found on the IBM Cloud.

Yes. All IBM Cloud Data and AI services are based upon open source technologies.

The exercises in the course may be completed by anyone using the IBM Cloud "Lite" plan, which is free for use.

To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.

When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.

Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.

URL: https://www.coursera.org/learn/ibm-ai-workflow-feature-engineering-bias-detection