Applied Unsupervised Learning in Python

Keep adding new skills with 10,000+ programs for $239 (usually $399). Save now.

👁 University of Michigan

Applied Unsupervised Learning in Python

This course is part of More Applied Data Science with Python Specialization

👁 Kevyn Collins-Thompson

Instructor: Kevyn Collins-Thompson

Included with

•

Learn more

Ask Coursera

4 modules

Gain insight into a topic and learn the fundamentals.

Advanced level

Recommended experience

3 weeks to complete

at 10 hours a week

Flexible schedule

Learn at your own pace

4 modules

Gain insight into a topic and learn the fundamentals.

Advanced level

Recommended experience

3 weeks to complete

at 10 hours a week

Flexible schedule

Learn at your own pace

What you'll learn

Apply unsupervised learning methods, such as dimensionality reduction, manifold learning, and density estimation, to transform and visualize data.
Understand, evaluate, optimize, and correctly apply clustering algorithms using hierarchical, partitioning, and density-based methods.
Use topic modeling to find important themes in text data and use word embeddings to analyze patterns in text data.
Manage missing data using supervised and unsupervised imputation methods, and use semi-supervised learning to work with partially-labeled datasets.

Skills you'll gain

Tools you'll learn

Python Programming

Details to know

👁 Image

Shareable certificate

Add to your LinkedIn profile

Assessments

21 assignments

Taught in English

See how employees at top companies are mastering in-demand skills

Learn more about Coursera for Business

👁 logos of Petrobras, TATA, Danone, Capgemini, P&G and L'Oreal

Build your subject-matter expertise

This course is part of the More Applied Data Science with Python Specialization

When you enroll in this course, you'll also be enrolled in this Specialization.

Learn new concepts from industry experts
Gain a foundational understanding of a subject or tool
Develop job-relevant skills with hands-on projects
Earn a shareable career certificate

👁 Image

There are 4 modules in this course

In “Applied Unsupervised Learning in Python,” you will learn how to use algorithms to find interesting structure in datasets. You will practice applying, interpreting, and refining unsupervised machine learning models to solve a diverse set of problems on real-world datasets.

This course will show you how to explore unlabelled data using several techniques: dimensionality reduction and manifold learning for condensing and visualizing high-dimensional data, clustering to reveal interesting groups and outliers, topic modeling for summarizing important themes in text, methods for dealing with missing data, and more. This course also covers best practices associated with different techniques, as well as demonstrating how unsupervised learning can be used to improve supervised prediction. This is the second course in “More Applied Data Science with Python,” a four-course series focused on helping you apply advanced data science techniques using Python. It is recommended that all learners complete the Applied Data Science with Python specialization prior to beginning this course.

Welcome to Module 1! In this module, we will learn the basic unsupervised learning methods that focus on transformation of data: dimensionality reduction, manifold learning, and density estimation. We will be using realistic datasets for our analyses, implemented using the scikit-learn library. At the end of this Module, our assignment is to apply Principal Components Analysis to gain insight into a large real-world dataset. We will use manifold learning methods such as t-SNE to visualize complex structure, and use kernel density estimation to estimate probabilities of conditional events. Let’s begin!

What's included

18 videos7 readings7 assignments1 programming assignment1 discussion prompt

18 videos•Total 240 minutes

Welcome to Applied Unsupervised Learning in Python•6 minutes
Dimensionality Reduction: A Brief Introduction•17 minutes
Dimensionality Reduction with Feature Selection: Information Gain•16 minutes
Dimensionality Reduction with Feature Selection: Principal Component Analysis (PCA) Explained•21 minutes
Visualizing PCA Results: Foundations•8 minutes
Visualizing PCA Results: Biplots and Variance Plots•17 minutes
Singular Value Decomposition (SVD)•22 minutes
Applications of SVD in Data Science•14 minutes
Manifold Learning: Multidimensional Scaling (Part 1)•18 minutes
Manifold Learning: Multidimensional Scaling (Part 2)•12 minutes
Manifold Learning: t-Distributed Stochastic Neighbor Embedding (t-SNE)•14 minutes
Manifold Learning: Uniform Manifold Approximation and Projection (UMAP)•7 minutes
Density Estimation Part 1: Probability Density Functions•13 minutes
Density Estimation Part 1: Parametric vs. Non-Parametric Density Estimator•13 minutes
Density Estimation Part 2: Local Density Estimators•11 minutes
Density Estimation Part 2: Kernel Density Estimators•16 minutes
Density Estimation Part 2: Evaluating Density Estimators•5 minutes
Density Estimation Part 3: Local Density Estimators and Gaussian Mixture Models (GMMs)•12 minutes

7 readings•Total 85 minutes

MADSwPy Certificate Roadmap •5 minutes
Course Syllabus•10 minutes
Additional Resources•10 minutes
Help Us Learn About You•5 minutes
Ten Quick Tips for Effective Dimensionality Reduction•45 minutes
Introduction to Module 1 Programming Assignment: An Introduction to Unsupervised Learning•5 minutes
Module 1 Optional Readings & Resources•5 minutes

7 assignments•Total 105 minutes

Time to Practice: Dimensionality Reduction•15 minutes
Time to Practice: Principal Component Analysis (PCA)•15 minutes
Time to Practice: Singular Value Decomposition (SVD)•15 minutes
Time to Practice: Manifold Learning (Multidimensional scaling: Parts 1 & 2)•15 minutes
Time to Practice: Manifold Learning (t-SNE, and UMAP)•15 minutes
Time to Practice: Density Estimation Methods (Part 1)•15 minutes
Time to Practice: Density Estimation Methods (Parts 2 & 3)•15 minutes

1 programming assignment•Total 180 minutes

Create & Submit Module 1 Assignment•180 minutes

1 discussion prompt•Total 15 minutes

Meet Your Fellow Learners•15 minutes

Welcome to Module 2! In this module’s module, we will learn about clustering—another critical and widely-used unsupervised learning method. We will learn about the most important families of clustering algorithms, such as hierarchical methods (agglomerative bottom-up, divisive top-down), partitioning methods (k-means, k-medoids) and density-based methods (DBSCAN). We will also gain awareness of how to evaluate and optimize cluster quality. At the end of this module, our assignment is to apply a variety of these clustering approaches to realistic datasets using SciKit-Learn's clustering capabilities. Let’s begin!

What's included

10 videos3 readings5 assignments1 programming assignment

10 videos•Total 141 minutes

A Brief Introduction to Clustering (Part 1)•18 minutes
A Brief Introduction to Clustering (Part 2)•18 minutes
Hierarchical Clustering Part 1: Introduction•15 minutes
Hierarchical Clustering Part 2: Ward's Method•11 minutes
Hierarchical Clustering Part 2: Dendograms•13 minutes
Introduction to K-means•12 minutes
Applying K-means in Practice•13 minutes
DBSCAN Clustering•12 minutes
Evaluating Cluster Quality (Part 1)•13 minutes
Evaluating Cluster Quality (Part 2)•18 minutes

3 readings•Total 30 minutes

Cluster Labeling•20 minutes
Introduction to Module 2 Assignment: Clustering•5 minutes
Module 2 Optional Readings & Resources•5 minutes

5 assignments•Total 75 minutes

Time to Practice: Clustering Overview•15 minutes
Time to Practice: Hierarchical Clustering•15 minutes
Time to Practice: K-means Clustering•15 minutes
Time to Practice: DBSCAN Clustering•15 minutes
Time to Practice: Cluster Quality•15 minutes

1 programming assignment•Total 180 minutes

Create & Submit Module 2 Assignment•180 minutes

Welcome to Module 3! In this module’s module, we will learn about estimating latent variables—another important area of unsupervised learning, especially for text-based applications. We will focus first on the topic of text representations. Topic modeling is another form of latent variable estimation, which we will learn about via two different methods: Latent Dirichlet Allocation (LDA) and Non-Negative Matrix Factorization. We will also survey word embeddings to learn how to represent words with vectors in semantically useful ways. At the end of this module, our assignment is to solve problems through analyzing topic structure in a large document collection, and applying word embeddings to an NLP-related task. Let’s begin!

What's included

8 videos2 readings5 assignments1 programming assignment

8 videos•Total 129 minutes

How to Represent Text as a Vector: A Typical Workflow•21 minutes
Text Processing in SciKit-Learn•13 minutes
Introduction to Topic Modeling•18 minutes
Latent Dirichlet Allocation (LDA) •7 minutes
Using LDA with Scikit-Learn•12 minutes
Non-Negative Matrix Factorization (NMF)•19 minutes
Word Embeddings Technique #1: Word2vec•21 minutes
Word Embeddings Technique #2: Glove•16 minutes

2 readings•Total 10 minutes

Introduction to Module 3 Assignment: Text Representations, Topic Modeling, and Word Embeddings•5 minutes
Module 3 Optional Readings & Resources•5 minutes

5 assignments•Total 75 minutes

Time to Practice: Representing Text as a Vector•15 minutes
Time to Practice: Text Processing in SciKit-Learn•15 minutes
Time to Practice: Latent Dirichlet Allocation (LDA) •15 minutes
Time to Practice: Non-Negative Matrix Factorization (NMF)•15 minutes
Time to Practice: Word2vec & Glove•15 minutes

1 programming assignment•Total 180 minutes

Create & Submit Module 3 Assignment•180 minutes

Welcome to Module 4, our last module of the course! We wrap up our course by learning about how unsupervised methods can be integrated with supervised learning methods to improve prediction performance. A key topic this module in that direction covers imputation methods for dealing with missing data. We will also look at various special topics, including extensions of unsupervised learning that are used at the cutting edge of today's technology: semi-supervised learning and self-supervised learning. At the end of this module, our assignment is to apply methods and techniques for imputing missing data and semi-supervised learning, with the underlying theme being how unsupervised learning can improve supervised learning. Let’s begin!

What's included

7 videos3 readings4 assignments1 programming assignment

7 videos•Total 110 minutes

Applying Unsupervised Learning to Supervised Learning Tasks•19 minutes
Imputation of Missing Data•14 minutes
Imputation with Scikit-Learn•20 minutes
A Brief Introduction to Semi-Supervised Learning•20 minutes
Label propagation with scikit-learn•15 minutes
A Brief Introduction to Self-Supervised Learning•18 minutes
Course Conclusion•4 minutes

3 readings•Total 20 minutes

Introduction to Module 4 Assignment: Applying Methods and Techniques for Data Imputation and Semi-Supervised Learning•5 minutes
Module 4 Optional Readings & Resources•5 minutes
Post-Course Survey•10 minutes

4 assignments•Total 60 minutes

Time to Practice: Applying Unsupervised Learning to Supervised Learning Tasks•15 minutes
Time to Practice: Imputation of Missing Data•15 minutes
Time to Practice: Semi-Supervised Learning•15 minutes
Time to Practice: Self-Supervised Learning•15 minutes

1 programming assignment•Total 180 minutes

Create & Submit Module 4 Assignment•180 minutes

Earn a career certificate

Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.

Instructor

👁 Kevyn Collins-Thompson

Kevyn Collins-Thompson

University of Michigan

4 Courses•331,781 learners

Offered by

👁 Image

University of Michigan

Explore more from Machine Learning

👁 Image
Status: Free Trial
U
University of Michigan
Applied Machine Learning in Python
Course
👁 Image
Status: Free Trial
E
Edureka
Applied Machine Learning with Python
Course
👁 Image
P
Packt
Cluster Analysis and Unsupervised Machine Learning in Python
Course
👁 Image
Status: Free Trial
U
University of Michigan
Data Mining in Python
Course

Why people choose Coursera for their career

👁 Image

Felipe M.

Learner since 2018

"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."

👁 Image

Jennifer J.

Learner since 2020

"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."

👁 Image

Larry W.

Learner since 2021

"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."

👁 Image

Chaitanya A.

"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."

Frequently asked questions

To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.

When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.

Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.

URL: https://www.coursera.org/learn/applied-unsupervised-learning-in-python