Applied Unsupervised Learning in Python
Keep adding new skills with 10,000+ programs for $239 (usually $399). Save now.
Applied Unsupervised Learning in Python
This course is part of More Applied Data Science with Python Specialization
Instructor: Kevyn Collins-Thompson
Included with
Ask Coursera
Recommended experience
Recommended experience
What you'll learn
Apply unsupervised learning methods, such as dimensionality reduction, manifold learning, and density estimation, to transform and visualize data.
Understand, evaluate, optimize, and correctly apply clustering algorithms using hierarchical, partitioning, and density-based methods.
Use topic modeling to find important themes in text data and use word embeddings to analyze patterns in text data.
Manage missing data using supervised and unsupervised imputation methods, and use semi-supervised learning to work with partially-labeled datasets.
Skills you'll gain
Tools you'll learn
Details to know
See how employees at top companies are mastering in-demand skills
Build your subject-matter expertise
- Learn new concepts from industry experts
- Gain a foundational understanding of a subject or tool
- Develop job-relevant skills with hands-on projects
- Earn a shareable career certificate
There are 4 modules in this course
In “Applied Unsupervised Learning in Python,” you will learn how to use algorithms to find interesting structure in datasets. You will practice applying, interpreting, and refining unsupervised machine learning models to solve a diverse set of problems on real-world datasets.
This course will show you how to explore unlabelled data using several techniques: dimensionality reduction and manifold learning for condensing and visualizing high-dimensional data, clustering to reveal interesting groups and outliers, topic modeling for summarizing important themes in text, methods for dealing with missing data, and more. This course also covers best practices associated with different techniques, as well as demonstrating how unsupervised learning can be used to improve supervised prediction. This is the second course in “More Applied Data Science with Python,” a four-course series focused on helping you apply advanced data science techniques using Python. It is recommended that all learners complete the Applied Data Science with Python specialization prior to beginning this course.
Welcome to Module 1! In this module, we will learn the basic unsupervised learning methods that focus on transformation of data: dimensionality reduction, manifold learning, and density estimation. We will be using realistic datasets for our analyses, implemented using the scikit-learn library. At the end of this Module, our assignment is to apply Principal Components Analysis to gain insight into a large real-world dataset. We will use manifold learning methods such as t-SNE to visualize complex structure, and use kernel density estimation to estimate probabilities of conditional events. Let’s begin!
What's included
18 videos7 readings7 assignments1 programming assignment1 discussion prompt
18 videos•Total 240 minutes
- Welcome to Applied Unsupervised Learning in Python•6 minutes
- Dimensionality Reduction: A Brief Introduction•17 minutes
- Dimensionality Reduction with Feature Selection: Information Gain•16 minutes
- Dimensionality Reduction with Feature Selection: Principal Component Analysis (PCA) Explained•21 minutes
- Visualizing PCA Results: Foundations•8 minutes
- Visualizing PCA Results: Biplots and Variance Plots•17 minutes
- Singular Value Decomposition (SVD)•22 minutes
- Applications of SVD in Data Science•14 minutes
- Manifold Learning: Multidimensional Scaling (Part 1)•18 minutes
- Manifold Learning: Multidimensional Scaling (Part 2)•12 minutes
- Manifold Learning: t-Distributed Stochastic Neighbor Embedding (t-SNE)•14 minutes
- Manifold Learning: Uniform Manifold Approximation and Projection (UMAP)•7 minutes
- Density Estimation Part 1: Probability Density Functions•13 minutes
- Density Estimation Part 1: Parametric vs. Non-Parametric Density Estimator•13 minutes
- Density Estimation Part 2: Local Density Estimators•11 minutes
- Density Estimation Part 2: Kernel Density Estimators•16 minutes
- Density Estimation Part 2: Evaluating Density Estimators•5 minutes
- Density Estimation Part 3: Local Density Estimators and Gaussian Mixture Models (GMMs)•12 minutes
7 readings•Total 85 minutes
- MADSwPy Certificate Roadmap •5 minutes
- Course Syllabus•10 minutes
- Additional Resources•10 minutes
- Help Us Learn About You•5 minutes
- Ten Quick Tips for Effective Dimensionality Reduction•45 minutes
- Introduction to Module 1 Programming Assignment: An Introduction to Unsupervised Learning•5 minutes
- Module 1 Optional Readings & Resources•5 minutes
7 assignments•Total 105 minutes
- Time to Practice: Dimensionality Reduction•15 minutes
- Time to Practice: Principal Component Analysis (PCA)•15 minutes
- Time to Practice: Singular Value Decomposition (SVD)•15 minutes
- Time to Practice: Manifold Learning (Multidimensional scaling: Parts 1 & 2)•15 minutes
- Time to Practice: Manifold Learning (t-SNE, and UMAP)•15 minutes
- Time to Practice: Density Estimation Methods (Part 1)•15 minutes
- Time to Practice: Density Estimation Methods (Parts 2 & 3)•15 minutes
1 programming assignment•Total 180 minutes
- Create & Submit Module 1 Assignment•180 minutes
1 discussion prompt•Total 15 minutes
- Meet Your Fellow Learners•15 minutes
Welcome to Module 2! In this module’s module, we will learn about clustering—another critical and widely-used unsupervised learning method. We will learn about the most important families of clustering algorithms, such as hierarchical methods (agglomerative bottom-up, divisive top-down), partitioning methods (k-means, k-medoids) and density-based methods (DBSCAN). We will also gain awareness of how to evaluate and optimize cluster quality. At the end of this module, our assignment is to apply a variety of these clustering approaches to realistic datasets using SciKit-Learn's clustering capabilities. Let’s begin!
What's included
10 videos3 readings5 assignments1 programming assignment
10 videos•Total 141 minutes
- A Brief Introduction to Clustering (Part 1)•18 minutes
- A Brief Introduction to Clustering (Part 2)•18 minutes
- Hierarchical Clustering Part 1: Introduction•15 minutes
- Hierarchical Clustering Part 2: Ward's Method•11 minutes
- Hierarchical Clustering Part 2: Dendograms•13 minutes
- Introduction to K-means•12 minutes
- Applying K-means in Practice•13 minutes
- DBSCAN Clustering•12 minutes
- Evaluating Cluster Quality (Part 1)•13 minutes
- Evaluating Cluster Quality (Part 2)•18 minutes
3 readings•Total 30 minutes
- Cluster Labeling•20 minutes
- Introduction to Module 2 Assignment: Clustering•5 minutes
- Module 2 Optional Readings & Resources•5 minutes
5 assignments•Total 75 minutes
- Time to Practice: Clustering Overview•15 minutes
- Time to Practice: Hierarchical Clustering•15 minutes
- Time to Practice: K-means Clustering•15 minutes
- Time to Practice: DBSCAN Clustering•15 minutes
- Time to Practice: Cluster Quality•15 minutes
1 programming assignment•Total 180 minutes
- Create & Submit Module 2 Assignment•180 minutes
Welcome to Module 3! In this module’s module, we will learn about estimating latent variables—another important area of unsupervised learning, especially for text-based applications. We will focus first on the topic of text representations. Topic modeling is another form of latent variable estimation, which we will learn about via two different methods: Latent Dirichlet Allocation (LDA) and Non-Negative Matrix Factorization. We will also survey word embeddings to learn how to represent words with vectors in semantically useful ways. At the end of this module, our assignment is to solve problems through analyzing topic structure in a large document collection, and applying word embeddings to an NLP-related task. Let’s begin!
What's included
8 videos2 readings5 assignments1 programming assignment
8 videos•Total 129 minutes
- How to Represent Text as a Vector: A Typical Workflow•21 minutes
- Text Processing in SciKit-Learn•13 minutes
- Introduction to Topic Modeling•18 minutes
- Latent Dirichlet Allocation (LDA) •7 minutes
- Using LDA with Scikit-Learn•12 minutes
- Non-Negative Matrix Factorization (NMF)•19 minutes
- Word Embeddings Technique #1: Word2vec•21 minutes
- Word Embeddings Technique #2: Glove•16 minutes
2 readings•Total 10 minutes
- Introduction to Module 3 Assignment: Text Representations, Topic Modeling, and Word Embeddings•5 minutes
- Module 3 Optional Readings & Resources•5 minutes
5 assignments•Total 75 minutes
- Time to Practice: Representing Text as a Vector•15 minutes
- Time to Practice: Text Processing in SciKit-Learn•15 minutes
- Time to Practice: Latent Dirichlet Allocation (LDA) •15 minutes
- Time to Practice: Non-Negative Matrix Factorization (NMF)•15 minutes
- Time to Practice: Word2vec & Glove•15 minutes
1 programming assignment•Total 180 minutes
- Create & Submit Module 3 Assignment•180 minutes
Welcome to Module 4, our last module of the course! We wrap up our course by learning about how unsupervised methods can be integrated with supervised learning methods to improve prediction performance. A key topic this module in that direction covers imputation methods for dealing with missing data. We will also look at various special topics, including extensions of unsupervised learning that are used at the cutting edge of today's technology: semi-supervised learning and self-supervised learning. At the end of this module, our assignment is to apply methods and techniques for imputing missing data and semi-supervised learning, with the underlying theme being how unsupervised learning can improve supervised learning. Let’s begin!
What's included
7 videos3 readings4 assignments1 programming assignment
7 videos•Total 110 minutes
- Applying Unsupervised Learning to Supervised Learning Tasks•19 minutes
- Imputation of Missing Data•14 minutes
- Imputation with Scikit-Learn•20 minutes
- A Brief Introduction to Semi-Supervised Learning•20 minutes
- Label propagation with scikit-learn•15 minutes
- A Brief Introduction to Self-Supervised Learning•18 minutes
- Course Conclusion•4 minutes
3 readings•Total 20 minutes
- Introduction to Module 4 Assignment: Applying Methods and Techniques for Data Imputation and Semi-Supervised Learning•5 minutes
- Module 4 Optional Readings & Resources•5 minutes
- Post-Course Survey•10 minutes
4 assignments•Total 60 minutes
- Time to Practice: Applying Unsupervised Learning to Supervised Learning Tasks•15 minutes
- Time to Practice: Imputation of Missing Data•15 minutes
- Time to Practice: Semi-Supervised Learning•15 minutes
- Time to Practice: Self-Supervised Learning•15 minutes
1 programming assignment•Total 180 minutes
- Create & Submit Module 4 Assignment•180 minutes
Earn a career certificate
Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.
Instructor
Offered by
Explore more from Machine Learning
- Status: Free TrialU
University of Michigan
Course
- Status: Free Trial
Course
- Status: Free TrialU
University of Michigan
Course
Why people choose Coursera for their career
Frequently asked questions
To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.
When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.
Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.
More questions
Financial aid available,
