Python for Data Science
Keep adding new skills with 10,000+ programs for $239 (usually $399). Save now.
Python for Data Science
This course is part of Fractal Data Science Professional Certificate
9,771 already enrolled
Included with
Ask Coursera
148 reviews
Recommended experience
148 reviews
Recommended experience
What you'll learn
Build pandas pipelines to clean, transform, and aggregate real‑world datasets.
Perform EDA and compute descriptive statistics to summarize data quality and behavior.
Apply hypothesis tests (t‑test/chi‑square) and interpret results for business decisions.
Create publication‑quality charts (bar/line/box/heatmaps) with matplotlib & seaborn.
Skills you'll gain
- Descriptive Statistics
- Statistical Analysis
- Data Cleansing
- Data Preprocessing
- Feature Engineering
- Statistical Methods
- Correlation Analysis
- Plot (Graphics)
- Data Processing
- Data Analysis
- Data Visualization Software
- Matplotlib
- Statistical Hypothesis Testing
- Exploratory Data Analysis
- Data Wrangling
- Data Transformation
- Data Manipulation
- Data Science
Tools you'll learn
Details to know
See how employees at top companies are mastering in-demand skills
Build your Data Analysis expertise
- Learn new concepts from industry experts
- Gain a foundational understanding of a subject or tool
- Develop job-relevant skills with hands-on projects
- Earn a shareable career certificate from Fractal Analytics
There are 5 modules in this course
Master Python for data science with hands‑on projects. Learn pandas, statistics, and visualization to solve real‑world business problems. Build job‑ready skills in data wrangling, exploratory data analysis (EDA), and charting with matplotlib/seaborn—no prior experience required. This beginner‑friendly course guides you through cleaning messy data, applying descriptive and inferential statistics, and preparing datasets for machine learning. You’ll design analyses that answer business questions, communicate insights with compelling visuals, and complete challenging assessments aligned to workplace scenarios.
By the end, you’ll confidently manipulate data in pandas, automate workflows, and build dashboards that stakeholders understand. Start your data‑driven journey and turn raw data into decisions.
In the first module of the Python for Data Science course, learners will be introduced to the fundamental concepts of Python programming. The module begins with the basics of Python, covering essential topics like introduction to Python.Next, the module delves into working with Jupyter notebooks, a popular interactive environment for data analysis and visualization. Learners will learn how to set up Jupyter notebooks, create, run, and manage code cells, and integrate text and visualizations using Markdown. Additionally, the module will showcase real-life applications of Python in solving data-related problems. Learners will explore various data science projects and case studies where Python plays a crucial role, such as data cleaning, data manipulation, statistical analysis, and machine learning.By the end of this module, learners will have a good understanding of Python, be proficient in using Jupyter notebooks for data analysis, and comprehend how Python is used to address real-world data science challenges.
What's included
12 videos6 readings2 assignments
12 videos•Total 60 minutes
- Welcome to python for data science•6 minutes
- Expert Talk - A data scientist's experience with Python•4 minutes
- What is python?•4 minutes
- Working with Jupyter notebooks•8 minutes
- Introduction to the problem•4 minutes
- Solution approach - Preparing tables and charts•4 minutes
- Solution approach - Gaining Insights•4 minutes
- Solution Approach - Airline traffic analysis•5 minutes
- Solution summary•4 minutes
- Expert Talk - Why Python is the language of choice for data science professionals•9 minutes
- Introduction to the Problem•4 minutes
- Exploring the Problem•5 minutes
6 readings•Total 60 minutes
- Course syllabus•10 minutes
- Installation guide •10 minutes
- Working effectively with Jupyter notebooks•10 minutes
- Important note!•10 minutes
- The Global Problem Statement•10 minutes
- Tell us what you think!•10 minutes
2 assignments•Total 60 minutes
- Python fundamentals•30 minutes
- Data Analysis •30 minutes
By the end of this module, learners will acquire essential skills in working with various types of data. They will have a solid grasp of Python programming fundamentals, including data structures and libraries. They will be proficient in loading, cleaning, and transforming data, and will possess the ability to perform exploratory data analysis, employing data visualization techniques. They will also gain insights into basic statistical concepts, such as probability, distributions, and hypothesis testing.
What's included
32 videos4 readings6 assignments2 programming assignments5 ungraded labs
32 videos•Total 174 minutes
- Introduction•1 minute
- Diving into CSV Data•7 minutes
- Data inspection•5 minutes
- Finding missing data in the POS data•7 minutes
- Deleting missing data and saving the cleaned data set•7 minutes
- Lab data and problem•3 minutes
- A note on assessments•1 minute
- Basic data structures - lists and dictionaries•14 minutes
- Basic data structures - series•3 minutes
- Creating a data frame using lists, dictionaries and series•4 minutes
- Slicing with precision•6 minutes
- Changing the indices and saving the new DataFrame•4 minutes
- Navigating data insights•6 minutes
- Selecting data that match certain criteria•4 minutes
- Selecting data that match multiple criteria•4 minutes
- Expert Talk - Understanding your data•6 minutes
- What are the unique products in the POS data set?•6 minutes
- Finding specific values in the data•7 minutes
- How much did we sell per category? •6 minutes
- Finding totals and averages by brand and by category•6 minutes
- Grouping by multiple attributes•6 minutes
- Displaying aggregated data in a pivot table•8 minutes
- Expert talk - How insights and data analysis guide each other•5 minutes
- Working with dates•7 minutes
- How much did we sell each month?•6 minutes
- What is the monthly average of sales?•5 minutes
- Were there specific dates when sales were high?•9 minutes
- What if we have more than one dataset?•6 minutes
- Merging some simple data sets•5 minutes
- Merging POS data with the online data•6 minutes
- Walkthrough - How to approach a graded assignment•4 minutes
- Summary•1 minute
4 readings•Total 60 minutes
- Data cleaning with python•10 minutes
- Resources - Datasets and Jupyter notebooks •10 minutes
- Python statistics fundamentals •10 minutes
- Working with dates •30 minutes
6 assignments•Total 180 minutes
- DataFame essentials•30 minutes
- DataFrame operations•30 minutes
- Data selection & filtering•30 minutes
- Data manipulation & aggregation•30 minutes
- Date time operations•30 minutes
- Merging & joining dataframes•30 minutes
2 programming assignments•Total 300 minutes
- Graded Assignment•120 minutes
- New Programming Assignment•180 minutes
5 ungraded labs•Total 150 minutes
- Data cleaning & manipulation •30 minutes
- Data slicing & manipulations•30 minutes
- Data aggregations•30 minutes
- Practice Programming Assignment•30 minutes
- Merging the data•30 minutes
By the end of this module, learners will gain a comprehensive understanding of statistical concepts, data exploration techniques, and visualization methods. Learners will develop the skills to identify patterns, outliers, and relationships in data, making informed decisions and formulating hypotheses. Ultimately, they will emerge with the ability to transform raw data into meaningful insights, effectively communicate their findings through data storytelling, and apply EDA across diverse real-world applications.
What's included
34 videos1 reading5 assignments1 programming assignment4 ungraded labs
34 videos•Total 206 minutes
- Introduction•0 minutes
- Expert Talk - Why EDA is a superpower•6 minutes
- Finding the average of the data•7 minutes
- Understanding the spread of the data•9 minutes
- Quantiles - how to understand and visualize them•7 minutes
- Exploring variability in the POS data•7 minutes
- What shape is my data? •7 minutes
- Understanding the distributions of features in the POS data•7 minutes
- Understanding Data Distributions•4 minutes
- Some other common shapes of data - Part I•11 minutes
- Some other common shapes of data - Part I•6 minutes
- Some other common shapes of data - Part II•8 minutes
- Some other common shapes of data - Part III•8 minutes
- What chance of revenue falls in a given range•3 minutes
- How are the features related to each other? - Part I•6 minutes
- How are the features related to each other? - Part I•5 minutes
- How are the features related to each other? - Part II•5 minutes
- How are the features related to each other? - Part II•5 minutes
- Visualizing categorical features•7 minutes
- Visualizing proportions•8 minutes
- Expert Talk - Power of visualization & its importance in storytelling •7 minutes
- Using boxplots to compare revenues across segments in the POS data•8 minutes
- Making better visuals - Part III•9 minutes
- Communicating insights better by creating multiple subplots within the same plot•3 minutes
- Comparing the distribution of revenue for each sector by overlaying their KDE plots •8 minutes
- Sampling our data - Part I •5 minutes
- Sampling our data - Part II•5 minutes
- Introduction to hypothesis testing - Part I•6 minutes
- Introduction to hypothesis testing - Part II•4 minutes
- Hypothesis testing using Z - Test - Part I•6 minutes
- Hypothesis testing using Z - Test - Part II•6 minutes
- Hypothesis testing using t - Test•7 minutes
- Hypothesis testing using Chi-square test•7 minutes
- Summary•1 minute
1 reading•Total 10 minutes
- Resources - Datasets and Jupyter notebooks•10 minutes
5 assignments•Total 150 minutes
- Statistics fundamentals•30 minutes
- Data distributions•30 minutes
- Understanding relationships between features•30 minutes
- Practice Quiz•30 minutes
- Practice quiz•30 minutes
1 programming assignment•Total 120 minutes
- Graded Assignment•120 minutes
4 ungraded labs•Total 120 minutes
- Understanding data distributions •30 minutes
- Practice Programming Assignment•30 minutes
- Practice Programming Assignment•30 minutes
- Practice Programming Assignment•30 minutes
By the end of this module, learners will acquire the essential skills to effectively transform raw and often messy data into a structured and suitable format for advanced analysis. They will master the techniques for handling missing values, identifying and dealing with outliers, encoding categorical variables, scaling and normalizing numerical features, and handling textual or unstructured data. Learners will also be proficient in detecting and addressing data inconsistencies, such as duplicates and errors. Learners will be able to treat data to make it suitable for further analysis. Upon completion of this module, Upon completion
What's included
25 videos2 readings3 assignments1 programming assignment3 ungraded labs
25 videos•Total 135 minutes
- Introduction•4 minutes
- Expert Talk - Handling missing data•7 minutes
- What to do with missing values?•5 minutes
- Missing values in the POS data•3 minutes
- Missing values within a hierarchy•8 minutes
- Missing values within a hierarchy (contd.)•6 minutes
- What if parts of the hierarchy are also missing?•3 minutes
- Finishing up missing value treatment in the POS data•5 minutes
- Missing values - another simpler example•9 minutes
- Working with categoric features•5 minutes
- Transforming features - binning and discretization•8 minutes
- Transforming features - binning and discretization (contd.)•6 minutes
- Encoding categoric features - one-hot and label encoding•9 minutes
- Encoding features in the POS data•6 minutes
- Finishing up the encoding and saving the encoded data•4 minutes
- What is data normalization and why do we need it?•5 minutes
- Data normalization using min-max scaling•6 minutes
- Data normalization using z-score scaling•4 minutes
- Other types of data transformation•5 minutes
- Applying log transformation to the online data•5 minutes
- Finding outlying data•6 minutes
- Removing outliers by dropping them•5 minutes
- How to deal with outliers - imputation•7 minutes
- How to deal with outliers - capping•4 minutes
- Summary•2 minutes
2 readings•Total 40 minutes
- Resources - Datasets and Jupyter notebooks•10 minutes
- Data pre-processing •30 minutes
3 assignments•Total 90 minutes
- Missing values•30 minutes
- Dealing with categorical data•30 minutes
- Data normalization•30 minutes
1 programming assignment•Total 120 minutes
- Graded Assignment•120 minutes
3 ungraded labs•Total 90 minutes
- Handling missing values•30 minutes
- Handling categorical features•30 minutes
- Data normalization & treating outliers•30 minutes
By the end of this module, learners will develop a profound understanding of how to craft and enhance features to optimize the performance of machine learning models. They will be adept at identifying relevant variables, creating new features through techniques such as one-hot encoding, binning, and polynomial expansion, and extracting valuable information from existing data, like dates or text, using methods like feature extraction and text vectorization. Learners will also grasp the concept of feature scaling and normalization to ensure the consistency and comparability of feature ranges. With these skills, they will possess the ability to shape data effectively, amplifying its predictive power and contributing to the construction of robust, high-performing machine learning pipelines.
What's included
11 videos2 readings1 assignment1 programming assignment1 ungraded lab
11 videos•Total 53 minutes
- Introduction•1 minute
- Reducing the dimensionality of data sets•6 minutes
- Exploring the features of the obesity data set•7 minutes
- What is Principal Component Analysis(PCA)?•8 minutes
- Applying PCA to the obesity data•5 minutes
- Creating a transformed version of the data through feature engineering•9 minutes
- Expert Talk - Gen AI in Python•5 minutes
- Introduction to Gen AI in Python for Data science•4 minutes
- Some quick data analysis using PandasAI•4 minutes
- Some quick data visualization using PandasAI•3 minutes
- Summary•1 minute
2 readings•Total 40 minutes
- Complete guide to Feature Engineering•30 minutes
- Resources - Datasets and Jupyter notebooks•10 minutes
1 assignment•Total 30 minutes
- Feature engineering & PCA•30 minutes
1 programming assignment•Total 120 minutes
- Graded Assignment•120 minutes
1 ungraded lab•Total 30 minutes
- Dimensionality reduction, PCA•30 minutes
Earn a career certificate
Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.
Instructor
Offered by
Explore more from Data Analysis
- Status: Free Trial
Course
- Status: Free Trial
- Status: Free TrialC
Coursera
Specialization
Why people choose Coursera for their career
Learner reviews
- 5 stars
64.42%
- 4 stars
14.76%
- 3 stars
2.68%
- 2 stars
2.01%
- 1 star
16.10%
Showing 3 of 148
Reviewed on Feb 18, 2024
Good course. Need more in-depth details with case studies.
Reviewed on Nov 14, 2023
All expert did a comprehending way of giving their knowledge for learning, a great work.
Reviewed on Nov 28, 2023
Its a great course if you want to learn how to apply concepts in solving real business problems
Frequently asked questions
A practical, beginner‑friendly introduction to Python for data science focused on data wrangling, statistics, and visualization—skills employers value and use daily.
Beginners and professionals transitioning into data analysis or business analytics who want hands‑on, job‑ready skills.
Clean and analyze datasets with pandas, run statistical tests, build insightful visualizations, and prepare data for ML—then present findings that drive decisions.
More questions
Financial aid available,
