Applied Information Extraction in Python
Keep adding new skills with 10,000+ programs for $239 (usually $399). Save now.
Applied Information Extraction in Python
This course is part of More Applied Data Science with Python Specialization
Instructor: VG Vinod Vydiswaran
Included with
Learn more
Ask Coursera
Recommended experience
Recommended experience
What you'll learn
Develop skills to process and interpret information presented in free-text data.
Identify the major classes of named entity recognition (NER) and implement, with guidance, state-of-the-art machine learning techniques for NER.
Compare, contrast, and select between multiple machine learning and deep learning approaches for NER.
Explore Large Language Models and configure a Transformer-based pipeline to extract entities of interest from a text dataset.
Skills you'll gain
Tools you'll learn
Details to know
14 assignments
See how employees at top companies are mastering in-demand skills
Build your subject-matter expertise
- Learn new concepts from industry experts
- Gain a foundational understanding of a subject or tool
- Develop job-relevant skills with hands-on projects
- Earn a shareable career certificate
There are 4 modules in this course
In βApplied Information Extraction in Python,β you will learn how to extract useful information from free-text data, which is a type of string data created when people type. Examples of free-text data include names of people or organizations, location information such as cities and zip codes, or other elements like stock prices or clinical diagnoses. Free-text data is found everywhere, from magazine articles to social media posts, and can be complex to analyze.
In this course, youβll use applied machine learning and text-mining techniques to analyze free-text data. You will learn how to identify named entities and tag them with appropriate types of classifications, using real-world data from business, politics, and healthcare. Youβll develop multiple approaches to recognize and extract named entities and attributes of interest from free-text data, ranging from regular expressions to neural network models. Finally, youβll explore Transformer models such as ChatGPT and Large Language Models to extract information from large datasets. This is the final course in βMore Applied Data Science with Python,β a four-course series focused on helping you apply advanced data science techniques using Python. It is recommended that all learners complete the following courses from the Applied Data Science with Python Specialization: Introduction to Data Science in Python, Applied Machine Learning in Python, and Applied Text Mining in Python.
This module introduces information extraction, covering key tasks and approaches for extracting relevant information from text. You will explore pattern-based and list-based methods to identify and extract information from text data, applying these techniques across diverse domains. You will also develop an end-to-end NLP pipeline to extract named entities from free text using terminology resources.
What's included
7 videos5 readings3 assignments1 programming assignment1 discussion prompt1 ungraded lab
7 videosβ’Total 42 minutes
- Welcome to Information Extractionβ’7 minutes
- What is Information Extraction? β’9 minutes
- Information Extraction in Different Domainsβ’7 minutes
- Extracting Formatted Informationβ’7 minutes
- Lookup Based Extractionβ’4 minutes
- Demo: Using Regular Expressions & Examining Outputβ’4 minutes
- Assignment 1 Introduction: Formatting & Normalizing Data with Regular Expressionsβ’4 minutes
5 readingsβ’Total 60 minutes
- MADSwPy Certificate Roadmapβ’10 minutes
- Course Syllabusβ’10 minutes
- Introduction to Jupyter Notebookβ’10 minutes
- Help Us Learn About Youβ’10 minutes
- Regular Expressions in Detailβ’20 minutes
3 assignmentsβ’Total 50 minutes
- Module 1 Assignmentβ’30 minutes
- Knowledge Check: Introduction to Information Extractionβ’10 minutes
- Knowledge Check: Rule-Based Approaches to Information Extractionβ’10 minutes
1 programming assignmentβ’Total 180 minutes
- Build an Information Extraction Pipeline for Template/List-Based Fieldsβ’180 minutes
1 discussion promptβ’Total 10 minutes
- Meet Other Learnersβ’10 minutes
1 ungraded labβ’Total 60 minutes
- Jupyter Notebook Practice on Basic NLP and Rule-Based Extractionβ’60 minutes
In Module 2, you'll dive into the world of named entity recognition (NER). You'll learn to define and identify named entities, and understand how to tackle related tasks by framing them as NER challenges. We'll explore how to use resources like standardized terminology and named gazettes to enhance NER. You'll also gain hands-on experience by training a machine learning model for sequence classification using an annotated text dataset. Finally, we'll discuss the pros and cons of different Markov models for NER, equipping you with the insights needed for practical applications.
What's included
7 videos6 readings4 assignments1 programming assignment1 ungraded lab
7 videosβ’Total 51 minutes
- What is Named Entity Recognition (NER)?β’5 minutes
- NER as a Sequence Classification Taskβ’9 minutes
- Fundamentals of Markov Chain Modelsβ’8 minutes
- Hidden Markov Models (HMMs)β’11 minutes
- Conditional Random Fields (CRFs)β’6 minutes
- Demo: CRF Model Trainingβ’8 minutes
- Assignment 2 Introduction: Implementing a CRF Modelβ’4 minutes
6 readingsβ’Total 60 minutes
- BIO Encoding for Named Entity Labelsβ’10 minutes
- BILOU Encoding for Named Entity Labelsβ’10 minutes
- Machine Learning Fundamentals: How Machines Learn to Label Named Entitiesβ’10 minutes
- Markov Chain and Hidden Markov Modelsβ’10 minutes
- Training Hidden Markov Models: How HMMs Learn to Assign Labelsβ’10 minutes
- The Math Behind HMMs: How Probabilities Power Sequence Labelingβ’10 minutes
4 assignmentsβ’Total 65 minutes
- Module 2 Assignmentβ’30 minutes
- Knowledge Check: Named Entities and Named Entity Recognitionβ’10 minutes
- Knowledge Check: Setting up NER as a Machine Learning Taskβ’10 minutes
- Knowledge Check: Hidden Markov Models (HMMs)β’15 minutes
1 programming assignmentβ’Total 180 minutes
- Build an Information Extraction Pipeline for CRF Based Extractionβ’180 minutes
1 ungraded labβ’Total 60 minutes
- Jupyter Notebook Practice on Training CRFsβ’60 minutes
In Module 3, focused on neural network models, you will explore the differences between training deep learning models and traditional machine learning models. You'll learn how to model and train a neural network-based classifier, as well as formulate text as features for NER model training. We will discuss the pros and cons of deep learning approaches. You'll design a neural network model to identify concepts from free text and apply a trained deep learning model to solve NER tasks.
What's included
5 videos4 readings4 assignments1 programming assignment1 ungraded lab
5 videosβ’Total 26 minutes
- Introduction to Deep Learningβ’4 minutes
- Neural Network Modelsβ’7 minutes
- Deep Neural Network Modelsβ’6 minutes
- Demo: Configuring the Bi-Directional LSTMβ’4 minutes
- Assignment 3 Introduction: Building an Information Extraction Pipeline with BiLSTMs and CRFsβ’4 minutes
4 readingsβ’Total 40 minutes
- Understanding Deep Learning: A Shift From Rules to Representationβ’10 minutes
- Activation Functions: How Deep Learning Models Make Decisionsβ’10 minutes
- Understanding Deep Neural Network Models: How Depth Enables Learning at Scaleβ’10 minutes
- Building an Information Extraction Pipeline with BiLSTMs and CRFsβ’10 minutes
4 assignmentsβ’Total 60 minutes
- Module 3 Assignmentβ’30 minutes
- Knowledge Check: What Is Deep Learning?β’10 minutes
- Knowledge Check: What Are Neural Network Models?β’10 minutes
- Knowledge Check: Deeper Neural Networksβ’10 minutes
1 programming assignmentβ’Total 180 minutes
- Build an Information Extraction Pipeline using Deep Neural Networksβ’180 minutes
1 ungraded labβ’Total 60 minutes
- Jupyter Notebook Practice on Training LSTMsβ’60 minutes
In this module, you'll dive into the power of deep learning models in diverse fields such as healthcare and sports commentary. You'll learn how to build neural network models that are fine-tuned for specific tasks and discover how to set up a deep neural network for detecting key entities. We'll also introduce you to the world of large language models, showcasing their transformative capabilities and applications in information extraction.
What's included
5 videos4 readings3 assignments1 programming assignment
5 videosβ’Total 37 minutes
- Language Models (LMs)β’11 minutes
- Large Language Models (LLMs)β’8 minutes
- Transformersβ’9 minutes
- Assignment 4 Introduction: Build an Information Extraction Pipeline using Transformersβ’4 minutes
- Course Wrap-Upβ’4 minutes
4 readingsβ’Total 40 minutes
- Recent Advances in GPTsβ’10 minutes
- Building an Information Extraction Pipeline with Transformers and LLMsβ’10 minutes
- Continue Your Journey and Earn a Master of Applied Data Science Degree Onlineβ’10 minutes
- Course Post-Surveyβ’10 minutes
3 assignmentsβ’Total 50 minutes
- Module 4 Assignmentβ’30 minutes
- Knowledge Check: Language Models β’10 minutes
- Knowledge Check: What Are Transformers?β’10 minutes
1 programming assignmentβ’Total 180 minutes
- Build an Information Extraction Pipeline using Transformersβ’180 minutes
Earn a career certificate
Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.
Instructor
Offered by
Explore more from Machine Learning
- Status: Free TrialU
University of Michigan
Course
- Status: Free TrialU
University of Michigan
Course
- Status: Free TrialU
University of Michigan
Course
- Status: Preview
Why people choose Coursera for their career
Frequently asked questions
To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.
When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.
Yes. In select learning programs, you can apply for financial aid or a scholarship if you canβt afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, youβll find a link to apply on the description page.
More questions
Financial aid available,
