VOOZH about

URL: https://apify.com/parseforge/edx-scraper

โ‡ฑ edX Scraper | University Courses and Programs ยท Apify


๐Ÿ‘ edX Scraper | University Courses and Programs avatar

edX Scraper | University Courses and Programs

Pricing

from $19.00 / 1,000 results

Go to Apify Store

edX Scraper | University Courses and Programs

Extract edX course catalog data including title, university, instructors, level, duration, price, language, subject, prerequisites, and full description. Track MicroMasters, professional certificates, and degree programs for education analytics, lead generation, and market research.

Pricing

from $19.00 / 1,000 results

Rating

0.0

(0)

Developer

๐Ÿ‘ ParseForge

ParseForge

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

a month ago

Last modified

Share

๐Ÿ‘ ParseForge Banner

๐ŸŽ“ edX Scraper

๐Ÿš€ Export edX course data in seconds. Search by keyword or subject and collect course titles, partners, prices, durations, enrollment counts, and direct links in one clean dataset.

๐Ÿ•’ Last updated: 2026-05-22 ยท ๐Ÿ“Š 13 fields per record ยท Up to 1,000,000 courses ยท Global course catalog coverage

The edX Scraper extracts course and program listings from edX's public course catalog using Algolia's search index - the same fast search engine edX uses for its own website. Each record includes the course title, subtitle, offering partner (MIT, Harvard, Microsoft, Google, etc.), subject category, difficulty level, language, duration, price, recent enrollment count, course type, and a direct link to the course page.

edX hosts thousands of courses and professional programs from over 160 partner universities and institutions worldwide, covering subjects from data science and machine learning to business, humanities, and language learning. This scraper gives you structured access to that entire catalog in one run.

Target Audience

WhoWhy
EdTech researchersAnalyze course catalog composition, pricing, and partner distribution
Curriculum developersBenchmark course structures and durations across providers
HR and L&D teamsDiscover upskilling programs for specific skills and subjects
Competitive intelligence teamsTrack which skills edX partners are prioritizing
Content creators and educatorsResearch what already exists before building new courses

๐Ÿ“‹ What the edX Scraper does

  • Queries edX's Algolia search index by keyword and optional subject filter
  • Returns courses, programs, executive education, and degree products
  • Captures partner name and logo, pricing, enrollment count, language, and duration
  • Formats duration as a human-readable string (e.g. "6 weeks (5-8 hrs/wk)")
  • Strips HTML from descriptions to return clean plain text
  • Paginates automatically until maxItems is reached or all results are exhausted

๐Ÿ’ก Why it matters: edX's course catalog spans thousands of offerings across 160+ partner institutions - far too many to browse manually. This actor lets researchers, product teams, and L&D professionals access the full catalog in minutes, with structured data ready for analysis or import.

๐ŸŽฌ Full Demo

๐Ÿšง Coming soon

โš™๏ธ Input

FieldTypeDefaultDescription
querystringpythonSearch keyword (e.g. "machine learning", "data science", "project management")
subjectstring(none)Subject filter (e.g. "Computer Science", "Business & Management")
maxItemsinteger10Maximum number of courses to collect (free: 10, paid: up to 1,000,000)

Example 1 - Search for data science courses:

{
"query":"data science",
"maxItems":100
}

Example 2 - Browse computer science courses:

{
"query":"",
"subject":"Computer Science",
"maxItems":200
}

โš ๏ธ Good to Know: The subject field must match an edX subject category exactly (e.g. "Computer Science", "Business & Management", "Data Analysis & Statistics"). Free users are limited to 10 courses per run. All course types are included: individual courses, programs, executive education, and degrees.

๐Ÿ“Š Output

FieldTypeDescription
๐Ÿ–ผ๏ธ imageUrlstringCourse cover image URL
๐Ÿ“ titlestringFull course or program name
๐Ÿ“„ subtitlestringShort course description (HTML-stripped)
๐Ÿ›๏ธ partnerstringOffering institution (e.g. MIT, Harvard, Google)
๐Ÿ–ผ๏ธ partnerLogostringPartner institution logo URL
๐Ÿ—‚๏ธ subjectstringSubject category
๐Ÿ“Š levelstringDifficulty level (Introductory, Intermediate, Advanced)
๐ŸŒ languagestringLanguage of instruction
โฑ๏ธ durationstringCourse length (e.g. "6 weeks (5-8 hrs/wk)")
๐Ÿ’ฐ pricestringPrice or "Free" if no cost
๐Ÿ‘ฅ enrollmentCountnumberRecent enrollment count
๐Ÿท๏ธ courseTypestringProduct type (Course, Program, Executive Education, 2U Degree)
๐Ÿ”— urlstringDirect link to the edX course page
๐Ÿ•’ scrapedAtstringISO timestamp of when the record was collected

Sample record:

{
"imageUrl":"https://prod-discovery.edx-cdn.org/media/programs/card_images/abc123.jpg",
"title":"Python for Data Science",
"subtitle":"Learn Python programming fundamentals and apply them to real-world data analysis problems.",
"partner":"MIT",
"partnerLogo":"https://prod-discovery.edx-cdn.org/organization/logos/mit.png",
"subject":"Computer Science",
"level":"Introductory",
"language":"English",
"duration":"4 weeks (5-7 hrs/wk)",
"price":"USD 149",
"enrollmentCount":48200,
"courseType":"Course",
"url":"https://www.edx.org/learn/python/massachusetts-institute-of-technology-python-for-data-science",
"scrapedAt":"2026-05-22T09:15:00.000Z"
}

โœจ Why choose this Actor

  • Algolia-powered - uses edX's own fast search index, not brittle HTML scraping
  • All product types - courses, programs, executive education, and degree programs in one dataset
  • Human-readable duration - combines weeks and hours/week into a single readable string
  • HTML-stripped descriptions - plain text subtitles, no markup to clean up
  • Partner logo included - institution logos for display in apps or reports
  • Pay-per-item pricing - only pay for the course records you collect

๐Ÿ“ˆ How it compares to alternatives

FeatureParseForge edX ScraperManual catalog browsingCoursera scraper
Full catalog exportYesNoDifferent platform
Enrollment countsYesSometimesSometimes
Partner logosYesYesVaries
Duration dataYesYesVaries
Price extractionYesYesYes
Free tier10 coursesUnlimited (slow)N/A

๐Ÿš€ How to use

  1. Create a free Apify account (includes $5 credit)
  2. Open the edX Scraper actor page and click Try for free
  3. Enter a search keyword (e.g. machine learning, finance, UX design)
  4. Optionally add a subject filter
  5. Set maxItems to the number of courses you want
  6. Click Start - results are typically ready in under 30 seconds
  7. Download as CSV, JSON, Excel, or connect via API

๐Ÿ’ผ Business use cases

Learning and Development (L&D) Planning

HR and L&D teams can pull all courses in a given subject area, compare partner quality and pricing, and build a curated list of upskilling resources for employees.

EdTech Market Research

EdTech companies and investors can analyze the full edX catalog to understand market saturation, pricing norms, popular topics, and which institutions are most active.

Curriculum Gap Analysis

Educators building new courses can scan existing edX offerings in their subject to identify what already exists and where gaps remain in the curriculum landscape.

Enrollment Trend Analysis

The enrollmentCount field reveals which topics are most in-demand. Researchers and product teams can track which subjects are growing in popularity over time by running periodic scans.

๐Ÿ”Œ Automating edX Scraper

  • Make (formerly Integromat) - Schedule monthly catalog scans and update an Airtable course database automatically
  • Zapier - Notify your team when new courses matching your keyword are added to the catalog
  • Google Sheets - Export course lists and share with your L&D team for review and curation
  • REST API - Embed course search into internal learning portals or recommendation tools

๐ŸŒŸ Beyond business use cases

Personal Learning Discovery

Students and self-learners can scan the full edX catalog for a topic they want to master, comparing multiple courses from different institutions side by side before enrolling.

Academic Research on Online Education

Researchers studying MOOCs and the online learning landscape can collect systematic data on course offerings, pricing models, and institutional participation.

Journalism and Reporting

Education journalists can track how the edX catalog evolves over time - which subjects are growing, which partners are expanding, and how pricing changes.

Open Data Projects

Developers and data enthusiasts can contribute edX course data to open educational resource directories or build recommendation engines.

๐Ÿค– Ask an AI assistant about this scraper

Not sure how to work with your results? Ask an AI:

"I have JSON data from the edX Scraper with fields like title, partner, subject, level, price, and enrollmentCount. How do I identify the 10 most popular free courses and group them by subject?"

The structured output is designed to be immediately usable with spreadsheets, databases, or AI-powered analysis tools.

โ“ Frequently Asked Questions

Is this scraper affiliated with edX or 2U? No. This is an independent tool that accesses edX's public Algolia search index - the same index powering edX's own website search.

What course types are included? All edX product types: individual Courses, Programs (MicroMasters, MicroBachelors, Professional Certificate), Executive Education, and 2U Degrees.

What does price: "Free" mean? The course is available to audit for free. Some courses offer a paid verified certificate option but can be taken without charge.

What does enrollmentCount represent? The recent enrollment count as reported by edX's Algolia index. This is not a total all-time enrollment - it reflects recent activity.

Can I filter by language? Not at the input level. The language field in the output lets you filter after export.

How current is the data? Data is fetched live from Algolia at run time. It reflects the edX catalog as it was at that moment.

Can I collect all courses on edX? Yes - leave query empty and set a large maxItems. The catalog has several thousand courses and programs.

What is the subject filter? It maps to edX's subject taxonomy. Examples: "Computer Science", "Data Analysis & Statistics", "Business & Management", "Language", "Engineering".

Does it include course syllabi or module breakdowns? No. The scraper extracts catalog-level metadata. For detailed course content, visit the course URL directly.

How many courses does edX have? edX hosts several thousand courses and programs from 160+ partner institutions.

Does it capture discounts or promotional pricing? The price field reflects the list price from the catalog. Promotional pricing is not captured.

Can I run this on a schedule? Yes. Use Apify's built-in scheduler to run monthly or quarterly catalog snapshots.

๐Ÿ”Œ Integrate with any app

Export your dataset to:

Spreadsheets: Google Sheets, Microsoft Excel, Airtable

Databases: PostgreSQL, MySQL, MongoDB, Supabase

LMS and HR Tools: Workday Learning, Cornerstone, Degreed

Automation: Make, Zapier, n8n, Pipedream

Analytics: Tableau, Power BI, Metabase, Google Looker Studio

๐Ÿ”— Recommended Actors

ActorDescription
Coursera ScraperExtract course listings and reviews from Coursera
Udemy ScraperScrape Udemy courses with pricing and ratings
LinkedIn Learning ScraperCollect professional courses from LinkedIn Learning

๐Ÿ’ก Pro Tip: browse the complete ParseForge collection for 50+ ready-to-use data extractors covering EdTech platforms, job boards, marketplaces, and more.


๐Ÿ†˜ Need Help? Open our contact form and we will get back to you within one business day.


โš ๏ธ Disclaimer: This actor is an independent tool not affiliated with, endorsed by, or connected to edX Inc. or 2U Inc. It accesses only publicly available course catalog data. Use responsibly and in accordance with edX's Terms of Service. ParseForge is not responsible for how collected data is used.

You might also like

Coursera Scraper | Courses Specializations and Reviews

parseforge/coursera-scraper

Scrape course catalogs from Coursera including title, partner university, instructors, ratings, enrollment counts, skills, duration, price, and full descriptions. Track specializations, certificates, and degree programs for education analytics, lead generation, and market research.

edX Course Scraper

crawlerbros/edx-scraper

Scrape edX - the world's leading MOOC platform. Search courses, browse by subject or university, fetch specific course URLs. Extracts title, institution, level, duration, effort, pricing, enrollment count, rating, skills, and more.

EDX Discovery Scraper

getdataforme/edx-discovery-scraper

The EDX Discovery Scraper extracts detailed course data from EDX, including descriptions, pricing, and organization info, aiding market research and competitive analysis....

University Course Catalog Scraper

datapilot/university-course-catalog-scraper

University Course Catalog Scraper extracts course information from university catalog websites using and Apify. It collects course codes, titles, credits, departments, descriptions, and prerequisites, supports pagination, and outputs structured JSON for academic research and catalog analysis. ๐ŸŽ“๐Ÿ“š

EdX Course Scraper ๐ŸŽ“

shahidirfan/edx-course-scraper

Power your edtech insights with this ultimate EdX Course Scraper. Instantly extract detailed online course data, including syllabi, instructors, pricing, and reviews. Perfect for e-learning aggregators and market researchers. Streamline your education data collection today!

10

edX Scraper | All In One | $3 / 1k

fatihtahta/edx-scraper

Scrape edX into clean, structured course and program data. Capture titles, partners, descriptions, skills, level, language, pacing, duration, availability and enrollment signals. Perfect for curriculum research, catalog building, market analysis and competitive tracking.

edX Online Course Data Extractor

epctex/edx-scraper

Effortlessly scrape thousands of online courses from edX. Extract titles, images, details, owners, and all other course details. Customize your search with filters like language and more for precise results.

Edx Allcourse Details Spider

getdataforme/Edx-allcourse-details-spider

The Edx Allcourse Details Spider is an Apify Actor that scrapes comprehensive details on all edX courses, including titles, descriptions, partners, subjects, levels, and skills....