VOOZH about

URL: https://apify.com/parseforge/pypi-packages-scraper

โ‡ฑ PyPI Python Package Scraper ยท Apify


Pricing

from $12.00 / 1,000 result items

Go to Apify Store

PyPI Packages Scraper

Pull Python package data from PyPI. Returns name, version, summary, description, classifiers, license, author, project URLs (homepage, source, issues, docs), Python version requirement, dependencies, release history, last upload, and total release count. Direct lookup by package name.

Pricing

from $12.00 / 1,000 result items

Rating

0.0

(0)

Developer

๐Ÿ‘ ParseForge

ParseForge

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

24 days ago

Last modified

Share

๐Ÿ‘ ParseForge Banner

๐Ÿ PyPI Python Package Scraper

๐Ÿš€ Pull PyPI packages with version, license, classifiers, dependencies, vulnerabilities, release files (wheel + sdist), funding URL, and 33 fields.

๐Ÿ•’ Last updated: 2026-05-08 ยท ๐Ÿ“Š 33+ fields per record ยท 500K+ PyPI packages ยท version, classifiers, dependencies, security advisories, release files (wheel + sdist sizes + sha256), license, project URLs

The PyPI Python Package Scraper pulls rich package metadata from the Python Package Index. Output includes name, version, summary, description (truncated), license + license expression + license files, author + email, maintainer email, homepage, repository, bug tracker, docs URL, changelog URL, funding URL, classifiers, runtime dependencies, provides_extra optionals, python version requirement, yanked flag, total releases, release files (wheel + sdist with size + SHA-256 + Python version), and security vulnerabilities published by the PyPI Security team.

Direct lookup only - feed a list of package names, get rich records back. The Actor uses the JSON detail endpoint, which is the canonical source for PyPI metadata.

๐ŸŽฏ Target Audience๐Ÿ’ก Primary Use Cases
Python developers, security teams, SBOM builders, ML researchers, package-discovery tools, OSS analyticsPython supply chain analysis, vulnerability tracking, SBOM generation, dependency-graph extraction, ecosystem health monitoring

๐Ÿ“‹ What the PyPI Python Package Scraper does

Five filtering workflows in a single run:

  • ๐Ÿ†” Direct lookup. One package per line, plain names.
  • ๐Ÿšจ Vulnerabilities included. Security advisories from the PyPI Security DB.
  • ๐Ÿ“ฆ Release files. Per-version wheel + sdist with sizes and SHA-256.
  • โš–๏ธ License + license files. Standard license, license expression (PEP 639), license file paths.
  • ๐Ÿ”— Project URLs. Homepage, repo, bugs, docs, changelog, funding, all in one map.

๐Ÿ’ก Why it matters: clean, server-side filtering and fresh data on every run.


๐ŸŽฌ Full Demo

๐Ÿšง Coming soon: a 3-minute walkthrough showing how to go from sign-up to a downloaded dataset.


โš™๏ธ Input

InputTypeDefaultBehavior
maxItemsinteger10Records to return. Free plan caps at 10, paid plan up to 1,000,000.
namesstring""Newline-separated package names (one per line).

Example: lookup popular ML packages.

{
"maxItems":10,
"names":"numpy\npandas\nscikit-learn\ntensorflow\ntorch\ntransformers\nopenai\nlangchain"
}

Example: audit production deps.

{
"maxItems":20,
"names":"requests\nflask\nfastapi\ndjango\nuvicorn\ngunicorn"
}

๐Ÿ“Š Output

Each record contains 33+ fields. Download as CSV, Excel, JSON, or XML.

๐Ÿงพ Schema

FieldTypeExample
๐Ÿ“› namestring"numpy"
๐Ÿท๏ธ versionstring"2.3.4"
๐Ÿ“ summarystring"Fundamental package for array computing in Python"
โš–๏ธ licensestring"BSD-3-Clause"
โš–๏ธ licenseExpressionstring"BSD-3-Clause"
๐Ÿ‘ค authorNamestring"Travis E. Oliphant et al."
๐Ÿ“ง authorEmailstring""
๐ŸŒ homepagestring"https://numpy.org"
๐Ÿ”— repositoryUrlstring"https://github.com/numpy/numpy"
๐Ÿ”— fundingUrlstring"https://numpy.org/about/"
๐Ÿท๏ธ classifiersarray["Development Status :: 5 - Production/Stable",...]
๐Ÿ“ฆ requiresDistarray["pytest >= 4.6"]
๐Ÿ pythonRequiresstring">=3.10"
๐Ÿšจ yankedbooleanfalse
๐Ÿ“Š totalReleasesnumber392
๐Ÿ“Š releaseFileCountnumber28
๐Ÿ“ฆ releaseFilesarray of objects[{"filename":"numpy-2.3.4-cp313-...whl","size":18724328,"sha256":"...","packagetype":"bdist_wheel"}]
๐Ÿšจ vulnerabilitiesarray of objects[{"id":"PYSEC-2014-...","summary":"...","fixedIn":["1.6.0"]}]
๐ŸŒ pypiUrlstring"https://pypi.org/project/numpy/"

๐Ÿ“ฆ Sample records


โœจ Why choose this Actor

Capability
๐ŸšจVulnerabilities included. PyPI security advisories with id, summary, fixed versions, link to source.
๐Ÿ“ฆReal release files. Wheel + sdist URLs with size + SHA-256, ready for SBOM.
โš–๏ธModern license fields. license_expression (PEP 639) and license_files alongside the legacy license field.
๐Ÿ”—Funding URLs. Direct funding / sponsor link if the project provides one.
๐Ÿ†“No API key. PyPI is open.

๐Ÿ“ˆ How it compares to alternatives

ApproachCostCoverageRefreshFiltersSetup
โญ This Actor$5 free credit500K+ packagesLive per runLookupโšก 2 min
PyPI direct APIFreeSameLiveDIY๐Ÿข Code
Snyk Python Advisor$$SameLiveYes๐Ÿข Account
pip search (deprecated)-----

๐Ÿš€ How to use

  1. ๐Ÿ“ Sign up. Create a free account with $5 credit (takes 2 minutes).
  2. ๐ŸŒ Open the Actor. Find the PyPI Python Package Scraper on the Apify Store.
  3. ๐ŸŽฏ Set input. Pick filters and maxItems.
  4. ๐Ÿš€ Run it. Click Start.
  5. ๐Ÿ“ฅ Download. Grab results in the Dataset tab as CSV, Excel, JSON, or XML.

โฑ๏ธ Total time from signup to dataset: 3-5 minutes. No coding required.


๐Ÿ’ผ Business use cases

๐Ÿ” Supply Chain

  • Python SBOM generation
  • License compliance
  • Integrity-hash verification
  • Vulnerability dashboards

๐Ÿค– ML + Data Science

  • Track new ML library releases
  • Pin reproducible env snapshots
  • Compare libraries
  • Discover new packages

๐Ÿ“Š Ecosystem Analytics

  • Top-PyPI rankings
  • License distribution
  • Author / org stats
  • Yank-rate tracking

๐ŸŽ“ Education + Research

  • Reproducible PyPI snapshots
  • Course material
  • Hobbyist exploration
  • Library comparison projects

๐Ÿ”Œ Automating PyPI Python Package Scraper

Control the scraper programmatically:

  • ๐ŸŸข Node.js. Install the apify-client NPM package.
  • ๐Ÿ Python. Use the apify-client PyPI package.
  • ๐Ÿ“š See the Apify API documentation for full details.

The Apify Schedules feature lets you trigger this Actor on any cron interval.


๐ŸŒŸ Beyond business use cases

Data like this powers more than commercial workflows.

๐ŸŽ“ Research and academia

  • Software-engineering studies
  • OSS health datasets
  • Network analysis on Python deps
  • Reproducible PyPI corpora

๐ŸŽจ Personal and creative

  • Personal package dashboards
  • Curated PyPI lists
  • Side projects with metadata
  • Library-discovery sites

๐Ÿค Non-profit and civic

  • Free SBOM tools
  • OSS security awareness
  • Educational maps
  • Civic tech inventories

๐Ÿงช Experimentation

  • Train recommenders
  • Prototype security scanners
  • Build vulnerability bots
  • Test license tooling

๐Ÿค– Ask an AI assistant about this scraper

Open a ready-to-send prompt in the AI of your choice:


โ“ Frequently Asked Questions

๐Ÿงฉ How does it work?

Provide a list of PyPI package names. The Actor calls the PyPI JSON detail endpoint for each, then maps the response to a clean record.

๐Ÿ“Š How many fields per record?

33 base, expanding when the project provides funding URL, vulnerabilities, license files, etc.

๐Ÿšจ How current are vulnerabilities?

Pulled live from PyPI's security advisory database (osv.dev backed).

๐Ÿ“ฆ Do you list every release file?

Yes for the latest version: wheel + sdist with filename, URL, size, SHA-256, and Python version target.

โš–๏ธ How is licensing represented?

Both the legacy license field and PEP 639 license_expression + license_files when present.

๐Ÿ Are pre-releases included?

Yes. Yanked status is exposed too.

๐Ÿ†“ Do I need an API key?

No. PyPI is open.

๐Ÿ” Can I schedule runs?

Yes. Schedule weekly to monitor security advisories.

โš–๏ธ Is this data free to use?

Yes. PyPI metadata is publicly available under the project's licensing.

๐Ÿ’ณ Do I need a paid Apify plan?

No. The free plan covers preview runs (10 records).


๐Ÿ”Œ Integrate with any app

PyPI Python Package Scraper connects to any cloud service via Apify integrations:

  • Make - Automate multi-step workflows
  • Zapier - Connect with 5,000+ apps
  • Slack - Get run notifications
  • Airbyte - Pipe data into your warehouse
  • GitHub - Trigger runs from commits
  • Google Drive - Export datasets to Sheets

๐Ÿ”— Recommended Actors

๐Ÿ’ก Pro Tip: browse the complete ParseForge collection for more reference-data scrapers.


๐Ÿ†˜ Need Help? Open our contact form to request a new scraper, propose a custom data project, or report an issue.


โš ๏ธ Disclaimer: this Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by the Python Software Foundation, the PyPI maintainers, or any individual package author. All trademarks mentioned are the property of their respective owners. Only publicly available open data is collected.

You might also like

PyPI (Python) Packages Scraper

gio21/pypi-packages-scraper

Scrape PyPI Python packages by name-search or top-downloads. Returns full metadata: name, version, summary, author, license, downloads, dependencies, project URLs, classifiers. Pay per package returned.

PyPI Package Stats Scraper โ€” Downloads, Versions, Dependencies

seemuapps/pypi-package-stats-scraper

Get download counts, version history, dependencies, license, author, and classifiers for any Python package on PyPI. Bulk-process a list of packages in one run.

PyPI Scraper

crawlerbros/pypi-scraper

Scrape Python package metadata from PyPI: exact-name lookup, newly-added packages, and recently-updated packages. Pulls version, license, classifiers, dependencies, project URLs, and maintainer info.