👁 Company ESG & Sustainability Data Extractor avatar

Company ESG & Sustainability Data Extractor

Pricing

from $10.00 / 1,000 esg extractions

👁 Company ESG & Sustainability Data Extractor

Company ESG & Sustainability Data Extractor

Extract ESG and sustainability metrics, carbon commitments, and net-zero targets from public company sustainability pages. Structured JSON output for finance, research, and procurement teams.

Pricing

from $10.00 / 1,000 esg extractions

Rating

0.0

(0)

Developer

👁 Technical Dost Solutions

Technical Dost Solutions

Maintained by Community

Actor stats

Bookmarked

Total users

Monthly active users

3 days ago

Last modified

What this Actor does

Extract ESG and sustainability metrics, carbon commitments, and net-zero targets from public company sustainability and ESG report web pages that you supply.

It processes user-provided public URLs, reads schema.org Organization JSON-LD for the company name, scans visible page text for ESG keywords grouped by metric category (carbon, energy, water, waste, diversity, governance), pairs those keywords with nearby numeric values and units, and optionally captures net-zero and reduction-target commitment sentences. It normalizes useful fields, deduplicates rows, and saves structured records to the Apify dataset.

Why this Actor is useful

Sustainability analysts, investors, and procurement teams pay for this kind of extraction because it converts unstructured ESG narrative reports into clean, comparable datasets. It saves manual reading, creates repeatable monitoring, feeds spreadsheets, dashboards, or scoring models, and turns public ESG pages into API-ready data.

Who this is for

ESG and sustainability analysts
Investment and ESG research teams
Corporate sustainability and procurement teams
Data providers and ESG rating builders
Journalists and NGOs tracking corporate climate claims
B2B teams enriching company sustainability profiles

Common use cases

Build comparable ESG metric datasets across many companies
Track net-zero and carbon-neutral commitments and target years
Monitor reported Scope 1/2/3 emissions over time
Enrich company profiles with sustainability data points
Feed ESG scoring or screening models

Input

startUrls: Public URLs to extract from. Use only pages you are allowed to access without login or bypassing access controls.
keywords: Optional additional ESG or sustainability terms to match on top of the built-in keyword library.
includeCommitments: Capture net-zero, carbon-neutral, and reduction-target sentences as commitment rows with an extracted target year.
maxItems: Maximum number of rows to save.
maxConcurrency: Number of pages processed in parallel. The default is intentionally conservative.
requestTimeoutSecs: Maximum time to spend on a single page.
proxyConfiguration: Optional Apify proxy configuration where permitted by your source review.

Output

companyName: Company name when exposed in Organization structured data.
sourceUrl: URL where the data was extracted.
metricCategory: Category such as carbon, energy, water, waste, diversity, governance, commitment, or other.
metricName: The matched metric label (for example, Scope 1 emissions).
metricValue: The numeric value found near the metric keyword.
unit: Detected unit such as %, tCO2e, MWh, or similar.
reportingYear: Reporting year detected in the same sentence when available.
targetYear: Target year detected for commitment rows.
commitmentText: The captured net-zero or reduction-target sentence.
framework: Reporting frameworks referenced on the page (GRI, SASB, TCFD, CDP, SDG).
extractedAt: Timestamp when this Actor extracted the row.
extractionMethod: structured_data, text_extraction, or commitment_text.
confidenceScore: Heuristic confidence score (structured 0.9, text-derived 0.6-0.8).
missingFields: Required fields (companyName, metricName, metricValue, reportingYear) not available from the source page.

Sample input

{
"startUrls":[
{
"url":"https://example.com/"
}
],
"keywords":[],
"includeCommitments":true,
"maxItems":50,
"maxConcurrency":3,
"requestTimeoutSecs":30
}

Sample output

{
"companyName":"Example Manufacturing Group",
"sourceUrl":"https://example.com/",
"metricCategory":"carbon",
"metricName":"Scope 1 emissions",
"metricValue":125000,
"unit":"tCO2e",
"reportingYear":2024,
"targetYear":null,
"commitmentText":null,
"framework":"GRI",
"extractedAt":"2026-06-12T00:00:00.000Z",
"extractionMethod":"structured_data",
"confidenceScore":0.9,
"missingFields":[]
}

How to use

Run this Actor on Apify with public URLs, export the dataset as JSON, CSV, Excel, or through the Apify API, then connect the output to Google Sheets, Make, Zapier, a webhook, your CRM, or an internal dashboard. For monitoring, save the input as an Apify task and schedule recurring runs.

Pricing

This Actor uses a pay-per-event model: $0.01 per extraction. You pay only for the structured rows the Actor produces, which keeps costs predictable and tied directly to delivered data.

Best practices

Start with a small set of reviewed public ESG and sustainability report URLs.
Prefer the main sustainability or ESG data pages rather than PDF download links.
Add domain-specific terms via keywords when a company uses non-standard metric names.
Keep includeCommitments enabled to capture net-zero and target language.
Keep maxConcurrency low for smaller websites.
Review source website rules before scheduling recurring runs.
Treat text-derived values as candidates for human review before downstream scoring.

Compliance and responsible use

This Actor is for public data only. It must not be used to bypass logins, paywalls, CAPTCHAs, or security systems, collect private data, gather sensitive personal data, or support spam or abuse. You are responsible for following applicable laws and source website rules.

Limitations

Output quality depends on the public ESG content available on the source pages.
Text-derived extraction is heuristic. Numeric values and units are matched near keywords and may need human verification before use in scoring.
The Actor reads HTML pages and does not parse PDF reports.
Some fields may be empty when the source does not publish them, and they are reported in missingFields rather than inferred.
The Actor does not claim support for any specific third-party ESG platform.
Website markup and access policies can change.

Troubleshooting

Empty output usually means the page has no recognizable ESG keywords paired with numeric values.
Invalid URL errors mean one or more input URLs are malformed.
Slow runs can usually be improved by lowering maxConcurrency.
Missing fields are source-data limitations, not inferred values.

Changelog

v0.2.0: Production-readiness pass with improved positioning, samples, schema descriptions, and responsible-use notes.
v0.1.0: Initial dry-run factory generated MVP.

👁 CSRHub.com ESG Data Scraper avatar

CSRHub.com ESG Data Scraper

njoylab/csrhub-com-esg-data-scraper

Extract comprehensive ESG metrics and company profiles from CSRHub.com with this efficient Apify scraper. Get structured sustainability ratings, corporate information, and industry benchmarks for investment analysis and research

👁 User avatar

njoylab

👁 SGX (Singapore Exchange) Scraper — Stocks, ETFs, REITs, Bonds avatar

SGX (Singapore Exchange) Scraper — Stocks, ETFs, REITs, Bonds

alwaysprimedev/sgx-scraper

Pull every security listed on the Singapore Exchange — stocks, ETFs, REITs, business trusts, bonds, warrants — with live delayed prices, ISIN codes, CPF-eligibility flags, and full corporate profiles.

👁 User avatar

Always Prime

Sec Esg Disclosure Scraper

fortuitous_pirate/sec-esg-disclosure-scraper

SEC EDGAR ESG & Climate Disclosure Scraper. Structured data export for lead generation, enrichment, and competitive research.

👁 User avatar

Fortuitous Pirate

👁 Global Climate Sustainability B2B Leads avatar

Global Climate Sustainability B2B Leads

blukaze/global-climate-sustainability-b2b-leads-Apify-Actor

Global Climate & Sustainability B2B Leads Finder crawls company websites to detect ESG and sustainability activity, then converts it into structured leads with key pages, contacts, and a sustainabilityIntentScore (0–100) to quickly identify high-intent organizations.

👁 User avatar

Blukaze Automations

👁 EPA Toxics Release Inventory (TRI) Scraper avatar

EPA Toxics Release Inventory (TRI) Scraper

compute-edge/epa-tri-scraper

Extract toxic chemical release data from the EPA Toxics Release Inventory (TRI). Over 3 million records of industrial facility emissions reported since 1987. Filter by state, year, and chemical name.

👁 User avatar

Compute Edge

👁 EPA TRI Scraper - Toxic Release Inventory API avatar

EPA TRI Scraper - Toxic Release Inventory API

pink_comic/epa-tri-toxic-release-search

Search EPA Toxic Release Inventory (TRI) facility and chemical release records. Find toxic emissions by state, ZIP, facility, or chemical for ESG research, environmental due diligence, compliance monitoring, and risk screening. No API key required. Pay per result.

👁 User avatar

Ava Torres

👁 GreenTrace-scrapper avatar

GreenTrace-scrapper

sama4/greentrace-scrapper

👁 User avatar

And Sama

👁 💎ESG Scraper: Sustainability Reports & PDF Disclosures avatar

💎ESG Scraper: Sustainability Reports & PDF Disclosures

primeparse/esg-content-scraper

Powerful ESG scraper (Environmental, Social, and Governance) to automatically extract sustainability reports, PDF disclosures, articles, and content from any website. Get clean, AI-ready datasets with keyword filtering, metadata extraction, images, links, and full PDF support.

👁 User avatar

PrimeParse

5.0

(1)

Bcorp Directory Scraper

fortuitous_pirate/bcorp-directory-scraper

Bcorp Directory Scraper. Structured data export for lead generation, enrichment, and competitive research.

👁 User avatar

Fortuitous Pirate

👁 Forex Exchange Rate Scraper avatar

Forex Exchange Rate Scraper

taroyamada/exchange-rate-monitor

Feed AI models and RAG pipelines with real-time forex data by scraping live exchange rates from open.er-api.com and calculating exact currency fluctuations.

👁 User avatar

naoki anzai

URL: https://apify.com/technicaldost/company-esg-sustainability-extractor

⇱ ESG & Sustainability Data Extractor | Carbon Net-Zero · Apify

Company ESG & Sustainability Data Extractor

What this Actor does

Why this Actor is useful

Who this is for

Common use cases

Input

Output

Sample input

Sample output

How to use

Pricing

Best practices

Compliance and responsible use

Limitations

Troubleshooting

Changelog

You might also like

CSRHub.com ESG Data Scraper

SGX (Singapore Exchange) Scraper — Stocks, ETFs, REITs, Bonds

Sec Esg Disclosure Scraper

Global Climate Sustainability B2B Leads

EPA Toxics Release Inventory (TRI) Scraper

EPA TRI Scraper - Toxic Release Inventory API

GreenTrace-scrapper

💎ESG Scraper: Sustainability Reports & PDF Disclosures

Bcorp Directory Scraper

Forex Exchange Rate Scraper