VOOZH about

URL: https://apify.com/leadops_lab/dataset-quality-auditor

โ‡ฑ Apify Dataset QA Gate for Scraper Workflows ยท Apify


Pricing

$1.00 / 1,000 qa reports

Go to Apify Store

Apify Dataset QA Gate

Pass, warn, or stop Apify datasets before CRM import, enrichment, client delivery, or webhook automation.

Pricing

$1.00 / 1,000 qa reports

Rating

0.0

(0)

Developer

๐Ÿ‘ jiaxun mao

jiaxun mao

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

0

Monthly active users

2 months ago

Last modified

Share

Pass, warn, or stop Apify datasets before CRM import, enrichment, client delivery, or webhook automation.

This Actor is for teams that run scrapers repeatedly and need a quality gate before bad data flows into expensive or visible downstream steps.

Use it after a scraper and before:

  • CRM import
  • enrichment APIs
  • Google Sheets exports
  • client lead-list delivery
  • n8n, Make, Zapier, or Apify webhook workflows

Why use a QA gate?

Deduplication tools clean rows. Enrichment tools add data. This Actor answers the earlier question:

Should this dataset continue through the workflow at all?

It returns:

  • qaStatus: PASS, WARN, or FAIL
  • automationAction: continue, review, or stop
  • failed quality checks with actual vs expected values
  • CRM-ready record count and percentage
  • duplicate count and percentage
  • field coverage for company, domain, email, phone, location, and category
  • sample messy rows for review
  • sample clean rows that can continue downstream
  • recommendations for cleanup, enrichment, or scoring

Input options

Use either:

  • records: paste raw records as JSON.
  • sourceDatasetId: select an existing Apify Dataset ID.

Example:

{
"sourceDatasetId":"YOUR_DATASET_ID",
"requiredFields":["companyName","domain","email","phone","location"],
"passThresholds":{
"minCrmReadyPercent":80,
"maxDuplicatePercent":10,
"minRequiredFieldCoveragePercent":70
},
"maxRecords":10000,
"sampleSize":25
}

Automation workflow

  1. Run a lead, directory, product, review, or listing scraper.
  2. Send the scraper Dataset ID into this Actor.
  3. If automationAction is continue, send clean rows to CRM, Sheets, or enrichment.
  4. If automationAction is review, route the dataset to manual review.
  5. If automationAction is stop, block the workflow before wasting enrichment credits or importing bad data.

Lead workflow

For lead lists, run this Actor first. If the dataset passes, run Lead Intelligence Scorer to deduplicate, score, and prioritize the leads.

Recommended chain:

scraper -> QA Gate -> Lead Intelligence Scorer -> CRM/export

Best fit

  • agencies validating client lead-list deliverables
  • operators running scheduled Apify scrapers
  • founders sending scraper output into Sheets or a CRM
  • automation builders who need a simple pass/fail signal

You might also like

Apify Dataset QA Gate

zentrafoundry/apify-dataset-quality-auditor

Score Apify datasets and emit actionable quality issues before downstream use.

Dataset Quality Gate - Schema & Data QA

jy-labs/dataset-quality-gate

Validate Apify Datasets by pasted items, Dataset ID, or Run ID before delivery, automation, or AI/RAG ingestion. Catch schema drift, missing fields, duplicates, and bad URLs/emails/dates.

Job Hunt Automation

scrapyspider/job-hunt-automation

Aggregates and normalizes job listings from multiple Apify job scraper datasets. Deduplicates by URL and outputs clean, structured data ready for CRM import or further processing.

Dataset Result Gate

vittuhy/dataset-result-gate

Conditional pipeline gate. Fails if the previous actor's dataset is empty, succeeds if it has results โ€” stopping unnecessary downstream runs before they start.

๐Ÿ‘ User avatar

Vรญt Tuhรฝ

1

Related articles

Announcing Apify CLI v1
Read more