👁 Prompt Injection Dataset Scanner avatar

Prompt Injection Dataset Scanner

Pricing

Pay per event

Try for free

Go to Apify Store

👁 Prompt Injection Dataset Scanner

Prompt Injection Dataset Scanner

Try for free

Scan and sanitize dataset records before they enter LLM, RAG, or agent pipelines.

Pricing

Pay per event

Rating

0.0

(0)

Developer

👁 Zentra

Zentra

Maintained by Community

Actor stats

Bookmarked

Total users

Monthly active users

39 minutes ago

Last modified

Who this is for

AI operations, security, governance, platform, and automation teams use this actor when they need focused prompt injection dataset scanner output instead of a broad generic scraper or manual checking.

Buyer outcomes

Inspect prompt injection dataset scanner behavior before it creates avoidable cost, safety, or trust issues.
Prioritize review with policy decisions, risk levels, budget impact, trace evidence, and recommended actions.
Route blocked, approved, or review-required events into audit logs and operational workflows.

Sources monitored

Inputs

sourceMode: use sample for a safe smoke run, or configured modes for trace/tool-call inputs.
startUrls: optional public actor, policy, MCP manifest, or evidence URLs when the workflow uses URL-backed review.
sourceIds: approved policy, dataset, manifest, or trace source identifiers.
maxItems: bounded number of decisions, findings, or reports to emit.
watchlistTerms: policy names, tools, vendors, domains, or risk terms to prioritize.
webhookUrl: optional destination for review-required decisions or audit reports.
outputMode: use sample records for Store validation or production output for normal runs.

How it transforms the input

Input: agent trace, tool-call request, MCP manifest, actor metadata, policy rule, or run evidence.
Transformation: apply policy, risk, budget, permission, side-effect, or audit checks.
Output: allow/block/review decisions, matched policy, risk score, budget impact, reason, and recommended next action.

Outputs

The actor returns structured AgentOps records for tool-call decisions, policy results, budget/cost signals, prompt-injection review, repair diagnosis, trace evidence, or audit reports.

Family-specific fields to expect:

agentGoal: What the agent or workflow was trying to accomplish.
toolCall: Requested tool name, arguments, and execution context.
policyDecision: Allow, block, review, or escalation decision.
riskLevel: Risk level assigned to the action or workflow.
budgetImpact: Estimated or observed cost impact.
sideEffectRisk: Potential external, write, payment, or account side-effect risk.
recommendedAction: Operational next step for the reviewer or automation.
auditEvidence: Trace, policy, manifest, or run evidence used in the decision.
recordId: Stable record ID for exports, dedupe, and downstream joins.
title: Human-readable record title for review and export.
sourceName: Source identifier used to trace where the record came from.
sourceUrl: Direct source URL for review and audit.
dedupeKey: Stable key used for delta mode and duplicate suppression.
retrievedAt: Timestamp showing when the actor retrieved or generated this record.
score: Normalized field for filtering, routing, or downstream review.
scoreReasons: Buyer-readable explanation for the score or match.
confidence: Normalized field for filtering, routing, or downstream review.
errors: Normalized field for filtering, routing, or downstream review.
runSummary: Run-level summary for counts, filters, charges, and next actions.

Pricing

This actor uses Apify pay-per-event pricing. Current public listing guidance: $29-$49 / 1,000 launch validation records until public data proof is complete. Charges are tied to buyer-visible value events such as record-scanned, record-quarantined, quarantine-report, dataset-processed, record-saved, enriched-record. Small validation runs are supported so you can inspect output before scaling a schedule.

record-scanned: Charge after producing one scanned record. Typical price: $0.002. A run that produces 10 matching records charges only for the matched buyer-value events and remains capped by the run limit.
record-quarantined: Charge after producing one quarantined risky record. Typical price: $0.020. A run that produces 10 matching records charges only for the matched buyer-value events and remains capped by the run limit.
quarantine-report: Charge after producing one quarantine report. Typical price: $0.120. A run that produces 10 matching records charges only for the matched buyer-value events and remains capped by the run limit.
dataset-processed: Base charge when Prompt Injection Dataset Scanner writes a non-empty default dataset. Typical price: $0.011. A run that produces 10 matching records charges only for the matched buyer-value events and remains capped by the run limit.
first-run-cap: Recommended first run budget cap. Typical price: $2.000. Start with the default small run, inspect the dataset, then raise maxItems or schedule recurring runs.

API example

curl-X POST "https://api.apify.com/v2/actors/zentrafoundry~zentra-prompt-injection-quarantine/runs"\
+ -H"Authorization: Bearer $APIFY_TOKEN"\
+ -H"Content-Type: application/json"\
+ -d'{"maxItems":10,"sourceIds":["OWASP-LLM01","APIFY-DATASETS","APIFY-MCP"],"includeSourceUrls":true,"includeMatchReasons":true,"outputMode":"buyer-ready-records"}'

Recommended first run

{
"maxItems":10,
"sourceIds":[
"OWASP-LLM01",
"APIFY-DATASETS",
"APIFY-MCP"
],
"includeSourceUrls":true,
"includeMatchReasons":true,
"outputMode":"buyer-ready-records"
}

Sample output

Sample status: sample_unavailable at https://zentra.nimblique.studio/external/actor-review/samples/zentra-prompt-injection-quarantine.json. No fake sample is published; run a bounded real sample refresh before using examples in promotion.

Recommended public tasks

[
{
"name":"Review 10 agent/tool decisions",
"description":"Low-cost validation run for checking policy, risk, cost, and action fields.",
"input":{
"maxItems":10,
"sourceIds":[
"OWASP-LLM01",
"APIFY-DATASETS",
"APIFY-MCP"
],
"includeSourceUrls":true,
"includeMatchReasons":true,
"outputMode":"buyer-ready-records",
"actorSlug":"zentra-prompt-injection-quarantine"
}
},
{
"name":"Daily AgentOps decision review",
"description":"Recurring review batch for tool-call risk, cost guardrails, and audit evidence.",
"schedule":"Daily during local business hours",
"input":{
"maxItems":25,
"sourceIds":[
"OWASP-LLM01",
"APIFY-DATASETS",
"APIFY-MCP"
],
"includeSourceUrls":true,
"includeMatchReasons":true,
"outputMode":"buyer-ready-records",
"actorSlug":"zentra-prompt-injection-quarantine"
}
}
]

Use cases

Review prompt injection dataset scanner decisions before high-risk tool calls execute.
Route policy violations, cost guardrails, and prompt-injection findings into audit logs or review queues.
Compare agent runs by risk, confidence, budget impact, and recommended next action.
Create customer-facing evidence for safer AI-agent operations.

Trust and compliance

Uses Owasp Llm01, Apify datasets/storage, Apify MCP server.
Keeps source URLs and source identifiers in output records for auditability.
Does not require private credentials unless a source is explicitly configured for approved authenticated access.
AgentOps outputs should be logged and reviewed before enforcing high-impact production decisions.

Limitations

Results depend on public-source availability, source uptime, and source update cadence.
Public sources can revise records after publication; rerun scheduled tasks for fresh evidence.
Scores and match reasons are decision-support signals, not legal, financial, procurement, medical, safety, or regulatory advice.
Large production runs can cost more than the default smoke run; start small, inspect output, then scale schedules.

FAQ

Can I run this without URLs? Yes. The default sample mode is designed to succeed without user-supplied URLs, and URL-backed runs can use startUrls when needed.

Can I schedule it? Yes. Use sinceLastRun, watchlistTerms, and optional webhookUrl to turn the actor into a recurring alert or report workflow.

How do I verify value before scaling? Run the recommended first-run input, review the sample output fields, then increase maxItems or schedule recurring runs after the dataset matches your use case.

Llm Prompt Optimizer API

vivid_astronaut/llm-prompt-optimizer

👁 User avatar

Fabio Suizu

👁 Data.gov.uk Scraper - Cheap 🌐📊🇬🇧 avatar

Data.gov.uk Scraper - Cheap 🌐📊🇬🇧

scrapestorm/data-gov-uk-scraper---cheap

🔎 Easily collect dataset listings from data.gov.uk Provide one or multiple search URLs and extract dataset information such as 📄 Dataset Title 🏢 Published By 🕒 Last Updated 📝 Description 🔗 Dataset URL & more Perfect for open data research, government data monitoring & dataset discovery 📊🚀

👁 User avatar

Storm_Scraper

5.0

👁 LLM Dataset Processor avatar

LLM Dataset Processor

dusan.vystrcil/llm-dataset-processor

Allows you to process output of other actors or stored dataset with single LLM prompt. It's useful if you need to enrich data, summarize content, extract specific information, or manipulate data in a structured way using AI.

👁 User avatar

Dušan Vystrčil

153

👁 Reddit RAG Dataset — LLM Training Data from Posts & Comments avatar

Reddit RAG Dataset — LLM Training Data from Posts & Comments

blackfalcondata/reddit-rag-dataset

Build clean LLM and RAG datasets from Reddit. Export posts with full comment threads as ready-to-chunk text, HTML and Markdown — only text-bearing records with parent/child thread structure. No login or developer token needed.

👁 User avatar

Black Falcon Data

AI Repository Security Scanner

optimus-fulcria/ai-repo-security-scanner

Scan AI/ML repositories for vulnerabilities: sandbox escapes, code injection, path traversal. For security teams.

👁 User avatar

Fulcria Labs

Filter dataset records

analogous_ottoman/filter-records-based-on-negative-keywords

This actor lets you select a field in your dataset and exclude some records if they contain a keyword in the list of excluded keywords you provide (case-insensitive).

Analogous

MCP Injection Scanner (Apr 2026 CVE wave)

knowing_yucca/meok-mcp-injection-scan

30+ canonical injection-pattern checks for the Apr 2026 Anthropic MCP RCE class.

👁 User avatar

MEOK AI

👁 AI Training Data Scraper - LLM and RAG-Ready avatar

AI Training Data Scraper - LLM and RAG-Ready

george.the.developer/ai-training-data-scraper

Extract web content formatted for LLM fine-tuning and RAG pipelines. Output in OpenAI JSONL, Claude JSONL, Markdown, or raw text.

👁 User avatar

George Kioko

👁 Dataset Download avatar

Dataset Download

idiatech/apify-Dataset-Download

Download any dataset from the Apify platform automatically and in any format you want. Use this actor along with a Dataset toolbox automation tool.

👁 User avatar

idIA Tech

Brand Monitoring Dataset

nathan_switch/brand-monitoring-dataset

Switch

URL: https://apify.com/zentrafoundry/zentra-prompt-injection-quarantine

⇱ Scan Scraped Datasets for Prompt Injection Before RAG or AI · Apify

Prompt Injection Dataset Scanner

Who this is for

Buyer outcomes

Sources monitored

Inputs

How it transforms the input

Outputs

Pricing

API example

Recommended first run

Sample output

Recommended public tasks

Use cases

Trust and compliance

Limitations

FAQ

You might also like

Llm Prompt Optimizer API

Data.gov.uk Scraper - Cheap 🌐📊🇬🇧

LLM Dataset Processor

Reddit RAG Dataset — LLM Training Data from Posts & Comments

AI Repository Security Scanner

Filter dataset records

MCP Injection Scanner (Apr 2026 CVE wave)

AI Training Data Scraper - LLM and RAG-Ready

Dataset Download

Brand Monitoring Dataset