👁 AI Research Radar — compliant feed of new AI papers and news avatar

AI Research Radar — compliant feed of new AI papers and news

Pricing

from $0.50 / 1,000 results

👁 AI Research Radar — compliant feed of new AI papers and news

AI Research Radar — compliant feed of new AI papers and news

AI research feed of new ML papers and AI news from HuggingFace, Anthropic, Google, The Decoder — structured JSON, robots-compliant.

Pricing

from $0.50 / 1,000 results

Rating

0.0

(0)

Developer

👁 Connor Teskey

Connor Teskey

Maintained by Community

Actor stats

Bookmarked

Total users

Monthly active users

9 days ago

Last modified

AI Research Radar

New AI papers, lab announcements, and AI news from five permitted sources, delivered as one structured, schedule-ready feed.

Built for AI newsletter writers, research agents, and trend dashboards. Instead of hand-maintaining a scraper per site, you run one actor and get the latest items from HuggingFace papers and blog, the Anthropic and Google AI newsrooms, and The Decoder as uniform JSON records — ready to rank, summarize, alert on, or pipe into a RAG index.

What you get

Field	Meaning
`title`	Paper, post, or article headline
`url`	Canonical link on the source site
`category`	`papers`, `blog`, `labs`, or `news` — set per source
`source`	Source domain, e.g. `huggingface.co`
`fetched_at`	UTC timestamp of the run (ISO 8601)
`extraction`	Extractor version tag (`selector_free_v1`)

Quick start

{
"sources":[
{"url":"https://huggingface.co/papers","category":"papers"},
{"url":"https://huggingface.co/blog","category":"blog"},
{"url":"https://www.anthropic.com/news","category":"labs"}
],
"maxItemsPerSource":25
}

This returns up to 75 fresh items (25 per source), typically in under a minute. Omit sources entirely to use the full five-source default set, which adds the Google AI blog and The Decoder.

Output example

{
"category":"papers",
"title":"Role-Agent: Bootstrapping LLM Agents via Dual-Role Evolution",
"url":"https://huggingface.co/papers/2606.10917",
"source":"huggingface.co",
"fetched_at":"2026-06-10T14:12:08.421337+00:00",
"extraction":"selector_free_v1"
}

Why this one

Selector-free extraction. Titles are pulled by link-text shape and URL structure rather than page-specific CSS selectors, so the site redesigns that break conventional scrapers do not break this one.
Layout drift is flagged, never hidden. A source that suddenly yields zero items is marked zero_yield_check_layout in the HEALTH report instead of quietly shrinking your feed.
Papers, labs, and press in one schema. The five default sources cover research papers, official lab announcements, and AI journalism, each record tagged with its category.
Bring your own sources. Pass any list of {url, category} pages; the same robots check, retry logic, and extraction apply to every source you add.
Fresh by design. Each run is a live snapshot of the source pages — schedule it hourly or daily and the radar stays current.

Compliance and reliability

Topsail actors are built compliance-first and ship with self-healing plumbing:

robots.txt is always respected — fail-closed. If a robots check cannot complete, the source is skipped, never scraped. There is no input to turn this off.
Sources are public listing and newsroom pages — HuggingFace papers and blog, Anthropic news, the Google AI blog, and The Decoder — pages these publishers serve openly to every visitor, with no account, paywall, or personal data involved.
Transient failures retry once with backoff; persistent failures are reported, not hidden.
Every run writes a per-source HEALTH report to the key-value store, so you can see exactly which sources delivered and which were blocked, empty, or erroring.
No PII, no paywalled or login-gated content, no circumvention.

Pricing

Pay per result: $0.50 per 1,000 dataset items — one item is one paper, post, or article. Sources that come back robots-blocked, erroring, or empty add nothing to the dataset and cost nothing — you pay only for delivered records. A typical default run of around 100 items costs about $0.05.

Honest limits

Titles and canonical links only — no abstracts, authors, publication dates, or article text. fetched_at is the run timestamp, not the publish date.
Extraction expects headline-shaped link text (at least 4 words and 24 characters), so very short titles can be missed and an occasional non-article link can slip through.
Only same-domain links are collected from each source page.
Pages that render their listings entirely with JavaScript yield zero items; the run flags them in HEALTH rather than failing.
No cross-run deduplication or diff detection — each run is a full snapshot. Dedupe by url downstream if you ingest continuously.

FAQ

Can I use this as an ML papers API? Yes. Trigger runs on a schedule through the Apify API and read the dataset as JSON or CSV — a lightweight ML papers API without maintaining your own scraper.

How fresh is the AI research feed? Each run is a live snapshot of the source pages at run time. Schedule the actor hourly or daily to keep an always-current AI news feed.

Can I add my own sources? Yes. sources accepts any list of {url, category} pages. The robots check and selector-free extraction apply to every source you add; blog-style listing pages work best.

Does it return abstracts or full article text? No — titles and canonical links only. Pair it with Topsail's Site to Markdown actor when you need full LLM-ready page content.

What happens when a source site redesigns? Usually nothing: extraction keys on link-text shape and URL structure, not page-specific selectors. If a source still drops to zero items, the run flags it as zero_yield_check_layout in the HEALTH report.

More compliant data feeds from Topsail

Site to Markdown — any site to clean LLM-ready markdown
GTA 6 Countdown & Developments Tracker — countdown, confirmed facts, diffed developments, market odds
Commodity Intel — oil, gold, uranium headlines from permitted sources
Crypto News — BTC/ETH/DeFi headlines from major outlets

👁 Ai-ML-scraper avatar

Ai-ML-scraper

labrat011/ai-ml-scraper

Search AI/ML models, research papers, and trending papers from HuggingFace Hub and arXiv. No API key required.

👁 User avatar

mick_

HuggingFace Daily Papers Scraper

tzmyk/huggingface-daily-papers-scraper

Scrapes AI/ML research papers from HuggingFace Daily Papers (huggingface.co/papers). Extracts title, authors, abstract, GitHub repo, star count, upvotes, AI summary, and keywords.

👁 User avatar

tzmyk

👁 Anthropic News & Research Scraper avatar

Anthropic News & Research Scraper

automation-lab/anthropic-scraper

Scrapes news articles and research papers from Anthropic's website. Returns title, date, categories, description, image URL, and optionally full article text.

👁 User avatar

Stas Persiianenko

arXiv Paper Scraper - AI ML Research Papers

openclawmara/arxiv-paper-scraper

Scrape arXiv research papers by keyword, category, or author. Extracts titles, abstracts, authors, citations, and metadata. Perfect for AI/ML research monitoring, literature reviews, and LLM training data collection.

👁 User avatar

OpenClaw Mara

👁 Crypto News — compliant Bitcoin & DeFi headline feed avatar

Crypto News — compliant Bitcoin & DeFi headline feed

topsail/compliant-crypto-news

Compliant crypto news API: a structured Bitcoin news feed and DeFi news headlines from CoinDesk, Decrypt, and CoinTelegraph.

👁 User avatar

Connor Teskey

arXiv Papers Scraper - AI/ML Research at Scale

wetyr_corporation/arxiv-papers-scraper

Search and bulk extract arXiv research papers with abstracts, authors, categories, and PDF links. Built for AI/ML researchers, RAG knowledge bases, and citation tracking.

👁 User avatar

WETYR

👁 Commodity Intel — compliant oil, gold & uranium news feed avatar

Commodity Intel — compliant oil, gold & uranium news feed

topsail/compliant-commodity-intel

Commodity news API: oil news feed, gold news, silver and uranium headlines as structured JSON from robots-compliant public sources.

👁 User avatar

Connor Teskey

👁 AI News Aggregator avatar

AI News Aggregator

david_flagg/ai-news-aggregator

Aggregate AI and ML news from Hacker News, Papers With Code, MIT Technology Review, The Batch, and Import AI. Filter by keywords, date range, minimum score. Get titles, URLs, authors, summaries, topic tags, arXiv links, and code repos. Real-time data, sorted by date or relevance.

👁 User avatar

David Flagg

RSS to JSON — Structured Feed Data for AI

wsgcjj/rss-to-json

Convert any RSS or Atom feed to clean structured JSON. Perfect for AI agents, content aggregation, news monitoring, and data pipelines.

👁 User avatar

陈俊杰

👁 Papers with Code Scraper avatar

Papers with Code Scraper

crawlerbros/papers-with-code-scraper

Scrape Papers with Code like search ML papers, fetch paper details with repos and results, browse ML tasks and leaderboards, search datasets, and find ML methods.

👁 User avatar

Crawler Bros

URL: https://apify.com/topsail/compliant-ai-research-radar

⇱ AI & ML Papers & News Scraper API — Compliant · Apify

AI Research Radar — compliant feed of new AI papers and news

AI Research Radar

What you get

Quick start

Output example

Why this one

Compliance and reliability

Pricing

Honest limits

FAQ

More compliant data feeds from Topsail

You might also like

Ai-ML-scraper

HuggingFace Daily Papers Scraper

Anthropic News & Research Scraper

arXiv Paper Scraper - AI ML Research Papers

Crypto News — compliant Bitcoin & DeFi headline feed

arXiv Papers Scraper - AI/ML Research at Scale

Commodity Intel — compliant oil, gold & uranium news feed

AI News Aggregator

RSS to JSON — Structured Feed Data for AI

Papers with Code Scraper