VOOZH about

URL: https://apify.com/seralifatih/congress-trading-pipeline

⇱ πŸ›οΈ U.S. Senate Congress Trade Tracker Β· Apify


πŸ‘ πŸ›οΈ U.S. Senate Congress Trade Tracker avatar

πŸ›οΈ U.S. Senate Congress Trade Tracker

Pricing

from $1.50 / 1,000 transaction records

Go to Apify Store

πŸ›οΈ U.S. Senate Congress Trade Tracker

Track every U.S. Senate member stock trade automatically. Clean, structured data from official disclosures β€” senator name, ticker, trade date, amount. Perfect for investors following congress trades & quant researchers. No PDF parsing needed.

Pricing

from $1.50 / 1,000 transaction records

Rating

0.0

(0)

Developer

πŸ‘ Fatih Δ°lhan

Fatih Δ°lhan

Maintained by Community

Actor stats

0

Bookmarked

13

Total users

5

Monthly active users

10 days ago

Last modified

Share

Congress Trading Pipeline β€” API

A senator buys $250k in defense stock the week before a major procurement vote. The filing lands quietly on the Senate EFD system.

This pipeline catches it β€” normalized, deduplicated, queryable JSON β€” within hours of the official disclosure. Public domain government data, no middleman.

Who uses this

  • Traders following Senate insiders β€” committee members moving before contract announcements, votes, and regulatory decisions
  • Algo traders who want structured JSON they can pipe directly into a strategy without manual CSV parsing
  • Data engineers who need a clean, deduplicated Senate trading feed with stable record IDs for joins and incremental loads
  • App developers who want a drop-in REST API β€” run on Railway or Render, point your frontend at /api/transactions

Why this instead of existing tools? Senate EFD data is public but awkward to consume. This pipeline normalizes raw filings into a consistent schema with stable IDs, dedup, and a queryable REST API β€” so you build on top, not around.


Prerequisites

  • Node.js 18+
  • No external services, databases, or API keys required for MVP

Setup

npminstall
cp .env.example .env # edit as needed β€” all vars have defaults
npm run dev

Server starts on http://localhost:3001.
On first boot the scheduler runs the pipeline immediately, then every 6 hours.

Environment variables

VariableDefaultDescription
PORT3001HTTP listen port
DB_PATH./data/pipeline.dbSQLite file path (created automatically)
LOG_LEVELinfodebug / info / warn / error
NODE_ENVdevelopmentSet to production for JSON-lines log output
CRON_SCHEDULE0 */6 * * *node-cron schedule expression
FETCH_DAYS_BACK90Rolling window of PTRs to fetch
CRON_SECRET(empty)Shared secret for /api/cron and /api/sync-committees
FRONTEND_ORIGINhttp://localhost:3000Allowed CORS origin when running standalone
LAST_RUN_PATH./data/last_run.jsonPersisted last-run stats file

Pipeline architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Fetch │──▢│ Parse │──▢│ Transform │──▢│ Dedup │──▢│ Store β”‚
β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚
β”‚ Senate β”‚ β”‚ JSON β”‚ β”‚ typeβ”‚ β”‚ key: β”‚ β”‚ SQLite β”‚
β”‚ EFDAPI β”‚ β”‚ primary β”‚ β”‚ amount β”‚ β”‚ name + β”‚ β”‚ INSERT β”‚
β”‚ GET β”‚ β”‚ β”‚ β”‚ dates β”‚ β”‚ date + β”‚ β”‚ OR β”‚
β”‚ 100/page β”‚ β”‚ HTML β”‚ β”‚ owner β”‚ β”‚ asset +β”‚ β”‚ IGNORE β”‚
β”‚ β”‚ β”‚ fallback β”‚ β”‚ ticker β”‚ β”‚ amount β”‚ β”‚ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Express API β”‚
β”‚ :3001 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Source endpoint: GET https://efts.senate.gov/LATEST/search-index
Pagination: 100 records/page, loops until hits.total exhausted
Fallback: if JSON parse yields empty asset_name on all rows, re-parses raw HTML
Retry: 3 attempts with exponential backoff + Β±25% jitter on all HTTP calls


API reference

GET /health

$curl http://localhost:3001/health
{
"status":"ok",
"db_count":847,
"last_run":"2026-04-29T14:23:00.000Z"
}

GET /api/refresh

Returns timestamp of most recently stored record. Called by the frontend on every page mount.

$curl http://localhost:3001/api/refresh
{"lastUpdated":"2026-04-29T14:23:00.000Z"}

lastUpdated is null if no records exist yet.


POST /api/refresh

Triggers a full pipeline run. Called when the user clicks "Refresh Data" in the frontend.

$curl-X POST http://localhost:3001/api/refresh
{"ok":true,"signals":14,"lastUpdated":"2026-04-29T14:23:00.000Z"}

On failure:

{"ok":false,"error":"Fetch failed: HTTP 503 Service Unavailable"}

GET /api/cron

Same pipeline run as POST /api/refresh, protected by CRON_SECRET. Called by an external scheduler (Cloudflare Worker, cron job, etc.).

curl-H"x-cron-secret: your-secret" http://localhost:3001/api/cron
# or
curl"http://localhost:3001/api/cron?secret=your-secret"
{
"ok":true,
"summary":{
"ingested":340,
"newTrades":14,
"signalsGenerated":14,
"topScore":null,
"topScoreTicker":null,
"runAt":"2026-04-29T14:23:00.000Z"
}
}

Returns 401 if secret is missing or wrong.


GET /api/sync-committees

Syncs congressional committee membership. Protected by CRON_SECRET. Run once on setup, then weekly.

$curl-H"x-cron-secret: your-secret" http://localhost:3001/api/sync-committees
{"ok":true,"synced":0}

GET /api/transactions

Queryable read endpoint. Returns transactions serialized to match the frontend Signal field names.

# All recent transactions (default limit 500)
curl http://localhost:3001/api/transactions
# Filter by ticker
curl"http://localhost:3001/api/transactions?ticker=AAPL"
# Filter by politician (LIKE match, case-insensitive)
curl"http://localhost:3001/api/transactions?politician=Pelosi"
# Date range
curl"http://localhost:3001/api/transactions?date_from=2026-04-01&date_to=2026-04-30"
# Type + owner + pagination
curl"http://localhost:3001/api/transactions?type=buy&owner=joint&limit=50&offset=0"
{
"count":2,
"data":[
{
"id":"a3f...c1",
"filer_name":"Nancy Pelosi",
"filer_type":"congress",
"trade_type":"purchase",
"ticker":"NVDA",
"asset_name":"NVIDIA Corporation",
"asset_type":"Stock",
"amount_low":1000001,
"amount_high":5000000,
"amount_midpoint":3000000,
"trade_date":"2026-04-29",
"filing_date":"2026-04-29",
"owner":"joint",
"is_active":true
}
]
}

Query parameters:

ParamTypeDescription
politicianstringSubstring match (LIKE)
tickerstringExact match, auto-uppercased
date_fromYYYY-MM-DDInclusive lower bound on transaction_date
date_toYYYY-MM-DDInclusive upper bound on transaction_date
typebuy | sellExact match
ownerself | joint | spouse | childExact match
limitinteger 1–1000Default 500
offsetinteger β‰₯ 0Default 0

Invalid params return 400:

{"error":{"date_from":["Must be YYYY-MM-DD"]}}

GET /api/debug

Dev diagnostics. No auth. Returns DB count and 2 sample records.

$curl http://localhost:3001/api/debug

Cron schedule

Default: 0 */6 * * * (every 6 hours).

Change via CRON_SCHEDULE env var β€” any valid node-cron expression.

CRON_SCHEDULE="0 */2 * * *"npm run dev # every 2 hours
CRON_SCHEDULE="0 8 * * *"npm run dev # once daily at 08:00

Last run stats (timestamp, inserted, skipped, errors) are persisted to ./data/last_run.json after each run.


Seeding and smoke test

Load 20 realistic fake records covering edge cases (null tickers, spouse/child owners, large amounts, same-day multi-trades, clusters):

$npm run seed

Verify the running server responds correctly:

# Terminal 1
npm run dev
# Terminal 2
npm run smoke

Smoke test exits 0 on all pass, 1 on any failure.


Phase 2 roadmap

House of Representatives disclosures (efd.house.gov) use a different filing format and will be added after Senate coverage is stable. Planned additions: PDF parsing for older PTRs that lack structured data, ticker enrichment via OpenFIGI or a static CUSIP mapping table (resolving the ticker: null cases currently stored as-is), a scoring engine that ranks transactions by conviction signal (cluster detection, filing delay, filer track record), and Telegram/email alerts for high-score transactions. Multi-tenant auth (Supabase RLS + Paddle billing) is tracked separately under the SaaS roadmap.


Data source

All data is sourced from the U.S. Senate Electronic Financial Disclosures system β€” a public government database. Senate PTR filings are required under the STOCK Act and are public domain. This pipeline does not scrape third-party aggregators.

You might also like

πŸ›οΈ U.S. House Congress Trade Tracker

seralifatih/congress-trading-pipeline-1

Track every U.S. House member stock trade automatically. Clean, structured data from official disclosures β€” member name, ticker, trade date, amount. Perfect for investors following congress trades & quant researchers. No PDF parsing needed.

πŸ‘ User avatar

Fatih Δ°lhan

11

US Congress Trading Monitor - House Member Trade Signals

datasignalslab/congress-trading-monitor

US Congress trading monitor (House): stock trades of House members from official disclosures, parsed and scored. QuiverQuant alternative, pay per member.

πŸ‘ User avatar

DataSignals Lab

2

SEC Insider Trading Tracker - Form 4 + Congress Trades

intelscrape/sec-insider-trading-tracker

Track corporate insider trades, large position disclosures, and Congress member trades with cluster buy detection.

Congress.gov Members Scraper | US House & Senate Directory

parseforge/congress-gov-members-scraper

Export US House and Senate members from congress.gov: name, party, state, district, chamber, term years, bioguide ID and official portrait. Filter by Congress, state, district and current-serving status. CSV, Excel, JSON or XML for government affairs and outreach.

Congress.gov Bills Scraper | US Federal Legislation Export

parseforge/congress-gov-bills-scraper

Export US House and Senate bills from congress.gov: number, title, chamber, latest action, update date and direct URL. Filter by Congress number and bill type (HR, S, HJRES, SJRES and resolutions). CSV, Excel, JSON or XML for legislative tracking and policy research.

US Congress Members Scraper

crawlerbros/us-congress-members-scraper

Browse US Congress members, bills, and votes via the free Congress.gov API - no auth or proxy required.

Congress.gov Bill Tracker - Bills, Votes, Sponsors & Subjects

jungle_synthesizer/congress-gov-bill-tracker

Track U.S. Congress bills via the official Congress.gov API. Extracts bill details, sponsors, cosponsors with party/state breakdown, committee assignments, policy subjects, and latest actions. Filter by congress number, bill type, or updatedSince for incremental runs.

πŸ‘ User avatar

BowTiedRaccoon

2

Congress Financial Disclosures & Stock Trades

johnvc/us-congress-financial-disclosures-and-stock-trading-data

This Apify actor provides comprehensive access to US Congressional financial disclosure and stock trading data. Search for transactions by congressional member name, specific report dates, date ranges, or stock ticker symbols. Perfect for journalists, researchers, and transparency advocates.