Crawlio Browser
@Crawlio-app
Crawlio Agent
๐ npm version
๐ License: MIT
Documentation | API Reference | Chrome Extension
MCP server that gives AI full control of a live Chrome browser via CDP. 100 tools (93 browser + 3 extraction + 3 recording + 1 compiler) with framework-aware intelligence, typed evidence infrastructure, and confidence-tracked findings โ captures what static crawlers can't see.
Note: This repo supersedes
crawlio-browser-mcp. All development now happens here.
Links:
When to use Crawlio Agent
Use Crawlio Agent when your AI needs to interact with a real browser โ SPAs, authenticated pages, dynamic content, JS-rendered frameworks. Unlike headless browser tools, Crawlio Agent connects to your actual Chrome via a lightweight extension, giving the AI access to your logged-in sessions, cookies, and full browser state.
Crawlio Agent vs headless browser tools: Headless tools launch a separate browser process. Crawlio Agent connects to your existing Chrome โ no separate browser, no login flows, full access to your tabs and sessions.
Quick Start
- Install the Chrome Extension
- Run the init wizard:
npx crawlio-browser init
That's it. Auto-detects and configures 14 MCP clients: Claude Code, Cursor, VS Code, Codex, Gemini CLI, Claude Desktop, ChatGPT Desktop, Windsurf, Cline, Zed, Goose, OpenCode, MCPorter, and Cline CLI.
Init wizard options
npx crawlio-browser init # Default: code mode, stdio transport
npx crawlio-browser init --full # Full mode (100 individual tools)
npx crawlio-browser init --portal # Portal mode (persistent HTTP server)
npx crawlio-browser init --cloudflare # Add Cloudflare MCP (89 tools, no wrangler)
npx crawlio-browser init --dry-run # Show what would happen
npx crawlio-browser init --yes # Skip prompts (CI / scripted installs)
npx crawlio-browser init -a claude # Target specific MCP client
Transport Modes
| Mode | Command / URL | Protocol | Best For |
|---|---|---|---|
| stdio | npx crawlio-browser | JSON-RPC over stdin/stdout | Claude Desktop, Cursor, Windsurf โ client manages process lifecycle |
| Portal (HTTP) | POST http://127.0.0.1:3001/mcp | MCP Streamable HTTP | Claude Code, ChatGPT Desktop โ server survives session restarts |
| Portal (SSE) | GET /sse + POST /message | Server-Sent Events | Legacy clients needing SSE transport |
Portal mode is recommended for Claude Code โ the server persists across context compaction and session restarts. On macOS, --portal installs a launchd agent for auto-start on login.
Manual setup (any client)
How It Works
AI Client (stdio/http) --> MCP Server (Node.js) --> Chrome Extension (MV3)
crawlio-browser WebSocket -> CDP
The MCP server communicates with the Chrome extension via WebSocket. The extension controls the browser through Chrome DevTools Protocol (CDP).
Capabilities
Framework-Aware Intelligence
Every execute call probes the browser for framework signatures and injects a shape-shifting smart object with framework-native accessors. React state, Vue reactivity, Next.js routing, Shopify cart data โ 17 framework namespaces across 4 tiers, detected at runtime and rebuilt on every navigation. The AI doesn't query a generic DOM; it queries the framework's own data structures.
Evidence-Based Analysis
Method Mode adds higher-order methods and a typed evidence system on top of Code Mode. smart.extractPage() runs 7 parallel operations in a single call โ page capture, performance metrics, security state, font detection, meta extraction, accessibility audit, and mobile-readiness check. Failed operations produce typed CoverageGap records instead of silent nulls. Findings created with smart.finding() get their confidence automatically adjusted when supporting data is missing. The result: structured, auditable research output with gap tracking and confidence propagation.
Session Recording & Replay
Record browser interactions as structured data, then compile them into reusable SKILL.md automations. 12 interaction tools are automatically intercepted during recording โ clicks, typing, navigation, scrolling โ each capturing args, result, timing, and page URL. One compileRecording() call converts the session into a deterministic automation script.
Auto-Settling & Actionability
Every mutative action (click, type, navigate, select_option) runs actionability checks before acting โ polling visibility, dimensions, enabled state, and overlay detection. After the action, a progressive backoff settle delay ([0, 20, 100, 100, 500]ms) waits for DOM mutations to quiesce. The AI doesn't need manual sleep() calls between actions.
Architecture: JIT Context Runtime
The JIT Context MCP Runtime is a layered execution architecture where each layer absorbs a category of complexity that would otherwise fall on the model. The model sees three tools and a clean SDK. Everything beneath that surface is the runtime absorbing reality.
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ AI Model (LLM) โ
โ Writes code, reads errors, loops โ
โโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโ
โ 3 tools: search, execute, connect_tab
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ JIT Context MCP Runtime โ
โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ METHOD MODE โ โ
โ โ Behavioral protocol + higher-order methods โ โ
โ โ scrollCapture ยท waitForIdle ยท extractPage ยท comparePages โ โ
โ โ detectTables ยท extractTable ยท waitForNetworkIdle ยท โ โ
โ โ extractData โ โ
โ โ โ โ
โ โ โณ Absorbs: behavioral variance, ad-hoc composition, โ โ
โ โ inconsistent output shapes, data extraction patterns โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค โ
โ โ POLYMORPHIC CONTEXT โ โ
โ โ 17 framework namespaces, injected Just-In-Time โ โ
โ โ react ยท vue ยท angular ยท nextjs ยท shopify ยท ... โ โ
โ โ โ โ
โ โ โณ Absorbs: framework opacity, minified code, โ โ
โ โ devtools hook complexity โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค โ
โ โ ACTIONABILITY ENGINE โ โ
โ โ 7 core smart methods with built-in resilience โ โ
โ โ click ยท type ยท navigate ยท waitFor ยท evaluate ยท โ โ
โ โ snapshot ยท screenshot โ โ
โ โ โ โ
โ โ โณ Absorbs: DOM timing, hydration delays, CSS animations, โ โ
โ โ disabled states, overlapping elements โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค โ
โ โ TETHERED IPC BRIDGE โ โ
โ โ WebSocket โ Chrome extension, message queue, โ โ
โ โ heartbeat, auto-reconnect, stale detection โ โ
โ โ โ โ
โ โ โณ Absorbs: connection drops, tab refreshes, โ โ
โ โ port conflicts, extension lifecycle โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค โ
โ โ 133 RAW COMMANDS (bridge.send) โ โ
โ โ CDP-level browser control via Chrome extension โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Live Chrome Browser โ
โ Persistent session, real DOM, โ
โ framework runtime, user state โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
What Each Layer Absorbs
| Layer | Without It | With It |
|---|---|---|
| Tethered IPC Bridge | Script crashes on tab refresh, pending commands lost on reconnect, port conflicts on startup | Resilient WebSocket with message queue (100-msg capacity), heartbeat stale detection (15s intervals), auto-reconnect with drain |
| Actionability Engine | click('#btn') fires before the button renders, during CSS transitions, or while an overlay covers it | Progressive polling (exists โ has dimensions โ visible โ not disabled โ not obscured) with [0, 20, 100, 100, 500]ms backoff |
| Polymorphic Context | Model sees minified <div> elements; reading React state requires knowing exact hook paths, renderer maps, and fiber root API | Runtime probes live JS environment, detects 17 frameworks, injects namespace methods (smart.react.getVersion(), smart.nextjs.getData()) |
| Method Mode | Model composes primitives ad-hoc โ inconsistent scroll loops, missed edge cases, varying return shapes | 8 tested methods encode correct patterns; behavioral protocol constrains workflow |
Execution Lifecycle
- Discovery โ Model calls
search("page capture performance")and gets documentation for relevant commands - Framework Detection โ Runtime probes the live DOM, detects active frameworks, constructs polymorphic
smartobject with appropriate namespaces - Scope Assembly โ Model's code is compiled into an async function with injected parameters:
bridge(133 commands),crawlio(HTTP client),sleep,TIMEOUTS,smart(7 core + 8 higher-order + up to 17 framework namespaces),compileRecording - Execution โ Method Mode methods compose the lower layers:
extractPage()fires 7 parallelbridge.send()calls;click()runs the actionability engine;react.getVersion()evaluates framework-specific expressions - Error Recovery (Agentic REPL) โ On failure, the browser stays in the exact state that produced the error. The model reads the structured error, adjusts, and calls
executeagain. Framework cache persists โ no re-detection unless URL changed
Design Principles
- Absorb complexity downward โ Every category of difficulty (connection management, DOM timing, framework detection, multi-step composition) is handled by the layer best equipped for it. The model only encounters the clean interface at the top.
- Shape the SDK to the target โ The polymorphic context system detects what the page is and reshapes available methods to match. The model writes against a stable interface; the runtime adapts underneath.
- Preserve state across cycles โ The tethered architecture means the model can fail, learn, and retry against the same live environment โ transforming error handling from "restart from scratch" into "adjust and continue."
How It Compares
| Dimension | Standard MCP | Cloudflare Code Mode | JIT Context Runtime |
|---|---|---|---|
| Tools in context | 50-100+ schemas | 2 (search, execute) | 3 (search, execute, connect_tab) |
| Execution environment | N/A (tool calls) | V8 isolate (stateless) | Local async sandbox (stateful, tethered to live browser) |
| DOM access | Via individual tool calls | None | Live, persistent, framework-aware |
| Framework awareness | None | None | 17 namespaces, injected JIT |
| Action resilience | Model must handle timing | N/A (no DOM) | Built-in actionability polling + settle delays |
| Error recovery | Re-call individual tool | Re-create isolate | Re-execute against same live state (Agentic REPL) |
| Multi-step patterns | Model improvises | Model writes loops | 8 tested higher-order methods + behavioral protocol |
Read the full architecture guide โ
Two Modes
Code Mode (3 tools) โ default
Collapses 100 tools into 3 high-level tools with ~95% schema token reduction:
| Tool | Description |
|---|---|
search | Discover available commands by keyword |
execute | Run async JS with bridge, crawlio, smart, sleep, and compileRecording in scope |
connect_tab | Connect to a browser tab |
// Navigate and screenshot
await bridge.send({ type: 'browser_navigate', url: 'https://example.com' }, 30000);
await sleep(2000);
const screenshot = await bridge.send({ type: 'take_screenshot' }, 10000);
return screenshot;
Full Mode (100 tools)
Every tool exposed directly to the LLM. Enable with --full:
npx crawlio-browser init --full
Smart Object
In Code Mode, the smart object provides framework-aware helpers with auto-waiting and actionability checks.
Core Methods
| Method | Description |
|---|---|
smart.evaluate(expression) | Execute JS in the page via CDP |
smart.click(selector, opts?) | Auto-waiting click with 500ms settle |
smart.type(selector, text, opts?) | Auto-waiting type with 300ms settle |
smart.navigate(url, opts?) | Navigate with 1000ms settle |
smart.waitFor(selector, timeout?) | Poll until element is actionable |
smart.snapshot() | Accessibility tree snapshot |
smart.screenshot() | Full-page screenshot (base64 PNG) |
Higher-Order Methods
| Method | Description |
|---|---|
smart.scrollCapture(opts?) | Scroll to bottom, capturing screenshots at each position. Handles stuck-scroll detection, bottom detection, section capping, and scroll reset. |
smart.waitForIdle(timeout?) | MutationObserver-based idle detection โ waits for 500ms quiet window. Timeout hard-capped at 15s. Replaces blind sleep() calls. |
smart.extractPage(opts?) | 7 parallel operations in one call โ page capture, performance, security, fonts, meta, accessibility, mobile-readiness. Returns typed PageEvidence with CoverageGap[] for anything that failed. |
smart.comparePages(urlA, urlB) | Navigates to both URLs, runs extractPage() on each, returns a ComparisonScaffold with 11 dimensions, shared/missing fields, and comparable metrics. |
Typed Evidence
Methods for structured analysis findings with confidence propagation:
| Method | Description |
|---|---|
smart.finding(data) | Create a validated Finding with claim, evidence, sourceUrl, confidence, and method. Rejects malformed input with specific errors. |
smart.findings() | Get all session-accumulated findings (returns a copy) |
smart.clearFindings() | Reset session findings and coverage gaps |
When a finding's dimension matches an active coverage gap, confidence is automatically capped:
| Input Confidence | Active Gap | Output |
|---|---|---|
high | reducesConfidence: true | medium + confidenceCapped: true |
medium | reducesConfidence: true | low + confidenceCapped: true |
low | any | low (floor) |
| any | no matching gap | unchanged |
Framework Namespaces
When a framework is detected, the smart object exposes framework-specific helpers:
Method Mode
Method Mode is a domain layer built on top of Code Mode. It adds higher-order methods, a typed evidence system, and a behavioral protocol to the execute sandbox โ without changing the tool surface. The model still sees three tools. The same smart object. The same 133-command catalog underneath. What changes is what happens inside execute.
The Maturity Ladder
| Layer | Optimizes For | Behavioral Variance | Evidence Quality |
|---|---|---|---|
| Raw MCP (100 tools) | Completeness | High โ flat tool list, no composition guidance | None โ unstructured text |
| Code Mode (3 tools) | Token efficiency | Medium โ right primitives, ad-hoc composition | None โ model-defined shapes |
| Method Mode v1 (+ 8 methods + protocol) | Consistency | Low โ proper methods, protocol constraints | Convention โ { finding, evidence, url } |
| Method Mode v2 (+ typed evidence + gaps + confidence) | Correctness | Minimal โ typed schemas, tool-enforced findings | Structural โ typed records, gap tracking, confidence propagation |
Architecture
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ execute sandbox โ
โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Behavioral Protocol (web-research skill) โ โ
โ โ Acquire โ Normalize โ Analyze โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค โ
โ โ Evidence Infrastructure โ โ
โ โ finding() ยท findings() ยท clearFindings() โ โ
โ โ Typed records ยท Coverage gaps ยท Confidence prop. โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค โ
โ โ Higher-Order Methods [8] โ โ
โ โ scrollCapture ยท waitForIdle ยท extractPage ยท โ โ
โ โ comparePages ยท detectTables ยท extractTable ยท โ โ
โ โ waitForNetworkIdle ยท extractData โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค โ
โ โ Smart Core [7 methods] โ โ
โ โ evaluate ยท click ยท type ยท navigate ยท waitFor ยท โ โ
โ โ snapshot ยท screenshot โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค โ
โ โ Framework Namespaces [up to 17, injected JIT] โ โ
โ โ react ยท vue ยท angular ยท nextjs ยท shopify ยท ... โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค โ
โ โ bridge.send() โ 133 raw commands โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Each layer up encodes more domain knowledge. bridge.send({ type: "capture_page" }) captures a page. smart.extractPage() captures a page AND runs performance metrics, security state, font detection, accessibility analysis, and mobile-readiness checks in parallel โ seven operations, one call, graceful failure on supplementary data, typed gaps for anything that fails.
Evidence Infrastructure
Coverage Gaps โ When supplementary operations in extractPage() fail, they don't silently return null. A typed gap is recorded with the dimension, reason, impact, and whether it reduces confidence on related findings:
// Example gap from a failed performance metrics call
{ dimension: "performance", reason: "CDP domain disabled", impact: "method-failed", reducesConfidence: true }
Tool-Enforced Findings โ smart.finding() validates every field at the tool level. The model cannot produce a finding without meeting the schema โ it either returns a valid Finding or gets a clear error. Findings accumulate across execute calls within a session via smart.findings().
Session Aggregation โ Findings and coverage gaps persist across execute calls. A model can make findings across multiple calls, then retrieve the full set with smart.findings(). Reset with smart.clearFindings().
End-to-End Example: Competitive Audit
// 1. Extract and compare both sites (scaffold + gaps included)
const comparison = await smart.comparePages(
'https://acme.com',
'https://rival.com'
);
// 2. Make findings โ confidence auto-adjusts based on data availability
smart.finding({
claim: 'Rival loads 2.3x faster on Largest Contentful Paint',
evidence: [
`Acme LCP: ${comparison.siteA.performance?.webVitals?.lcp}ms`,
`Rival LCP: ${comparison.siteB.performance?.webVitals?.lcp}ms`,
],
sourceUrl: 'https://acme.com',
confidence: 'high',
method: 'comparePages + extractPage performance metrics',
dimension: 'performance', // if perf data failed, confidence caps to "medium"
});
smart.finding({
claim: 'Acme has 12 images without alt text; Rival has 0',
evidence: [
`Acme imagesWithoutAlt: ${comparison.siteA.accessibility?.imagesWithoutAlt}`,
`Rival imagesWithoutAlt: ${comparison.siteB.accessibility?.imagesWithoutAlt}`,
],
sourceUrl: 'https://acme.com',
confidence: 'high',
method: 'comparePages + extractPage accessibility summary',
dimension: 'accessibility',
});
// 3. Capture visual evidence
await smart.navigate('https://acme.com');
await smart.waitForIdle();
const acmeVisuals = await smart.scrollCapture({ maxSections: 5 });
// 4. Return accumulated session findings + visual evidence
return {
findings: smart.findings(),
scaffold: comparison.scaffold,
gaps: { acme: comparison.siteA.gaps, rival: comparison.siteB.gaps },
visualEvidence: { acme: acmeVisuals.sectionCount + ' sections captured' },
};
Examples
Navigate, extract, and analyze
// Connect to active tab, extract structured page evidence
const page = await smart.extractPage();
const finding = smart.finding({
claim: `Site uses ${page.capture.framework?.name || 'no detected framework'}`,
evidence: [`Framework: ${JSON.stringify(page.capture.framework)}`],
sourceUrl: page.meta?.canonical || 'active tab',
confidence: 'high',
method: 'extractPage framework detection',
});
return { page: page.meta, finding };
Mobile emulation + screenshot
// Emulate iPhone and capture
await bridge.send({ type: 'emulate_device', device: 'iPhone 14' }, 10000);
await smart.navigate('https://example.com');
await smart.waitForIdle();
const screenshot = await smart.screenshot();
return screenshot;
Record and compile automation
// Record a browser session, then compile to reusable skill
await bridge.send({ type: 'start_recording' }, 10000);
await smart.navigate('https://example.com');
await smart.click('button.submit');
await smart.type('#email', 'test@example.com');
const session = await bridge.send({ type: 'stop_recording' }, 10000);
return compileRecording(session.session, 'signup-flow');
Intercept and mock network
// Block analytics, mock API response
await bridge.send({
type: 'browser_intercept',
pattern: '*analytics*',
action: 'block'
}, 10000);
await bridge.send({
type: 'browser_intercept',
pattern: '*/api/user',
action: 'mock',
body: JSON.stringify({ name: 'Test User' }),
statusCode: 200
}, 10000);
await smart.navigate('https://example.com');
return await smart.snapshot();
Session Recording
Record browser sessions as structured data, then compile them into reusable automation skills. 12 interaction tools are automatically intercepted during recording (click, type, navigate, scroll, etc.), capturing args, result, timing, and page URL.
// In code mode: record, interact, compile
await bridge.send({ type: 'start_recording' }, 10000);
// ... interact with the page ...
const session = await bridge.send({ type: 'stop_recording' }, 10000);
const skill = compileRecording(session.session, 'my-automation');
return skill;
In full mode, recording is available as 4 individual tools: start_recording, stop_recording, get_recording_status, and compile_recording.
Auto-Settling
Mutative tools (browser_click, browser_type, browser_navigate, browser_select_option) use actionability checks:
- Pre-flight: Polls element visibility, stability, and enabled state before acting
- Action: Dispatches the CDP command
- Post-settle: Waits for DOM mutations to quiesce with progressive backoff
[0, 20, 100, 100, 500]ms
This means the AI doesn't need to manually add sleep() or waitFor() calls between actions โ the tools handle SPA rendering delays automatically.
Framework Detection
Detects 64 technologies across 4 tiers using globals, DOM markers, meta tags, HTTP headers, and script URLs:
| Tier | Frameworks | Signal Strength |
|---|---|---|
| Meta-frameworks | Next.js, Nuxt, SvelteKit, Remix, Gatsby | Unique globals + parent detection |
| Core | React, Vue.js, Angular, Svelte, Astro, Qwik, SolidJS, Lit, Preact | Globals + DOM markers |
| CMS & Platforms | WordPress, Shopify, Webflow, Squarespace, Wix, Drupal, Magento, Ghost, Bubble | Meta tags + globals |
| Libraries & Tools | jQuery, Bootstrap, Tailwind CSS, Alpine.js, HTMX, Turbo, Stencil, Redux, Ember.js, Backbone.js | DOM + globals |
Multi-framework detection returns a primary framework (meta-framework takes priority) plus a subFrameworks array for the full stack.
Tools Reference
Requirements
- Node.js >= 18
- Chrome (or Chromium) with the Crawlio Agent extension installed
- Crawlio.app (optional) โ for site crawling and enrichment
Resources
- Documentation
- API Reference
- Product Page
- Chrome Extension
- npm Package
- Changelog
- Previous repo โ this project supersedes
crawlio-browser-mcp
License
MIT
Server Config
{
"mcpServers": {
"crawlio-browser": {
"command": "npx",
"args": [
"-y",
"crawlio-browser"
]
}
}
}