VOOZH about

URL: https://alterlab.io/docs/guides/authenticated-scraping


Guide
New

Authenticated Scraping (BYOS)

Bring Your Own Session: automate tasks you already do manually in your browser — checking member pricing, pulling dashboard data, monitoring your own orders and subscriptions.

Prerequisites

You need an AlterLab account and API key. Authenticated scraping works on all plans — no upgrade required. You also need a valid login on the target site whose cookies you want to use.

Why Authenticated Scraping?

If you already log into a site every day to check your orders, pull pricing, or read your dashboard — BYOS lets you automate that. You supply the session cookies from your own browser login, and AlterLab fetches the same pages you'd see manually, on your schedule, without you having to do it by hand.

Member Pricing

Amazon Prime, Walmart+, Costco, Tesco Clubcard — see the real prices that logged-in members pay, not the inflated public price.

Paywalled Content

News sites, research portals, and SaaS dashboards that gate content behind a subscription or login wall.

Private Communities

Reddit private subreddits, Discord public servers, Discourse forums that require authentication to view threads.

Personalized Data

Order history, account dashboards, saved lists, and other content that only appears for the logged-in user.

How It Works

1

Log in to the target site in your browser

Use Chrome, Firefox, or any browser. Sign in normally with your credentials.

2

Capture your session cookies

Open DevTools → Application → Cookies, or use the AlterLab dashboard to paste them.

3

Store them in AlterLab (or pass inline)

Create a session via the API/SDK/dashboard, or pass cookies directly in the scrape request.

4

Scrape with authentication

AlterLab injects your cookies into every request to the target domain, across all scraping tiers. The response contains the authenticated page content.

Session vs Inline Cookies

Stored sessions are ideal for recurring scrapes — create once, reuse across many requests. AlterLab tracks health, validates expiry, and alerts you when cookies need refreshing.

Inline cookies are best for one-off requests or when you rotate cookies externally. They are not stored and cannot be managed or monitored.

Legal Considerations

Your Responsibility

Authenticated scraping uses your credentials to access your account on the target site. You are responsible for ensuring your usage complies with the target site's Terms of Service. AlterLab does not provide legal advice.

Do: Scrape data you have legitimate access to (your own orders, your own subscriptions, content behind a paywall you are paying for).

Do not: Share stolen credentials, scrape accounts you do not own, circumvent security measures in ways that violate the CFAA or equivalent laws in your jurisdiction.

Remember: AlterLab encrypts and isolates your cookies, but the legal responsibility for how you use authenticated access rests with you.

Quick Start

Method 1: Capture Cookies from Chrome DevTools

Step 1: Open the target site and log in

Navigate to the site (e.g. amazon.com) and sign in with your account.

Step 2: Open DevTools → Application → Cookies

Press F12 or Ctrl+Shift+I, click the Application tab, expand Cookies in the left panel, and select the target domain.

Step 3: Copy the session cookies

Identify the cookies that maintain your login. Common names include:

  • session-id, session-token (Amazon)
  • reddit_session (Reddit)
  • JSESSIONID, PHPSESSID (common patterns)
  • __Secure-next-auth.session-token (NextAuth sites)

Step 4: Create a session in AlterLab

Bash
curl-X POST https://api.alterlab.io/api/v1/sessions \-H"X-API-Key: YOUR_API_KEY"\-H"Content-Type: application/json"\-d'{
 "name": "My Amazon Prime",
 "domain": "amazon.com",
 "cookies": {
 "session-id": "144-1234567-8901234",
 "session-token": "abc123xyz789..."
 }
 }'

Step 5: Scrape with your session

Bash
curl-X POST https://api.alterlab.io/api/v1/scrape \-H"X-API-Key: YOUR_API_KEY"\-H"Content-Type: application/json"\-d'{
 "url": "https://www.amazon.com/dp/B0EXAMPLE",
 "session_id": "SESSION_ID_FROM_STEP_4"
 }'

Method 2: Programmatic via SDK

Use the Python or Node SDK to create sessions and scrape programmatically. This is the recommended approach for automation.

Python
from alterlab import AlterLabSync

client = AlterLabSync(api_key="YOUR_API_KEY")
# Create a stored sessionsession = client.sessions.create( name="My Amazon Prime", domain="amazon.com", cookies={"session-id":"144-1234567-8901234","session-token":"abc123xyz789...",},)print(f"Session created: {session['id']}")
# Validate the session worksvalidation = client.sessions.validate(session["id"])print(f"Valid: {validation['is_valid']} (confidence: {validation['confidence']}%)")
# Scrape with the stored sessionresult = client.scrape("https://www.amazon.com/dp/B0EXAMPLE", session_id=session["id"],)print(result["content"][:200])

Method 3: Inline Cookies (One-off Requests)

For one-off scrapes, pass cookies directly in the request without creating a stored session. The cookies are used for that single request and are not persisted.

Python
from alterlab import AlterLabSync

client = AlterLabSync(api_key="YOUR_API_KEY")
# Scrape with inline cookies (not stored)result = client.scrape("https://www.amazon.com/dp/B0EXAMPLE", cookies={"session-id":"144-1234567-8901234","session-token":"abc123xyz789...",},)print(result["content"][:200])
# Check if authenticated content was returnedprint(f"BYOS applied: {result.get('byos_applied',False)}")

session_id and cookies are mutually exclusive

You cannot provide both session_id and cookies in the same request. Use one or the other. The API returns SESSION_COOKIES_CONFLICT if both are present.

Supported Use Cases

Use CaseExample SitesWhat You Get with Auth
Member PricingAmazon Prime, Walmart+, CostcoReal member-only prices, exclusive deals, Subscribe & Save rates
UK Retailer PricingTesco Clubcard, Boots Advantage, Sainsbury's NectarLoyalty card prices, personalized offers, member promotions
Paywalled NewsWSJ, Bloomberg, FT, The AthleticFull article text behind the paywall (requires your subscription)
Private CommunitiesReddit (private subs), Discourse forumsPrivate subreddit posts, members-only forum threads
SaaS DashboardsInternal tools, admin panelsDashboard data, report exports, account settings
Social MediaLinkedIn, X/Twitter (logged-in views)Full profiles, connection-gated content, algorithmic feeds

Session API Reference

See the Sessions API Reference for full details on creating and managing authenticated sessions.

Session Management

Stored sessions provide automatic health monitoring, expiry tracking, and usage analytics. Manage them via the API, SDK, or dashboard.

Health Monitoring

AlterLab tracks consecutive failures and success rates. Sessions are automatically marked as expired or invalid when cookies stop working.

Validation

The POST /sessions/:id/validate endpoint makes a real request to the target domain and checks for logged-in indicators, returning a confidence score.

Cookie Rotation

When cookies expire, use POST /sessions/:id/refresh to replace them. This resets failure counters and re-activates the session.

For full endpoint documentation, see the Sessions API Reference.

Kill All Sessions

Immediately deactivate all stored sessions for your account. Use this endpoint to revoke access in bulk — for example, after a suspected credential compromise or before offboarding a team member.

DELETE
/api/v1/sessions
Bash
# Kill all sessions for your accountcurl-X DELETE https://api.alterlab.io/api/v1/sessions \-H"Authorization: Bearer your_jwt_token"
# Response{"deleted":12,
"message":"All sessions deactivated"}

Irreversible Action

Killing all sessions deactivates them immediately. Any in-flight scraping jobs that reference these sessions will fail. Existing scrape results are preserved — only the session credentials are revoked.

Zero-Knowledge Mode

For maximum security, AlterLab supports client-side encryption. Your cookies are encrypted in your browser or application before they ever leave your device. AlterLab stores and transmits the ciphertext but can never read the cookie values.

How it works

  1. Fetch AlterLab's public key via GET /sessions/public-key
  2. Encrypt your cookies client-side using the public key
  3. Send the encrypted payload when creating or refreshing a session
  4. AlterLab decrypts only at scrape time, in an ephemeral worker — the plaintext is never stored
Python
from alterlab import AlterLabSync
from alterlab.crypto import encrypt_cookies

client = AlterLabSync(api_key="YOUR_API_KEY")
# Fetch the server's public keypub_key = client.sessions.get_public_key()
# Encrypt cookies client-sideencrypted = encrypt_cookies( cookies={"session-id":"144-1234567-8901234"}, public_key=pub_key["public_key"],)
# Create session with client-encrypted cookiessession = client.sessions.create( name="My Amazon (Zero-Knowledge)", domain="amazon.com", cookies_encrypted=encrypted,)

End-to-End Encryption

With zero-knowledge mode, cookie values are encrypted on your device and only decrypted in an ephemeral worker process at scrape time. They are never stored in plaintext, never logged, and never visible to AlterLab staff.

Security & Privacy

FeatureDetails
Encryption at RestCookies are encrypted with AES-256-GCM before storage. The encryption key is domain-bound — a session for amazon.com cannot be decrypted for any other domain.
Domain IsolationCookies from a session are only sent to URLs matching the stored domain. A session for amazon.com will never leak cookies to walmart.com.
Value RedactionCookie values are never returned in API responses. You can see cookie names for debugging, but the actual values are only accessible internally during a scrape.
Audit LoggingEvery session operation (create, update, delete, validate, refresh, use) is logged with timestamps and actor identity. View the audit trail via GET /sessions/:id/audit.
Organization SharingOrganization admins can share sessions with team members. Shared sessions are read-only for non-admins — they can use them for scraping but cannot modify or delete them.
Soft DeleteDeleting a session marks it as inactive rather than erasing it. This preserves the audit trail and prevents accidental data loss.

GDPR & Privacy API

AlterLab provides a dedicated privacy sub-API to help you comply with GDPR, CCPA, and similar regulations when managing sessions that contain personal data (e.g., user authentication cookies).

GET
/api/v1/sessions/privacy/export

Export all session data associated with your account in machine-readable format (JSON). Use this to fulfill data access requests under GDPR Article 15.

Bash
curl https://api.alterlab.io/api/v1/sessions/privacy/export \-H"Authorization: Bearer your_jwt_token"\-o sessions-export.json
DELETE
/api/v1/sessions/privacy/purge

Permanently and irreversibly delete all session data, including encrypted cookie values, usage logs, and audit trails. This fulfills erasure requests under GDPR Article 17 (Right to be Forgotten).

Bash
curl-X DELETE https://api.alterlab.io/api/v1/sessions/privacy/purge \-H"Authorization: Bearer your_jwt_token"\-H"Content-Type: application/json"\-d'{"confirm": "PURGE_ALL_SESSION_DATA"}'
# Response{"purged_sessions":12,
"purged_audit_records":847,
"completed_at":"2025-01-15T12:00:00Z"}

Permanent Deletion

Purging session data is irreversible. Cookie values, usage history, and audit logs are permanently erased. You cannot recover this data after purging.
GET
/api/v1/sessions/privacy/retention

View and configure the data retention policy for session data. You can set automatic purge schedules to align with your organization's data handling policy.

Bash
curl https://api.alterlab.io/api/v1/sessions/privacy/retention \-H"Authorization: Bearer your_jwt_token"
# Response{"retention_days":90,
"auto_purge_expired": true,
"purge_on_account_deletion":true}

Best Practices

Rotate cookies regularly

Most session cookies expire after hours or days. Set up a schedule to refresh cookies before they expire. Use the expires_at field when creating sessions so AlterLab can warn you.

Use one session per domain

Sessions are domain-scoped. Create separate sessions for amazon.com and amazon.co.uk. Mixing domains is not supported.

Validate before heavy usage

Call POST /sessions/:id/validate after creating or refreshing a session. This catches expired or invalid cookies before you waste credits on failed scrapes.

Monitor session health

Check consecutive_failures and success_rate in the session details. If failures spike, refresh your cookies.

Scrape at a reasonable rate

Authenticated scraping with your personal account cookies should mimic normal browsing patterns. Aggressive scraping can trigger the target site's abuse detection and invalidate your session.

Use zero-knowledge mode for sensitive sites

For banking, healthcare, or any site where cookie exposure is a concern, enable client-side encryption so AlterLab never sees your plaintext credentials.

Troubleshooting

Scrape returns public content instead of authenticated

  • Cookies may have expired. Validate the session and check consecutive_failures.
  • You may be missing a required cookie. Some sites need multiple cookies (e.g. Amazon needs both session-id and session-token).
  • The target site may have rotated your session server-side. Log in again and refresh the cookies.

SESSION_DOMAIN_MISMATCH error

The URL you are scraping does not match the session's domain. A session for amazon.com cannot be used to scrape walmart.com. Create a separate session for each domain.

SESSION_EXPIRED error

The session has been marked as expired because the cookies have passed their expires_at timestamp or repeated validation failures were detected. Log in to the target site again and use POST /sessions/:id/refresh with fresh cookies.

Validation returns low confidence

The validator checks for logged-in indicators (user menus, account links, personalized greetings). If the page does not contain clear indicators, confidence will be low even if the cookies are valid. This is common for sites with minimal UI changes between logged-in and logged-out states.

SESSION_LIMIT_REACHED error

Each account can store up to 50 sessions. Delete unused or expired sessions to free up slots. Use GET /sessions?status=expired to find sessions that can be cleaned up.

For a complete list of session-related error codes, see the Error Codes Reference.