Pricing
Pay per usage
TikTok Keyword Scraper
Under maintenanceScrapes TikTok video search results by keyword using Playwright, with persistent browser profiles, CAPTCHA solving, and an optional Apify residential proxy that can be fully disabled to run direct (no proxy).
Pricing
Pay per usage
Rating
5.0
(2)
Developer
Actor stats
0
Bookmarked
8
Total users
5
Monthly active users
7 days ago
Last modified
Categories
Share
TikTok Keyword Scraper โ Apify Actor
Scrapes TikTok video search results by keyword using Playwright, with persistent browser profiles, CAPTCHA solving, proxy rotation via a local relay, and an optional Apify proxy that can be fully disabled to run with no proxy at all.
This package runs two ways from the same codebase:
- As an Apify Actor (
.actor/+src/) โ the intended way to deploy on Apify's cloud. - As a local FastAPI server (
server.py) โ for local development, debugging, or running outside Apify.
Project Structure
.โโโ .actor/โ โโโ actor.json # Actor metadataโ โโโ INPUT_SCHEMA.json # Input fields shown in Apify Console(incl."Use proxy" toggle)โ โโโ Dockerfile # apify/actor-python-playwright:3.11 base imageโโโ src/โ โโโ __init__.pyโ โโโ __main__.py # Actor entrypoint(`python -m src`)โ โโโ main.py # Reads input, runs keywords concurrently, pushes to datasetโโโ browser.py # Browser/context factory, cookies, scroll/nav helpersโโโ captcha.py # CAPTCHA detection +solving(SadCaptcha โ SolveCaptcha)โโโ scrapers.py # scrape_tiktok_search / hashtag / profile / download-urlโโโ data_helpers.py # Video cleaning,API parsing, shared scroll loopโโโ config.py # Constants, proxy pool(now toggleable), selectorsโโโ profiles.py / profile_pool.py # Persistent browser-profile pool + proxy assignmentโโโ proxy_relay.py # Local TCPrelay(works around Chromium proxy-auth bugs)โโโ downloader.py # Streaming video downloaderโโโ server.py # FastAPI server for local/non-Apify useโโโ tiktok_cookies.json # Bundled fallback cookie set(see "Cookie fallback" below)โโโ requirements.txtโโโ .dockerignore
Running No Proxy at All (the toggle you asked for)
The Actor input has a "Use proxy" checkbox (useProxy, default true).
- ON (default): every browser session routes through the Apify residential
proxy pool (
BUYPROXIES94952group), same as before. - OFF: every session connects directly, with no proxy โ useful for local testing, debugging, or if your Apify plan has no proxy quota left.
How it works under the hood: src/main.py sets the environment variable
USE_PROXY=true|false from that checkbox before any scrape starts.
config.get_proxy_pool() checks USE_PROXY on every call (it's not cached at
import time), and returns an empty list when proxying is off. With an empty
pool, profiles.get_proxy_for_profile() returns None, and
browser.make_browser_and_context() already had a "no proxy configured"
direct-connection branch โ so turning the toggle off requires no other code
changes anywhere in the scraper.
You can also flip this manually outside the Actor input by setting the
USE_PROXY env var directly (e.g. for local CLI/server runs):
$USE_PROXY=false python server.py
There's also an optional proxyPassword input field (marked secret) if you
want to override the Apify proxy password baked into config.py with your
own, without editing code.
Cookie Fallback (Apify /tmp wipe fix)
Apify's containers are ephemeral โ /tmp (and therefore every per-profile
cookies.json under /tmp/tiktok_profiles/) is wiped between separate Actor
runs. A brand-new container's first navigation then has zero session cookies,
which is what causes TikTok to silently route searches to the Users tab
instead of the Videos tab.
browser.load_cookies() now falls back to the bundled tiktok_cookies.json
at the project root whenever a profile has no (or expired) per-profile cookie
file yet, giving every fresh container at least one valid baseline session
instead of a completely cold one. This file is copied into the Docker image
by COPY . ./ in the Dockerfile, so it ships with every build.
Deploying to Apify
npminstall-g apify-cliapify logincd tiktok-apify-actor/apify push
apify push builds the Docker image from .actor/Dockerfile and uploads
everything else (COPY . ./ in the Dockerfile copies the whole repo root
into the image, including src/, the scraper modules, and the bundled
cookie file).
Actor Input Example
{"keywords":["funny cats","cooking recipe"],"maxResults":50,"maxConcurrency":3,"dateFilter":0,"scrollPause":3.0,"headless":true,"useProxy":true}
Run with no proxy at all:
{"keywords":["funny cats"],"useProxy":false}
Running via REST API
import requestsAPIFY_TOKEN ="your_token_here"ACTOR_ID ="your_username/tiktok-keyword-scraper"response = requests.post(f"https://api.apify.com/v2/acts/{ACTOR_ID}/runs",headers={"Content-Type":"application/json"},params={"token": APIFY_TOKEN},json={"keywords":["python tutorial"],"maxResults":30,"useProxy":True},)run_id = response.json()["data"]["id"]
Results land in the run's default dataset (one row per video, plus a
search_keyword field), and a run-level summary is written to the
key-value store under OUTPUT.
Running Locally (FastAPI server, unchanged)
pip install-r requirements.txtplaywright install chromiumpython server.py# โ http://localhost:8000/docs
| Method | Path | Description |
|---|---|---|
POST | /search | Search by keyword |
POST | /batch-search | Up to 100 keywords at once |
GET | /job/{job_id} | Poll job status / results |
GET | /proxy-test | Test every proxy in the pool |
POST | /download | Download a video on demand |
GET | /health | Health check |
Notes
maxConcurrencyis capped atPROFILE_POOL_SIZE(10) โ each parallel job needs its own browser-profile slot.dateFiltermatches TikTok's "Posted" search filter:0=all,1=24h,7=week,30=month,90=3mo,180=6mo.- This tool is for educational/research purposes only.
