TikTok & Instagram Reels Transcription โ AI Captions
Pricing
Pay per usage
TikTok & Instagram Reels Transcription โ AI Captions
Transcribe TikTok videos and Instagram Reels to text via automation. Get SRT captions for accessibility, subtitles for repurposing, and text content for scheduling tools. Batch multiple URLs. No Wisprs account needed.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
Actor stats
0
Bookmarked
2
Total users
0
Monthly active users
8 days ago
Last modified
Categories
Share
TikTok & Instagram Reels Transcription โ AI Subtitles & Captions
Transcribe TikTok videos, Instagram Reels, YouTube Shorts, and Facebook videos to text, SRT, and VTT subtitle files. 100+ languages. Self-hosted Whisper AI โ no OpenAI API key required. Saves to your Apify Dataset automatically.
This Actor uses the Wisprs API, which transcribes audio from short-form social media videos using Whisper-based speech-to-text. Unlike caption-scraping approaches that fail on videos without auto-generated captions, Wisprs transcribes the actual audio โ which means it works for every public video. Accuracy is excellent on clear audio; results vary by language, accent, and recording quality.
What does this Actor do?
- Accepts a list of TikTok, Instagram Reel, YouTube Shorts, or Facebook video URLs
- Submits each to the Wisprs transcription API (short-form videos typically complete in 30โ90 seconds)
- Exports the transcript in your chosen formats: TXT, SRT, VTT, JSON
- Optionally generates a one-sentence summary or Twitter thread from the transcript
- Saves one dataset row per video โ ready for bulk processing, captioning, or content analysis
How do I transcribe TikTok videos and generate SRT caption files?
{"startUrls":[{"url":"https://www.tiktok.com/@username/video/YOUR_VIDEO_ID"}],"language":"auto","exportFormats":["srt","vtt","txt"]}
Each video produces one dataset row with transcript_srt (SRT file), transcript_vtt (WebVTT file), and transcript_txt (plain text). The SRT and VTT files are ready to upload directly to TikTok, Instagram, or any video platform.
How do I transcribe Instagram Reels in bulk for content research?
{"startUrls":[{"url":"https://www.instagram.com/reel/REEL_ID/"},{"url":"https://www.instagram.com/reel/REEL_ID_2/"}],"language":"auto","exportFormats":["txt","json"]}
The transcript_json field contains word-level timestamps โ useful for keyword extraction, hook analysis, and content research pipelines. The transcript_txt is plain text ready to pipe into a classifier or summarizer.
How do I use this for content moderation at scale?
Submit flagged or monitored video URLs. Each transcribed video returns transcript_txt in the dataset row โ pipe that text into your keyword filter, LLM classifier, or moderation queue:
{"startUrls":[{"url":"https://www.tiktok.com/@username/video/VIDEO_ID"}],"exportFormats":["txt"],"language":"auto"}
At $0.001 per video, transcribing 100,000 videos for moderation costs approximately $100.
How do I transcribe a full TikTok profile?
Pair this Actor with a TikTok profile scraper to get all public video URLs for a creator, then pass them into this Actor. The async job model handles concurrent submissions โ one Actor run covers an entire profile.
Using with AI agents (MCP)
This Actor is published on the Apify Store and automatically available as an MCP tool. AI agents using Claude Desktop, LangChain, CrewAI, or any MCP-compatible framework can discover and call this Actor directly โ no custom integration required.
What data does the Actor return?
| Field | Description |
|---|---|
url | The submitted video URL |
jobId | Wisprs transcription ID (integer) |
transcriptionId | Same as jobId |
status | completed or failed |
language | Detected language ISO 639-1 code (e.g. "en") |
transcript_txt | Full plain-text transcript |
transcript_srt | SRT subtitle file content |
transcript_vtt | WebVTT subtitle file content |
transcript_json | Word-level timestamps in JSON |
repurposed_summary | 1โ2 sentence summary (if repurposeMode=summary) |
repurposed_thread | Twitter/X thread text (if repurposeMode=thread) |
How much does it cost?
Pricing is $1.00 per 1,000 transcriptions ($0.001 per video) โ no per-minute surcharge for short-form content:
| Volume | Cost |
|---|---|
| 1,000 videos | ~$1.00 |
| 10,000 videos | ~$10.00 |
| 100,000 videos | ~$100.00 |
The Apify free plan includes $5/month in credits โ enough to transcribe 5,000 short-form videos.
Wisprs vs the competition
| Feature | Wisprs | tictechid Transcriber |
|---|---|---|
| TikTok + IG Reels + YT Shorts + FB | Yes | Yes |
| Language support | 100+ | 35 |
| Self-hosted AI (no external API key) | Yes | Unspecified |
| SRT / VTT subtitle export | Yes | Text only |
| Word-level JSON timestamps | Yes | No |
| Content repurposing (summary, thread) | Yes | No |
| Price per 1,000 transcriptions | $1.00 | $1.50 |
What can I build with this?
Bulk caption generation โ scrape TikTok or Instagram profiles for public videos, submit all URLs, and get SRT/VTT files ready to upload back to the platforms. Full caption coverage for a creator's entire library in one run.
Content moderation pipeline โ transcribe flagged social media videos for text analysis. The transcript_txt field is ready to pipe into a content classifier or keyword filter.
Trend analysis โ transcribe trending TikTok videos in a niche to extract what creators are saying. Identify recurring phrases, topics, and hooks at scale.
Multilingual caption localization โ transcribe a video in its original language (with word-level timestamps via exportFormats: ["json"]), pass the JSON to a translation API, and re-align the translated text to the original timestamps. The timing structure carries through the pipeline.
Creator research tool โ transcribe competitor content and run keyword analysis to understand their messaging strategy. Identify content gaps and talking points your audience responds to.
Social listening โ transcribe public videos mentioning your brand, product, or keywords. Extract what customers are saying in video format that standard social listening tools miss.
Supported platforms
| Platform | URL format |
|---|---|
| TikTok | tiktok.com/@username/video/ID |
| Instagram Reels | instagram.com/reel/ID/ |
| YouTube Shorts | youtube.com/shorts/ID |
| Facebook Reels/Watch | facebook.com/watch?v=ID |
| YouTube (standard) | youtube.com/watch?v=ID |
Public posts only. Private, followers-only, or age-restricted content cannot be transcribed.
Language support
100+ languages with automatic detection. The detected language appears as language in each dataset row. Notable languages: English, Spanish, Portuguese, Hindi, Indonesian, Arabic, French, German, Japanese, Korean, Mandarin, and 90+ more.
Short-form video presents unique transcription challenges (background music, cuts, fast speech). Accuracy is excellent on speech-forward content; results vary on heavily music-backed or ASMR content.
Related Actors
- Wisprs โ Audio & Video Transcription โ universal transcription including long-form video and podcast
- Wisprs โ Podcast Show Notes Generator โ podcast episodes โ show notes, chapters, quotes
- Wisprs โ YouTube Content Repurposer โ YouTube โ thread, blog post, show notes
FAQ
Does this require an OpenAI API key or a Wisprs account? No. No external API key or account required. Wisprs runs Whisper on its own infrastructure and handles all transcription automatically โ you pay only via Apify credits.
Does it work for videos without auto-generated captions? Yes. Wisprs transcribes the audio directly โ it does not rely on platform-generated captions.
What about private or age-restricted videos?
Private, followers-only, and age-restricted content cannot be downloaded. The dataset row will have status: "failed". Check the Actor logs for details.
What's the maximum video length? This Actor is optimized for short-form content (under 10 minutes). For long-form video and podcasts, use the Wisprs Audio & Video Transcription Actor.
Can I process a full TikTok profile at once? Yes โ pair this Actor with a TikTok scraper to extract all public video URLs from a profile, then pass them into this Actor.
Support
- Documentation: wisprs.co/docs
- Email: tosh@belvadigital.com
100+ languages. $1.00 per 1,000 videos. No account or API key required.
