Pricing
from $3.00 / 1,000 results
Instagram Lead Extractor
Discover Instagram profiles from usernames, hashtags, locations, search queries, datasets, or CSV β and extract emails, phones, and social handles from their bios.
Pricing
from $3.00 / 1,000 results
Rating
0.0
(0)
Developer
Actor stats
0
Bookmarked
22
Total users
4
Monthly active users
a month ago
Last modified
Categories
Share
Instagram Bio & Email Extractor
An Apify actor that discovers Instagram profiles from multiple sources and extracts contact info β especially emails β from their bios.
What it does
- Discovers Instagram profiles from any combination of: direct usernames/URLs, hashtags, locations, search queries, an upstream Apify dataset, or a CSV / Google Sheets URL.
- Fetches each profile via Instagram's
web_profile_infoJSON endpoint (the cheapest and most stable public path). - Extracts username, full name, bio, follower/following/post counts, external URL, business category, verification status, and
is_private. - Parses the bio (and optionally the link-in-bio page) for emails, phone numbers (E.164), and social handles for TikTok, YouTube, X, LinkedIn, WhatsApp, Telegram, Threads, Facebook, and Pinterest.
- Emits one row per profile into the default Apify dataset.
What it does NOT do
- Does not scrape private profiles' posts, stories, DMs, or follower lists.
- Does not log in by default. A session cookie is opt-in and required for auth-only modes (see below).
- Does not bypass Instagram's auth walls for protected content.
Discovery modes
| Mode | Input field | Requires session cookie | Notes |
|---|---|---|---|
| Direct usernames | usernames | No | Most reliable. Accepts @handle, username, or full profile URL. |
| Upstream Apify dataset | datasetId + datasetUsernameField | No | Chain from another actor. |
| CSV / Google Sheets | csvUrl + csvUsernameColumn | No | Public CSV URL or docs.google.com/spreadsheets/... (auto-converts to export?format=csv). |
| Search | searchQueries | No | Top accounts per query. Volume small. |
| Hashtags | hashtags | No | Discovers post authors. Cap with maxProfilesPerHashtag. |
| Locations | locations | No | Full IG location URL. Cap with maxProfilesPerLocation. |
| Followers of target | followersOf | Yes | Auth-only β IG does not serve follower lists unauthenticated. Not implemented in v1. |
| Following of target | followingOf | Yes | Same as above. Not implemented in v1. |
| Post engagers | postUrls | Mixed (likers auth-only) | Not implemented in v1. |
Discovery modes can be combined freely β results are merged and deduplicated by username (first-seen wins for discoveredVia / sourceRef).
Output
One row per profile. Always emitted with the full shape β empty arrays or nulls for missing fields.
{"username":"natgeo","fullName":"National Geographic","profileUrl":"https://www.instagram.com/natgeo/","biography":"Experience the world...","externalUrl":"https://natgeo.com","isVerified":true,"isPrivate":false,"isBusinessAccount":true,"businessCategory":"Media","followersCount":281000000,"followingCount":132,"postsCount":28500,"profilePicUrl":"https://...","emails":["press@natgeo.com"],"phones":["+12025550100"],"socialHandles":{"tiktok":["natgeo"],"youtube":["natgeo"],"x":[],"linkedin":[],"whatsapp":[],"telegram":[],"threads":[],"facebook":[],"pinterest":[]},"scrapedAt":"2026-05-23T12:34:56.000Z","discoveredVia":"hashtag","sourceRef":"veganbakeryberlin","contactSource":"profile_page","status":"ok"}
status is one of ok | private | not_found | deactivated | rate_limited | error.
Email extraction
The actor handles common bio obfuscations before matching:
name [at] domain [dot] comβname@domain.comname (at) domain (dot) comβname@domain.comname AT domain DOT comβname@domain.comname@@domainβname@domain
Matched emails are then filtered to drop:
- Image / asset extensions (
.png,.jpg,.webp, β¦) - Known tracking domains (
sentry.io,wixpress.com,example.com, β¦) noreply@/no-reply@/donotreply@- Purely numeric local-parts longer than 8 characters (tracking IDs)
Phone numbers are parsed via libphonenumber-js and output in E.164.
External URL scanning
When scrapeExternalUrl: true, the actor follows the profile's external_url and scans the response body for additional contacts. This is SSRF-guarded:
- Rejects private / loopback / link-local IPs (post-DNS).
- Rejects non-
http(s)schemes. - Caps response size at 2 MB.
- 10s total timeout.
- At most 3 redirects, re-validated per hop.
Anti-detection
- Residential proxy required for production scale. Datacenter IPs get challenged within a few requests.
- Session pool with rotation β sessions are retired after rate-limit / login-wall responses.
- Randomised delays between profiles (
minDelayMs/maxDelayMs, default 1500β4500 ms). - Login wall detection by both URL (
/accounts/login/) and body markers (LoginAndSignupPage, etc.). - Cheerio-based crawler by default β no headless browser, much cheaper.
Local development
npminstall# place a test inputmkdir-p storage/key_value_stores/defaultecho'{"usernames":["natgeo"],"maxProfiles":1}'> storage/key_value_stores/default/INPUT.json# runnpm start# testnpmtest
Local runs use no proxy by default and will hit IG's login wall after a few requests. That's expected β local is for logic, scale-test on the Apify platform with a residential proxy.
Legal & ethical
- This actor scrapes only public profiles. Private profiles return only basic metadata (no bio).
- Profile pictures are stored as URLs only β no binaries.
- You are responsible for compliance with GDPR, CCPA, CAN-SPAM, and local data-protection laws when using extracted contact info for outreach.
- Do not use this actor for harassment, stalking, or targeted abuse.
