VOOZH about

URL: https://apify.com/epctex/osint-scraper

โ‡ฑ OSINT Data Extractor ยท Apify


Pricing

$10.00/month + usage

Go to Apify Store

Harness the power of OSINT data with our advanced OSINT Scraper. Discover keywords and leaked information from platforms like Ideone, Dumpz, Github Gist, Pastebin, Pasteorg and Textbin. You can specify search terms, customize and retrieve OSINT data out of the box.

Pricing

$10.00/month + usage

Rating

5.0

(7)

Developer

๐Ÿ‘ epctex

epctex

Maintained by Community

Actor stats

41

Bookmarked

1.1K

Total users

14

Monthly active users

15 hours

Issues response

18 hours ago

Last modified

Categories

Share

Actor - OSINT Scraper

OSINT scraper

This actor should help you to retrieve sensitive data from many websites which might leak the data.

The OSINT data scraper supports the following features:

  • Search any keyword - You can search any keyword you would like to have and get the results

  • Scrape multiple websites - Scrape Codepad, Dumpz, Github Gist, Ideone, Pastebin, Pasteorg, Textbin and many more websites.

  • Customizable - Create a custom scraping function that will run on the results within your needs.

Bugs, fixes, updates, and changelog

This scraper is under active development. If you have any feature requests you can create an issue from here.

Input Parameters

The input of this scraper should be JSON containing the list of pages on OSINT that should be visited. Possible fields are:

  • searchKeywords: (Required) (String) Keyword array that you want to search on the websites.

  • codepad: (Optional) (Boolean) This will enable the codepad module which will go to scrape http://codepad.org/.

  • githubgist: (Optional) (Boolean) This will enable the Github Gist module which will go to scrape https://gist.github.com/.

  • ideone: (Optional) (Boolean) This will enable the ideone module which will go to scrape https://ideone.com/.

  • pastebin: (Optional) (Boolean) This will enable the pastebin module which will go to scrape https://pastebin.com/.

  • pasteorg: (Optional) (Boolean) This will enable the pasteorg module which will go to scrape https://www.paste.org/.

  • textbin: (Optional) (Boolean) This will enable the textbin module which will go to scrape https://textbin.net/.

  • proxy: (Required) (Proxy Object) Proxy configuration.

  • extendOutputFunction: (Optional) (String) Function that takes a JQuery handle ($) as an argument and returns an object with data.

This solution requires the use of Proxy servers, either your own proxy servers or you can use Apify Proxy.

Tip

When you want to have a scrape only a couple of modules, you can set true on the module flags and the actor will initiate to scrape only these websites.

If you want to scrape Pastebin, please try to use US-based proxies because Pastebin is restricted in many countries.

Compute Unit Consumption

The actor is optimized to run blazing fast and scrape many listings as possible. Therefore, it forefronts all listing detail requests. If the actor doesn't block very often run consumes ~0.01-0.03 compute units per 100 pages.

OSINT Scraper Input example

{
"searchKeywords":[
"@gmail",
"db_pass"
],
"codepad":true,
"githubgist":true,
"ideone":true,
"pastebin":true,
"pasteorg":true,
"textbin":true,
"proxy":{
"useApifyProxy":true
}
}

During the Run

During the run, the actor will output messages letting you know what is going on. Each message always contains a short label specifying which page from the provided list is currently specified. When items are loaded from the page, you should see a message about this event with a loaded item count and total item count for each page.

If you provide incorrect input to the actor, it will immediately stop with a failure state and output an explanation of what is wrong.

OSINT Export

During the run, the actor stores results into a dataset. Each item is a separate item in the dataset.

You can manage the results in any language (Python, PHP, Node JS/NPM). See the FAQ or our API reference to learn more about getting results from this OSINT actor.

Scraped Output

The structure of each item in OSINT Scraper looks like this:

Item Detail

{
"keyword":"a",
"url":"https://gist.github.com/trin94/3381395adc8b2c3fea81a38b9a385369"
}

Contact

Please visit us through epctex.com to see all the products that are available for you. If you are looking for any custom integration or so, please reach out to us through the chat box in epctex.com. In need of support? business@epctex.com is at your service.

You might also like

Scout โ€” Lead Enrichment + OSINT

logical_vivacity/scout

Email finder + lead enrichment + OSINT from public sources. Pass any fragment โ€” name, email, or domain โ€” get a verified dossier: 700+ identity sites, SMTP-validated emails, document mining, sanctions screen, domainโ†’team discovery. $0.05 person, $0.15 domain. No API keys

๐Ÿ‘ User avatar

Logical Vivacity

120

Dark Web Scraper

epctex/darkweb-scraper

Uncover valuable insights with our Dark Web Scraper. Extract sensitive data, including crypto wallets, API keys, emails, phone numbers, and more, from the depths of the Dark Web. You can specify search terms, and customize and retrieve OSINT data out of the box.

Holehe Email OSINT - Find Registered Accounts

anshumanatrey/holehe-email-osint

Check if an email is registered on 120+ platforms (Instagram, Twitter, GitHub, Discord, etc.) without alerting the target. Perfect for OSINT investigations, security research, and email verification.

๐Ÿ‘ User avatar

Anshuman Atrey

1.3K

5.0

Sherlock Scraper

crawlerbros/sherlock-scraper

Search for usernames across 400+ social networks and websites. Find all social media accounts linked to a username using the Sherlock OSINT tool.

423

5.0

Username OSINT Availability Checker

dev00/username-osint

Perform deep OSINT lookups to instantly check username availability across 30+ top social media, professional, and gaming platforms.

dev00

9

Free OSINT email lookup and email verifier

s-r/apifyosintemail

Investigate email addresses with OSINT intelligence. Validates format, detects accounts across 18+ platforms (GitHub, Twitter, Microsoft, LinkedIn, etc.), checks data breach history, and provides A-F risk scoring. Supports batch lookups with concurrency control

Instagram Followers And Following Scrapper

scrapio/instagram-followers-and-following-scrapper

Instagram Followers and Following Scraper ๐Ÿ‘ฅ๐Ÿ“ธ extracts follower and following lists from public Instagram profiles, including usernames, profile URLs, bios, and account details. Ideal for audience research, influencer discovery, competitor tracking, and social network analysis. ๐Ÿš€๐Ÿ“Š

Ultra FAST Instagram Profile Scraper

rover-omniscraper/instagram-profile-scraper

The fastest Instagram profile scraper on Apify. Extract followers, bio, business info, posts & more via Instagram's internal API โ€” no login, no browser. Concurrent scraping, rotating proxies & 60+ user agents. Structured JSON in seconds.

๐Ÿ‘ User avatar

Rover Omniscraper

24

4.0

Social Media Finder Pro

xtech/social-media-finder-pro

๐—™๐—ถ๐—ป๐—ฑ ๐˜€๐—ผ๐—ฐ๐—ถ๐—ฎ๐—น ๐—บ๐—ฒ๐—ฑ๐—ถ๐—ฎ ๐—ฎ๐—ฐ๐—ฐ๐—ผ๐˜‚๐—ป๐˜๐˜€ ๐—ถ๐—ป๐˜€๐˜๐—ฎ๐—ป๐˜๐—น๐˜† by username across ๐Ÿฐ๐Ÿฌ๐Ÿฌ+ ๐—ฝ๐—น๐—ฎ๐˜๐—ณ๐—ผ๐—ฟ๐—บ๐˜€ including Facebook, Instagram, Twitter & TikTok. ๐—™๐—ฟ๐—ฒ๐—ฒ ๐˜‚๐˜€๐—ฒ๐—ฟ๐—ป๐—ฎ๐—บ๐—ฒ ๐˜€๐—ฒ๐—ฎ๐—ฟ๐—ฐ๐—ต ๐˜๐—ผ๐—ผ๐—น to discover profiles. ๐—™๐—ฎ๐˜€๐˜, ๐—ฎ๐—ฐ๐—ฐ๐˜‚๐—ฟ๐—ฎ๐˜๐—ฒ ๐˜€๐—ผ๐—ฐ๐—ถ๐—ฎ๐—น ๐—บ๐—ฒ๐—ฑ๐—ถ๐—ฎ ๐—น๐—ผ๐—ผ๐—ธ๐˜‚๐—ฝ.