VOOZH about

URL: https://apify.com/automation-lab/failory-startups-scraper

โ‡ฑ Failory Startups Scraper | Startup Directory Data ยท Apify


Pricing

Pay per event

Go to Apify Store

Failory Startups Scraper

Scrape Failory startup directory pages into clean startup profiles with websites, industries, founders, funding details, and investors.

Pricing

Pay per event

Rating

0.0

(0)

Developer

๐Ÿ‘ Stas Persiianenko

Stas Persiianenko

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Categories

Share

Extract structured startup profiles from public Failory startup directory pages.

Use this actor to collect startup names, websites, industries, descriptions, founders, funding snippets, headquarters, investor names, and source-page metadata from country and category lists such as United States startups, SaaS startups, fintech startups, and AI startups.

What does Failory Startups Scraper do?

Failory Startups Scraper turns public Failory directory pages into clean dataset rows.

It fetches Failory pages over HTTP, parses the startup cards and company detail tables, and exports one row per startup profile.

Typical rows include:

  • ๐Ÿš€ Startup name
  • ๐Ÿ”— Website URL
  • ๐Ÿงญ Failory source URL
  • ๐Ÿท๏ธ Industries and category badges
  • ๐Ÿ“ Company description
  • ๐Ÿ“ Headquarters, city, and country
  • ๐Ÿ‘ฅ Founders
  • ๐Ÿ’ฐ Funding amount and latest funding status
  • ๐Ÿฆ Top investors
  • ๐Ÿ–ผ๏ธ Logo URL
  • โฑ๏ธ Scrape timestamp

Who is it for?

This actor is useful for teams that need startup intelligence without manual copy-paste.

  • ๐Ÿงฒ Lead generation teams building startup prospect lists
  • ๐Ÿ’ผ B2B sales teams targeting funded companies
  • ๐Ÿ“Š Market researchers mapping startup categories
  • ๐Ÿฆ Investors and accelerators screening startup ecosystems
  • ๐Ÿงช Product marketers researching competitor categories
  • ๐Ÿงฐ Data teams feeding startup records into CRMs or BI tools

Why use this actor?

Failory pages are useful but manual browsing is slow.

This actor gives you structured, exportable records that can be filtered, enriched, deduplicated, and joined with other datasets.

Benefits:

  • No browser automation needed for normal runs
  • Public HTTP pages only
  • Country and category page support
  • Source metadata included for traceability
  • One dataset row per startup
  • Works with Apify datasets, webhooks, APIs, and integrations

What Failory pages can I scrape?

Use public URLs under https://www.failory.com/startups.

Examples:

  • https://www.failory.com/startups/united-states
  • https://www.failory.com/startups/saas
  • https://www.failory.com/startups/artificial-intelligence
  • https://www.failory.com/startups/fintech
  • https://www.failory.com/startups/united-kingdom

You can also provide slugs like united-states, saas, or artificial-intelligence.

Data table

FieldDescription
startupNameStartup or company name
rankRank/order on the Failory page
websiteUrlExternal website linked by Failory
sourceUrlFailory page where the startup was found
sourcePageTitlePage title or heading
sourcePageSlugFailory slug after /startups/
sourcePageTypeCountry, category, directory, or unknown
sourcePageLabelHuman-readable page label
descriptionStartup description paragraph
industriesArray of Failory badges
industryTextComma-separated industries for CSV tools
headquartersHeadquarters text
cityParsed city from headquarters
countryParsed country from headquarters
yearFoundedFounded year
foundersFounder names
fundingAmountFunding amount text
startupSizeSize bucket from Failory
lastFundingStatusLatest funding stage/status
topInvestorsInvestor names
logoUrlLogo image URL
scrapedAtISO timestamp

How much does it cost to scrape Failory startup profiles?

The default pay-per-event setup is designed for affordable startup lead generation.

Pricing uses:

  • A small run-start event
  • A per-profile result event for each startup saved

At the BRONZE price, 1,000 startup profiles cost $0.50 plus the small start fee. Tiered discounts apply on higher Apify plans.

How to run it

  1. Open the actor on Apify.
  2. Add one or more Failory startup directory URLs.
  3. Optionally add slugs such as saas or united-states.
  4. Set maxItems to the number of startup profiles you need.
  5. Set maxPages if you start from a broad directory page.
  6. Click Start.
  7. Export the dataset as JSON, CSV, Excel, XML, RSS, or HTML.

Input example

{
"startUrls":[
{"url":"https://www.failory.com/startups/united-states"},
{"url":"https://www.failory.com/startups/saas"}
],
"slugs":["artificial-intelligence"],
"maxItems":100,
"maxPages":5
}

Output example

{
"startupName":"Perplexity",
"rank":2,
"websiteUrl":"https://www.perplexity.ai/?ref=failory",
"sourceUrl":"https://www.failory.com/startups/united-states",
"sourcePageType":"country",
"sourcePageLabel":"United States",
"description":"Perplexity has developed an AI-powered answer engine...",
"industries":["AI","Chatbot","Generative AI"],
"headquarters":"San Francisco, California, United States",
"country":"United States",
"yearFounded":2022,
"fundingAmount":"$1.5B",
"lastFundingStatus":"Venture Round"
}

Tips for best results

  • Start with one or two specific pages before scraping many categories.
  • Use maxItems to control dataset size and cost.
  • Use sourcePageLabel to group records by country or category.
  • Deduplicate by startupName and websiteUrl when combining pages.
  • Use industryText for spreadsheet filters.
  • Use industries when processing JSON programmatically.

Integrations

You can connect the dataset to common workflows:

  • ๐Ÿ“‡ Send startup records to a CRM
  • ๐Ÿ“ฌ Trigger outreach workflows with Apify webhooks
  • ๐Ÿ“Š Load startup datasets into BigQuery, Snowflake, or Sheets
  • ๐Ÿ”Ž Enrich websites with separate email or SEO actors
  • ๐Ÿง  Feed profiles into research assistants or scoring models

API usage with Node.js

import{ ApifyClient }from'apify-client';
const client =newApifyClient({token: process.env.APIFY_TOKEN});
const run =await client.actor('automation-lab/failory-startups-scraper').call({
startUrls:[{url:'https://www.failory.com/startups/saas'}],
maxItems:50,
maxPages:2,
});
const{ items }=await client.dataset(run.defaultDatasetId).listItems();
console.log(items);

API usage with Python

from apify_client import ApifyClient
import os
client = ApifyClient(os.environ['APIFY_TOKEN'])
run = client.actor('automation-lab/failory-startups-scraper').call(run_input={
'startUrls':[{'url':'https://www.failory.com/startups/united-states'}],
'maxItems':50,
'maxPages':2,
})
items = client.dataset(run['defaultDatasetId']).list_items().items
print(items)

API usage with cURL

curl-X POST "https://api.apify.com/v2/acts/automation-lab~failory-startups-scraper/runs?token=$APIFY_TOKEN"\
-H'Content-Type: application/json'\
-d'{"startUrls":[{"url":"https://www.failory.com/startups/saas"}],"maxItems":50,"maxPages":2}'

MCP integration

Use the actor from MCP-compatible tools through Apify MCP Server.

MCP URL:

https://mcp.apify.com/?tools=automation-lab/failory-startups-scraper

Add it from Claude Code with a command like:

$claude mcp add apify-failory-startups "https://mcp.apify.com/?tools=automation-lab/failory-startups-scraper"

JSON configuration example:

{
"mcpServers":{
"apify-failory-startups":{
"url":"https://mcp.apify.com/?tools=automation-lab/failory-startups-scraper"
}
}
}

Example prompts:

  • "Use the Failory Startups Scraper MCP tool to scrape 50 SaaS startups and summarize the most common industries."
  • "Run automation-lab/failory-startups-scraper for United States startups and format the top funded companies as a table."
  • "Find Failory AI startups with the MCP tool and identify founders and investors mentioned in the data."

Claude Desktop MCP setup

Add Apify MCP Server to Claude Desktop and include this actor in the tools query.

Use your Apify token for authentication. Then ask Claude to run automation-lab/failory-startups-scraper with a Failory URL and a small maxItems value.

Claude Code MCP setup

Configure the Apify MCP endpoint with:

https://mcp.apify.com/?tools=automation-lab/failory-startups-scraper

Add it from Claude Code with a command like:

$claude mcp add apify-failory-startups "https://mcp.apify.com/?tools=automation-lab/failory-startups-scraper"

You can also use a JSON-style MCP server configuration:

{
"mcpServers":{
"apify-failory-startups":{
"url":"https://mcp.apify.com/?tools=automation-lab/failory-startups-scraper"
}
}
}

Then call the actor from your coding workflow to create fixtures, update prospect files, or refresh market datasets.

Example MCP prompts:

  • "Use the Failory Startups Scraper tool to get 20 SaaS startups and save the names and websites as CSV."
  • "Run the Failory actor for united-states with maxItems 10 and summarize funding stages."
  • "Collect AI startup profiles from Failory and return founder and investor columns."

Data quality notes

The actor extracts what Failory publishes in the page HTML.

Some records may not include every field. For example, some pages may omit funding amount, investor names, or founder names. Empty values are exported as null or empty arrays depending on field type.

Limitations

  • It does not log in to Failory.
  • It does not bypass private data walls.
  • It does not infer emails or phone numbers.
  • It does not guarantee that Failory's public data is current.
  • It extracts startup profiles from list pages, not unrelated blog posts.

FAQ

Can I scrape Failory without a login?

Yes. This actor only uses public Failory startup directory pages that are visible without an account.

Does it extract emails or phone numbers?

No. Failory startup directory pages do not consistently publish emails or phone numbers, so the actor does not invent or infer them.

Why do some pages return only a small number of records?

Some Failory pages expose a curated top list in the HTML. The actor exports the records available in that public page markup.

Can I start from the main /startups page?

Yes. The actor can discover country and category links from the main directory page. Increase maxPages if you want it to follow more discovered pages.

Troubleshooting

If you get zero records, check that your URL is a Failory startup directory page.

Good URL pattern:

https://www.failory.com/startups/<country-or-category-slug>

If a broad /startups page returns fewer records than expected, increase maxPages so the actor can follow more discovered directory links.

Legality

This actor is designed to scrape publicly available Failory pages. It does not access private accounts or restricted data. You are responsible for using the exported data in accordance with applicable laws, Failory's terms, and privacy rules that apply to your use case.

Related scrapers

Other automation-lab actors that may complement this workflow:

Changelog

0.1

Initial version with HTTP extraction for public Failory startup directory pages.

Support

If a Failory page layout changes or a specific startup category stops parsing, open an Apify issue with the input URL and run ID so we can reproduce it quickly.

You might also like

Failory Live Startups Directory Scraper

jungle_synthesizer/failory-live-startups-directory-scraper

Scrape Failory's live startups directory โ€” 14,000+ startups across 267+ country, city, and industry facet pages. Extracts startup name, website URL, industry, year founded, funding amount, funding round, and facet label. Ideal for lead generation, VC research, and competitive intelligence.

๐Ÿ‘ User avatar

BowTiedRaccoon

2

Wellfound Jobs + Startup Intelligence Scraper

blackfalcondata/wellfound-scraper

Scrape wellfound.com startup jobs with company intelligence: funding stage, founders, and investor data. Free-text search plus company-only lead-gen mode and real-time Slack/Telegram alerts on new postings.

๐Ÿ‘ User avatar

Black Falcon Data

173

5.0

(1)

Wantedly Japan Tech Jobs & Startup Culture Scraper

parseforge/wantedly-scraper

Japanese tech-startup job listings from wantedly.com with company mission, culture and team narrative sections, founder, employee count, founding date, address and structured JobPosting data parsed from the public project + company pages.

Wellfound Startup Scraper With Emails | AngelList Directory

fatihtahta/wellfound-startup-scraper

Extract structured Wellfound startup profiles including company details, email adresses, phone numbers, social media accounts, hiring signal and more. Built for startup sourcing, market intelligence, and automated CRM or analytics pipelines.

Y Combinator Startups Scraper

automation-lab/ycombinator-scraper

Extract Y Combinator startup data: company names, websites, descriptions, team sizes, batches, industries, and hiring status. Filter by batch (W24, S23), status, industry, or tags. Uses the official YC API โ€” no proxy needed. Export as JSON, CSV, or Excel.

๐Ÿ‘ User avatar

Stas Persiianenko

49

Acquire.com Startup Marketplace Scraper

crawlerbros/acquire-scraper

Scrape public startup acquisition listings from Acquire.com with titles, asking price, annual revenue/profit, category, and descriptions. HTTP-only via the public sitemap + SSR listing pages; no login required.

28

TheHub.io Scraper: Startup & Investor Database

alaricus/the-hub-io-scraper

Extract 10,000+ Nordic startups and 1,000+ investors from TheHub.io. Get contact details, funding stages, team size, industries, and SDG goals. Perfect for lead gen and VC research.

Startup.jobs Scraper

shahidirfan/Startup-Jobs-Scraper

Extract comprehensive job data from Startup.jobs instantly. Ideal for tracking startup hiring trends and opportunities. This actor is optimized for stability and works great without proxy, ensuring seamless data collection at no extra cost.

71

5.0

(2)

TrustMRR Startup scraper

advantageous_subcontra/trustmrr

Get all startups listed in any category on TrustMRR startup database. Get all information about each startup, like revenue, founding year, and location.

66

Wellfound (AngelList) Scraper - Startup Jobs, Salary & Equity

thirdwatch/wellfound-jobs-scraper

Scrape Wellfound (formerly AngelList) job listings: titles, companies, locations, salaries, equity, descriptions. Uses Camoufox anti-detect browser to bypass DataDome + Cloudflare. Supports role-based and location-filtered searches.