VOOZH about

URL: https://apify.com/tri_angle/e-commerce-product-matching-tool

⇱ E-commerce Product Matching Tool Β· Apify


πŸ‘ E-commerce Product Matching Tool avatar

E-commerce Product Matching Tool

Pricing

from $1.00 / 1,000 vector matching results

Go to Apify Store

E-commerce Product Matching Tool

Match products across e-commerce datasets with E-Commerce Product Matching Tool. Use it with E-commerce Scraping Tool datasets to automatically find identical and similar products and power price monitoring or catalog comparison.

Pricing

from $1.00 / 1,000 vector matching results

Rating

0.0

(0)

Developer

πŸ‘ Tri⟁angle

Tri⟁angle

Maintained by Apify

Actor stats

0

Bookmarked

3

Total users

0

Monthly active users

a day ago

Last modified

Categories

Share

πŸ›’ E-Commerce Product Matching Tool

Match and compare products across any two e-commerce datasets. Find identical, similar, and related products between your own catalog and any competitor - with optional AI validation for higher-confidence results.


🧠 What it does

The E-Commerce Product Matching Tool takes two product datasets and automatically finds which products in one match products in the other. It runs each dataset through a three-stage pipeline - converting products into comparable representations, scoring every possible pair for similarity, and optionally using an AI model to validate the results and explain its reasoning.

It is designed to work with datasets collected by the E-Commerce Scraping Tool.

The tool is useful for anyone who needs to reconcile, compare, or deduplicate product information across two sources - without manually reviewing thousands of rows.


βš™οΈ How it works

The tool runs your two datasets through a three-stage process:

πŸ”’ Stage 1 - Vectorization Five fields from each product are extracted and converted into numerical vectors: title, brand, category, description, and specifications. These vectors are stored in a vector database, which makes it possible to compare thousands of products in seconds based on semantic meaning - not just exact text matches. This means products can be matched even when they use different wording or formatting across retailers.

πŸ“ Stage 2 - Similarity matching Every product in Dataset A is compared against Dataset B and assigned a similarity score from 0 to 100. You can choose to include all evaluated pairs in the output, or filter to only the pairs that meet your similarity threshold.

πŸ€– Stage 3 - AI validation (optional) If you enable LLM matching, an AI model reviews each candidate pair and gives a final verdict: is this a genuine match? It also provides a reasoning explanation so you can understand why it made each decision. This stage runs only on the pairs that passed the similarity threshold, which keeps costs under control.

Dataset A+ Dataset B
↓
Vectorization
↓
Similarity scoring ←── threshold filter(optional)
↓
AI validation ←── enable with"Use LLM matching"(optional)
↓
Output

πŸš€ Before you start

You need two Apify datasets containing product data. The easiest way to collect them is with the E-Commerce Scraping Tool, which lets you scrape product listings from Amazon, Walmart, eBay, and hundreds of other retailers in a single run.

Once you have your datasets, copy their dataset IDs from Apify Console and paste them into the input fields below.


βš™οΈ Input

Required

ParameterTypeDescription
datasetIdAstringDataset ID for your first product list (e.g. your own catalog)
datasetIdBstringDataset ID for your second product list (e.g. a competitor's catalog)

Options

ParameterTypeDefaultDescription
useLlmMatchingbooleanfalseRun AI validation on similarity candidates for higher-confidence results with reasoning explanations
vectorMatchesOnlybooleanfalseOnly include product pairs that meet the similarity threshold in the output. When disabled, all evaluated pairs are returned with their scores
maxOutputItemsnumberunlimitedStop processing after this many output items. Use this to cap cost on large datasets

Advanced options

ParameterTypeDefaultDescription
vectorSimilarityThresholdnumber (0-100)70Minimum similarity score for a pair to qualify as a match. Lower values return more results with more potential false positives; higher values return fewer, more precise results

πŸ“¦ Output

Each output item represents one evaluated product pair. The output always includes the similarity assessment from Stage 2. When LLM matching is enabled, it also includes the AI verdict and reasoning from Stage 3.

πŸ“¦ Output fields - similarity matching

FieldTypeDescription
productAobjectProduct data from Dataset A
productBobjectProduct data from Dataset B
similarityScorenumberSimilarity score from 0 to 100
is_matchbooleanWhether the pair meets the similarity threshold

Additional fields when LLM matching is enabled

FieldTypeDescription
llm_is_matchbooleanAI verdict: true if the model considers this a genuine product match
llm_reasoningstringThe AI model's explanation of its verdict
llm_relationshipstringThe AI model's classification of the relationship between the two products. Possible values: "same-product", "variant", "different-product"
llm_differencesarrayList of specific differences identified by the AI model between the two products. Empty array when products are identical or near-identical

Example output - similarity matching only

{
"productA":{
"title":"Apple AirPods Pro (2nd Generation)",
"price":249,
"brand":"Apple",
"url":"https://www.amazon.com/..."
},
"productB":{
"title":"Apple AirPods Pro 2nd Gen - USB-C",
"price":229,
"brand":"Apple",
"url":"https://www.walmart.com/..."
},
"similarityScore":94,
"is_match":true
}

Example output - with LLM matching enabled

{
"productA":{
"title":"Apple AirPods Pro (2nd Generation)",
"price":249,
"brand":"Apple",
"url":"https://www.amazon.com/..."
},
"productB":{
"title":"Apple AirPods Pro 2nd Gen - USB-C",
"price":229,
"brand":"Apple",
"url":"https://www.walmart.com/..."
},
"similarityScore":94,
"is_match":true,
"llm_is_match":true,
"llm_reasoning":"Both products are the Apple AirPods Pro 2nd generation. The title variation reflects the USB-C connector variant, which is the same product sold under a slightly different listing title. Brand, model generation, and key features are identical.",
"llm_relationship":"same-product",
"llm_differences":[]
}

πŸ’Ό Use cases

🏷️ Competitive price monitoring

Scrape your own product catalog and a competitor's catalog using the E-Commerce Scraping Tool, then run both datasets through this tool to find where the same products are priced differently. Schedule it to run weekly for ongoing price intelligence.

πŸ—‚οΈ Catalog deduplication

If you manage product feeds from multiple suppliers, run any two feeds through the tool to identify duplicate or near-duplicate listings before merging them into your master catalog.

πŸ›οΈ Marketplace comparison

Compare your Amazon listings against your Walmart listings to find products that exist in one place but not the other, or that have mismatched titles, prices, or descriptions across platforms.

πŸ”„ Product feed alignment

Reconcile an internal product database against an external feed (a distributor, a retailer, or a data provider) to verify coverage and spot discrepancies.


πŸ’° Pricing

The tool uses a pay-per-event pricing model - you are charged based on the number of product pairs processed, not for the run itself.

Controlling costs

  • Set maxOutputItems to cap the number of pairs processed in a single run. The tool stops as soon as the limit is reached, so your cost is fully bounded.
  • Use vectorMatchesOnly: true to filter early - only pairs that pass the similarity threshold proceed to output (and to LLM validation if enabled), which reduces cost on datasets with low match rates.
  • LLM matching adds cost per validated item. Disable it if the similarity score alone gives you sufficient signal for your use case.
  • Run a small test with a sample of each dataset to calibrate your similarity threshold before processing the full dataset.

πŸ”— API integration

JavaScript

import{ ApifyClient }from'apify-client';
const client =newApifyClient({
token:'<YOUR_API_TOKEN>',
});
const input ={
datasetIdA:'<YOUR_FIRST_DATASET_ID>',
datasetIdB:'<YOUR_SECOND_DATASET_ID>',
useLlmMatching:true,
vectorMatchesOnly:true,
vectorSimilarityThreshold:70,
maxOutputItems:1000,
};
const run =await client.actor('tri_angle/e-commerce-product-matching-tool').call(input);
console.log(`Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);

Python

from apify_client import ApifyClient
client = ApifyClient('<YOUR_API_TOKEN>')
run_input ={
'datasetIdA':'<YOUR_FIRST_DATASET_ID>',
'datasetIdB':'<YOUR_SECOND_DATASET_ID>',
'useLlmMatching':True,
'vectorMatchesOnly':True,
'vectorSimilarityThreshold':70,
'maxOutputItems':1000,
}
run = client.actor('tri_angle/e-commerce-product-matching-tool').call(run_input=run_input)
print('Check your data here: https://console.apify.com/storage/datasets/'+ run['defaultDatasetId'])

CLI

echo'{
"datasetIdA": "<YOUR_FIRST_DATASET_ID>",
"datasetIdB": "<YOUR_SECOND_DATASET_ID>",
"useLlmMatching": true,
"vectorMatchesOnly": true,
"vectorSimilarityThreshold": 70,
"maxOutputItems": 1000
}'|
apify call tri_angle/e-commerce-product-matching-tool --input-file - --silent --output-dataset

πŸš€ Getting started

  1. Collect two product datasets - use the E-Commerce Scraping Tool or any Apify scraper that returns product data
  2. Find each dataset's ID in Apify Console under Storage > Datasets
  3. Open the E-Commerce Product Matching Tool and paste both dataset IDs into the input
  4. Choose your options: enable LLM matching for higher confidence, or keep it off for faster, lower-cost results
  5. Click Start and wait for results - the run time depends on dataset size and whether LLM matching is enabled
  6. Download your output as JSON, CSV, or Excel, or connect it to your data pipeline via the API

❓ FAQ

Do I have to use E-Commerce Scraping Tool to collect the data? No. Any Apify dataset that contains product data works as input. E-Commerce Scraping Tool output is natively compatible, but you can use data from any source as long as it's stored in an Apify dataset.

How accurate is the matching? Similarity matching works well for products with consistent names, brands, or standard identifiers (like EAN or UPC). For products with ambiguous or highly variable descriptions, enable LLM matching - the AI model reads the full product context and provides a verdict with reasoning, which significantly improves accuracy.

What similarity threshold should I use? The default of 70 is a good starting point for most cases. Lower it (e.g. 50-60) if you want more results and are willing to review some false positives. Raise it (e.g. 85-90) if you want only very high-confidence matches. Test with a small dataset sample first.

Can I match more than two datasets at once? Not in a single run. To compare three datasets, run the tool twice: A vs. B, then B vs. C (or A vs. C). Each run produces a separate output dataset.

How do I control costs on large datasets? Set maxOutputItems to a number that fits your budget. The tool stops processing as soon as that limit is reached, so your cost is fully bounded. You can also use vectorMatchesOnly: true to skip outputting low-similarity pairs, which reduces the number of items LLM matching needs to process.

Can I schedule this to run automatically? Yes. Use Apify's built-in scheduler to run the tool on a recurring basis - daily, weekly, or at any custom interval. Combine it with the E-Commerce Scraping Tool on a matching schedule to keep your match data up to date automatically.

What export formats are available? Output is available as JSON, CSV, Excel, XML, and HTML. You can also connect directly to the output dataset via the Apify API or integrate with tools like Google Sheets, Zapier, n8n, and others.

You might also like

E-commerce Scraping Tool

apify/e-commerce-scraping-tool

Scrape data from e-commerce websites with E-commerce Scraping Tool. Scrape almost any retail site in minutes, extract e-commerce data and use it to monitor price details over time or compare different e-commerce sites’ offerings.

Ecommerce-Product-Scraper

digicovai/ecommerce-product-scraper

Scrape data from e-commerce websites with E-commerce Scraping Tool. Scrape almost any retail site in minutes, extract e-commerce data and use it to monitor price details over time or compare different e-commerce sites’ offerings.

E-Commerce Price Extractor

smacient/e-commerce-price-extractor

Advanced AI powered price scraping tool, that works across most of the E-Commerce Platforms. Perfect for price extraction, comparison, competitor analysis, and flexible pricing optimization

πŸ‘ User avatar

Tacheon Digital

21

5.0

E-commerce Email Scraper πŸ”πŸ›’πŸ“§ - Cheap & Advanced

scrapestorm/e-commerce-email-scraper---cheap-advanced

πŸ” Scrape E-commerce Emails Easily Enter your search parameters (e.g product keywords, email domains & platform) to collect verified seller or store contacts along with product title, store description & more πŸ“Š Perfect for e-commerce lead generation, B2B outreach, product research & market analysis

117

5.0

ecommerce-guardian-pro

allanjblythe/price-monitor

Professional price intelligence platform. Monitor competitor pricing across major e-commerce sites. Get instant alerts for price changes, stock status, and market movements. Uses Apify's advanced E-commerce Scraping Tool. Essential for e-commerce strategy and competitive analysis.

E-commerce Email Scraper - Low-costπŸ’²πŸ”₯πŸ”πŸ›’

delectable_incubator/e-commerce-email-scraper-low-cost

Scrape e-commerce contacts and store data πŸ”πŸ›’ with a powerful email scraper. Extract verified seller emails, contacts, product titles, store descriptions, and source links using keywords, domains, or platforms. Ideal for B2B lead generation, outreach campaigns and e-commerce market intelligence πŸ“Š

E-commerce Price Tracker Actor

kingofthejunes/e-commerce-price-tracker-actor

E-commerce Price Tracker - Never Miss a Sale Again! Automatically track product prices across multiple e-commerce platforms and get notified when prices drop or items go on sale.

πŸ‘ User avatar

Kayode Balogun

2

AI Product Matcher

equidem/ai-product-matcher

Match products across multiple e-commerce websites. Use this AI product matching Actor whenever you need to find matching pairs of products from different online shops for dynamic pricing, competitor analysis or market research.

πŸ‘ User avatar

MatΔ›j Sochor

770