VOOZH about

URL: https://thenewstack.io/report-opensearch-bests-elasticsearch-at-vector-modeling/

⇱ Report: OpenSearch Bests ElasticSearch at Vector Modeling - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2025-03-18 06:00:20
Report: OpenSearch Bests ElasticSearch at Vector Modeling
AI / Data / Open Source

Report: OpenSearch Bests ElasticSearch at Vector Modeling

A Trail of Bits report found OpenSearch outperformed Elasticsearch in vector search and general workloads, thanks to OpenSearch's support for multiple search engines, vector embeddings customization, and advanced metadata filtering options.
Mar 18th, 2025 6:00am by Jelani Harper
👁 Featued image for: Report: OpenSearch Bests ElasticSearch at Vector Modeling

A recent research report from analysis firm Trail of Bits highlights some of the key differences — representing critical considerations for contemporary information retrieval — between OpenSearch and Elasticsearch. OpenSearch and the Open Search Project were created by Amazon; OpenSearch’s search and analytics platform was forked from Elasticsearch.

The offerings were evaluated with the OpenSearch Benchmark, which compares solutions according to various workloads. The report indicates that OpenSearch v2.17.1 (the latest version at the time the research was performed) was 11 percent faster on the Vectorsearch workload than ElasticSearch v8.15.4.

It also reveals that OpenSearch was 1.6x faster on the Big5 workload. These results were found when aggregating the geometric mean of each solution’s queries. Both platforms have since been updated to other versions.

Trail of Bits chose to spotlight the results of these workloads in a recent blog partly because of their meaningfulness to the enterprise. According to Evan Downing, Trail of Bits senior security engineer, AI/ML, and one of the preparers of the report, “Big5’s kind of your generic workload that will satisfy most users and the Vectorsearch workload will evaluate things that have to do with machine learning and vector embeddings.”

The Vectorsearch workload directly correlates to generative AI applications and applications of vector similarity search. According to Trail of Bits Engineering Director William Woodruff, the Big5 workload involves “things like searching for terms over a product database.”

An examination of the different approaches OpenSearch and Elasticsearch invoke for meeting these workloads, and others in the OpenSearch Benchmark, illustrates some of the most useful capabilities in search today.

Multiple Search Engines

Although the solutions were assessed with the OpenSearch Benchmark, “To my knowledge, OpenSearch Benchmark was forked from the Elasticsearch benchmarking suite,” Downing said. Despite the fact that OpenSearch itself was forked from Elasticsearch, the report indicates that a comparison between the two solutions isn’t apples to apples.

One of chief differences is that, at the time of the research (most of which occurred between September and December of 2024), OpenSearch supported a variety of search engines—including those designed for vector embedding retrieval use cases—while Elasticsearch supported just one, Apache Lucene. OpenSearch users can avail themselves of Lucene, Facebook AI Similarity Search (Faiss), and Non-Metric Space Library (NMSLIB).

This three to one ratio of engines between OpenSearch and Elasticsearch could have impacted OpenSearch’s favorable results in the vectorsearch workload.

Vector Search Algorithms and Quantizations

The various search engines assessed in the benchmark employ different approaches to information retrieval — which is not a monolithic process. According to Downing, Lucene, Faiss, and NMSLIB “support different algorithms for doing vector search and also different quantizations. So basically, you can think of this as a compression for the dataset size and the requirements that are required by the users of these algorithms.”

Quantization techniques are one of the factors that influence the performance of vector search databases. The compression to which Downing referred can impact the cost of using vector search systems, particularly in terms of storage. Although there are a host of differences between these three engines, for the actual benchmark, it was pertinent that “each of those workload engines requires different parameters in order to run, based on different API requirements and other things,” Downing said. “So, when we’re comparing this all on the line, we’re comparing OpenSearch with Lucene, OpenSearch with NMSLIB, OpenSearch with Faiss, and Elasticsearch with Lucene.”

Smart Metadata Filtering

Of the three, Lucene may be the most widely known engine. It’s an open source search engine library operated by the Apache Foundation. For solutions that have multiple engines to choose from, as OpenSearch does, there are some applications for which Lucene is particularly appropriate. “It is my understanding that Lucene is generally a good option for smaller deployments,” Downing commented.

One of the more notable facets of Lucene is its metadata filtering. Typically, users can filter the results of vector database searches based on metadata about the actual embeddings. There are options for filtering metadata before searches and after searches, which can affect the overall quality of the results.

The distinction with Lucene is that it “offers some benefits, as does Faiss, with some things like smart filtering, where the optimum filtering strategy, like pre-filtering, or post-filtering, or exact K-Nearest Neighbors, is automatically applied depending on the different situation,” Downing said. Faiss is a software library (with few third-party dependencies) for vector similarity search and other applications that underpin use cases for generative models. NMSLIB is a vector embedding search library and toolset for assessing similarity search methods. “NMSLIB and Faiss are built mostly for large-scale use cases,” Downing said.

Big5 Workload

The Big5 workload illustrates how far information retrieval has come today. It encompasses aspects of text querying, sorting, date histograms, range queries, and term aggregations. These capabilities are useful for searching through documents, product and customer information, structured and unstructured data, and more.

OpenSearch outperformed Elasticsearch in all Big5 categories and was 16.55 times faster than Elasticsearch in the date histogram component. Date histogram features provide temporal aggregations. “This is sort of a chronological grouping, you could say, where you’re dividing the dataset into buckets or intervals,” Downing commented. “So, for example, we want to say give me all the documents from a specific day on this month.”

Text queries are predicated in part on lexical, or keyword, search capabilities and are commonly applied to use cases involving user IDs, email addresses, or names.  Range queries “are based on a specific range of values in a given field,” Downing explained. With these capabilities, users can retrieve results from a dataset in which the temperature is between 70 and 85 degrees, for example. Sorting enables organizations to order the results of queries according to any number of factors, which might include chronological, numeric, or alphabetical order.

Meaningful Findings

For the enterprise user, the most meaningful findings from the recent benchmark between OpenSearch and Elasticsearch have less to do with the performance of these solutions and more to do with their capabilities. The report indicates that all vector search platforms are not the same. They incorporate different engines that support respective features.

Some of those distinctions pertain to libraries for vector embedding search and pivotal considerations like metadata filtering, as well as versatility for quantization and compression. Moreover, capabilities for sorting search results, aggregating search terms, issuing range queries, and other facets of the Big5 workload are also worthy of consideration when assessing search and analytics platforms — and their performance.

TRENDING STORIES
Jelani Harper has worked as a research analyst, research lead, information technology editorial consultant, and journalist for over 10 years. During that time he has helped myriad vendors and publications in the data management space strategize, develop, compose, and place...
Read more from Jelani Harper
SHARE THIS STORY
TRENDING STORIES
Amazon Web Services and Elastic are sponsors of The New Stack.  
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.