SingleStore Matches Vector Search Performance of Pinecone and Zilliz — Plus Gives Benefits of a Modern SQL Database

The world changed with the advent of ChatGPT, sparking a revolution in how we interact with AI.

This breakthrough highlighted the immense potential of Large Language Models (LLMs), ushering in a new era of applications like semantic search and Retrieval-Augmented Generation (RAG). At the heart of these innovations lies vector search — a critical enabler for modern applications striving to deliver smarter, faster and more intuitive user experiences.

The demand for efficient vector search has propelled specialty databases like Pinecone and Zilliz into the spotlight. These systems have demonstrated the value of purpose-built vector databases in accelerating AI-driven workloads. At the same time, virtually all major SQL and NoSQL databases have responded by adding indexed vector search capabilities, recognizing this technology is no longer optional for a competitive DBMS.

However, when evaluating vector search solutions, one metric stands out: queries per second (QPS) per dollar at a fixed level of recall. Achieving competitive QPS/$ is a requirement for many applications. But being able to achieve the required level of performance on your primary DBMS for vector search has benefits for complexity and cost as well. This includes reduced data movement and lower costs for software, hardware and people. And let’s not forget — most developers and architects still need the power of SQL to handle complex tasks like filters, joins, aggregates and window functions alongside vector search.

Enter SingleStore. In early 2024, with version 8.5, we introduced vector Approximate Nearest Neighbor (ANN) search, marking a significant milestone in our journey to unify the best of SQL with advanced technologies like vector search, full text and more. Since then, we’ve validated that SingleStore delivers competitive QPS/$ for vector workloads — all while offering the robust analytics and transactional capabilities our users rely on for structured and semi-structured data.

In this blog, we’ll explore how SingleStore stacks up against two respected specialty vector databases: Pinecone and Zilliz. Read on to discover why SingleStore may be the only database you need for modern, vector-powered applications.

Performance

Our performance tests show SingleStore has cost-competitive performance with specialty vector databases. We benchmarked SingleStore versus Zilliz and Pinecone, using the popular, open-source VectorDBBench benchmark suite with the Cohere data set with 10M vectors of 768 dimensions and HNSW indexes.

The maximum Queries per Second (QPS) for two configurations of each database is shown here. For all configurations for all three systems, the maximum QPS is within a factor of two of each other. The subsequent graph shows the monthly costs of each of these configurations. SingleStore combines competitive vector search performance with fast SQL analytics, joins and aggregations across petabytes of structured and semi-structured data to power intelligent applications.

The tests were run with the Cohere data set with 10M vectors of 768 dimensions and HNSW indexes. The recall for all six configurations was similar, ranging from 88.8% for Zilliz (recommended) to 91.5% for Pinecone. There's no across-the-board dominant product for price-performance, though Zilliz does have a strong price point for the smaller configuration for this particular benchmark. For real workloads, price-performance is close enough that other considerations will come into play in deciding what product to buy.

These include query language expressiveness, performance in the presence of filters, transaction processing and HA/DR requirements, partner tool integrations and the importance of using one DBMS platform vs. two platforms in an application architecture. We expect that vector search price-performance is an active area of R&D at all three companies; we know that is the case at SingleStore.

Measurement

The performance measurements were done with the open-source and widely used VectorDBBench benchmark suite which was originally created by Zilliz, and simulates realistic vector search use cases.

Our performance tests used the Cohere 10M data set size with 768 dimensions, which is included with VectorDBBench. VectorDBBench consists of three phases: ingest, optimize and query. The query phase with the metric maximumQPS was used for the preceding performance results. Each VectorDBBench test driver instance is deployed in the same region as the DBaaS cluster on a VM with 16 vCores and 32 GB RAM. Full results can be found here.

For these performance measurements, VectorDBbench 0.0.9 was forked and extended with support for SingleStore, CLI support for VectorDBBench and support for specifying the time when to create the index, (i) at the start of the vector ingestion phase on an empty database or (ii) during the optimize phase on an ingested vector data set. The benchmark process is automated with the benchANT platform that integrates the VectorDBBench fork. The fork is publicly available on GitHub.

Configuring SingleStore

Configuring SingleStore for high performance for vector and full-text index searches results in significant performance improvements. For high-throughput full-text or vector index search workloads, where the goal is high QPS (as is in the experiments shared), configuring for maximum table segment size is desired. This leads to fewer, larger indexes, since there is one vector index per segment. Since index search time is logarithmic in the size of the index, that minimizes total compute time to search the index. That in turn gives better QPS.

In SingleStore, vector (and full-text) search performance is highly related to the number of index segments examined. So to optimize performance, the index segment size should be maximized so that queries process the fewest index segments. The following configurations are recommended:

Use large segments. Allow columnstore segments to be as large as possible (through setting internal_columnstore_max_uncompressed_blob_size and columnstore_segment_rows to their maximum possible values.
Use few partitions and eliminate subpartitions. Use one partition per leaf node, or two partitions if using a small, single-leaf or two-leaf system and eliminate subpartitions.
Evenly distribute data among partitions. Choose a shard key that will evenly distribute data among partitions.
Turn down flexible parallelism. Set flexible parallelism so that one thread will scan each partition by setting query_parallelism_per_leaf_core = 0.01. This setting reduces per-query overhead and helps performance when the majority of the query execution work is for index searches.

This configuration works well for workloads in which multiple queries are run in parallel, and when queries do not have highly selective filters, as is the case with the VectorDBBench workload.

Try SingleStore for vector search

SingleStore gives you the best of both worlds: competitive performance for vector search plus a full-fledged SQL database with modern full-text search. With SingleStore, you get fast vector search and the power of SQL to handle filters, joins, aggregates and more. We've shown that SingleStore has competitive performance with specialty vector databases like Pinecone and Zilliz. SingleStore may be the only database you need for modern, vector-powered applications.

Start free with SingleStore today.

Appendix: Configurations

The goal of this performance benchmark is to ensure a fair and transparent setup for all considered vector databases. The baseline for the comparison are the SingleStore Helios S-2 and S-4 clusters. For Pinecone, comparably priced cluster sizes are selected. For Zilliz, one cluster size is selected based on the Zilliz Cloud Cost Calculator for the target workload of 10M vectors with a dimension of 768, resulting in the configuration Zilliz (recommended). The second configuration for Zilliz is selected to be price equal with SingleStore Helios S-4.

The following table shows the applied DBaaS cluster configurations that were used to carry out the benchmarks. In addition, to ensure full transparency and reproducibility, the raw benchmarking data is also available on GitHub.