I’ve long admired Rockset for its simplistic approach to real-time analytics on structured and unstructured data coming from any database. The reality is that many database providers make these claims, but few deliver. Rockset was one of them.
As a SingleStore Solution Engineer for the last five years (now leading customer solutions), I’ve seen a healthy bit of competition from Rockset — which has made both our technologies even better. With the acquisition of Rockset by OpenAI and the subsequent decommissioning of the technology for its customers, developers are now forced with a decision on how to power their analytics applications going forward.
In my experience seeing Rockset in the field, there are three critical capabilities for which developers select it. Let’s take a look at those, and discuss your alternatives. If after reading this SingleStore sounds interesting to you, feel free to try here and reach out to our field engineering team directly for support.
1. Connectivity
One of the coolest things about Rockset has always been the ease of getting data from your OLTP databases. Of course, this feature is rooted in the ideology that OLTP and OLAP should be kept separate — a theory we at SingleStore have debunked with our patented Universal Storage (but I digress).
Without these capabilities from Rockset, users are forced to push data out to messaging infrastructure like Kafka or object storage. At SingleStore, we have built Pipelines, a customer-loved feature that does parallel ingest from MySQL, Postgres, MongoDB® and more with just five lines of SQL. My latest tests on our Free Starter Workspace showed 100k inserts per second, with some of our largest customers doing upwards of 12M upserts per second!
2. Unstructured support
Rockset does not require fixed schemas, making it super easy for developers to build analytics into their existing application codebase. Low-latency (sub-100ms) analytics on JSON is a monumental challenge, one that even the JSON gods (MongoDB) have not conquered yet.
The most common solution to this problem is to extract JSON into relational columns and run your analytics. This slows down ingestion and frankly… I just think it’s cheating 🤷.This is something we allow SingleStore users to do but most of them have elected to use our native JSON query support to do point lookups and large-scale aggregations using our robust seekable JSON features. Check out this blog from Heap.
3. Complex, low-latency analytics
Rockset relies heavily on its real-time indexing capabilities, which in some cases slow down ingestion and create unnecessary storage (memory/disk) overhead. For some workloads that don’t have as heavy a reliance on price-performance characteristics, this may not matter. However, it can become a challenge as workloads scale.
At SingleStore, our patented Universal Storage relies on two simple keys: a shard key (controlling distribution of data) and a sort key (controlling ordering within a column segment). Users are free to add indexes to their heart’s content, but SingleStore delivers performance for most use cases with these two simple definitions at DDL creation. Check out how Armis reduced their data pipeline cost by 70% by switching to SingleStore.
Summary
The acquisition of Rockset by OpenAI has shown us one thing for sure: the worlds of AI and databases are one and will remain that way for a long time. Whether you’re building real-time analytics apps or you’re ahead of the pack with deploying AI, we believe SingleStore is a rich alternative to make sure this big news makes little (or no) impact to your day-to-day.
In fact, we’ll work with you to bring your data to SingleStore, migrating you over for free. See? Told you we’d make it easy 😊.