From Rockset to SingleStore: My Journey and a Guide for Rockset Developers Seeking a New Home

Rockset becomes part of OpenAI

In June 2024, OpenAI acquired Rockset, a real-time analytics database known for its speed and flexibility.

From Rockset to SingleStore: My Journey and a Guide for Rockset Developers Seeking a New Home

Following this acquisition, OpenAI announced it would cease Rockset’s cloud service operations, leaving its customers in need of alternative database solutions. Shortly after the acquisition, I joined SingleStore as a Solutions Engineer in the EMEA team, bringing with me the experience and insights gained from my field tenure at Rockset.

As Rockset’s customers seek a new database to support their real-time analytics needs, SingleStore is a compelling option for many of them. Its distributed, SQL-based design can handle both transactional (OLTP) and analytical (OLAP) workloads (also known as HTAP), with high performance and scalability. SingleStore’s architecture, featuring in-memory rowstore tables and disk-backed columnstore tables — together with bottomless storage — offers flexibility in managing diverse data types and workloads. What’s interesting about these two platforms is they have a lot of things in common from an architectural perspective — for example, both are based on an aggregator- leaf model.

Migration process to SingleStore

For companies and prospects who were using Rockset or showed interest in it but hadn’t yet reached production, the transition to SingleStore offers an exciting opportunity to address similar requirements and use cases. Like Rockset, SingleStore excels in real-time analytics and can handle semi-structured and structured data at scale, making it a strong fit for use cases including real-time dashboards, IoT analytics, fraud analytics and personalized customer experiences. SingleStore’s ability to combine high-performance transactional and analytical capabilities in one platform means it can cover even more scenarios, while its flexibility in data ingestion and query optimization ensures it can adapt to varied workloads. If you’re exploring data platforms to solve these challenges, SingleStore provides a robust, future-proof solution worth considering.

To ensure a smooth migration process, it’s essential to understand the differences in data ingestion, data modeling and storage, querying capabilities and integration endpoints between the two platforms. SingleStore’s Pipelines feature, for instance, provides parallel streaming data ingestion from distributed sources, simplifying data workflows by reducing the need for ETL middleware.

I was lucky enough to play and work with both platforms, and my opinion is that the migration process shouldn’t be very complex or painful. Because both Rockset and SingleStore are SQL- based databases focused on real-time analytics (which means low latency on ingest and fast queries, even at high concurrency), the migration process is not complex and involves several key steps:

Data migration. Export your data from Rockset and import it into SingleStore (via S3 for example), ensuring compatibility and data integrity during the transfer. If your data source is not Rockset, this step involves hooking up your original data source to SingleStore — you can use Pipelines or SingleStore Flow to get data in with little to no code written. If you use MongoDB®, you should evaluate SingleStore Kai™ as well.
Schema adaptation. Adjust your data schemas to align with SingleStore’s table formats, choosing between rowstore and columnstore based on your specific workload requirements. Apart from that, you’ll need to choose your shard keys, and optionally sort keys and indexes to help with query performance. Rockset is schemaless and indexes everything upfront via the Converged Index feature — which is great to start with, but offers less flexibility later on (since you can’t decide which indexes are created) and triples your storage footprint (again, due to multiple copies of data/indexes). Additionally, I think SingleStore offers a richer data management capability, where you can ALTER your tables, add and drop indexes and so on.
Query translation. Rewrite your existing queries to conform to SingleStore’s SQL dialect, taking advantage of its performance optimization features. SingleStore supports a wide range of SQL functions and can be further extended with custom code (stored procedures, functions and even Wasm code!). This step could be iterative and depends on details like JSON handling, using hints with your Rockset queries, etc. If you use similarity search (with vector embeddings) or full-text search, you are well covered with SingleStore’s capabilities in this space.
Application integration. Update your application configurations to connect with SingleStore, conducting thorough testing to ensure seamless operation. Rockset had a query lambda endpoint where you could convert your SQL to APIs. SingleStore offers a data API where you can submit your SQL and get data back. SingleStore is also MySQL-wire compliant, and has a wide range of supported drivers, SDKs and external tools (like BI tools, ORMs, etc.).

In conclusion, the transition process from Rockset, while painful and unexpected, offers an opportunity to leverage SingleStore’s robust features for your real-time analytics needs. I encourage you to explore SingleStore’s capabilities and consider it as a viable solution for your data infrastructure — covering transactions, analytics and search applications at scale. With additional features like deployment across AWS, Azure, GCP or on-prem, full-text search support with Lucene, Iceberg integration, no-code ETL via SingleStore Flow, extensibility via Wasm and fast vector search, you should be well covered with whichever real-time requirements you might have.