Is a one-host cluster better?

pierre · January 14, 2023, 7:51pm

Hello SingleStore,

A (self hosted) multi-host cluster is supposed to bring “high availability”, however it’s not that obvious because:

the DSN only points to one host, and the “master aggregator” failover is not managed by SingleStore.
managing a “master-aggregator” failover is crazy complex (unless I missed something?)
therefore we loose “network high availability” and only have “data high availability” because the data is copied on other nodes
but “data high availability” is provided by cloud providers via mountable network disks & periodic snapshots etc…

In this case, is a cluster deployed in one big VM (with 16CPU or more) more performant ?

no network latency = huge improvement on “broadcasted joins”
less monitoring and maintenance
more cost efficient: no duplication of partitions = less disk space, better saturation of the CPUs in case of “hot shards” etc…
faster disaster recovery (setup one VM and not 5)

Or did I miss something?

Thanks

bremy · January 17, 2023, 10:58pm

Hey Pierre,
To answer some of your about running with high availability, I’ll explain from the perspective of running inside of a kubernetes environment and how it is leveraged in this context.

when running with kubernetes, a service is created to route traffic to the child or master aggregators. when a node goes goes offline or is unhealthy the DSN for those services will route traffic accordingly
as for the current challenge of handling the master aggregator failover, a common practice is to deploy with a persistent volume claim providing data availability
and let kubernetes spin up a new pod for the master aggregator. While this type master aggregator downtime will cause a ddl outage, the online child aggregators will still be able to process dml queries.

To expand in a little more detail on the availability cases in this kubernetes setup:
when a master aggregator fails - ddl queries are impacted until a new master aggregator pod is spun up but dml queries to the child aggregators will be functional
when a child aggregator fails - when running with multiple child aggregators, traffic will be routed to the healthy child aggregators
when a leaf fails - the cluster will initiate failovers for the database partitions of the impacted leaf which provides better availability then waiting for a leaf to come back online

To respond to your performance questions in this type of environment:

you are correct that network latency will reduce but you will be limited to resources of a single machine
for monitoring and maintenance, I would argue that kubernetes makes the monitoring story scalable
you are correct that there is a cost to high availability at the partition level on the leaves but the benefits are:
a) the online aspect of leaf partition failovers
b) the database partitions primaries at steady state will be balanced and split between the leaves; because being a primary leaf partition is more cpu intensive than being a second cpu partition having a secondary leaf
will help balance the load.

To summarize, running in this type of setup will provide child aggregator availability for the dml service and leaf failovers to minimize downtime.

Regards,
Brooks

pierre · January 17, 2023, 11:20pm

Thanks Brooks for detailing the MA failover strategy with kubernetes, for a moment I thought you were using SQLProxy with a “ping script” that would update the proxy config on the fly.

Sadly kubernetes requires a full-time qualified admin to be maintained and secured, so we are trading a complex failover of MA for a complex administration of k8s. However it’s interesting to understand how you made it work in SingleStore Cloud.

I will experiment with a one-big-vm deployment for the coming months and see where it goes, so far I noticed better response time on complex queries that required JOINs broadcasting.

Best