We have our databases spread across 2 different clusters with separate databases, however we need to access them. Is there any way that we can create DBLink and access it in Singlestore? any help in cross cluster data access is much appreciable.
Have you thought of using workspace and workspace group? We released that big feature recently where storage and compute are decoupled. You can have multiple workspaces (workspace is a pool of compute resources) connected to multiple databases (like n:m relationship). You just need to attach a database to a workspace to query it.
We don’t support queries that span multiple clusters (or workspaces). All the data you query must be on the cluster. If you really need to query data from another database on another cluster combined with data on a different cluster, you’ll need to move the data over. One easy way to do that may be to use REPLICATE DATABASE to replicate the remote DB to a local DB. On cloud, workspaces might help you, as Arnaud said.
Someday we may do cross-cluster query support. We’ll make a note that you want it.
Just to add a vote to this. We have a similar requirement whereby we will be running 2 separate clusters in azure for the purpose of region redundancy. It would be ideal if we could actually span a database over the 2 clusters and have the resiliency work cross region, however any communication ability between 2 clusters would simplify our design on this w.r.t ensuring the 2 databases are in sync.
I was hoping that I could actually bind the nodes to a shared region replicated azure storage account, but I’m assuming that would not work because there would be 2 master nodes each controlling their own leaf nodes unaware of each other but all writing to the same db.
@william.pegg it sounds like you want disaster recovery (DR) for our cloud service. Rest assured we are working hard on that. It is in our future. It’s different from cross-cluster query, which is what I think @prathap_26 is asking about.
We already have multi-availability zone (AZ) redundancy for Premium Edition on our cloud – not quite DR but it gives more protection against failures, even of a whole data center (AZ).
For on-prem, you can use REPLICATE DATABASE to set up DR.
This is not in your cloud, we run singlestore in our own AKS clusters that exist in two azure regions. The clusters would be running in an active/active configuration, i.e. they (usually) have no knowledge of each other and both duplicate the processing effort and results. Then if one region went down our load balance would fail over to the other region. The problem we have is that the data in these two clusters could drift as a result of random failure or other evironment specific events. in order to reduce the visibility of a failover we would need to synchronise the data in these clusters (or at least compare and know the differences) every so often, this would be easiest to do if the two could talk to each other for this specific purpose.
Other solutions exist of course, we would just need to broker this communication through our service layer, it would just be a useful option to be able to sync or copy data via SQL as the data volume would be significant. Then as I said, even better could be if the two regions were actually using the same database as it would open the possibility of sharing some of this load instead of just duplicating it.
Replicate Database might be what I need, I’ll take a look at that.
We’d love to hear how it turns out, lessons learned and solutions so we can highlight on future discussions and livestreams to share with other members who can learn from this.