Hi!
I understand that across databases in the same cluster, the sharding strategy can be different. That is, if two tables that exist in different databases have the same shard key, they won’t necessarily be distributed the same way. Thus, joins of these tables across databases may require repartitioning.
I’m sure there’s a good reason for this, but I was wondering if in the future you might add a configuration option to force different DBs in the same cluster to use the same sharding strategy?
The use-case here is that after migrating from Vertica, my org had to arrange our data into separate DBs for backwards compatibility with existing APIs. Vertica has “schemas” which gave us a convenient way to organize data by a particular category within the same database, denoted by a prefix, IE:
my_metadata_store.table_1
my_fact_store.table_2
Where my_metadata_store
and my_fact_store
were in the same database, but different “schemas.” Since switching to SingleStore, we have organized it such that my_metadata_store
and my_fact_store
are separate databases to achieve the same organization style. In hindsight, knowing what we know now about sharding, we might have approached this differently, but reorganizing this data otherwise would be a pretty big endeavor.
At query time we have many instances where cross-database joins occur. So, if we could shard those tables the same way, it would be very helpful performance-wise.
Excuse the long post, the context felt relevant. Thanks for reading!