Changing log_file_size_partitions only impacts newly created databases as you noted.
Another knob you can modify to reduce space is snapshot_trigger_size, which is how much log we allow to accumulate on disk before a snapshot is taken. Its set to 1 GB by default.
See Adam’s earlier post for some context. We pre-allocate large log files since 7.0, and they will fill up with user data if you use the DB significantly. So the issue is mostly for empty DBs. And disk is cheap these days.
In my use case I’m creating a DB per “project” and as a SaaS it might have hundreds of “projects”… it won’t be cost effective to reserve 6go on every node for each database that might remain empty.
I expect the active databases to grow at 5-10go / year, could you please give me insights on the right log_file_size_partitions to use?
What is the impact of having a log_file_size_partitions at 8mo? 16mo? 64mo?
It mainly impacts write throughput when there is a sustained burst of rowstore writes. You could see dips in throughput when we need to create new log files due to having it smaller - how much so depends more on your disk throughput . If your workload is mostly columnstore its unlikely you would notice anything. Setting it to 64 mb should be fine.
This only saves you space for mostly empty databases. It doesn’t save space for database with data in them
Probably the best way to save disk space is to use fewer partitions for some of your smaller databases (I see you are already going that a bit. Some of them are 8 partitions vs 16). You can increase the partition count later, but it is an offline operation right now.
Can you confirm that reducing the number of partitions will also reduce the parallelism of the queries, and increase the response time for CPU-intensive queries?
The actual rate for Google Compute Engine SSD disks in my location is $8.5 /mo for 50G (without snapshots).
If the cluster is not taking more space per empty DB it’s fine, but coming from other engines (MySQL, Mongo…) it’s shocking to see 42G used for 200mo of data!
Can you confirm that a 4 vCPU node will process a 4 partitions db as fast as a 8 partitions, when it receives a query?
For most query shapes that’s correct Pierre. 4 partitions on a 4 vCPU node is optimal.
And yeah, we don’t really optimize disk usage for the many small databases case. The system is more optimized for fewer larger databases (so this is why it does more upfront disk allocation for each database).