I upgraded SingleStore from 7.3 to 7.5 this weekend and the disk space usage has gone through the roof. Still trying to determine what has changed in between these versions.
I have added some disk as an emergency measure but the disk usage is still growing at an alarming rate. And this is not due to any DBs being filled in by a lot of data. This seems to be purely linked to the singlestore server itself.
We went from a stable 150GB to about 600GB now and growing…
I’ve tracked down the increase to the data files in the leaf nodes. Checking one of the leaf, it’s increased by 30GB in about 12h (There are two leaf nodes so 60GB increase in 12h).
Could you provide some solution to this? Disk is not infinite and we’re now having to pay for more disk even though we’re not actually holding any more data.
Checking further picking one DB as an example.
In 7.3 we used to have about 2 data files of log_file_size_ref_dbs size and 15 data files of log_file_size_partitions.
In 7.5, this has gone up to 192 data files of log_file_size_ref_dbs size and 388 data files of log_file_size_partitions.
Since we have over 50 DBs on the cluster you can see why this is a significant issue.
7.3 was already a huge disk waster (One could question why a DB would need 200MB of disk to hold 20MB worth of data) but 7.5 has gone over any expectations and now requires almost 5GB to hold 20MBs…
Can we see the output of ls in the logs/snapshots directory of an example leaf? You can send it to us at bug-report@memsql.com
Are you running with snapshot-trigger-size set to the default value (2 GB?) and snapshots_to_keep set to 2? By default S2 keeps a lot of log history around for replication (2 GB * 2 per above sys vars) per partition, but if your not writing any new data then this history won’t be building up.
-Adam
Also, one other question. Are you doing writes to any of the databases (even a trickle of writes?).
If you want to reduce the disk space right away I would lower snapshot_trigger_size to 256 mb or lower while we look into what is going on.
Just wanted to confirm that all I did was upgrading from 7.3 to 7.5. The settings were not changed so the change must be internal to the new version.
snapshot_trigger_size is set to 512MB
snapshots_to_keep is set to 1
Yes we are doing writes to all the DBs.
I’ll email the ls results to the email address you suggested.
We resolved this via e-mail.
The problem was snapshots_to_keep is not a sync variable where as snapshot_trigger_size is a sync variable. This means snapshots_to_keep needs to be set via: update-config · SingleStore Documentation so the setting persists across restarts (that command will write it into the memsql.cnf). Once we set snapshots_to_keep back to 1 disk usage dropped back to normal levels.
More details about this are here:
Engine Variables · SingleStore Documentation
I’ve opened a task to make snapshots_to_keep a sync variable so it can be set via SET GLOBAL directly and have its value persist across restarts so this is less confusing.