You should never remove old transaction logs - memsql does that automatically. If you delete them manually, you’ll likely end up losing or corrupting your data. The amount of log file content that is kept around is dependent on snapshots_to_keep - memsql deletes all log content below the lowest snapshot. For performance reasons, however, the log files are pre-sized (in this case you’re using the default 256MB size for partitions, which is settable at create database time), and extra unused files are kept around - a minimum of 1 unused file, and maximum of 4 unused files, the actual number dynamically depends on your workload.
If you need your log files to take up less space, then you can adjust the globals log_file_size_partitions and log_file_size_ref_dbs (these default to 256MB and 64MB respectively) before creating the database, or restoring it from a backup - at this time it is impossible to change the log file size for a database without restoring it from a backup.
Also, I see 71 16M log files for the memsql database on the leaf nodes:
-rw------- 1 memsql memsql 16777216 Apr 23 12:13 memsql_log_v1_282624
-rw------- 1 memsql memsql 16777216 Apr 23 10:50 memsql_log_v1_286720
.
.
.
Is that normal behavior?
This is expected, depending on your snapshots_to_keep and snapshot_trigger_size. We keep extra logs on “standby”, but we also keep around logs that have valid content. Log content between snapshots is split into multiple fixed-size logs, so it’s not unusual to have many small log files.
As an example:
Suppose you have snapshot_trigger_size set to 2GB, snapshots_to_keep set to 2, and log_file_size_partitions set to 16MB. In that case, you would keep 2GB worth of log content between each snapshot you keep on disk, or up to 4GB per partition. That means each partition would keep up to 250 log files.
If you manually take snapshots, you can manually control how much space exists between each snapshot, forcing a cleanup of log files, but if you then let the system run, it’ll eventually create a bunch of new log files with log content.
If you want to reduce the total amount of space occupied by all log files, not just unused ones, you also need to adjust snapshots_to_keep and snapshot_trigger_size.
I have the following settings:
log_file_size_partitions = 10485760 (10MB)
snapshot_trigger_size = 536870912 (500MB)
snapshots_to_keep = 1
Yet there are a little over 300 10MB files in the data directory for one of the DB taking therefore about 3GB worth of disk space.
Could you explain why? I was expecting no more than 500MB worth of space taken by these files.