I have a intermittent problem when I run backup job to S3,
My cluster: 2 EC2
1X Master
1X Aggregator
2X Leaf
Now my issue:
“mysql.connector.errors.DatabaseError: 1970 (HY000): Leaf Error (memsql-leaf-01.xxx.local:3306): Subprocess timed out receiving data. No stderr returned.”
It happen always 5 or 7 minutes after to start the process.
Error from aggregator log:
26865465487880 2021-04-15 19:43:47.181 ERROR: Failed taking a distributed backup for database xxx to directory ‘xx-xx backups/db/2021-04-15/xxx.backup’ failed with (1970:Leaf Error (memsql-leaf-01.xxx.local:3306): Subprocess timed out receiving data. No stderr returned.)
Hello!
This error is due to S3’s response time during the backup operation taking longer than what we expect, so the operation times out.
You can control this timeout by changing the global variable ‘subprocess_io_idle_timeout_ms’. Increasing this variable will make the subprocess wait longer for S3 before timing out.