sebaZ
August 10, 2021, 6:13pm
1
Hi Guys!
Im doing every day backups to GCP. I was doing it for a long time, but in the last week it started failing all the time, in deferents moment of the backup, it failed after 5,20,45 minutes and so one.
The error that i got:
Leaf Error (127.0.0.1:3307): Backup subprocess nonzero exit value. 408 Request Timeout
I restarted my VM just in case, and also memory and CPU are ok all the time.
MEMSQL Version: 7.3.3 Cluster in a box
Example:
113506922248 2021-08-10 19:02:45.290 DETAIL: Calling backup with: backup/backup --storage-type gcs --target XXX/backups/db/2021-08-10/XXXX.backup/BACKUP_INCOMPLETE
113507090843 2021-08-10 19:02:45.458 ERROR: Failed taking a distributed backup for database XXXX
to directory ‘XXXX’ failed with (2205:Leaf Error (127.0.0.1:3307): Backup subprocess nonzero exit value. 408 Request Timeout)
Can you help me?
1 Like
adam
August 11, 2021, 8:39pm
2
Its possible your encountering an issue related to:
@nhoran The backup also failed after restarting the NTP services.
I just attempted to create a new empty database and created a backup. It succeeded in 2 seconds.
The size of the last succeeded backup was ~400 GB. The failure always happens after ~15 minutes. Data is written to GCS before failing.
Could the issue be related to the process running too long resulting in the GCS access token expiring? Even though the last succeeded backup completed in ~25 minutes.
That problem was fixed in 7.5 and we are looking into backporting the fix into a 7.3 patch release in the next few weeks.
sebaZ
August 13, 2021, 1:06pm
3
NTP is not the issue, I have more than 1 DB and all the rest are working ok. Can be a problem in a specific DB?
adam
August 13, 2021, 1:10pm
4
The problem I linked is not related to NTP (though the error message may make you believe that). It was an issue in the client library we were using internally to upload files to GCP that impacted larger databases (that look longer then ~15 minutes to backup).
-Adam