Backup Failure due to OOM

moralez.rodrigo · January 13, 2022, 1:09pm

Our backup command is failing because oom.

Leaf Error (memsql-leaf-01:3307): Memory used by MemSQL (28703.00 Mb) has reached the ‘maximum_memory’ setting (28665 Mb) on this node. Possible causes include (1) available query execution memory has been used up for table memory (in use table memory: 18855.88 Mb) and (2) the query is large and complex and requires more query execution memory than is available (in use query execution memory 0.00 Mb

Not only this is failing, the queries, instead of just taking more time are failing due to OOM. A simple count distinct or count_approx_distinct fails because of that. Other databases just run the queries until they timeout (I would prefer that instead).

I already analyzed the available memory (without any query running) and it’s even more confusing:

IP_ADDR	TYPE	MAX_MEMORY_MB	memory_used_mb	max_table_memory_mb	table_memory_used_mb
memsql-cluster-aggl	CA	14285	4579	12856	3410
memsql-cluster-leaf-04l	LEAF	28665	26186	25798	18411
memsql-cluster-leaf-03	LEAF	28665	26099	25798	18413
memsql-cluster-leaf-02	LEAF	28665	25954	25798	18351
memsql-cluster-leaf-01	LEAF	28665	27227	25798	18501
memsql-cluster-master	MA	14276	8327	12848	3443

we see 26GB of memory use which is a lot a probably the cause of the failure. It doesn’t decrease even if I run analyze table XXX flush;

Analyzing what’s is consuming that memory I ran:

select sum(memory_use)/(1024*1024)
from information_schema.table_statistics

select sum(memory_use)/(1024*1024)
from information_schema.INDEX_STATISTICS is2

Total: 9979 + 1652 = 11GB.

Any idea how can we resolve this issue? We have almost all the queries failing by memory issues.

MEMSQL VERSION: 7.3.12
Thanks in advance for your help.

moralez.rodrigo · January 14, 2022, 6:26pm

I’m trying to run OPTIMIZE TABLE table_name full, already ran flush… but it fails due to out of meemory as well… I’m blocked

hanson · January 15, 2022, 12:22am

Are you using rowstore or columstore tables? With Universal Storage features in 7.6, if you are using rowstore today, you may be able to switch to columnstore and still meet the needs of your application. That would free a lot of memory.

See SingleStore’s Patented Universal Storage - Part 4

7.6 also supports hash group-by operation spilling (the most common cause of out-of-memory during query execution). See docs on enable_spilling variable.

What’s your most common query that runs out of memory?