Ok, I’ve inherited some memSQL/SingleStore support and I’m a newbie with it. (Sr. Oracle DBA/Linux guy though)
I had a user create support ticket with the following:
[Informatica][ODBC 20101 driver][20101]Leaf Error
(MI-P-MEM-SQL2.sentryds.com:3306):
Memory used by MemSQL
(48099.50 Mb) has reached the ‘maximum_memory’ setting (57823 Mb) on this node.Possible causes include
(1) available query
execution memory has been used up for table memory (in use table memory:
18192.75 Mb) and
(2) the query is large and
complex and requires more query execution memory than is available (in use
query execution memory 26315.63 Mb).
See https://docs.memsql.com/troubleshooting/latest.There were two large sessions running at the same time
and they both failed. When I ran them manually one after another, they
were successful.
Checking the leaf memory on this cluster (2 leafs) I see that each are using:
26-27GB of RAM
I checked for data skew (as this is a relatively new cluster) and I see very, very little skew from the aggregator:
+-------------------+------------------------------------+-----------+----------+------------+-------------+
| DATABASE_NAME | TABLE_NAME | avg_rows | row_skew | avg_memory | memory_skew |
+-------------------+------------------------------------+-----------+----------+------------+-------------+
| db_1 | prch_fact_b | 832750 | 10.100 | 0 | NULL |
| db_1 | stg_prch_fact | 1010897 | 9.800 | 0 | NULL |
| db_1 | prch_fact | 1010897 | 9.800 | 0 | NULL |
| db_3 | t1 | 681 | 4.100 | 185587 | 1.200 |
| db_2 | user_drug_grouping_ndcs | 1306 | 2.300 | 303940 | 0.300 |
| db_1 | stg_population_rpt_enc_diag_import | 11927 | 1.200 | 0 | NULL |
| db_1 | drug_dim | 10996 | 0.900 | 0 | NULL |
| db_1 | stg_population_rpt_drug | 1869298 | 0.300 | 0 | NULL |
| db_1 | population_rpt_drug | 1869298 | 0.300 | 0 | NULL |
| db_1 | stg_population_rpt_enc | 3616520 | 0.100 | 0 | NULL |
| db_1 | population_rpt_enc | 3616520 | 0.100 | 0 | NULL |
| db_4 | segments | 6203 | 0.000 | 34408288 | 0.000 |
| db_1 | population_rpt_enc_diag | 14552272 | 0.000 | 0 | NULL |
| db_1 | code_ref | 202740 | 0.000 | 76111016 | 0.100 |
| db_4 | adt | 478981981 | 0.000 | 0 | NULL |
| db_1 | population_rpt_enc_proc | 13856358 | 0.000 | 0 | NULL |
| db_5 | feed_outpatient_mapping | 4241 | 0.000 | 872224 | 0.000 |
| db_1 | stg_population_rpt_enc_diag | 14552272 | 0.000 | 0 | NULL |
| db_1 | stg_population_rpt_enc_proc | 13856358 | 0.000 | 0 | NULL |
+-------------------+------------------------------------+-----------+----------+------------+-------------+
Running SHOW STATUS EXTENDED on the LEAF Nodes gives me:
| Aborted_clients | 31947 |
| Aborted_connects | 15 |
| Active_dedicated_admin_connections | 0 |
| Alloc_client_connection | 8.000 MB |
| Alloc_compiled_unit_sections | 32.567 MB |
| Alloc_databases_list_entry | 2.250 MB |
| Alloc_deleted_version | 159.875 MB |
| Alloc_hash_buckets | 506.448 MB |
| Alloc_internal_key_node | 71.125 MB |
| Alloc_mmap_file | 17744.000 MB |
| Alloc_object_code_images | 50.223 MB |
| Alloc_plan_cache | 1.750 MB |
| Alloc_protocol_packet | 142.500 MB |
| Alloc_query_execution | 0.000 (-3.125) MB |
| Alloc_replication | 17.750 MB |
| Alloc_sharding_partitions | 0.125 MB |
| Alloc_skiplist_tower | 1118.000 MB |
| Alloc_table_autostats | 56.979 MB |
| Alloc_table_memory | 3577.426 MB |
| Alloc_table_metadata_cache | 3.000 MB |
| Alloc_table_primary | 570.750 MB |
| Alloc_thread_stacks | 587.000 (-20.000) MB |
| Alloc_unit_ifn_thunks | 3.543 MB |
| Alloc_unit_images | 119.993 MB |
| Alloc_variable | 1094.250 MB |
| Alloc_variable_allocated | 104.9 MB |
| Alloc_variable_bucket_104 | allocs:29476 alloc_MB:2.9 buffer_MB:4.5 cached_buffer_MB:1.0 |
| Alloc_variable_bucket_1168 | allocs:187 alloc_MB:0.2 buffer_MB:3.4 cached_buffer_MB:2.1 |
| Alloc_variable_bucket_11896 | allocs:38 alloc_MB:0.4 buffer_MB:3.2 cached_buffer_MB:1.9 |
| Alloc_variable_bucket_128 | allocs:25682 alloc_MB:3.1 buffer_MB:4.2 cached_buffer_MB:0.4 |
| Alloc_variable_bucket_130960 | allocs:2 alloc_MB:0.2 buffer_MB:2.1 cached_buffer_MB:1.9 |
| Alloc_variable_bucket_14544 | allocs:65 alloc_MB:0.9 buffer_MB:4.1 cached_buffer_MB:1.8 |
| Alloc_variable_bucket_1480 | allocs:104 alloc_MB:0.1 buffer_MB:3.1 cached_buffer_MB:2.6 |
| Alloc_variable_bucket_16 | allocs:670299 alloc_MB:10.2 buffer_MB:466.9 cached_buffer_MB:2.0 |
| Alloc_variable_bucket_160 | allocs:7540 alloc_MB:1.2 buffer_MB:1.4 cached_buffer_MB:0.1 |
| Alloc_variable_bucket_1832 | allocs:31 alloc_MB:0.1 buffer_MB:2.6 cached_buffer_MB:1.9 |
| Alloc_variable_bucket_18696 | allocs:1832 alloc_MB:32.7 buffer_MB:39.4 cached_buffer_MB:1.9 |
| Alloc_variable_bucket_200 | allocs:1078 alloc_MB:0.2 buffer_MB:0.2 cached_buffer_MB:0.0 |
| Alloc_variable_bucket_21816 | allocs:1 alloc_MB:0.0 buffer_MB:1.9 cached_buffer_MB:1.8 |
| Alloc_variable_bucket_2288 | allocs:154 alloc_MB:0.3 buffer_MB:4.0 cached_buffer_MB:1.9 |
| Alloc_variable_bucket_24 | allocs:49206 alloc_MB:1.1 buffer_MB:363.0 cached_buffer_MB:3.2 |
| Alloc_variable_bucket_248 | allocs:858 alloc_MB:0.2 buffer_MB:0.2 cached_buffer_MB:0.0 |
| Alloc_variable_bucket_26184 | allocs:0 alloc_MB:0.0 buffer_MB:1.9 cached_buffer_MB:1.9 |
| Alloc_variable_bucket_2832 | allocs:11 alloc_MB:0.0 buffer_MB:2.4 cached_buffer_MB:1.9 |
| Alloc_variable_bucket_312 | allocs:546 alloc_MB:0.2 buffer_MB:3.4 cached_buffer_MB:3.0 |
| Alloc_variable_bucket_32 | allocs:258108 alloc_MB:7.9 buffer_MB:59.5 cached_buffer_MB:3.2 |
| Alloc_variable_bucket_32728 | allocs:16 alloc_MB:0.5 buffer_MB:5.1 cached_buffer_MB:3.9 |
| Alloc_variable_bucket_3528 | allocs:10 alloc_MB:0.0 buffer_MB:2.6 cached_buffer_MB:2.0 |
| Alloc_variable_bucket_384 | allocs:30514 alloc_MB:11.2 buffer_MB:13.5 cached_buffer_MB:1.6 |
| Alloc_variable_bucket_40 | allocs:158401 alloc_MB:6.0 buffer_MB:30.2 cached_buffer_MB:2.6 |
| Alloc_variable_bucket_43648 | allocs:1 alloc_MB:0.0 buffer_MB:2.6 cached_buffer_MB:2.5 |
| Alloc_variable_bucket_4504 | allocs:8 alloc_MB:0.0 buffer_MB:2.1 cached_buffer_MB:1.6 |
| Alloc_variable_bucket_48 | allocs:82731 alloc_MB:3.8 buffer_MB:15.5 cached_buffer_MB:1.5 |
| Alloc_variable_bucket_480 | allocs:11 alloc_MB:0.0 buffer_MB:0.9 cached_buffer_MB:0.6 |
| Alloc_variable_bucket_56 | allocs:29813 alloc_MB:1.6 buffer_MB:4.4 cached_buffer_MB:1.1 |
| Alloc_variable_bucket_5680 | allocs:37 alloc_MB:0.2 buffer_MB:3.1 cached_buffer_MB:2.1 |
| Alloc_variable_bucket_600 | allocs:30 alloc_MB:0.0 buffer_MB:2.4 cached_buffer_MB:1.9 |
| Alloc_variable_bucket_6224 | allocs:15 alloc_MB:0.1 buffer_MB:2.6 cached_buffer_MB:2.0 |
| Alloc_variable_bucket_64 | allocs:183328 alloc_MB:11.2 buffer_MB:13.0 cached_buffer_MB:0.8 |
| Alloc_variable_bucket_65472 | allocs:0 alloc_MB:0.0 buffer_MB:1.9 cached_buffer_MB:1.9 |
| Alloc_variable_bucket_72 | allocs:26261 alloc_MB:1.8 buffer_MB:2.5 cached_buffer_MB:0.1 |
| Alloc_variable_bucket_7264 | allocs:43 alloc_MB:0.3 buffer_MB:4.4 cached_buffer_MB:2.1 |
| Alloc_variable_bucket_752 | allocs:42 alloc_MB:0.0 buffer_MB:2.2 cached_buffer_MB:1.9 |
| Alloc_variable_bucket_80 | allocs:22171 alloc_MB:1.7 buffer_MB:2.6 cached_buffer_MB:0.5 |
| Alloc_variable_bucket_88 | allocs:19022 alloc_MB:1.6 buffer_MB:2.6 cached_buffer_MB:0.8 |
| Alloc_variable_bucket_9344 | allocs:301 alloc_MB:2.7 buffer_MB:5.8 cached_buffer_MB:2.1 |
| Alloc_variable_bucket_936 | allocs:57 alloc_MB:0.1 buffer_MB:2.6 cached_buffer_MB:1.9 |
| Alloc_variable_cached_buffers | 69.9 (+0.1) MB |
| Alloc_warnings | 91.625 MB |
| Auto_attach_remaining_seconds | 0 |
| Average_garbage_collection_duration | 52 ms |
| Buffer_manager_cached_memory | 14424.9 (+6.1) MB |
| Buffer_manager_memory | 17710.2 MB |
| Buffer_manager_unrecycled_memory | 4.4 (-3.0) MB |
| Bytes_received | 6562409676366 |
| Bytes_sent | 5235231831848 |
| Columnstore_ingest_management_active_queries | 0 |
| Columnstore_ingest_management_estimated_memory | 0.000 MB |
| Columnstore_ingest_management_estimated_segments_to_flush | 0 |
| Columnstore_ingest_management_max_concurrency | 15 |
| Columnstore_ingest_management_queued_queries | 0 |
| Connections | 33149 |
| Context_switch_misses | 122162 |
| Context_switches | 9520082 |
| Data_directory | /var/lib/memsql/b3434ee4-af85-4706-ac41-a6fdf7271a90/data |
| Disk_space_reserved_for_secondary_index | 0 |
| Execution_time_of_reads | 2692128194 ms |
| Execution_time_of_write | 2696826256 ms |
| Failed_read_queries | 4723 |
| Failed_write_queries | 30892 |
| Free_io_pool_memory | 192.5 MB |
| GCed_versions_last_sweep | 0 |
| Idle_queue | 0 |
| Inflight_async_compilations | 0 |
| Ingest_errors_disk_space_use | 0 Bytes |
| Linux_resident_memory | 27127.524 (-4.414) MB |
| Linux_resident_shared_memory | 49.079 (-1.664) MB |
| Malloc_active_memory | 949.969 (-0.469) MB |
| Malloc_transaction_cached_memory | 297.968 MB |
| Max_used_connections | 1285 |
| Maximum_cluster_capacity | 18 units |
| Plancache_directory | /var/lib/memsql/b3434ee4-af85-4706-ac41-a6fdf7271a90/plancache |
| Prepared_stmt_count | 0 |
| Queries | 99119362 |
| Query_compilation_failures | 0 |
| Query_compilations | 8747 |
| Questions | 99119362 |
| Ready_queue | 0 |
| Row_lock_wait_time | 634 ms |
| Rows_affected_by_writes | 5445583824 |
| Rows_returned_by_reads | 2482418005 |
| Seconds_until_expiration | 130084664 |
| Segments_directory | /var/lib/memsql/b3434ee4-af85-4706-ac41-a6fdf7271a90/data/blobs |
| Snapshots_directory | /var/lib/memsql/b3434ee4-af85-4706-ac41-a6fdf7271a90/data/snapshots |
| Successful_read_queries | 26732949 |
| Successful_write_queries | 62274956 |
| Threads_background | 1 |
| Threads_cached | 203 |
| Threads_connected | 1141 |
| Threads_created | 586 |
| Threads_idle | 758 |
| Threads_running | 1 |
| Threads_shutdown | 16013 |
| Threads_waiting_for_disk_space | 0 |
| Total_blobs_processed_for_fsync | 918407 |
| Total_blobs_submitted_for_fsync | 918407 |
| Total_dedicated_admin_connections | 0 |
| Total_io_pool_memory | 214.9 MB |
| Total_server_memory | 20231.9 (-20.5) MB |
| Transaction_buffer_wait_time | 0 ms |
| Transaction_log_flush_wait_time | 46320214 ms |
| Transaction_logs_directory | /var/lib/memsql/b3434ee4-af85-4706-ac41-a6fdf7271a90/data/logs |
| Uptime | 6187211 |
| Used_instance_license_units | 2 |
| Workload_management_active_connections | 0 |
| Workload_management_active_queries | 0 |
| Workload_management_active_threads | 0 |
| Workload_management_queued_queries | 0
I thought I might need to reduce transaction_buffer size, but according to the SingleStore FAQ that variable is deprecated and I don’t see an entry for Alloc_durability_large
I notice this line however:
| Linux_resident_memory | 27127.524 (-4.414) MB
Any tips/suggestions for a newbie here?