Columnstore tables with unique keys insert into the in-memory segment. I’d be interested what numbers you get if you change salaries
to be a rowstore table. My simulation suggests these numbers are in line with what we see for rowstore tables.
Unique key inserting straight to disk is slated for 7.5.
2020-05-13 19:23:24.202790 create database db;
2020-05-13 19:23:26.367455 use db;
2020-05-13 19:23:26.367615 create table ai(id int auto_increment, a char(16), shard key(id));
2020-05-13 19:23:26.414820 insert ai(a) values (null);
2020-05-13 19:23:26.434448 insert ai(a) select a from ai;
2020-05-13 19:23:26.446610 insert ai(a) select a from ai;
2020-05-13 19:23:26.449789 insert ai(a) select a from ai;
2020-05-13 19:23:26.453297 insert ai(a) select a from ai;
2020-05-13 19:23:26.456515 insert ai(a) select a from ai;
2020-05-13 19:23:26.459757 insert ai(a) select a from ai;
2020-05-13 19:23:26.460906 insert ai(a) select a from ai;
2020-05-13 19:23:26.461789 insert ai(a) select a from ai;
2020-05-13 19:23:26.462725 insert ai(a) select a from ai;
2020-05-13 19:23:26.463983 insert ai(a) select a from ai;
2020-05-13 19:23:26.465723 insert ai(a) select a from ai;
2020-05-13 19:23:26.468323 insert ai(a) select a from ai;
2020-05-13 19:23:26.472506 insert ai(a) select a from ai;
2020-05-13 19:23:26.480496 insert ai(a) select a from ai;
2020-05-13 19:23:26.494401 insert ai(a) select a from ai;
2020-05-13 19:23:26.521402 insert ai(a) select a from ai;
2020-05-13 19:23:26.574805 insert ai(a) select a from ai;
2020-05-13 19:23:26.683676 insert ai(a) select a from ai;
2020-05-13 19:23:26.900184 insert ai(a) select a from ai;
2020-05-13 19:23:27.610899 insert ai(a) select a from ai;
2020-05-13 19:23:29.404897 insert ai(a) select a from ai;
2020-05-13 19:23:32.921005 insert ai(a) select a from ai;
2020-05-13 19:23:40.342757 insert ai(a) select a from ai;
2020-05-13 19:23:55.373456 create table rs(a char(16), shard key(a));
2020-05-13 19:23:55.575927 create table csu(a char(16), shard key(a), key(a) using clustered columnstore, unique key(a) using hash);
2020-05-13 19:23:56.265031 create table cs(a char(16), shard key(a), key(a) using clustered columnstore);
2020-05-13 19:23:56.749664 insert rs select id from ai;
2020-05-13 19:24:08.911213 insert cs select id from ai;
2020-05-13 19:24:18.652115 insert csu select id from ai;
sleep 10
2020-05-13 19:24:54.714452 select database_name, table_name, rows, format(sum(memory_use),0) mem
from information_schema.table_statistics
group by 1, 2;
+---------------+------------+---------+---------------+
| database_name | table_name | rows | mem |
+---------------+------------+---------+---------------+
| db | csu | 1301393 | 1,693,712,480 |
| db | ai | 1049313 | 1,493,565,440 |
| db | rs | 1046604 | 1,419,640,832 |
| db | cs | 1046604 | 925,216 |
+---------------+------------+---------+---------------+
4 rows in set
2020-05-13 19:24:54.717794 select database_name, table_name, format(sum(compressed_size), 0) disk
from information_schema.COLUMNAR_SEGMENTS
group by 1, 2;
+---------------+------------+------------+
| database_name | table_name | disk |
+---------------+------------+------------+
| db | csu | 39,049,154 |
| db | cs | 82,756,406 |
+---------------+------------+------------+
2 rows in set
2020-05-13 19:24:54.721979 optimize table cs flush;
2020-05-13 19:24:54.744249 optimize table csu flush;
2020-05-13 19:25:03.877798 select database_name, table_name, rows, format(sum(memory_use),0) mem
from information_schema.table_statistics
group by 1, 2;
+---------------+------------+---------+---------------+
| database_name | table_name | rows | mem |
+---------------+------------+---------+---------------+
| db | csu | 1353207 | 1,306,793,248 |
| db | ai | 1049313 | 1,493,565,440 |
| db | rs | 1046604 | 1,419,640,832 |
| db | cs | 1046604 | 1,310,720 |
+---------------+------------+---------+---------------+
4 rows in set
2020-05-13 19:25:03.879202 select database_name, table_name, format(sum(compressed_size), 0) disk
from information_schema.COLUMNAR_SEGMENTS
group by 1, 2;
+---------------+------------+------------+
| database_name | table_name | disk |
+---------------+------------+------------+
| db | csu | 82,800,382 |
| db | cs | 82,791,227 |
+---------------+------------+------------+
2 rows in set
# Wait for GC
master-agg $ memsql
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 1381
Server version: 5.5.58 MemSQL source distribution (compatible; MySQL Enterprise & MySQL Commercial)
Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
memsql> use db
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
memsql> select database_name, table_name, rows, format(sum(memory_use),0) mem
-> from information_schema.table_statistics
-> group by 1, 2;
+---------------+------------+---------+---------------+
| database_name | table_name | rows | mem |
+---------------+------------+---------+---------------+
| db | csu | 1046604 | 0 |
| db | ai | 1049313 | 1,493,565,440 |
| db | rs | 1046604 | 1,419,640,832 |
| db | cs | 1046604 | 0 |
+---------------+------------+---------+---------------+
4 rows in set (0.01 sec)
memsql>