Hi Hanson, thanks for escalating this one. The license should be correct, as it is reporting ‘4 units’ to me, and I am getting the same behaviour with our free license and a trial Enterprise license.
I appreciate it probably seems like as misconfiguration, but the cgroup configuration does seem to be being applied correctly.
What I believe should be the source of truth for the applied limits, the cgroups themselves, are reporting the specified limits for both Memory (32Gb) and CPUs (8) when querying either the node filesystem or the kubernetes cAdvisor :
│ └─kubepods-burstable-pod60cdb7b8_0384_4b9b_b550_e9732e93388c.slice
│ ├─cri-containerd-2fad473233e86d93d261d0acf36fbefe8ae7beca953685dd89755cf89a703468.scope
│ │ ├─3147876 bash /etc/memsql/scripts/exporter-startup-script
│ │ ├─3147939 bash /etc/memsql/scripts/exporter-startup-script
│ │ └─3147941 /bin/memsql_exporter
│ ├─cri-containerd-4be5077585d9c8fcd5d65463963d1b2d6043de797d0dbe7d120285fe63adb571.scope
│ │ └─3147569 /pause
│ └─cri-containerd-17b0218bb66421af2e31f6208d4a29764dd691470616d12da14e11fc926d042e.scope
│ ├─3147671 bash /assets/startup-node
│ ├─3148292 /opt/memsql-server-8.0.17-0553658f69/memsqld_safe --auto-restart disable --defaults-file /var/lib/memsql/instance/memsql.cnf --memsqld /opt/memsql-server-8.0.17-0553658f69/memsqld --user 999
│ ├─3148370 /opt/memsql-server-8.0.17-0553658f69/memsqld --defaults-file /var/lib/memsql/instance/memsql.cnf --user 999
│ └─3148470 /opt/memsql-server-8.0.17-0553658f69/memsqld --defaults-file /var/lib/memsql/instance/memsql.cnf --user 999
ben@server$ cat /sys/fs/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod60cdb7b8_0384_4b9b_b550_e9732e93388c.slice/cpu.max
810000 100000
ben@server$ cat /sys/fs/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod60cdb7b8_0384_4b9b_b550_e9732e93388c.slice/memory.max
34464595968
ben@server:~$ cat /sys/fs/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod60cdb7b8_0384_4b9b_b550_e9732e93388c.slice/cri-containerd-17b0218bb66421af2e31f6208d4a29764dd691470616d12da14e11fc926d042e.scope/cpu.max
800000 100000
ben@server:~$ cat /sys/fs/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod60cdb7b8_0384_4b9b_b550_e9732e93388c.slice/cri-containerd-17b0218bb66421af2e31f6208d4a29764dd691470616d12da14e11fc926d042e.scope/memory.max
34359738368
ben@server:~$ curl http://localhost:8001/api/v1/nodes/server/proxy/metrics/cadvisor | grep "container_spec_cpu_quota" | grep "node-sdb-cluster-leaf-"
container_spec_cpu_quota{container="",id="/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod56219a10_fc11_41bd_b38f_55c17052eba6.slice",image="",name="",namespace="default",pod="node-sdb-cluster-leaf-ag1-1"} 810000
container_spec_cpu_quota{container="",id="/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod60cdb7b8_0384_4b9b_b550_e9732e93388c.slice",image="",name="",namespace="default",pod="node-sdb-cluster-leaf-ag1-0"} 810000
container_spec_cpu_quota{container="",id="/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod80e9b598_efc4_4cc9_a994_4c71c97fdf12.slice",image="",name="",namespace="default",pod="node-sdb-cluster-leaf-ag1-2"} 810000
container_spec_cpu_quota{container="exporter",id="/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod56219a10_fc11_41bd_b38f_55c17052eba6.slice/cri-containerd-b348d5a3d45b20fab66856850564fab6ed8a5dffb1efab771ec2c62f61cec61c.scope",image="docker.io/singlestore/node:latest",name="b348d5a3d45b20fab66856850564fab6ed8a5dffb1efab771ec2c62f61cec61c",namespace="default",pod="node-sdb-cluster-leaf-ag1-1"} 10000
container_spec_cpu_quota{container="exporter",id="/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod60cdb7b8_0384_4b9b_b550_e9732e93388c.slice/cri-containerd-2fad473233e86d93d261d0acf36fbefe8ae7beca953685dd89755cf89a703468.scope",image="docker.io/singlestore/node:latest",name="2fad473233e86d93d261d0acf36fbefe8ae7beca953685dd89755cf89a703468",namespace="default",pod="node-sdb-cluster-leaf-ag1-0"} 10000
container_spec_cpu_quota{container="exporter",id="/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod80e9b598_efc4_4cc9_a994_4c71c97fdf12.slice/cri-containerd-6c5bf1f3a444cacdc156c911249554df44997438afdb8c112c84492e7ce0fc59.scope",image="docker.io/singlestore/node:latest",name="6c5bf1f3a444cacdc156c911249554df44997438afdb8c112c84492e7ce0fc59",namespace="default",pod="node-sdb-cluster-leaf-ag1-2"} 10000
container_spec_cpu_quota{container="node",id="/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod56219a10_fc11_41bd_b38f_55c17052eba6.slice/cri-containerd-a63c83b26f97159d9f04e052ec682ab47b4486ebead03cf93d1b72b51e08bf1e.scope",image="docker.io/singlestore/node:latest",name="a63c83b26f97159d9f04e052ec682ab47b4486ebead03cf93d1b72b51e08bf1e",namespace="default",pod="node-sdb-cluster-leaf-ag1-1"} 800000
container_spec_cpu_quota{container="node",id="/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod60cdb7b8_0384_4b9b_b550_e9732e93388c.slice/cri-containerd-17b0218bb66421af2e31f6208d4a29764dd691470616d12da14e11fc926d042e.scope",image="docker.io/singlestore/node:latest",name="17b0218bb66421af2e31f6208d4a29764dd691470616d12da14e11fc926d042e",namespace="default",pod="node-sdb-cluster-leaf-ag1-0"} 800000
container_spec_cpu_quota{container="node",id="/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod80e9b598_efc4_4cc9_a994_4c71c97fdf12.slice/cri-containerd-a71d39d49d070f71e2b3daef9cd5b21d613fc5af2190af397f9f34985cd1ab8d.scope",image="docker.io/singlestore/node:latest",name="a71d39d49d070f71e2b3daef9cd5b21d613fc5af2190af397f9f34985cd1ab8d",namespace="default",pod="node-sdb-cluster-leaf-ag1-2"} 800000
However, when I query the database for its limits, it appears that the memory limit is being recognised, but not the CPU:
MySQL [information_schema]> SELECT * FROM mv_nodes;
+----+---------------------------------------------+------+---------------+---------------+------+--------+--------------------+----------+---------------+---------------------+----------------+----------------------+--------------------+------------------------+--------+---------+
| ID | IP_ADDR | PORT | EXTERNAL_HOST | EXTERNAL_PORT | TYPE | STATE | AVAILABILITY_GROUP | NUM_CPUS | MAX_MEMORY_MB | MAX_TABLE_MEMORY_MB | MEMORY_USED_MB | TABLE_MEMORY_USED_MB | TOTAL_DATA_DISK_MB | AVAILABLE_DATA_DISK_MB | UPTIME | VERSION |
+----+---------------------------------------------+------+---------------+---------------+------+--------+--------------------+----------+---------------+---------------------+----------------+----------------------+--------------------+------------------------+--------+---------+
| 4 | node-sdb-cluster-leaf-ag1-2.svc-sdb-cluster | 3306 | NULL | NULL | LEAF | online | 1 | 32 | 29491 | 26541 | 6495 | 737 | 3752978 | 3424411 | 69913 | 8.0.17 |
| 3 | node-sdb-cluster-leaf-ag1-1.svc-sdb-cluster | 3306 | NULL | NULL | LEAF | online | 1 | 32 | 29491 | 26541 | 6464 | 758 | 3752978 | 3424411 | 69913 | 8.0.17 |
| 2 | node-sdb-cluster-leaf-ag1-0.svc-sdb-cluster | 3306 | NULL | NULL | LEAF | online | 1 | 32 | 29491 | 26541 | 6671 | 753 | 3752978 | 3424411 | 69913 | 8.0.17 |
| 1 | node-sdb-cluster-master-0.svc-sdb-cluster | 3306 | NULL | NULL | MA | online | NULL | 32 | 14745 | 13270 | 484 | 56 | 3752978 | 3424411 | 69990 | 8.0.17 |
+----+---------------------------------------------+------+---------------+---------------+------+--------+--------------------+----------+---------------+---------------------+----------------+----------------------+--------------------+------------------------+--------+---------+
4 rows in set (0.026 sec)
(for what it’s worth, it also seems to be getting the disk calculation wrong here too, and reading the host’s OS disk and not the PVs they have assigned.)
As an additional test, I have tried limiting the number of CPUs assigned and run it at high load, and can confirm that it is being limited by the cgroup configuration, and it is not able to exceed the limits set on the pods.
The fact that it is being limited in actuality, and the memory limits are being correctly recognised, and that querying the cgroups on the filesystem or via kubernetes appears to show the correct limits, all lead me to think that the cgroups are correctly defined and configured.
We will be looking to expand the kubernets cluster to multiple hosts in future, which is why we need to get this right now, and due to the limitations on the way the leaf height variable works, it’s important that we can run more than one leaf per phsyical host.
Are you able to shed any light on how/where it is checking these CPU limits, and why it might be recognising the applied memory limit but not the CPU limit?