I’m testing for large IOT workloads with a file system pipeline from memsql.
the target table is configured as ‘column store’.
The specifications for the ‘Application’ server generating data and the ‘db’ server processing the data are as follows:
Test server spec (applicaton server)
shape: OCI (VM.DenseIO2.16)
oCPU : 16 (vCPU 32)
MEM: 128 GB
DISK: SSD
N/W: 16.4 Gbps
Test Server Specifications (db server) _MemSQL 7.0
shape : OCI (VM.Standard2.16)
oCPU : 16 (vCPU 32)
MEM: 128 GB
DISK: HDD
N/W: 16.4 Gbps
Let me explain the test scenario. The application’s server has one equipment that generates sensor data that is generated every 0.1 second in a single ‘csv’ file.
As we increase the number of these equipments, we are measuring the ‘cpu’ rate, the rate of data loading.
Partition of target table has a 1:1 relationship with the number of equipment in the ‘application’ server.
We conducted the test by increasing the amount of sensor data that was generated in one ‘csv’ file and increasing the file size.
However, if you look at the attached verification table, the ‘cpu’ rate and loading time are not linear.
Especially, look at the indicators when processing 30,000 sensor data.
I can’t understand the ‘cpu’ rate and loading time when processing ‘csv’ files that are smaller than the comparison target.
I want to know the correlation between file system pipeline and 'cpu’rate, network, file size, and partition.
In addition, the file system pipeline was not able to process all of the data even though the cpu rate was not high.
Please refer to the following.
need a description of the file system pipeline mechanism associated with this phenomenon.
The same thing happens when add a leaf node.
thanks.