Greetings SingleStore Team,
I was wondering if the OPTIMIZE FULL operation could be substantially sped up if the rowsegment length requirement were relaxed?
I see the result of our OPTIMZE FULL resulting in equal length row segments.
If a merge inserts rows into the first segment, it seems like the evicted rows get pushed into the next segment and so on such that the entire columnstore get re-written?
If that is the case, it seems like if you merge in 500k rows news into the first row segment, you could instead just merge those rows plus the original 1mm rows (total 1.5mm rows) into two 750k rowsegments and leave all the other rowsegments untouched. I have to imagine the query performance profile of having rowsegments in the range of 70 to 130% of the “optimal” rowsegment length would be fine.
This could be specified in the OPTIMIZE TABLE FULL syntax where there is an option like
OPTIMIZE TABLE my_table FULL ROWSEGMENT_LENGTH_TOLERANCE N
where N is a parameter (default 0 =current behavior, maybe max 50) that permits re-built / merged-in rowsegments to be up to N% smaller or larger than the ideal size.
I think this feature would permit very fast full optimization of even the largest columnstore tables.
-Rich