It would be great if SingleStore had the MAX_BY/MIN_BY aggregation functions, as it’s a very common function used in OLAP.
Any suggestions around the best performant workaround to achieve what MAX_BY/MIN_BY do with the current functions supported by SingleStore, in a Columnstore table?
@rob@hanson Yes, whenever the ordering argument is a datetime or a timestamp, the FIRST and LAST functions work great. However, if you have a table like the one below, that stores temperature readings of multiple devices, what is the most efficient query that could be used to obtain, for each device, the dt when the maximum and minimum temperatures were recorded?
CREATE TABLE `temperatures` (
`device_id` int,
`dt` datetime series timestamp,
`temperature` float
);
select device_id, first(temperature), last(temperature),
min(temperature), max(temperature),
min(dt) as first_dt, max(dt) as last_dt
from temperatures
group by device_id;
Using min and max on dt can give you the first and last dt values for the device_id you are grouping by.
Oh, I see you are asking a different question. You could do it with a user-defined aggregate that keeps around the time when the min and max temperatures were recorded and includes that in the output.
Alternatively, you could find your min and max temps for each device in a CTE, and then join on temperature to get the corresponding dt value.