Generating monotonically increasing sequence id when using multiple aggregators

jay · August 28, 2020, 7:45pm

We are looking to use two aggregators for load balancing and high availability. One requirement we have is to have a unique monotonically increasing id for a column no matter which aggregator inserted it. Any ideas how to achieve that ? Currently we have only one aggregator so we’re able to easily accomplish this.

hanson · August 28, 2020, 11:11pm

Hi @jay. Welcome! You can simulate sequences like this:

CREATE TABLE sequences (sequence_name text primary key, value bigint not null);

-- Creating a sequence:
INSERT INTO sequences VALUES ('my_sequence_name', 0);

-- Generating a value:
UPDATE sequences SET value=LAST_INSERT_ID(value+1) WHERE sequence_name = 'my_sequence_name';

-- sample use

memsql> SELECT LAST_INSERT_ID();
+------------------+
| LAST_INSERT_ID() |
+------------------+
|                1 |
+------------------+
1 row in set (0.01 sec)

memsql> UPDATE sequences SET value=LAST_INSERT_ID(value+1) WHERE sequence_name = 'my_sequence_name';
Query OK, 1 row affected (0.01 sec)
Rows matched: 1  Changed: 1  Warnings: 0

memsql> SELECT LAST_INSERT_ID();
+------------------+
| LAST_INSERT_ID() |
+------------------+
|                2 |
+------------------+
1 row in set (0.00 sec)

memsql> create table foo(a int);
Query OK, 0 rows affected (0.07 sec)

memsql> SELECT LAST_INSERT_ID();
+------------------+
| LAST_INSERT_ID() |
+------------------+
|                2 |
+------------------+
1 row in set (0.00 sec)

memsql> insert foo values(last_insert_id());
Query OK, 1 row affected (0.11 sec)

memsql> select * from foo;
+------+
| a    |
+------+
|    2 |
+------+

Since the sequences table is sharded on sequence_name since it is a primary key, there will be only one copy of each sequence count value that everybody shares, independent of what aggregator the operation comes into.

If you need to, you can have client apps grab batches of sequence numbers, like 100 at a time, by updating the sequences table and adding the batch size to the sequence number, and just put the numbers into records as needed until they are used up, then grab another batch. Hope this helps.

jay · August 29, 2020, 12:06am

Thanks a lot Hanson for the detailed reply. I’m familiar with this method, but what this doesn’t address is the “monotonically increasing” requirement part of my question. For e.g. the processes that look at the tables currently assume if event 1 has sequence number 10 and event 2 has sequence number 11, then event 1 “happened before” event 2.

hanson · August 31, 2020, 3:54pm

If you absolutely, positively need to make sure events that happened are recorded in a specific counter order, try using SELECT FOR UPDATE to increment a counter and get one number at a time. You can shard your counters like I showed above so they only live on one aggregator. SELECT FOR UPDATE holds an exclusive lock on the row between the SELECT and the UPDATE.

hanson · August 31, 2020, 4:24pm

By the way, we have a feature request forum area and you can both request new features and vote on feature requests there.

https://www.memsql.com/forum/c/feature-requests/

If you have a feature request, are want to vote on one, we’d love to hear from you. E.g. we’ve contemplated creating an Oracle- or SQL Server-like SEQUENCES feature. If that’s what you want, please ask.