feat: add 53-bit unique ID generator & database-generated sharded sequences#858
feat: add 53-bit unique ID generator & database-generated sharded sequences#858
Conversation
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
pgdog/src/unique_id.rs
Outdated
| // Compact (JS-safe) IDs only have 6 node bits. | ||
| if node_id > COMPACT_MAX_NODE_ID { | ||
| return Err(Error::CompactNodeIdTooLarge(node_id)); | ||
| } | ||
|
|
There was a problem hiding this comment.
that's probably should be checked only if the function is compact. For standard it could cause unexpected error
pgdog/src/unique_id.rs
Outdated
| if min_id > MAX_OFFSET { | ||
| return Err(Error::OffsetTooLarge(min_id)); | ||
| } |
There was a problem hiding this comment.
check is done only for MAX_OFFSET which is based on the standard generator, but it really depends on the actual layout.
There was a problem hiding this comment.
might worth to move the checks into the id_type or state itself to avoid adding more conditions here and be ready for other extensions
| @@ -31,6 +32,14 @@ const TIMESTAMP_SHIFT: u8 = (SEQUENCE_BITS + NODE_BITS) as u8; // 22 | |||
| const MAX_OFFSET: u64 = i64::MAX as u64 | |||
There was a problem hiding this comment.
btw, MAX_OFFSET if effectively 0. That makes config.general.unique_id_function unusable except for default 0 value. It seems unit tests are bypasses this since the offset tests use next_id directly that has no validation for MAX_OFFSET
There was a problem hiding this comment.
I'm not sure what you mean. Could you elaborate? I think we test the min ID in unit tests by passing in as an argument.
There was a problem hiding this comment.
Nice catch. Let me see what we can do here...
7256483 to
6a91de6
Compare
| if now >= target_ms { | ||
| return now; | ||
| } | ||
| thread::sleep(Duration::from_millis(1)); |
There was a problem hiding this comment.
a sidenote:
I wonder if we really need the sleep here to correct the clock drift. It seems like it's used to make sure the call to SystemTime::now() fits our expectation for monotonically increasing timestamp. But it's not strictly necessary for this calculation since it could be emulated by now_ms().max(self.last_timestamp_ms). Yes, it won't be not exactly the system time, but we fulfill the requirements.
Just because the sleep is not reliable and scheduler could skew the time in this case
There was a problem hiding this comment.
I think we're just trying to guarantee that the next call to now() returns the subsequent second or greater. This is used when the internal sequence is exhausted and we have to wait until next clock tick to guarantee that the sequence is monotonically increasing.
9614cbe to
1778c85
Compare
702f74a to
3108b04
Compare
This reverts commit 8e9f761.

Smaller unique ID
This feature is still experimental and subject to change.
Add support for 53-bit unique ID. This makes them smaller and "javascript-safe" in case identifiers are exposed as integers to the app API.
This is configurable thourough
pgdog.toml:Trade-offs: it only allows 64 pgdog nodes and 64,000 IDs generated per second per node. This is considerably lower than the standard 64-bit unique ID. It should be used for omnisharded tables only with a low write frequency.
Database-generated sharded sequences
Add the ability to generate cross-shard unique IDs using sequences. This produces smaller starting integers, but can only work with direct-to-shard
INSERTs. For omnisharded tables, the unique ID function should continue to be used.This is configurable through
pgdog.toml:The sharded sequence is installed automatically on all tables with a
BIGINTprimary key when runningpgdog setuporSETUP SCHEMAvia the admin DB.Identity columns
GENERATED ... WITH IDENTITYis now removed entirely during schema sync. This constraint blocks our implementation of replication and doesn't work with sharded databases, incl. with our unique ID generator because it prevents the column value from being inserted by the query.Smaller features
RESET PREPAREDadmin command which evicts all unused prepared statements from the global cacheexecute_checkedBug fixes
unique_id_minsetting was being ignored by the unique ID generator. This didn't affect any deployments because unique IDs start very large already.SETUP SCHEMAwas applied to all database/user pairs; it's now applied to theschema_owneronly