Skip to content

feat: add 53-bit unique ID generator & database-generated sharded sequences#858

Merged
levkk merged 19 commits intomainfrom
levkk-configurable-unique-id
Apr 8, 2026
Merged

feat: add 53-bit unique ID generator & database-generated sharded sequences#858
levkk merged 19 commits intomainfrom
levkk-configurable-unique-id

Conversation

@levkk
Copy link
Copy Markdown
Collaborator

@levkk levkk commented Mar 30, 2026

Smaller unique ID

This feature is still experimental and subject to change.

Add support for 53-bit unique ID. This makes them smaller and "javascript-safe" in case identifiers are exposed as integers to the app API.

This is configurable thourough pgdog.toml:

[general]
unique_id_function = "compact" # or "standard", which is default

Trade-offs: it only allows 64 pgdog nodes and 64,000 IDs generated per second per node. This is considerably lower than the standard 64-bit unique ID. It should be used for omnisharded tables only with a low write frequency.

Database-generated sharded sequences

Add the ability to generate cross-shard unique IDs using sequences. This produces smaller starting integers, but can only work with direct-to-shard INSERTs. For omnisharded tables, the unique ID function should continue to be used.

This is configurable through pgdog.toml:

[rewrite]
primary_key = "rewrite_omni" # Will only inject Unique ID to omnisharded table inserts, leaving others to be
                                                # generated by the database instead.

The sharded sequence is installed automatically on all tables with a BIGINT primary key when running pgdog setup or SETUP SCHEMA via the admin DB.

Identity columns

GENERATED ... WITH IDENTITY is now removed entirely during schema sync. This constraint blocks our implementation of replication and doesn't work with sharded databases, incl. with our unique ID generator because it prevents the column value from being inserted by the query.

Smaller features

  • Add RESET PREPARED admin command which evicts all unused prepared statements from the global cache
  • log any notice/warning received from Postgres when executing queries via execute_checked

Bug fixes

  • unique_id_min setting was being ignored by the unique ID generator. This didn't affect any deployments because unique IDs start very large already.
  • Admin command SETUP SCHEMA was applied to all database/user pairs; it's now applied to the schema_owner only

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 30, 2026

Comment on lines +174 to +178
// Compact (JS-safe) IDs only have 6 node bits.
if node_id > COMPACT_MAX_NODE_ID {
return Err(Error::CompactNodeIdTooLarge(node_id));
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's probably should be checked only if the function is compact. For standard it could cause unexpected error

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

Comment on lines 179 to 181
if min_id > MAX_OFFSET {
return Err(Error::OffsetTooLarge(min_id));
}
Copy link
Copy Markdown
Contributor

@meskill meskill Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check is done only for MAX_OFFSET which is based on the standard generator, but it really depends on the actual layout.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might worth to move the checks into the id_type or state itself to avoid adding more conditions here and be ready for other extensions

@@ -31,6 +32,14 @@ const TIMESTAMP_SHIFT: u8 = (SEQUENCE_BITS + NODE_BITS) as u8; // 22
const MAX_OFFSET: u64 = i64::MAX as u64
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, MAX_OFFSET if effectively 0. That makes config.general.unique_id_function unusable except for default 0 value. It seems unit tests are bypasses this since the offset tests use next_id directly that has no validation for MAX_OFFSET

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean. Could you elaborate? I think we test the min ID in unit tests by passing in as an argument.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here
image

if we do the math the value is 0. And the tests are not catching this because the tests for id_offset are only calling next_id function that has no validation.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch. Let me see what we can do here...

@levkk levkk force-pushed the levkk-configurable-unique-id branch from 7256483 to 6a91de6 Compare April 1, 2026 19:04
if now >= target_ms {
return now;
}
thread::sleep(Duration::from_millis(1));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a sidenote:
I wonder if we really need the sleep here to correct the clock drift. It seems like it's used to make sure the call to SystemTime::now() fits our expectation for monotonically increasing timestamp. But it's not strictly necessary for this calculation since it could be emulated by now_ms().max(self.last_timestamp_ms). Yes, it won't be not exactly the system time, but we fulfill the requirements.
Just because the sleep is not reliable and scheduler could skew the time in this case

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're just trying to guarantee that the next call to now() returns the subsequent second or greater. This is used when the internal sequence is exhausted and we have to wait until next clock tick to guarantee that the sequence is monotonically increasing.

@levkk levkk force-pushed the levkk-configurable-unique-id branch 3 times, most recently from 9614cbe to 1778c85 Compare April 7, 2026 21:54
@levkk levkk changed the title feat: 53-bit unique id feat: add 53-bit unique ID generator Apr 7, 2026
@levkk levkk changed the title feat: add 53-bit unique ID generator feat: add 53-bit unique ID generator & database-generated sharded sequences Apr 7, 2026
@levkk levkk force-pushed the levkk-configurable-unique-id branch from 702f74a to 3108b04 Compare April 8, 2026 14:56
@levkk levkk merged commit 38eccf3 into main Apr 8, 2026
10 of 11 checks passed
@levkk levkk deleted the levkk-configurable-unique-id branch April 8, 2026 15:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants