Skip to content

BE-607: HashQL: Refactor ModuleRegistry into builder/immutable split with allocator-generic stdlib construction#8887

Open
indietyp wants to merge 12 commits into
bm/be-538-hashql-permission-system-integrationfrom
bm/be-607-hashql-remove-smallvec-from-the-standardlibrary-for-module
Open

BE-607: HashQL: Refactor ModuleRegistry into builder/immutable split with allocator-generic stdlib construction#8887
indietyp wants to merge 12 commits into
bm/be-538-hashql-permission-system-integrationfrom
bm/be-607-hashql-remove-smallvec-from-the-standardlibrary-for-module

Conversation

@indietyp

@indietyp indietyp commented Jun 19, 2026

Copy link
Copy Markdown
Member

🌟 What is the purpose of this PR?

Replaces the StandardLibrary type (12.9KB on the stack due to nested SmallVec) with a flat ModuleCache backed by MaybeUninit slots and a FiniteBitSet, and restructures ModuleRegistry construction around a builder pattern (PartialModuleRegistry). Reduces stdlib construction cost by ~20% in instructions and ~60% in L1D cache misses.

🔍 What does this change?

Module cache and construction:

  • Replaced nested SmallVec<(TypeId, ModuleDef)> in StandardLibrary with a flat ModuleCache using Box<[MaybeUninit<ModuleDef>]> and FiniteBitSet<CacheId, u64> for initialization tracking
  • Split StandardLibrary into StandardLibrary (coordinator), StandardLibraryContext (passed to define methods), and ModuleCache (shared across modules)
  • Added CacheId enum with one variant per stdlib module in preorder traversal order, replacing runtime TypeId linear scan with compile-time indices
  • ModuleDef tracks emitted_len for exact output capacity when building item slices
  • Direct match in build() instead of [Option<ItemKind>; 2].flatten() for item emission
  • Item slices allocated directly on the heap (bypassing InternSet) since module item slices are provably unique by ModuleId ownership

Registry restructuring:

  • Introduced PartialModuleRegistry as a mutable builder that collects modules during construction
  • ModuleRegistry is now immutable after construction, backed by a contiguous IdSlice instead of InternMap
  • Removed RwLock from the root namespace (mutable access during construction, shared reference after)
  • Added ModuleRegistry::builder() test helper and PartialModuleRegistry::insert_root_module() for tests that inject custom modules

Symbol and type cleanup:

  • Replaced all heap.intern_symbol("...") calls in stdlib modules with pre-interned sym:: constants
  • Replaced all raw string .opaque("::...") calls with sym::path:: nested constants
  • Interned::empty() for zero-generic decl! expansions and special form types
  • ModuleDef parameterized by allocator S for bump-scope 2.x support

Benchmark results (Apple Silicon, bump-scope 2.x):

Counter Before After Delta
Instructions 440.0K 351.2K -20.2%
L1D cache misses 1,213 487 -59.9%
Branch mispredictions 420 347 -17.4%
Walltime 23.49 us 19.57 us -16.7%

Pre-Merge Checklist 🚀

🚢 Has this modified a publishable library?

This PR:

  • does not modify any publishable blocks or libraries, or modifications do not need publishing

📜 Does this require a change to the docs?

The changes in this PR:

  • are internal and do not require a docs change

🕸️ Does this require a change to the Turbo Graph?

The changes in this PR:

  • do not affect the execution graph

🐾 Next steps

Potential further optimizations identified but not implemented:

  • Cache primitive TypeIds (number, boolean, etc.) on TypeBuilder to avoid repeated intern round-trips
  • Hoist repeated monomorphic TypeDefs in modules like math and special_form
  • Pre-size hash maps for the finite stdlib construction
  • Batch generic declaration construction in the decl! macro

🛡 What tests cover this?

  • Existing hashql-core test suite (880 unit + 326 integration + 20 doc tests) all pass
  • The module::namespace and module::resolver tests exercise custom module construction through the new PartialModuleRegistry builder API

❓ How to test this?

  1. cargo test --package hashql-core
  2. Verify all 1226 tests pass

@vercel

vercel Bot commented Jun 19, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
hash Ready Ready Preview, Comment Jul 3, 2026 4:34pm
petrinaut Ready Ready Preview Jul 3, 2026 4:34pm
1 Skipped Deployment
Project Deployment Actions Updated (UTC)
hashdotdesign-tokens Ignored Ignored Preview Jul 3, 2026 4:34pm

@cursor

cursor Bot commented Jun 19, 2026

Copy link
Copy Markdown

PR Summary

Medium Risk
Touches core module resolution and stdlib construction used by every compile; behavior should be equivalent but the surface area is large and performance-sensitive.

Overview
Refactors how HashQL builds and stores the module graph: construction goes through PartialModuleRegistry (provision → insert → register → finish), and the finished ModuleRegistry is immutable with modules in an IdSlice and a lock-free root map instead of InternMap + RwLock.

Standard library registration is reworked around a flat ModuleCache with compile-time CacheId slots (replacing runtime TypeId scans and a large nested SmallVec), ModuleDef sized by emitted_len, heap-allocated item slices (no intern_items), and StandardLibraryContext / allocator-generic define hooks across stdlib modules. Item::absolute_path is removed in favor of absolute_path_rev; expander path formatting follows that API.

Tests use ModuleRegistry::builder and insert_root_module; stdlib paths lean on sym:: / sym::path:: instead of ad-hoc heap.intern_symbol strings. Minor: darwin-kperf drops as5-1 from M5 CPU name mapping.

Reviewed by Cursor Bugbot for commit 4abefab. Bugbot is set up for automated code reviews on this repo. Configure here.

indietyp commented Jun 19, 2026

Copy link
Copy Markdown
Member Author

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

Copilot AI review requested due to automatic review settings June 30, 2026 12:45
@indietyp indietyp force-pushed the bm/be-607-hashql-remove-smallvec-from-the-standardlibrary-for-module branch from b14202a to 6696c2c Compare June 30, 2026 12:45
@indietyp indietyp force-pushed the bm/be-538-hashql-permission-system-integration branch from b2e4747 to 1e921af Compare June 30, 2026 12:45

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 37 out of 37 changed files in this pull request and generated 2 comments.

b"a16" | b"as1" | b"as2" | b"as3" => Some(Self::M3),
b"as4" | b"as4-1" | b"as4-2" => Some(Self::M4),
b"as5" | b"as5-1" | b"as5-2" => Some(Self::M5),
b"as5" | b"as5-2" => Some(Self::M5),
Comment on lines 322 to 324
/// This method requires allocation for the traversal state:
/// - A vector for the module exploration stack
/// - A hash set for tracking visited modules
@indietyp indietyp force-pushed the bm/be-538-hashql-permission-system-integration branch from 1e921af to 5732e9b Compare July 3, 2026 11:35
@indietyp indietyp force-pushed the bm/be-607-hashql-remove-smallvec-from-the-standardlibrary-for-module branch from 6696c2c to 5d51d92 Compare July 3, 2026 11:35
Copilot AI review requested due to automatic review settings July 3, 2026 11:39
@indietyp indietyp force-pushed the bm/be-607-hashql-remove-smallvec-from-the-standardlibrary-for-module branch from 5d51d92 to f62be24 Compare July 3, 2026 11:39
@indietyp indietyp force-pushed the bm/be-538-hashql-permission-system-integration branch from 5732e9b to a7a901a Compare July 3, 2026 11:39

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 37 out of 37 changed files in this pull request and generated 1 comment.

Comment thread libs/@local/hashql/core/src/module/mod.rs
@indietyp indietyp force-pushed the bm/be-538-hashql-permission-system-integration branch from a7a901a to 65d4a0b Compare July 3, 2026 14:27
@indietyp indietyp force-pushed the bm/be-607-hashql-remove-smallvec-from-the-standardlibrary-for-module branch from f62be24 to 558d30c Compare July 3, 2026 14:27
Copilot AI review requested due to automatic review settings July 3, 2026 14:28
@indietyp indietyp force-pushed the bm/be-607-hashql-remove-smallvec-from-the-standardlibrary-for-module branch from 558d30c to d08cc92 Compare July 3, 2026 14:28
@indietyp indietyp force-pushed the bm/be-538-hashql-permission-system-integration branch from 65d4a0b to 6019aef Compare July 3, 2026 14:28

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 37 out of 37 changed files in this pull request and generated 1 comment.

Comment on lines 60 to 63
b"a16" | b"as1" | b"as2" | b"as3" => Some(Self::M3),
b"as4" | b"as4-1" | b"as4-2" => Some(Self::M4),
b"as5" | b"as5-1" | b"as5-2" => Some(Self::M5),
b"as5" | b"as5-2" => Some(Self::M5),
_ => None,

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 37 out of 37 changed files in this pull request and generated 2 comments.

b"a16" | b"as1" | b"as2" | b"as3" => Some(Self::M3),
b"as4" | b"as4-1" | b"as4-2" => Some(Self::M4),
b"as5" | b"as5-1" | b"as5-2" => Some(Self::M5),
b"as5" | b"as5-2" => Some(Self::M5),
Comment on lines 56 to 63
pub const fn from_db_name(name: &str) -> Option<Self> {
match name.as_bytes() {
b"a14" => Some(Self::M1),
b"a15" => Some(Self::M2),
b"a16" | b"as1" | b"as2" | b"as3" => Some(Self::M3),
b"as4" | b"as4-1" | b"as4-2" => Some(Self::M4),
b"as5" | b"as5-1" | b"as5-2" => Some(Self::M5),
b"as5" | b"as5-2" => Some(Self::M5),
_ => None,
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/libs Relates to first-party libraries/crates/packages (area) type/eng > backend Owned by the @backend team

Development

Successfully merging this pull request may close these issues.

2 participants