FE-769: Add String token dimension type (string interning)#8956
Conversation
Variable-length strings cannot live in fixed-stride packed token structs, so frames store a 64-bit reference (new u64 physical kind) into an append-only per-run StringPool owned by the simulation — not the frame: - StringPool: id 0 pre-seeded as "" (zeroed buffers decode sanely), entries immutable once assigned, never compacted mid-run (IDs stay valid for the whole retained frame history), fresh pool per init, maxSize guard (1M distinct values) fails loudly on unbounded unique-string workloads. - Interactive runs ship append-only `newStrings` payload deltas; the main-thread frame store accumulates its own pool copy (ordered, baseId-asserted) and frame readers decode through it. Monte Carlo pools are per-run and never cross threads. - Runtime records hold plain JS strings; coercion is total (String(value), missing → ""); interning is deterministic and equal strings always share an ID. Distribution on string stays an error; dynamics read strings but cannot write them. - LSP types string elements as `string` end to end; spreadsheet gains text cells; type properties and the playground gain the String option (the memory view shows the pool-reference round-trip). - Design record: docs/string-interning.md (options considered, mutability analysis, trade-offs, future iterations). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…e-769-add-string-discrete-type-support # Conflicts: # libs/@hashintel/petrinaut-core/docs/architecture/engine.html
PR SummaryMedium Risk Overview Engine & protocol: Surface area: schema/LSP virtual types/AI cheatsheet, scenario compile & clipboard validation, type properties + initial-state/scenario spreadsheets, token-encoding playground, user guide and Reviewed by Cursor Bugbot for commit 8d623ca. Bugbot is set up for automated code reviews on this repo. Configure here. |
|
The latest updates on your projects. Learn more about Vercel for GitHub.
1 Skipped Deployment
|
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 8d623ca. Configure here.
| block.byteLength, | ||
| ); | ||
| return readTokenRecord(layout, views, 0); | ||
| return readTokenRecord(layout, views, 0, stringPool); |
There was a problem hiding this comment.
Test helper omits string pool
Low Severity
In the same file, buildTokenBytes, decodeTokenBlock, and makeTestFrame pass a StringPool into readTokenRecord / encodeTokenToBytes, but decodePlaceTokens still calls readTokenRecord without a pool. For layouts with string elements, that throws the new programmer-error from token-layout.ts, so engine tests that decode frames via this helper cannot exercise string tokens even when the simulation instance has a pool.
Reviewed by Cursor Bugbot for commit 8d623ca. Configure here.
There was a problem hiding this comment.
Pull request overview
Adds string as a new token dimension type across Petrinaut UI + @hashintel/petrinaut-core, implementing per-run string interning via an append-only StringPool and shipping append-only pool deltas alongside interactive worker frame payloads so main-thread frame history remains decodable.
Changes:
- Introduces
StringPooland a newu64physical kind for string pool references in packed token buffers, with encode/decode wired through the pool. - Extends the interactive worker protocol + main-thread frame store to accumulate
newStringsdeltas and decode historical frames against the accumulated pool. - Updates UI editors (type dropdown, spreadsheets, playground) and documentation to support string dimensions end-to-end.
Reviewed changes
Copilot reviewed 48 out of 48 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| libs/@hashintel/petrinaut/src/ui/views/Editor/panels/SimulateView/scenarios/scenario-mapping.ts | Keeps string scenario columns as literal text when mapping rows to spreadsheet values. |
| libs/@hashintel/petrinaut/src/ui/views/Editor/panels/PropertiesPanel/type-properties/subviews/main.tsx | Adds “String” to the dimension type selector in type properties. |
| libs/@hashintel/petrinaut/src/ui/views/Editor/panels/PropertiesPanel/place-properties/subviews/place-initial-state/initial-state-editor.tsx | Preserves string initial-marking values as text in the spreadsheet editor. |
| libs/@hashintel/petrinaut/src/ui/lib/compile-visualizer.ts | Extends visualizer token prop typing to include string values. |
| libs/@hashintel/petrinaut/src/ui/dev/token-encoding-playground/token-memory-view.tsx | Updates playground memory view to display string pool IDs + decoded text round-trip. |
| libs/@hashintel/petrinaut/src/ui/dev/token-encoding-playground/playground-monaco.ts | Updates Monaco defs generation to type string dimensions as string. |
| libs/@hashintel/petrinaut/src/ui/dev/token-encoding-playground/physical-layout.ts | Uses a throwaway StringPool in the playground encoder/decoder path for string fields. |
| libs/@hashintel/petrinaut/src/ui/dev/token-encoding-playground/physical-layout.test.ts | Adds playground tests for string field layout and pool-reference round-trips. |
| libs/@hashintel/petrinaut/src/ui/dev/token-encoding-playground/dimension-editor.tsx | Adds “String” to the playground dimension type selector. |
| libs/@hashintel/petrinaut/src/ui/components/spreadsheet.tsx | Adds string column type support (parsing, tooltips, input type handling). |
| libs/@hashintel/petrinaut/src/ui/components/spreadsheet.stories.tsx | Extends spreadsheet Storybook story with a string column + data. |
| libs/@hashintel/petrinaut/docs/petri-net-extensions.md | Documents the new String dimension type and its discrete semantics. |
| libs/@hashintel/petrinaut-core/src/types/sdcpn.ts | Adds "string" to ColorElementType and includes string in token attribute runtime union. |
| libs/@hashintel/petrinaut-core/src/simulation/worker/simulation.worker.ts | Ships append-only newStrings deltas and resets delta state per init/reset. |
| libs/@hashintel/petrinaut-core/src/simulation/worker/simulation.worker.test.ts | Tests initial-marking delta shipping and omission when no string fields exist. |
| libs/@hashintel/petrinaut-core/src/simulation/worker/frame-payload.ts | Extends worker frame payload type with optional newStrings delta. |
| libs/@hashintel/petrinaut-core/src/simulation/runtime/frame-store.ts | Accumulates main-thread string pool copy and asserts delta ordering before storing frames. |
| libs/@hashintel/petrinaut-core/src/simulation/runtime/frame-store.test.ts | New tests for pool accumulation, ordering assertions, and clear() reset behavior. |
| libs/@hashintel/petrinaut-core/src/simulation/monte-carlo/transition-effect.ts | Ensures MC decode/encode paths use the run’s string pool; adjusts error formatting. |
| libs/@hashintel/petrinaut-core/src/simulation/monte-carlo/monte-carlo-simulator.test.ts | Adds an end-to-end MC test covering string interning + metric decoding. |
| libs/@hashintel/petrinaut-core/src/simulation/monte-carlo/frame-reader.ts | Decodes MC tokens using the run-local string pool. |
| libs/@hashintel/petrinaut-core/src/simulation/frames/frame-reader.ts | Extends frame reader compilation to accept a StringPoolReader for string decoding. |
| libs/@hashintel/petrinaut-core/src/simulation/frames/frame-reader.test.ts | Adds coverage for decoding string fields through a provided pool accessor. |
| libs/@hashintel/petrinaut-core/src/simulation/engine/types.ts | Adds stringPool to SimulationInstance so the pool is owned per run/init. |
| libs/@hashintel/petrinaut-core/src/simulation/engine/token-values.ts | Adds string default/coercion and guards against encoding/decoding strings without a pool. |
| libs/@hashintel/petrinaut-core/src/simulation/engine/token-layout.ts | Adds u64 physical kind, pool reader/writer types, and string pool integration for read/write/encode. |
| libs/@hashintel/petrinaut-core/src/simulation/engine/token-layout.test.ts | Adds layout + round-trip tests for string fields stored as u64 pool references. |
| libs/@hashintel/petrinaut-core/src/simulation/engine/token-layout.test-helpers.ts | Threads stringPool through test helpers so decoding works for string layouts. |
| libs/@hashintel/petrinaut-core/src/simulation/engine/string-pool.ts | New append-only StringPool implementation with max-size guard and delta support. |
| libs/@hashintel/petrinaut-core/src/simulation/engine/string-pool.test.ts | New unit tests for deduping, reserved "", valuesFrom, and max-size guard. |
| libs/@hashintel/petrinaut-core/src/simulation/engine/execute-transitions.test.ts | Ensures test simulation instances include a string pool. |
| libs/@hashintel/petrinaut-core/src/simulation/engine/encode-kernel-token.ts | Interns string outputs in kernel encoding and stores u64 pool IDs into buffers. |
| libs/@hashintel/petrinaut-core/src/simulation/engine/compute-possible-transition.ts | Decodes inputs via the simulation pool and interns outputs via the pool; adjusts error formatting. |
| libs/@hashintel/petrinaut-core/src/simulation/engine/compute-possible-transition.test.ts | Adds kernel output tests for string interning, forwarding, defaults, and Distribution rejection. |
| libs/@hashintel/petrinaut-core/src/simulation/engine/build-simulation.ts | Constructs a per-run StringPool and uses it while packing the initial marking + decoding dynamics input. |
| libs/@hashintel/petrinaut-core/src/simulation/authoring/scenario/compile-scenario.test.ts | Adds compile-scenario tests ensuring string columns pass through literally and default correctly. |
| libs/@hashintel/petrinaut-core/src/simulation/api.ts | Clarifies initial marking value semantics for string attributes. |
| libs/@hashintel/petrinaut-core/src/schemas/scenario-schema.ts | Clarifies schema docs: strings are literal for string elements; uuid strings still coerce for uuid elements. |
| libs/@hashintel/petrinaut-core/src/schemas/entity-schemas.ts | Extends element type enum and schema descriptions to include string semantics + interning. |
| libs/@hashintel/petrinaut-core/src/lsp/lib/generate-virtual-files.ts | Types string elements as string in LSP-generated TS defs (incl. metric session token record unions). |
| libs/@hashintel/petrinaut-core/src/lsp/lib/checker.test.ts | Adds LSP checker tests for string typing, kernel output acceptance, and Distribution rejection. |
| libs/@hashintel/petrinaut-core/src/index.ts | Exports StringPool and pool reader/writer types from the package entrypoint. |
| libs/@hashintel/petrinaut-core/src/default-codes.ts | Adds default source literals for string attributes in generated templates. |
| libs/@hashintel/petrinaut-core/src/clipboard/serialize.test.ts | Updates an invalid-type fixture now that "string" is a valid element type. |
| libs/@hashintel/petrinaut-core/src/ai.ts | Updates code-surface guidance to include string typing and scenario semantics. |
| libs/@hashintel/petrinaut-core/docs/string-interning.md | New design/decision doc describing the string interning architecture and trade-offs. |
| libs/@hashintel/petrinaut-core/docs/architecture/engine.html | Updates the format-v2 table and notes the interactive newStrings delta protocol. |
| .changeset/fe-769-string-token-dimension-type.md | Changeset documenting the new string element type and storage semantics. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| <tr | ||
| // eslint-disable-next-line react/no-array-index-key -- Row position is stable and meaningful | ||
| key={`row-${rowIndex}-${row.map(formatCellValue).join("-")}`} | ||
| key={`row-${rowIndex}-${row | ||
| .map(formatCellValue) | ||
| .join("-")}`} |
| * - `uuid` attributes are OPTIONAL (omitted values are auto-generated from | ||
| * the seeded simulation RNG) and also accept UUID strings and the | ||
| * `Uuid.generate()` / `Uuid.from(value)` sentinels. | ||
| * - Other discrete attributes (`integer`, `boolean`) must be plain values. | ||
| * - Other discrete attributes (`integer`, `boolean`, `string`) must be plain | ||
| * values (`string` never takes a Distribution or a sentinel). |


🌟 What is the purpose of this PR?
Adds
stringas a fifth colour element type. Strings are variable-length, so they cannot live in the fixed-stride packed token structs — instead each frame stores a 64-bit reference into an append-only per-run string intern pool that is owned by the simulation, not the frame. A full design document ships with the PR:libs/@hashintel/petrinaut-core/docs/string-interning.md.🔗 Related links
libs/@hashintel/petrinaut-core/docs/string-interning.md(options considered, mutability analysis, trade-offs, future iterations)🚫 Blocked by
🔍 What does this change?
The pool (
engine/string-pool.ts):init."", so zeroed buffers decode sanely and string-free nets ship zero protocol overhead.maxSizeguard (1M distinct values) fails loudly with a targeted message if a kernel generates unbounded unique strings — the pathology that made us reject interning for UUIDs, contained here by design.Buffers: new
u64physical kind (8 B, align 8) holding the pool ID via the existingBigUint64Arrayview. Stride math, byte-range compaction, and all whole-token moves are untouched — references are just bytes.Pool distribution (the "not part of the frame" consequence):
SimulationFramePayloadcarries an append-onlynewStrings: { baseId, values }delta; the main-thread frame store accumulates its own pool copy (ordered,baseId-asserted) and frame readers decode through it. Delta ordering guarantees every stored frame is decodable on arrival.Semantics: runtime
TokenRecords hold plain JS strings; coercion is total (String(value), missing →""); interning is deterministic (same run ⇒ same IDs) and equal strings always share an ID; kernels/markings/scenarios write, dynamics read-only (?: neverderivative);Distributionon a string field stays an error (LSP + runtime).UI: String option in the dimension type select and the playground; spreadsheet string columns (text editing, identity parse, Delete →
""); the playground memory view shows the pool-reference round-trip (input "hello world" → pool id 1 → "hello world").Docs: dimension-type list + kernel notes in the user guide;
stringrow in the architecture format-v2 table; the design document.Pre-Merge Checklist 🚀
🚢 Has this modified a publishable library?
This PR:
📜 Does this require a change to the docs?
The changes in this PR:
petri-net-extensions.mdshows a four-option dropdown; the UI now has five — please re-capture🕸️ Does this require a change to the Turbo Graph?
The changes in this PR:
maxSizewith a clear error — see the design doc's trade-off section).🐾 Next steps
🛡 What tests cover this?
string-pool.test.ts(dedup, reserved"",valuesFrom,maxSizeguard) andframe-store.test.ts(delta accumulation, ordering assertion, reset)String()coercion, missing →"", Distribution-on-string throwslint:tsc+lint:eslintclean❓ How to test this?
yarn devinlibs/@hashintel/petrinaut.state.places.X.tokens[0].label).label: input.Source[0].label) and one producing new strings; confirm equal strings behave identically.input → pool id → valueround-trip.📹 Demo
The playground story shows the wire format: a string field as a u64 pool reference with its round-trip in the hover panel.