Skip to content

Make TMerkleTreeNode dyn compatible + rm from_u8 panic!#406

Merged
malcolmgreaves merged 11 commits intomainfrom
mg/1_TMerkleTreeNode_object_safety
Apr 3, 2026
Merged

Make TMerkleTreeNode dyn compatible + rm from_u8 panic!#406
malcolmgreaves merged 11 commits intomainfrom
mg/1_TMerkleTreeNode_object_safety

Conversation

@malcolmgreaves
Copy link
Copy Markdown
Collaborator

@malcolmgreaves malcolmgreaves commented Apr 1, 2026

trait TMerkleTreeNode Changes
First in a series of PRs to abstract the Merkle tree store so that we can provide different
backing implementations other than the existing custom file format based implementation.

This PR changes drops the + Serialize constraint to trait TMerkleTreeNode to make it
dyn Compatible (this was formally known as object safe). Serialize has a method
serialize<S: Serializer> that prevents it from being compiled as a virtual table lookup,
hence it's not dyn Compatible. [1]

The change here drops the + Serialize constraint from the trait and instead adds a new
method to_msgpack_bytes() -> Result<Vec<u8>, ...>, which performs the serialization.

There's a blanket implementation that implements this updated trait for all existing concrete
node implementations: it uses their Serialize impl. to provide a generic implementation for
to_msgpack_bytes with a shared serialization error of rmp_serde::encode::Error.

Sites that were manually calling serialize on the node now call to_msgpack_bytes().
There's a new conversion of encode::Error into an OxenError wrapper.

[1] https://doc.rust-lang.org/reference/items/traits.html#r-items.traits.dyn-compatible

Return Serde Error More Often
Several functions were returning an OxenError even though they were only ever returning
a serde serialization error (rmp_serde::encode::Error). Since there's an OxenError
variant + From implementation to convert these, the functions' signatures have been
modified to return the more tightly scoped serde error and have callers rely on using
a conversion via the ? operator into an OxenError.

from_u8 changes
liboxen::model::merkle_tree::node_type::from_u8 has a breaking change: instead of a
panic! when an invalid u8 value is supplied, it returns an Err(.). A new error type
has been added InvalidMerkleTreeNodeType for this case and a new variant of
OxenError::MerkleTreeError has been added that wraps this error. Call sites have been
updated to propagate this error up to an OxenError.

- modified trait to use associated type for error
- to_msgpack_bytes returns result w/ that error or vec<u8>
- blanket impl uses something that also has serialize as to_msgpack_bytes implmentation
- made `from_u8` return error & added it to the OxenError heirarchy

WIP propigate changes
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 1, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Refactors Merkle node serialization and error handling: makes node-type parsing fallible, replaces serde panics with propagated rmp_serde encode/decode Results, adds an object-safe to_msgpack_bytes on TMerkleTreeNode, replaces per-node empty trait impls with a blanket impl, and updates DB/repo APIs to use generics and new error types.

Changes

Cohort / File(s) Summary
Error enum
crates/lib/src/error.rs
Added OxenError::MerkleTreeError(InvalidMerkleTreeNodeType) and OxenError::RmpEncodeError(rmp_serde::encode::Error).
Node type & trait refactor
crates/lib/src/model/merkle_tree/node_type.rs
Made from_u8 fallible (returns Result), added InvalidMerkleTreeNodeType error, removed Serialize supertrait from TMerkleTreeNode, added object-safe to_msgpack_bytes() and a blanket impl; added compile-time tests.
Concrete node deserializers
crates/lib/src/model/merkle_tree/node/{commit_node.rs, dir_node.rs, file_chunk_node.rs, file_node.rs, vnode.rs, merkle_tree_node.rs}
Changed deserialize/deserialize_id signatures to return rmp_serde::decode::Error instead of OxenError; removed empty impl TMerkleTreeNode blocks and adjusted imports.
DB layer
crates/lib/src/core/db/merkle_node/merkle_node_db.rs
Propagates MsgPack encode/decode errors (?) instead of panicking; to_node returns rmp decode errors; write paths use to_msgpack_bytes()?; generalized functions to N: TMerkleTreeNode; from_u8 parsing now returns errors instead of defaulting.
Repository tree writer
crates/lib/src/repositories/tree.rs
Private recursive writer p_write_tree changed to explicit generics N: TMerkleTreeNode (no trait-object param); added doc comment and serialization error constraints.

Sequence Diagram(s)

sequenceDiagram
    rect rgba(200,200,255,0.5)
    participant Repo
    participant p_write_tree
    participant Node
    participant MerkleNodeDB
    participant Storage
    end
    Repo->>p_write_tree: write tree (root commit)
    p_write_tree->>Node: request to_msgpack_bytes()
    Node-->>p_write_tree: Vec<u8> or EncodeError
    p_write_tree->>MerkleNodeDB: write_node(bytes)
    MerkleNodeDB->>Storage: persist bytes
    MerkleNodeDB-->>p_write_tree: Ok or Decode/IO Error
    p_write_tree-->>Repo: final Result
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

  • Oxen-AI/Oxen#500: Overlaps changes to Merkle node serialization APIs and DB read/write code — likely touches the same call sites and signatures.
  • Oxb 254/cleanup push #145: Touches MerkleTreeNodeType and TMerkleTreeNode serialization concerns related to this refactor.
  • Refactor CommitMerkleTree and repositories::tree #176: Modifies Merkle tree interfaces and repository write logic that this PR also updates.

Suggested reviewers

  • gschoeni
  • rpschoenburg
  • jcelliott

Poem

🐰 I pack the nodes in tidy rows,
I swap the panics for calmer flows.
From root to leaf the bytes now sing,
Errors caught — no sudden spring.
✨ Hop, patch, and let the trees take wing.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 41.03% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main refactoring changes: making TMerkleTreeNode trait dyn-compatible and removing panic-prone from_u8() calls.
Description check ✅ Passed The description accurately summarizes the key changes: removing the Serialize constraint from TMerkleTreeNode, adding to_msgpack_bytes(), implementing blanket implementations, updating call sites, and changing from_u8 to return Result instead of panicking.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch mg/1_TMerkleTreeNode_object_safety

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
crates/lib/src/error.rs (1)

315-317: Consider adding RmpEncodeError to the internal error hints.

The RmpDecodeError is included in the hint() method (Line 381) as an internal error, but the new RmpEncodeError is not. For consistency, encoding errors should likely receive the same hint.

🔧 Suggested fix
             DB(_) | ArrowError(_) | BinCodeError(_) | RedisError(_) | R2D2Error(_)
-            | RmpDecodeError(_) => {
+            | RmpDecodeError(_) | RmpEncodeError(_) => {
                 "This is an internal error. Run with RUST_LOG=debug for more details."
             }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/lib/src/error.rs` around lines 315 - 317, Add RmpEncodeError to the
internal error hints returned by the hint() method so encoding errors get the
same diagnostic guidance as decoding errors; locate the hint() implementation
and where it currently matches RmpDecodeError and include RmpEncodeError in that
same arm (or add a separate arm that returns the same internal error hint) so
both rmp_serde::encode::Error and rmp_serde::decode::Error map to the internal
error hint.
crates/lib/src/repositories/tree.rs (1)

1126-1128: Consider returning an error instead of panicking.

The panic! on unexpected node types could be replaced with an Err return for more graceful error handling, though this represents a programming error rather than a runtime condition.

🔧 Optional fix
             EMerkleTreeNode::File(file_node) => {
                 db.add_child(file_node)?;
             }
-            node => {
-                panic!("p_write_tree Unexpected node type: {node:?}");
-            }
+            EMerkleTreeNode::Commit(_) | EMerkleTreeNode::FileChunk(_) => {
+                return Err(OxenError::basic_str(format!(
+                    "p_write_tree unexpected node type: {:?}",
+                    child.node.node_type()
+                )));
+            }
         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/lib/src/repositories/tree.rs` around lines 1126 - 1128, Replace the
panic in the catch-all match arm inside p_write_tree with a propagated error
return: instead of panic!("p_write_tree Unexpected node type: {node:?}"), return
an Err variant (using the crate's repository error type or anyhow::Error) that
includes a descriptive message and the debug of node so callers can handle it;
update the function signature to return Result if needed and adjust call sites
to propagate the error via ? or map_err so this programming-error case is
reported without aborting the process.
crates/lib/src/core/db/merkle_node/merkle_node_db.rs (1)

257-264: ?Sized bounds needed only if trait objects are intended for these entry points.

The current code's implicit Sized bounds on N would reject &dyn TMerkleTreeNode<...> at these call sites. However, all current callers in p_write_tree pass concrete types (vnode, dir_node, file_node), so this is not a current issue.

If the design goal is to allow trait-object dispatch through these helpers, add ?Sized:

Suggested changes (if needed)
-    pub fn open_read_write_if_not_exists<N: TMerkleTreeNode>(
+    pub fn open_read_write_if_not_exists<N: TMerkleTreeNode + ?Sized>(
-    pub fn open_read_write<N: TMerkleTreeNode>(
+    pub fn open_read_write<N: TMerkleTreeNode + ?Sized>(
-    fn write_node<N: TMerkleTreeNode>(
+    fn write_node<N: TMerkleTreeNode + ?Sized>(
-    pub fn add_child<N: TMerkleTreeNode>(&mut self, item: &N) -> Result<(), OxenError>
+    pub fn add_child<N: TMerkleTreeNode + ?Sized>(&mut self, item: &N) -> Result<(), OxenError>

Applies to: lines 257–264, 277–284, 373–380, 424–427.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/lib/src/core/db/merkle_node/merkle_node_db.rs` around lines 257 - 264,
Update the generic bounds to allow trait-object dispatch by adding ?Sized to the
N type parameter where needed: change the signatures of
open_read_write_if_not_exists, (and the other helper functions flagged in the
review at the same pattern) so the generic bound becomes N: TMerkleTreeNode +
?Sized and keep the parameter as node: &N; leave the existing where clauses
(e.g., OxenError: From<N::SerializationError>) intact so SerializationError
still resolves for unsized types.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/lib/src/core/db/merkle_node/merkle_node_db.rs`:
- Around line 331-334: The code currently uses
MerkleTreeNodeType::from_u8_unwrap on bytes read from disk (e.g., in the lookup
mapping shown) which will panic on invalid bytes; change these sites (the
occurrence in the lookup mapping and the similar uses at the other noted
locations) to call the fallible MerkleTreeNodeType::from_u8 and propagate the
error instead of unwrapping so open()/map() return Err for invalid node-type
bytes; update the surrounding logic in the functions handling disk reads (e.g.,
open(), map()) to propagate the Result from from_u8 rather than assuming a valid
value.

---

Nitpick comments:
In `@crates/lib/src/core/db/merkle_node/merkle_node_db.rs`:
- Around line 257-264: Update the generic bounds to allow trait-object dispatch
by adding ?Sized to the N type parameter where needed: change the signatures of
open_read_write_if_not_exists, (and the other helper functions flagged in the
review at the same pattern) so the generic bound becomes N: TMerkleTreeNode +
?Sized and keep the parameter as node: &N; leave the existing where clauses
(e.g., OxenError: From<N::SerializationError>) intact so SerializationError
still resolves for unsized types.

In `@crates/lib/src/error.rs`:
- Around line 315-317: Add RmpEncodeError to the internal error hints returned
by the hint() method so encoding errors get the same diagnostic guidance as
decoding errors; locate the hint() implementation and where it currently matches
RmpDecodeError and include RmpEncodeError in that same arm (or add a separate
arm that returns the same internal error hint) so both rmp_serde::encode::Error
and rmp_serde::decode::Error map to the internal error hint.

In `@crates/lib/src/repositories/tree.rs`:
- Around line 1126-1128: Replace the panic in the catch-all match arm inside
p_write_tree with a propagated error return: instead of panic!("p_write_tree
Unexpected node type: {node:?}"), return an Err variant (using the crate's
repository error type or anyhow::Error) that includes a descriptive message and
the debug of node so callers can handle it; update the function signature to
return Result if needed and adjust call sites to propagate the error via ? or
map_err so this programming-error case is reported without aborting the process.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 823320ab-2b3e-4dfe-81e9-95cc6d71f313

📥 Commits

Reviewing files that changed from the base of the PR and between 60389ea and ce6c05f.

📒 Files selected for processing (10)
  • crates/lib/src/core/db/merkle_node/merkle_node_db.rs
  • crates/lib/src/error.rs
  • crates/lib/src/model/merkle_tree/node/commit_node.rs
  • crates/lib/src/model/merkle_tree/node/dir_node.rs
  • crates/lib/src/model/merkle_tree/node/file_chunk_node.rs
  • crates/lib/src/model/merkle_tree/node/file_node.rs
  • crates/lib/src/model/merkle_tree/node/merkle_tree_node.rs
  • crates/lib/src/model/merkle_tree/node/vnode.rs
  • crates/lib/src/model/merkle_tree/node_type.rs
  • crates/lib/src/repositories/tree.rs

Comment thread crates/lib/src/core/db/merkle_node/merkle_node_db.rs Outdated
Copy link
Copy Markdown
Contributor

@CleanCut CleanCut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can reduce the complexity here quite a bit and still accomplish the goal!

Comment thread crates/lib/src/repositories/tree.rs Outdated
Comment thread crates/lib/src/model/merkle_tree/node_type.rs Outdated
Comment on lines +92 to +97
/// Deserialize a `u8` value into a `MerkleTreeNodeType`.
/// Panics if the `u8` value is not a valid `MerkleTreeNodeType`.
pub fn from_u8_unwrap(val: u8) -> MerkleTreeNodeType {
Self::from_u8(val).expect("Invalid MerkleTreeNodeType: {val}")
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use from_u8 instead. Returning an error is the correct thing to do in every place that this panicky function is used.

Suggested change
/// Deserialize a `u8` value into a `MerkleTreeNodeType`.
/// Panics if the `u8` value is not a valid `MerkleTreeNodeType`.
pub fn from_u8_unwrap(val: u8) -> MerkleTreeNodeType {
Self::from_u8(val).expect("Invalid MerkleTreeNodeType: {val}")
}

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I would like to make this change, that's why I'm introducing it here :) I thought I got all of these -- let me double check the call sites.

Comment thread crates/lib/src/model/merkle_tree/node_type.rs
Comment thread crates/lib/src/model/merkle_tree/node_type.rs Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
crates/lib/src/model/merkle_tree/node_type.rs (1)

157-182: The new test still doesn't pin the two regressions this refactor cares about.

Right now it only proves concrete impls exist and their errors convert into OxenError. It won't fail if TMerkleTreeNode stops being dyn-compatible again, and it doesn't lock the repository-critical u8 mapping / invalid-byte path described above.

Proposed test additions
     #[test]
     fn test_nodes_implement_trait() {
         /// this only exists so we can check that all node types implement `TMerkleTreeNode`
         /// it will be a compile-time error if a node type does not implement the trait
-        fn is_tmerkletreenode<T: TMerkleTreeNode>(_: T)
+        fn is_tmerkletreenode<T: TMerkleTreeNode>(node: &T)
         where
             OxenError: From<T::SerializationError>,
         {
+            let _: &dyn TMerkleTreeNode<SerializationError = T::SerializationError> = node;
         }
 
-        is_tmerkletreenode(CommitNode::default());
-        is_tmerkletreenode(DirNode::default());
-        is_tmerkletreenode(VNode::default());
-        is_tmerkletreenode(FileNode::default());
-        is_tmerkletreenode(FileChunkNode::default());
+        is_tmerkletreenode(&CommitNode::default());
+        is_tmerkletreenode(&DirNode::default());
+        is_tmerkletreenode(&VNode::default());
+        is_tmerkletreenode(&FileNode::default());
+        is_tmerkletreenode(&FileChunkNode::default());
     }
+
+    #[test]
+    fn test_node_type_u8_mapping_is_stable() {
+        for (node_type, value) in [
+            (MerkleTreeNodeType::Commit, 0u8),
+            (MerkleTreeNodeType::Dir, 1u8),
+            (MerkleTreeNodeType::VNode, 2u8),
+            (MerkleTreeNodeType::File, 3u8),
+            (MerkleTreeNodeType::FileChunk, 4u8),
+        ] {
+            assert_eq!(node_type.to_u8(), value);
+            assert_eq!(
+                MerkleTreeNodeType::from_u8(value).expect("valid node type"),
+                node_type
+            );
+        }
+    }
+
+    #[test]
+    fn test_invalid_node_type_returns_error() {
+        assert!(matches!(
+            MerkleTreeNodeType::from_u8(u8::MAX),
+            Err(InvalidMerkleTreeNodeType(_))
+        ));
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/lib/src/model/merkle_tree/node_type.rs` around lines 157 - 182, Add
two stronger tests: (1) verify TMerkleTreeNode is object-safe by defining a
helper that accepts &dyn TMerkleTreeNode (e.g., fn takes_dyn(_: &dyn
TMerkleTreeNode) {}) and passing references like &CommitNode::default(),
&DirNode::default(), &VNode::default(), &FileNode::default(),
&FileChunkNode::default() to it; (2) pin the u8 mapping by exercising the
NodeType (or equivalent) conversion from bytes (e.g., TryFrom<u8> / from_u8) and
asserting each concrete node variant maps to the expected byte value and that
out-of-range/invalid bytes produce an Err or InvalidByte result. Reference
TMerkleTreeNode, the existing is_tmerkletreenode test, and the NodeType/from-u8
conversion functions to locate where to add these checks.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/lib/src/model/merkle_tree/node_type.rs`:
- Around line 136-155: TMerkleTreeNode now adds an associated type
SerializationError and a method to_msgpack_bytes which breaks any manual impls;
update the crate by (1) adding clear migration docs and CHANGELOG entry
describing how to change any manual impls to declare type SerializationError =
rmp_serde::encode::Error and implement to_msgpack_bytes (or rely on the provided
blanket impl for types implementing Serialize + MerkleTreeNodeIdType + Debug +
Display), (2) mark this change as a major-version bump / deprecation period for
users with custom impls, and (3) optionally provide a short compatibility
note/example showing the exact replacement impl for TMerkleTreeNode (using type
SerializationError and the to_msgpack_bytes body from the blanket impl) so
external consumers can update easily.

---

Nitpick comments:
In `@crates/lib/src/model/merkle_tree/node_type.rs`:
- Around line 157-182: Add two stronger tests: (1) verify TMerkleTreeNode is
object-safe by defining a helper that accepts &dyn TMerkleTreeNode (e.g., fn
takes_dyn(_: &dyn TMerkleTreeNode) {}) and passing references like
&CommitNode::default(), &DirNode::default(), &VNode::default(),
&FileNode::default(), &FileChunkNode::default() to it; (2) pin the u8 mapping by
exercising the NodeType (or equivalent) conversion from bytes (e.g., TryFrom<u8>
/ from_u8) and asserting each concrete node variant maps to the expected byte
value and that out-of-range/invalid bytes produce an Err or InvalidByte result.
Reference TMerkleTreeNode, the existing is_tmerkletreenode test, and the
NodeType/from-u8 conversion functions to locate where to add these checks.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 80e1e045-7d88-49f3-91f8-12027386cb23

📥 Commits

Reviewing files that changed from the base of the PR and between ce6c05f and 72fc293.

📒 Files selected for processing (1)
  • crates/lib/src/model/merkle_tree/node_type.rs

Comment thread crates/lib/src/model/merkle_tree/node_type.rs
@malcolmgreaves malcolmgreaves force-pushed the mg/1_TMerkleTreeNode_object_safety branch from 72fc293 to 4b3d9a4 Compare April 1, 2026 23:25
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
crates/lib/src/model/merkle_tree/node_type.rs (2)

48-51: Expose the invalid byte on this public error.

from_u8 is public, but callers outside this module cannot inspect which tag failed because InvalidMerkleTreeNodeType hides its only payload. If you want to keep construction private, a small accessor is enough.

♻️ Proposed refactor
 pub struct InvalidMerkleTreeNodeType(u8);
+
+impl InvalidMerkleTreeNodeType {
+    pub fn invalid_byte(&self) -> u8 {
+        self.0
+    }
+}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/lib/src/model/merkle_tree/node_type.rs` around lines 48 - 51, The
public error type InvalidMerkleTreeNodeType currently hides its payload so
callers of from_u8 cannot see the invalid byte; add a public accessor on
InvalidMerkleTreeNodeType (e.g. impl InvalidMerkleTreeNodeType { pub fn
invalid_byte(&self) -> u8 { self.0 } }) so external code can inspect the
offending u8 while keeping the tuple field private and preserving the existing
error message and derives; update any callsites that need the byte to use
invalid_byte().

166-181: Add behavioral tests for the new on-disk mapping helpers.

test_nodes_implement_trait only checks compile-time impl coverage. It doesn't protect the to_u8/from_u8 round-trip or the invalid-byte path, which are the actual compatibility contract in this file.

🧪 Proposed tests
     #[test]
     fn test_nodes_implement_trait() {
         /// this only exists so we can check that all node types implement `TMerkleTreeNode`
         /// it will be a compile-time error if a node type does not implement the trait
         fn is_tmerkletreenode<T: TMerkleTreeNode>(_: T)
         where
             OxenError: From<T::SerializationError>,
         {
         }
 
         is_tmerkletreenode(CommitNode::default());
         is_tmerkletreenode(DirNode::default());
         is_tmerkletreenode(VNode::default());
         is_tmerkletreenode(FileNode::default());
         is_tmerkletreenode(FileChunkNode::default());
     }
+
+    #[test]
+    fn test_node_type_round_trip() {
+        for node_type in [
+            MerkleTreeNodeType::Commit,
+            MerkleTreeNodeType::Dir,
+            MerkleTreeNodeType::VNode,
+            MerkleTreeNodeType::File,
+            MerkleTreeNodeType::FileChunk,
+        ] {
+            assert_eq!(MerkleTreeNodeType::from_u8(node_type.to_u8()).unwrap(), node_type);
+        }
+    }
+
+    #[test]
+    fn test_invalid_node_type_byte() {
+        assert!(matches!(
+            MerkleTreeNodeType::from_u8(255),
+            Err(InvalidMerkleTreeNodeType(_))
+        ));
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/lib/src/model/merkle_tree/node_type.rs` around lines 166 - 181, Extend
the tests to verify runtime behavior of the on-disk mapping helpers: for each
node type used in test_nodes_implement_trait (CommitNode, DirNode, VNode,
FileNode, FileChunkNode) call the node's to_u8 (or equivalent mapping function)
and then from_u8 to assert the round-trip yields the original enum variant (or
an equivalent equality/assertion), and add a separate test that passes an
invalid/unknown byte value into from_u8 to assert it returns the expected
error/None result; update or add tests near test_nodes_implement_trait and
reference TMerkleTreeNode, to_u8, from_u8 to ensure coverage of both valid
round-trips and invalid-byte handling.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/lib/src/model/merkle_tree/node_type.rs`:
- Around line 94-96: The panic message in MerkleTreeNodeType::from_u8_unwrap
uses a literal "{val}" instead of the actual value and loses caller location;
annotate from_u8_unwrap with #[track_caller] and change the error to include the
value (e.g., call from_u8(val).unwrap_or_else(|| panic!("Invalid
MerkleTreeNodeType: {}", val)) or use expect with format!) so the actual u8 is
printed and the caller location is preserved; reference functions
MerkleTreeNodeType::from_u8_unwrap and MerkleTreeNodeType::from_u8 when applying
this change.

---

Nitpick comments:
In `@crates/lib/src/model/merkle_tree/node_type.rs`:
- Around line 48-51: The public error type InvalidMerkleTreeNodeType currently
hides its payload so callers of from_u8 cannot see the invalid byte; add a
public accessor on InvalidMerkleTreeNodeType (e.g. impl
InvalidMerkleTreeNodeType { pub fn invalid_byte(&self) -> u8 { self.0 } }) so
external code can inspect the offending u8 while keeping the tuple field private
and preserving the existing error message and derives; update any callsites that
need the byte to use invalid_byte().
- Around line 166-181: Extend the tests to verify runtime behavior of the
on-disk mapping helpers: for each node type used in test_nodes_implement_trait
(CommitNode, DirNode, VNode, FileNode, FileChunkNode) call the node's to_u8 (or
equivalent mapping function) and then from_u8 to assert the round-trip yields
the original enum variant (or an equivalent equality/assertion), and add a
separate test that passes an invalid/unknown byte value into from_u8 to assert
it returns the expected error/None result; update or add tests near
test_nodes_implement_trait and reference TMerkleTreeNode, to_u8, from_u8 to
ensure coverage of both valid round-trips and invalid-byte handling.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1e3e88d7-ac2e-487d-9896-76f4d9ab3dd4

📥 Commits

Reviewing files that changed from the base of the PR and between 72fc293 and 4b3d9a4.

📒 Files selected for processing (1)
  • crates/lib/src/model/merkle_tree/node_type.rs

Comment thread crates/lib/src/model/merkle_tree/node_type.rs Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
crates/lib/src/model/merkle_tree/node_type.rs (1)

94-96: ⚠️ Potential issue | 🟡 Minor

Interpolate the invalid byte and preserve the caller location.

Line 95's expect("Invalid MerkleTreeNodeType: {val}") prints {val} literally, and without #[track_caller] the panic is reported at this helper instead of the bad call site.

🐛 Proposed fix
+    #[track_caller]
     pub fn from_u8_unwrap(val: u8) -> MerkleTreeNodeType {
-        Self::from_u8(val).expect("Invalid MerkleTreeNodeType: {val}")
+        match Self::from_u8(val) {
+            Ok(node_type) => node_type,
+            Err(_) => panic!("Invalid MerkleTreeNodeType: {}", val),
+        }
     }
In Rust, does `Result::expect("Invalid MerkleTreeNodeType: {val}")` interpolate `{val}`? Also, does annotating a wrapper like `from_u8_unwrap` with `#[track_caller]` preserve the original caller location for the panic?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/lib/src/model/merkle_tree/node_type.rs` around lines 94 - 96, The
helper from_u8_unwrap currently passes a literal "{val}" to expect and doesn't
preserve the caller location; update the function
MerkleTreeNodeType::from_u8_unwrap to (1) interpolate the byte into the panic
message by constructing the string (e.g., using format! with the val) instead of
a raw literal, and (2) add #[track_caller] to the from_u8_unwrap definition so
any panic from expect reports the original caller site rather than inside this
helper.
crates/lib/src/core/db/merkle_node/merkle_node_db.rs (1)

325-328: ⚠️ Potential issue | 🟠 Major

Return invalid node-type bytes as Err, not panics.

These bytes are loaded straight from the node files, so from_u8_unwrap() means one corrupt record—or a repo written with a newer node variant—can abort open()/map() instead of surfacing an OxenError.

🐛 Proposed fix
         let dtype = lookup
             .as_ref()
-            .map(|l| MerkleTreeNodeType::from_u8_unwrap(l.data_type))
+            .map(|l| MerkleTreeNodeType::from_u8(l.data_type))
+            .transpose()?
             .unwrap_or(MerkleTreeNodeType::Commit);
-        let data_type = MerkleTreeNodeType::from_u8_unwrap(lookup.data_type);
+        let data_type = MerkleTreeNodeType::from_u8(lookup.data_type)?;
-            let dtype = MerkleTreeNodeType::from_u8_unwrap(*dtype);
+            let dtype = MerkleTreeNodeType::from_u8(*dtype)?;

Also applies to: 498-499, 517-517

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/lib/src/core/db/merkle_node/merkle_node_db.rs` around lines 325 - 328,
Replace uses of MerkleTreeNodeType::from_u8_unwrap (which can panic) with a safe
conversion that returns an Err when the byte is invalid: call the non-panicking
conversion (e.g., MerkleTreeNodeType::from_u8 or equivalent) and if it returns
None/Err, return an OxenError (or map into the function's Result) with context
including the invalid byte value and node id. Update the code paths where
lookup.as_ref().map(...).unwrap_or(...) is used (the snippet using lookup and
MerkleTreeNodeType::from_u8_unwrap and the other occurrences around the other
two locations) to propagate a proper Err instead of panicking so open()/map()
surfaces an OxenError for corrupt/unknown node types.
🧹 Nitpick comments (1)
crates/lib/src/core/db/merkle_node/merkle_node_db.rs (1)

257-261: Add ?Sized if these helpers are supposed to accept dyn TMerkleTreeNode.

Lines 257, 274, 367, and 415 use N: TMerkleTreeNode, which implicitly requires Sized. This prevents passing &dyn TMerkleTreeNode trait objects. Adding ?Sized to these generic bounds would allow dynamic dispatch if needed.

♻️ Proposed fix
-    pub fn open_read_write_if_not_exists<N: TMerkleTreeNode>(
+    pub fn open_read_write_if_not_exists<N: TMerkleTreeNode + ?Sized>(
         repo: &LocalRepository,
         node: &N,
         parent_id: Option<MerkleHash>,
     ) -> Result<Option<Self>, OxenError> {
@@
-    pub fn open_read_write<N: TMerkleTreeNode>(
+    pub fn open_read_write<N: TMerkleTreeNode + ?Sized>(
         repo: &LocalRepository,
         node: &N,
         parent_id: Option<MerkleHash>,
     ) -> Result<Self, OxenError> {
@@
-    fn write_node<N: TMerkleTreeNode>(
+    fn write_node<N: TMerkleTreeNode + ?Sized>(
         &mut self,
         node: &N,
         parent_id: Option<MerkleHash>,
     ) -> Result<(), OxenError> {
@@
-    pub fn add_child<N: TMerkleTreeNode>(&mut self, item: &N) -> Result<(), OxenError> {
+    pub fn add_child<N: TMerkleTreeNode + ?Sized>(&mut self, item: &N) -> Result<(), OxenError> {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/lib/src/core/db/merkle_node/merkle_node_db.rs` around lines 257 - 261,
Change the generic bound from N: TMerkleTreeNode to N: TMerkleTreeNode + ?Sized
for the helper functions that need to accept trait objects (e.g.,
open_read_write_if_not_exists and the other methods at the same sites). Update
the generic declarations for those functions (the ones currently declared with
N: TMerkleTreeNode) to use N: TMerkleTreeNode + ?Sized so callers can pass &dyn
TMerkleTreeNode; keep parameters as &N and no other API changes are required.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/lib/src/repositories/tree.rs`:
- Around line 1081-1088: The doc comment for the tree-writing function still
references a generic serialization error type `S` and the requirement that all
node types share the same `S`; update the comment to reflect the current
implementation where `TMerkleTreeNode` uses the concrete
rmp_serde::encode::Error (see crates/lib/src/model/merkle_tree/node_type.rs) or
simply remove the stale note about `S` so the comment no longer claims a generic
serialization type requirement; ensure references to the function that writes
the full tree (the doc block in crates/lib/src/repositories/tree.rs) mention
that serialization errors are now fixed to rmp_serde::encode::Error or omit the
`S` discussion entirely.

---

Duplicate comments:
In `@crates/lib/src/core/db/merkle_node/merkle_node_db.rs`:
- Around line 325-328: Replace uses of MerkleTreeNodeType::from_u8_unwrap (which
can panic) with a safe conversion that returns an Err when the byte is invalid:
call the non-panicking conversion (e.g., MerkleTreeNodeType::from_u8 or
equivalent) and if it returns None/Err, return an OxenError (or map into the
function's Result) with context including the invalid byte value and node id.
Update the code paths where lookup.as_ref().map(...).unwrap_or(...) is used (the
snippet using lookup and MerkleTreeNodeType::from_u8_unwrap and the other
occurrences around the other two locations) to propagate a proper Err instead of
panicking so open()/map() surfaces an OxenError for corrupt/unknown node types.

In `@crates/lib/src/model/merkle_tree/node_type.rs`:
- Around line 94-96: The helper from_u8_unwrap currently passes a literal
"{val}" to expect and doesn't preserve the caller location; update the function
MerkleTreeNodeType::from_u8_unwrap to (1) interpolate the byte into the panic
message by constructing the string (e.g., using format! with the val) instead of
a raw literal, and (2) add #[track_caller] to the from_u8_unwrap definition so
any panic from expect reports the original caller site rather than inside this
helper.

---

Nitpick comments:
In `@crates/lib/src/core/db/merkle_node/merkle_node_db.rs`:
- Around line 257-261: Change the generic bound from N: TMerkleTreeNode to N:
TMerkleTreeNode + ?Sized for the helper functions that need to accept trait
objects (e.g., open_read_write_if_not_exists and the other methods at the same
sites). Update the generic declarations for those functions (the ones currently
declared with N: TMerkleTreeNode) to use N: TMerkleTreeNode + ?Sized so callers
can pass &dyn TMerkleTreeNode; keep parameters as &N and no other API changes
are required.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 042f2c23-3d27-4244-a7fc-7c6128a4ccc0

📥 Commits

Reviewing files that changed from the base of the PR and between 4b3d9a4 and 79b5e0a.

📒 Files selected for processing (3)
  • crates/lib/src/core/db/merkle_node/merkle_node_db.rs
  • crates/lib/src/model/merkle_tree/node_type.rs
  • crates/lib/src/repositories/tree.rs

Comment thread crates/lib/src/repositories/tree.rs
@malcolmgreaves
Copy link
Copy Markdown
Collaborator Author

Made one sweeping change that hopefully simplifies things a bit for this review. Instead of having a generic SerializationError for the trait, I've hardcoded it to be a serde serialization error (rmp_serde::encode::Error). We lose the ability to change the error type w/o changing the trait, but after thinking on it a bit more, I feel like this is an ok tradeoff. If we do change how we serialize nodes into bytes, we will need to create a migration as the on-disk format will change.

This cuts down on a lot of the syntax -- there's no more where clauses constraining the SerailizationError to be convertible into an OxenError nor the constraint that these error types have to align in p_write_tree. The OxenError::RmpEncodeError already takes care of converting the serialization error into an OxenError.

I also made the change to use the new from_u8 function that returns an error instead of panic!ing on an error. This has its own OxenError::MerkleTreeError and conversion from the more specific error type that is returned.

@malcolmgreaves malcolmgreaves changed the title Make trait TMerkleTreeNode dyn compatible (fka object safe) Make trait TMerkleTreeNode dyn compatible + rm from_u8 panic! Apr 2, 2026
@malcolmgreaves malcolmgreaves changed the title Make trait TMerkleTreeNode dyn compatible + rm from_u8 panic! Make TMerkleTreeNode dyn compatible + rm from_u8 panic! Apr 2, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/lib/src/model/merkle_tree/node_type.rs`:
- Line 70: In node_type.rs replace the misspelled word "conflcit" with
"conflict" in the comment text (the sentence containing "value to not conflcit
with any existing variants") so the comment reads "value to not conflict with
any existing variants"; ensure only the typo is corrected and surrounding
wording/formatting is preserved.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d0bb04b8-4470-422f-9948-0e7cae43b2dd

📥 Commits

Reviewing files that changed from the base of the PR and between 79b5e0a and 66a3d15.

📒 Files selected for processing (3)
  • crates/lib/src/core/db/merkle_node/merkle_node_db.rs
  • crates/lib/src/model/merkle_tree/node_type.rs
  • crates/lib/src/repositories/tree.rs
✅ Files skipped from review due to trivial changes (1)
  • crates/lib/src/repositories/tree.rs

changed for existing `MerkleTreeNode` variants.

**NEW** variants can be added without a migration. New variants must have an incremented `u8`
value to not conflcit with any existing variants. Older repositories will still be able to be
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Typo: "conflcit" → "conflict".

📝 Proposed fix
-    **NEW** variants can be added without a migration. New variants must have an incremented `u8`
-    value to not conflcit with any existing variants. Older repositories will still be able to be
+    **NEW** variants can be added without a migration. New variants must have an incremented `u8`
+    value to not conflict with any existing variants. Older repositories will still be able to be
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
value to not conflcit with any existing variants. Older repositories will still be able to be
value to not conflict with any existing variants. Older repositories will still be able to be
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/lib/src/model/merkle_tree/node_type.rs` at line 70, In node_type.rs
replace the misspelled word "conflcit" with "conflict" in the comment text (the
sentence containing "value to not conflcit with any existing variants") so the
comment reads "value to not conflict with any existing variants"; ensure only
the typo is corrected and surrounding wording/formatting is preserved.

@malcolmgreaves malcolmgreaves merged commit 086d324 into main Apr 3, 2026
8 of 9 checks passed
@malcolmgreaves malcolmgreaves deleted the mg/1_TMerkleTreeNode_object_safety branch April 3, 2026 21:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants