Migrate to v2 client baseline (reconcile with crates.io)#9
Conversation
The 4.x crates on crates.io (hyperfuel-format 4.0.1, hyperfuel-schema 4.0.0, hyperfuel-net-types 4.0.2, hyperfuel-client 3.0.1) were published from enviodev/hyperfuel-client-rust-v2, a private fork that diverged from this repo in late 2024 and became the de-facto source of truth. This repo's main was stuck on the 3.0.0 / 2.2.1 line with a stale Cargo.toml license, no homepage or repository metadata, and no git commit corresponding to the published 4.x crates. This commit replaces the working tree with v2's main (d618951, 2026-01-14, one commit past c185862 which produced the published hyperfuel-net-types-4.0.2) so that: - The git history matching the published crates lives in the public repo crates.io users navigate to. - Future version bumps build on a tree that matches the published Cargo.toml metadata (license MPL-2.0, capnp 0.19, polars-arrow, homepage/repository URLs, etc). - Subsequent PRs (notably the build.rs / capnp removal proposed in PR #8) can be applied to a current baseline and shipped as 4.0.x. Implemented as a single tree-replacement commit (git read-tree v2/main on top of v1 main) rather than a real merge, so review is a single diff against the existing main rather than against an unrelated-histories merge base. Known regressions vs v1 main (to be addressed in follow-up commits after tagging the published-state baseline): 1. README.md was copy-pasted from hypersync-client-rust; says "hypersync-client", links the wrong CI badge and crates.io page, and drops the quickstart example. 2. hyperfuel-format/Cargo.toml description reads "evm format library" (was "fuel network format library"). 3. hyperfuel-client/src/lib.rs top-level doc comment says "Hypersync client library for interacting with hypersync server" (wrong product). 4. hyperfuel-net-types Query/selection structs lost their field-level docstrings (from_block, to_block pagination semantics, include_all_blocks, etc). 5. examples/{asset-id,predicate-root,simple-logs,stream-logs} were removed from the workspace (v2 keeps examples on a separate `examples` branch; not yet ported to trunk). 6. Client::new dropped http1_only(), tcp_keepalive(7200s) and a separate connect_timeout from the reqwest builder. These were intentional in v1; worth confirming they weren't dropped by accident. 7. hyperfuel-format gained alloy-primitives, ethers, ethabi and sbbf-rs-safe deps that look copy-pasted from the EVM hypersync format crate and don't obviously belong in a Fuel format crate. 8. API surface lost beyond get_selected_arrow_data (PR #7): get_height_with_retry, get_arrow_data_with_retry, get_selected_data, preset_query_get_logs, and the filter_out_unselected_data client-side filter module. 9. simple_types module and use are commented out in lib.rs (dead code). 10. clippy::needless_lifetimes is silenced module-wide in hyperfuel-net-types; an internal "below 4 fields were not in the original fuel-client" comment leaks into the Query struct. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Rename hypersync-client to hyperfuel-client-rust and update README with new client details, quickstart example, and dependency installation instructions. Adjust configuration documentation to reflect HyperFuel terminology. Implement new preset_query_get_logs function for querying logs, and remove unused simple_types module. Update Cargo.toml descriptions and clean up unnecessary dependencies.
- README quickstart now uses the generic fuel.hypersync.xyz endpoint (matches Client::new's default), not the chain-id-15 testnet URL. - hyperfuel-format: drop xxhash-rust and nohash-hasher from Cargo.toml; nothing in hyperfuel-format/src/ references either, they were leftover EVM-crate baggage. - hyperfuel-client/src/lib.rs: delete the three commented-out /* ... */ stubs (collect_events, get_events, stream_events) that referenced the now-deleted Event / EventResponse types and would no longer compile if uncommented. - hyperfuel-client/src/config.rs: switch HexOutput to #[derive(Default)] + #[default] on NoEncode, matching the JoinMode pattern. Silences clippy::derivable_impls under -D warnings. cargo check --workspace and cargo clippy --workspace --all-targets -- -D warnings are both clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Removes the last two commented-out method blocks in
hyperfuel-client/src/lib.rs:
- A /* */ block for `pub async fn collect(...) -> Result<QueryResponse>`
that constructed `QueryResponse { ..., rollback_guard: None }`.
- A //-line-commented `pub async fn stream(...)` block with the same
rollback_guard pattern.
Neither would compile if uncommented: `QueryResponse` in
hyperfuel-client/src/types.rs has no `rollback_guard` field. These
were leftover from the abandoned port of hypersync EVM client code,
same lineage as the Event / EventResponse stubs removed in b55ea2d.
cargo clippy --workspace --all-targets -- -D warnings is clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughThis PR performs a major version upgrade (v4) of the HyperFuel client library, migrating from arrow2 to polars-arrow storage, introducing async streaming APIs with parquet output, and refactoring the client configuration and response type system. License changed from MIT to MPL-2.0 across all crates. ChangesHyperFuel v4 Core Refactor
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
Comment |
There was a problem hiding this comment.
Actionable comments posted: 15
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
hyperfuel-client/tests/api_test.rs (1)
1-523:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftReintroduce at least one compiled smoke integration test before merge.
Line 1 through Line 523 disables the entire integration suite, leaving no runtime coverage for query/stream/parsing paths in this crate. Please keep a minimal compiled test (even
#[ignore]) that exercises one end-to-end happy path against fixtures or a mocked response to prevent silent regressions during the v2 baseline cutover.Suggested minimal direction
+#[tokio::test(flavor = "multi_thread")] +#[ignore = "requires network or fixture refresh during v4 migration"] +async fn smoke_query_path_compiles_and_runs() { + let client = Client::new(ClientConfig::default()).unwrap(); + // Use a tiny deterministic query/fixture-backed path here. + // Assert response envelope shape (e.g., next_block monotonic, non-panicking decode path). +}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@hyperfuel-client/tests/api_test.rs` around lines 1 - 523, The integration suite was fully commented out; reintroduce a minimal compiled smoke test by uncommenting and enabling the parse_nameless_abi test and its helper get_file_path function, and ensure the JsonAbi import is present (uncomment alloy_json_abi::JsonAbi). Specifically, restore the parse_nameless_abi function, the get_file_path function, and the JsonAbi import so parse_nameless_abi calls get_file_path("nameless.abi.json") and parses it; you may mark the test #[ignore] if you want it compiled but skipped by default.
🧹 Nitpick comments (4)
hyperfuel-client/src/from_arrow.rs (1)
24-82: 🏗️ Heavy liftAcknowledged tech debt:
expect()calls throughoutBlockHeader::from_arrow.The comment at line 24 notes this should be improved. The
expect()calls (e.g., lines 51, 54, 57, etc.) will panic if required columns are missing or null. This is acceptable if server schema is guaranteed, but risky otherwise.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@hyperfuel-client/src/from_arrow.rs` around lines 24 - 82, BlockHeader::from_arrow currently panics via unwrap/expect when columns or values are missing; change it to return a Result<Vec<BlockHeader>, ArrowConversionError> (or another crate error type) and propagate errors instead of panicking: update the signature of FromArrow::from_arrow to return Result<..., _>, replace id: map_binary(...).unwrap() and all .expect(...) usages with fallible conversions that map missing columns/nulls into a descriptive ArrowConversionError (use ? on map_binary and on column lookups like batch.column::<...>("...") and on arr.get(idx) conversions), and update callers to handle the Result. Ensure you reference BlockHeader::from_arrow, ArrowBatch, map_binary, and the various column variables (id, da_height, transactions_count, etc.) when implementing the error type and propagation.hyperfuel-client/src/rayon_async.rs (3)
10-13: 💤 Low valueDocument panic handling behavior.
When the rayon task panics, the sender is dropped and the receiver will return
RecvError, losing the original panic information. This can make debugging harder, though it's a common pattern for spawn-like functions.Consider documenting this behavior in function documentation so callers understand that
RecvErrormay indicate either a panic or early receiver drop.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@hyperfuel-client/src/rayon_async.rs` around lines 10 - 13, Document that the rayon::spawn closure using func() and tx.send(...) will drop the sender if the spawned task panics, causing the receiver to observe a RecvError and losing the original panic info; update the surrounding function's doc comment to explicitly state that RecvError may mean either the spawned task panicked or the sender was otherwise dropped, and recommend callers use alternate patterns (e.g., join handles or propagation channels) if they need original panic details.
3-16: ⚡ Quick winAdd documentation for public API function.
This public function lacks documentation explaining its purpose and behavior. Consider adding a doc comment that covers:
- Purpose: Bridge CPU-bound work on rayon thread pool with async Tokio code
- Use case: When to use this vs
tokio::task::spawn_blocking- Behavior: Rayon task runs to completion even if receiver is dropped (no cancellation)
- Error handling: Panics in the rayon task result in
RecvErrorwhen awaiting the receiver📝 Example documentation
/// Spawns a CPU-bound function on the rayon thread pool and returns a receiver for the result. /// /// This function bridges rayon's thread pool with Tokio's async runtime by executing /// the provided closure on a rayon worker thread and sending the result through a /// oneshot channel. /// /// # Use cases /// /// Use this for CPU-intensive batch transformations that benefit from rayon's work-stealing /// thread pool. For simple blocking operations, prefer `tokio::task::spawn_blocking`. /// /// # Behavior /// /// - The rayon task runs to completion even if the receiver is dropped /// - No cancellation support once spawned /// - If the rayon task panics, awaiting the receiver will return `RecvError` /// /// # Examples /// /// ```no_run /// # use hyperfuel_client::rayon_async; /// # async fn example() -> Result<(), Box<dyn std::error::Error>> { /// let rx = rayon_async::spawn(|| { /// // CPU-intensive work here /// 42 /// }); /// let result = rx.await?; /// assert_eq!(result, 42); /// # Ok(()) /// # } /// ``` pub fn spawn<F, T>(func: F) -> oneshot::Receiver<T>🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@hyperfuel-client/src/rayon_async.rs` around lines 3 - 16, Add a Rust doc comment above the public function spawn that documents its purpose and behavior: explain that spawn bridges CPU-bound work on the rayon thread pool with async Tokio code (useful for CPU-intensive batch transformations) and when to prefer it vs tokio::task::spawn_blocking, state that the rayon task runs to completion even if the returned oneshot::Receiver is dropped (no cancellation), and note error handling semantics (a panic in the rayon::spawn task results in awaiting the oneshot::Receiver returning a RecvError). Include a short example showing calling spawn(|| ...) and awaiting the returned oneshot::Receiver<T>, and ensure the comment sits directly above the pub fn spawn<F, T>(func: F) -> oneshot::Receiver<T> signature.
6-6: ⚡ Quick winRemove the overly restrictive
Syncbound onT.The
Syncbound is unnecessary. Oneshot channels transfer ownership and only requireT: Send, notSync. TheSyncbound is only needed when sharing references across threads, which does not apply here since the value moves through the channel.pub fn spawn<F, T>(func: F) -> oneshot::Receiver<T> where F: 'static + FnOnce() -> T + Send, - T: 'static + Send + Sync, + T: 'static + Send,🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@hyperfuel-client/src/rayon_async.rs` at line 6, The generic bound on T is overly restrictive; change occurrences of the where/bound clause `T: 'static + Send + Sync` to remove `Sync` so it becomes `T: 'static + Send` (e.g., in the oneshot/async channel types or functions in rayon_async.rs where the type parameter T is declared). Update any associated trait bounds or function signatures that reference that exact `T` bound and run cargo check/tests to ensure no other code relied on `Sync`.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@hyperfuel-client/src/column_mapping.rs`:
- Around line 86-89: The error context string passed to .context(...) after
map_column contains a typo ("colum"); update the format string in that call to
read "apply cast to column '{}'" so the context uses the correct word; the
change should be made where map_column(&**col, dt).context(format!("apply cast
to colum '{}'", field.name))? is invoked (refer to map_column and field.name).
- Around line 22-38: Update the doc comments on the ColumnMapping struct so each
comment accurately describes its corresponding field: adjust the comment above
`receipt` to "Mapping for receipt data.", the comment above `input` to "Mapping
for input/trace data." (or "Mapping for input data." if preferred), and the
comment above `output` to "Mapping for decoded output data." Keep existing
`block` and `transaction` comments but confirm they say "Mapping for block
data." and "Mapping for transaction data." respectively; edit the comments in
the `ColumnMapping` struct to match these exact field names (`block`,
`transaction`, `receipt`, `input`, `output`).
- Around line 111-132: The function map_l1_fee_scalar currently uses multiple
unwraps (downcast_ref on col, std::str::from_utf8, and parse) which can panic on
bad input; change it to return errors instead: validate and convert the input
safely by checking downcast_ref::<BinaryArray<i32>>() and returning an
Err(anyhow!(...)) if absent, then iterate over col.iter() and convert each byte
slice with std::str::from_utf8(...).with_context(...)?.and_then(|s|
s.parse::<f64>().with_context(...)?), collecting into a Float64Array (or
Vec<Option<f64>>) while preserving nulls, and only then call
compute::cast::cast(&*to_box(...), &target_data_type.into(), CastOptions { ...
}) propagating errors with context (use map_err/with_context); update
signatures/returns to propagate anyhow::Error instead of panicking so callers
get proper error information.
In `@hyperfuel-client/src/from_arrow.rs`:
- Around line 10-22: The FromArrow trait and its from_arrow implementation are
currently infallible but map_binary uses unwrap() on TryFrom which can panic on
malformed external data; change from_arrow to return a Result<Vec<Self>, E>
(make FromArrow::from_arrow -> Result<Vec<Self>, Box<dyn std::error::Error>> or
a concrete error type) and update map_binary to propagate errors instead of
calling unwrap (e.g., return Option<Result<T, _>> or Result<Option<T>, _> and
convert TryFrom errors into the trait's error type), updating callers of
FromArrow::from_arrow accordingly so failures from TryFrom on BinaryArray are
surfaced rather than panicking.
In `@hyperfuel-client/src/lib.rs`:
- Around line 262-264: The code currently treats a missing ArchiveHeight.height
as 0 (height.height.unwrap_or(0)), which masks server errors; change the logic
that consumes res.json().await into ArchiveHeight so that if height.height is
None you return an Err (e.g., using .ok_or_else(...).context("missing height in
/height response") or anyhow::bail with a clear message) instead of defaulting
to 0; locate the ArchiveHeight type and the call site where you parse
res.json().await and replace unwrap_or(0) with an explicit error path that
propagates a failure to the caller.
- Around line 371-373: The code in get_arrow_impl uses
tokio::task::block_in_place which panics on current-thread runtimes; replace
that call with tokio::task::spawn_blocking to run parse_query_response off the
async thread. Specifically, change the block_in_place invocation that runs
parse_query_response(&bytes) to a spawn_blocking(move ||
parse_query_response(&bytes)) call, .await the JoinHandle, propagate or map any
JoinError and the parse_query_response context error so the function's ?
handling still works, and ensure get_arrow/get_arrow_impl signatures remain
unchanged while returning the same error type.
In `@hyperfuel-client/src/parquet_out.rs`:
- Around line 61-62: The loop using rx.recv().await currently uses the ? on
resp.context("get query response") which can early-return and leave writer tasks
running; instead, change the error handling in the receive/send loop (the
rx.recv() unwrap/context and any other uses of ? inside that loop) to capture
errors (e.g., map_err or match) and break the loop with an error value instead
of returning immediately, then after the loop explicitly drop the senders and
iterate over the parquet writer task handles (e.g., writer_handles or
writer_tasks) to close/flushing/join them (await their JoinHandles) before
finally returning the error; ensure the same pattern is applied to all sections
referenced (the blocks around lines 66-128 and 131-156) so writers always finish
and write footers on error.
- Around line 329-330: The closure passed to flat_map currently calls
array_to_columns(...).unwrap(), which will panic on failure inside the
rayon_async::spawn worker; instead make the closure return a Result and
propagate the error using the ? operator (i.e., replace unwrap() with ? on the
array_to_columns call), adjust the surrounding iterator combinators to operate
on Results (e.g., use flat_map/try_flat_map or map then collect with Result) and
ensure the error type bubbles out of the rayon_async::spawn task into the
enclosing function's Result; locate the flat_map closure and change its
signature/returns so array_to_columns errors are returned (not panicked) and
handled by the caller.
In `@hyperfuel-client/src/stream.rs`:
- Around line 53-75: The initial forward bootstrap currently sends the first
mapped response before applying the entity-count caps, letting callers receive
too many entities; modify the block handling client.get_arrow(&query) so that
after obtaining res and mapping via map_responses(config.clone(), vec![res],
reverse).await you apply the same num_* accounting and max_num_* trimming logic
used elsewhere (the logic that inspects each Resp and updates num_* and may drop
or partially trim entities) before calling tx.send(Ok(res)). Ensure you also
update query.from_block only after any trimming so next queries start from the
correct next_block; reference functions/vars: client.get_arrow, map_responses,
tx.send, query.from_block, and the num_*/max_num_* accounting used in the later
191-209 block.
- Around line 361-414: reverse_array currently only handles legacy
BinaryArray/Utf8Array concrete types and will fail for view-backed arrays;
update reverse_array to also match and handle BinaryView and Utf8View cases by
downcasting array.as_any() to BinaryView::<i32> and Utf8View::<i32>
(respectively), using their iter() (and .rev()) like the existing arms, and
returning the boxed result (mirroring the BinaryArray/Utf8Array arms). Keep the
existing Boolean/UInt8/UInt64 arms unchanged and ensure the new
BinaryView/Utf8View branches return Box<dyn Array> in the same fashion as the
other match arms.
- Around line 33-41: Validate the stream configuration immediately after reading
values and before creating step or starting the pipeline: ensure concurrency > 0
and batch_size > 0, check min_batch_size <= batch_size <= max_batch_size, and
response_size_floor <= response_size_ceiling (and none are zero where zero is
invalid). If any invariant fails, return an Err (or propagate a validation
error) from the public constructor/initializer (the function that creates the
stream in stream.rs) instead of proceeding; this prevents BlockRangeIterator
from stalling and the scheduler from breaking. Use the existing symbols
concurrency, batch_size, max_batch_size, min_batch_size, response_size_ceiling,
response_size_floor, reverse and step to locate where to add these checks.
In `@hyperfuel-format/src/types/quantity.rs`:
- Around line 27-30: The From<u64> impl stores all 8 BE bytes causing
leading-zero representations; update the impl in the impl From<u64> for Quantity
block so it canonicalizes the byte representation by trimming leading 0x00 bytes
(but leaving a single 0x00 for value == 0) before constructing Quantity, thereby
preserving Quantity invariants used by equality/hash; locate the fn from(value:
u64) and replace the raw to_be_bytes().into() construction with a
trimmed/minimal-byte sequence construction that yields the canonical internal
form.
In `@hyperfuel-schema/src/lib.rs`:
- Around line 203-238: The code currently only treats DataType::Utf8 as UTF-8
and uses unwrap() on downcasts, causing panics for Utf8View columns; update the
type check to accept both DataType::Utf8 and DataType::Utf8View, and replace all
unwrap() calls (the downcast_ref::<Utf8Array<i32>>() and
downcast_ref::<BinaryArray<i32>>() usages and the final
compute::cast::binary_to_utf8().unwrap()) with fallible operations that return
Err/Context on failure so the surrounding function remains fully fallible;
specifically, in the closure mapping chunks (where is_utf8 is set), test for
DataType::Utf8 | DataType::Utf8View, and when converting arrays (the blocks
using Utf8Array::to_binary and compute::cast::binary_to_utf8) attempt downcasts
safely and propagate errors with context rather than panicking.
In `@hyperfuel-schema/src/util.rs`:
- Around line 5-15: In project_schema, do not silently ignore names in
field_selection that aren't present in schema.fields; instead detect missing
entries and fail fast: change the signature of project_schema to return
Result<Schema, Error> (or anyhow::Error), while iterating over field_selection
(used with select_indices) collect any missing field names into a Vec<String>,
and if non-empty return an Err containing a clear message listing the missing
names and context (e.g., which projection and schema). Update callers of
project_schema accordingly to handle the Result. Ensure references to
project_schema, schema.fields, and field_selection are used to locate and
implement this change.
In `@README.md`:
- Line 15: The "### Dependencies" heading violates MD001 by jumping from H1 to
H3; update the heading level for the "### Dependencies" line to H2 (change it to
"## Dependencies") so the document maintains proper hierarchy and satisfies
markdownlint rules, ensuring the header text "Dependencies" remains unchanged
except for the leading hashes.
---
Outside diff comments:
In `@hyperfuel-client/tests/api_test.rs`:
- Around line 1-523: The integration suite was fully commented out; reintroduce
a minimal compiled smoke test by uncommenting and enabling the
parse_nameless_abi test and its helper get_file_path function, and ensure the
JsonAbi import is present (uncomment alloy_json_abi::JsonAbi). Specifically,
restore the parse_nameless_abi function, the get_file_path function, and the
JsonAbi import so parse_nameless_abi calls get_file_path("nameless.abi.json")
and parses it; you may mark the test #[ignore] if you want it compiled but
skipped by default.
---
Nitpick comments:
In `@hyperfuel-client/src/from_arrow.rs`:
- Around line 24-82: BlockHeader::from_arrow currently panics via unwrap/expect
when columns or values are missing; change it to return a
Result<Vec<BlockHeader>, ArrowConversionError> (or another crate error type) and
propagate errors instead of panicking: update the signature of
FromArrow::from_arrow to return Result<..., _>, replace id:
map_binary(...).unwrap() and all .expect(...) usages with fallible conversions
that map missing columns/nulls into a descriptive ArrowConversionError (use ? on
map_binary and on column lookups like batch.column::<...>("...") and on
arr.get(idx) conversions), and update callers to handle the Result. Ensure you
reference BlockHeader::from_arrow, ArrowBatch, map_binary, and the various
column variables (id, da_height, transactions_count, etc.) when implementing the
error type and propagation.
In `@hyperfuel-client/src/rayon_async.rs`:
- Around line 10-13: Document that the rayon::spawn closure using func() and
tx.send(...) will drop the sender if the spawned task panics, causing the
receiver to observe a RecvError and losing the original panic info; update the
surrounding function's doc comment to explicitly state that RecvError may mean
either the spawned task panicked or the sender was otherwise dropped, and
recommend callers use alternate patterns (e.g., join handles or propagation
channels) if they need original panic details.
- Around line 3-16: Add a Rust doc comment above the public function spawn that
documents its purpose and behavior: explain that spawn bridges CPU-bound work on
the rayon thread pool with async Tokio code (useful for CPU-intensive batch
transformations) and when to prefer it vs tokio::task::spawn_blocking, state
that the rayon task runs to completion even if the returned oneshot::Receiver is
dropped (no cancellation), and note error handling semantics (a panic in the
rayon::spawn task results in awaiting the oneshot::Receiver returning a
RecvError). Include a short example showing calling spawn(|| ...) and awaiting
the returned oneshot::Receiver<T>, and ensure the comment sits directly above
the pub fn spawn<F, T>(func: F) -> oneshot::Receiver<T> signature.
- Line 6: The generic bound on T is overly restrictive; change occurrences of
the where/bound clause `T: 'static + Send + Sync` to remove `Sync` so it becomes
`T: 'static + Send` (e.g., in the oneshot/async channel types or functions in
rayon_async.rs where the type parameter T is declared). Update any associated
trait bounds or function signatures that reference that exact `T` bound and run
cargo check/tests to ensure no other code relied on `Sync`.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 7e1fefab-bacd-422a-b77a-b0328b3e32fd
📒 Files selected for processing (58)
.github/workflows/publish.yamlCargo.tomlLICENSEREADME.mdexamples/asset-id/Cargo.tomlexamples/asset-id/README.mdexamples/asset-id/src/main.rsexamples/predicate-root/Cargo.tomlexamples/predicate-root/README.mdexamples/predicate-root/src/main.rsexamples/simple-logs/Cargo.tomlexamples/simple-logs/README.mdexamples/simple-logs/src/main.rsexamples/stream-logs/Cargo.tomlexamples/stream-logs/README.mdexamples/stream-logs/src/main.rshyperfuel-client/Cargo.tomlhyperfuel-client/src/column_mapping.rshyperfuel-client/src/config.rshyperfuel-client/src/filter.rshyperfuel-client/src/from_arrow.rshyperfuel-client/src/lib.rshyperfuel-client/src/parquet_out.rshyperfuel-client/src/parse_response.rshyperfuel-client/src/rayon_async.rshyperfuel-client/src/stream.rshyperfuel-client/src/transport_format.rshyperfuel-client/src/types.rshyperfuel-client/src/util.rshyperfuel-client/test-data/ens_token_abi.jsonhyperfuel-client/test-data/erc20.abi.jsonhyperfuel-client/test-data/nameless.abi.jsonhyperfuel-client/tests/api_test.rshyperfuel-format/Cargo.tomlhyperfuel-format/src/error.rshyperfuel-format/src/lib.rshyperfuel-format/src/types/data.rshyperfuel-format/src/types/fixed_size_data.rshyperfuel-format/src/types/mod.rshyperfuel-format/src/types/quantity.rshyperfuel-format/src/types/transaction_status.rshyperfuel-format/src/types/transaction_type.rshyperfuel-format/src/types/uint.rshyperfuel-format/src/types/util.rshyperfuel-format/test-data/arbitrum_tx_receipt.jsonhyperfuel-format/test-data/arbitrum_tx_receipt2.jsonhyperfuel-format/test-data/base_tx_receipt.jsonhyperfuel-format/test-data/block_with_tx.jsonhyperfuel-format/test-data/block_without_tx.jsonhyperfuel-format/test-data/log.jsonhyperfuel-format/test-data/transaction.jsonhyperfuel-format/test-data/transaction_receipt.jsonhyperfuel-format/tests/deserialize_json.rshyperfuel-net-types/Cargo.tomlhyperfuel-net-types/src/lib.rshyperfuel-schema/Cargo.tomlhyperfuel-schema/src/lib.rshyperfuel-schema/src/util.rs
💤 Files with no reviewable changes (23)
- examples/asset-id/README.md
- examples/stream-logs/README.md
- examples/predicate-root/README.md
- hyperfuel-format/test-data/transaction.json
- hyperfuel-format/test-data/log.json
- examples/simple-logs/src/main.rs
- hyperfuel-format/test-data/arbitrum_tx_receipt.json
- examples/asset-id/src/main.rs
- hyperfuel-format/test-data/base_tx_receipt.json
- examples/stream-logs/src/main.rs
- hyperfuel-client/src/filter.rs
- hyperfuel-client/src/transport_format.rs
- examples/simple-logs/Cargo.toml
- hyperfuel-format/test-data/arbitrum_tx_receipt2.json
- hyperfuel-format/tests/deserialize_json.rs
- examples/stream-logs/Cargo.toml
- hyperfuel-format/test-data/transaction_receipt.json
- examples/simple-logs/README.md
- examples/predicate-root/src/main.rs
- Cargo.toml
- examples/predicate-root/Cargo.toml
- examples/asset-id/Cargo.toml
- hyperfuel-format/test-data/block_without_tx.json
| pub struct ColumnMapping { | ||
| /// Mapping for block data. | ||
| #[serde(default)] | ||
| pub block: BTreeMap<String, DataType>, | ||
| /// Mapping for transaction data. | ||
| #[serde(default)] | ||
| pub transaction: BTreeMap<String, DataType>, | ||
| /// Mapping for log data. | ||
| #[serde(default)] | ||
| pub receipt: BTreeMap<String, DataType>, | ||
| /// Mapping for trace data. | ||
| #[serde(default)] | ||
| pub input: BTreeMap<String, DataType>, | ||
| /// Mapping for decoded log data. | ||
| #[serde(default)] | ||
| pub output: BTreeMap<String, DataType>, | ||
| } |
There was a problem hiding this comment.
Doc comments don't match field names.
The comments appear to be copy-pasted from an EVM-centric codebase and don't match the Fuel-specific field names:
- Line 30: "Mapping for log data" → should be "receipt"
- Line 33: "Mapping for trace data" → should be "input"
- Line 36: "Mapping for decoded log data" → should be "output"
📝 Proposed fix
- /// Mapping for log data.
+ /// Mapping for receipt data.
#[serde(default)]
pub receipt: BTreeMap<String, DataType>,
- /// Mapping for trace data.
+ /// Mapping for input data.
#[serde(default)]
pub input: BTreeMap<String, DataType>,
- /// Mapping for decoded log data.
+ /// Mapping for output data.
#[serde(default)]
pub output: BTreeMap<String, DataType>,📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| pub struct ColumnMapping { | |
| /// Mapping for block data. | |
| #[serde(default)] | |
| pub block: BTreeMap<String, DataType>, | |
| /// Mapping for transaction data. | |
| #[serde(default)] | |
| pub transaction: BTreeMap<String, DataType>, | |
| /// Mapping for log data. | |
| #[serde(default)] | |
| pub receipt: BTreeMap<String, DataType>, | |
| /// Mapping for trace data. | |
| #[serde(default)] | |
| pub input: BTreeMap<String, DataType>, | |
| /// Mapping for decoded log data. | |
| #[serde(default)] | |
| pub output: BTreeMap<String, DataType>, | |
| } | |
| pub struct ColumnMapping { | |
| /// Mapping for block data. | |
| #[serde(default)] | |
| pub block: BTreeMap<String, DataType>, | |
| /// Mapping for transaction data. | |
| #[serde(default)] | |
| pub transaction: BTreeMap<String, DataType>, | |
| /// Mapping for receipt data. | |
| #[serde(default)] | |
| pub receipt: BTreeMap<String, DataType>, | |
| /// Mapping for input data. | |
| #[serde(default)] | |
| pub input: BTreeMap<String, DataType>, | |
| /// Mapping for output data. | |
| #[serde(default)] | |
| pub output: BTreeMap<String, DataType>, | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@hyperfuel-client/src/column_mapping.rs` around lines 22 - 38, Update the doc
comments on the ColumnMapping struct so each comment accurately describes its
corresponding field: adjust the comment above `receipt` to "Mapping for receipt
data.", the comment above `input` to "Mapping for input/trace data." (or
"Mapping for input data." if preferred), and the comment above `output` to
"Mapping for decoded output data." Keep existing `block` and `transaction`
comments but confirm they say "Mapping for block data." and "Mapping for
transaction data." respectively; edit the comments in the `ColumnMapping` struct
to match these exact field names (`block`, `transaction`, `receipt`, `input`,
`output`).
| } else { | ||
| map_column(&**col, dt) | ||
| .context(format!("apply cast to colum '{}'", field.name))? | ||
| } |
There was a problem hiding this comment.
Typo in error context message.
Line 88 has "colum" instead of "column".
📝 Proposed fix
- .context(format!("apply cast to colum '{}'", field.name))?
+ .context(format!("apply cast to column '{}'", field.name))?📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| } else { | |
| map_column(&**col, dt) | |
| .context(format!("apply cast to colum '{}'", field.name))? | |
| } | |
| } else { | |
| map_column(&**col, dt) | |
| .context(format!("apply cast to column '{}'", field.name))? | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@hyperfuel-client/src/column_mapping.rs` around lines 86 - 89, The error
context string passed to .context(...) after map_column contains a typo
("colum"); update the format string in that call to read "apply cast to column
'{}'" so the context uses the correct word; the change should be made where
map_column(&**col, dt).context(format!("apply cast to colum '{}'", field.name))?
is invoked (refer to map_column and field.name).
| pub fn map_l1_fee_scalar( | ||
| col: &dyn Array, | ||
| target_data_type: DataType, | ||
| ) -> Result<Box<dyn Array + 'static>> { | ||
| let col = col.as_any().downcast_ref::<BinaryArray<i32>>().unwrap(); | ||
| let col = Float64Array::from_iter( | ||
| col.iter() | ||
| .map(|v| v.map(|v| std::str::from_utf8(v).unwrap().parse().unwrap())), | ||
| ); | ||
|
|
||
| let arr = compute::cast::cast( | ||
| &*to_box(col), | ||
| &target_data_type.into(), | ||
| CastOptions { | ||
| wrapped: false, | ||
| partial: false, | ||
| }, | ||
| ) | ||
| .with_context(|| anyhow!("failed to l1_fee_scalar to {:?}", target_data_type))?; | ||
|
|
||
| Ok(arr) | ||
| } |
There was a problem hiding this comment.
Multiple unwrap() calls can cause panics on malformed data.
This public function uses unwrap() on downcast (line 115), from_utf8 (line 118), and parse (line 118). If the column isn't a BinaryArray<i32>, contains invalid UTF-8, or isn't a valid float, this will panic rather than return an error.
Since this is exposed publicly and processes external server data, consider propagating errors instead.
🔧 Proposed fix with proper error handling
pub fn map_l1_fee_scalar(
col: &dyn Array,
target_data_type: DataType,
) -> Result<Box<dyn Array + 'static>> {
- let col = col.as_any().downcast_ref::<BinaryArray<i32>>().unwrap();
+ let col = col
+ .as_any()
+ .downcast_ref::<BinaryArray<i32>>()
+ .context("l1_fee_scalar column must be BinaryArray<i32>")?;
let col = Float64Array::from_iter(
col.iter()
- .map(|v| v.map(|v| std::str::from_utf8(v).unwrap().parse().unwrap())),
+ .enumerate()
+ .map(|(i, v)| {
+ v.map(|v| {
+ std::str::from_utf8(v)
+ .map_err(|e| anyhow!("l1_fee_scalar row {}: invalid UTF-8: {}", i, e))
+ .and_then(|s| {
+ s.parse::<f64>()
+ .map_err(|e| anyhow!("l1_fee_scalar row {}: parse error: {}", i, e))
+ })
+ })
+ .transpose()
+ })
+ .collect::<Result<Vec<_>>>()?
+ .into_iter(),
);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@hyperfuel-client/src/column_mapping.rs` around lines 111 - 132, The function
map_l1_fee_scalar currently uses multiple unwraps (downcast_ref on col,
std::str::from_utf8, and parse) which can panic on bad input; change it to
return errors instead: validate and convert the input safely by checking
downcast_ref::<BinaryArray<i32>>() and returning an Err(anyhow!(...)) if absent,
then iterate over col.iter() and convert each byte slice with
std::str::from_utf8(...).with_context(...)?.and_then(|s|
s.parse::<f64>().with_context(...)?), collecting into a Float64Array (or
Vec<Option<f64>>) while preserving nulls, and only then call
compute::cast::cast(&*to_box(...), &target_data_type.into(), CastOptions { ...
}) propagating errors with context (use map_err/with_context); update
signatures/returns to propagate anyhow::Error instead of panicking so callers
get proper error information.
| /// Used to do ArrowBatch-Native Rust type conversions while consuming the input value. | ||
| pub trait FromArrow: Sized { | ||
| /// Converts to the Vector type from the ArrowBatch type. | ||
| fn from_arrow(batch: &ArrowBatch) -> Vec<Self>; | ||
| } | ||
|
|
||
| pub fn typed_data_from_arrow_data(data: QueryResponseData) -> Result<QueryResponseDataTyped> { | ||
| let mut blocks = vec![]; | ||
| for batch in data.blocks.iter() { | ||
| let data = BlockHeader::from_arrow(batch).context("map blocks from arrow")?; | ||
| for block in data { | ||
| blocks.push(block) | ||
| } | ||
| } | ||
|
|
||
| let mut transactions = vec![]; | ||
| for batch in data.transactions.iter() { | ||
| let data = Transaction::from_arrow(batch).context("transaction from arrow")?; | ||
| for transaction in data { | ||
| transactions.push(transaction); | ||
| } | ||
| } | ||
|
|
||
| let mut receipts = vec![]; | ||
| for batch in data.receipts.iter() { | ||
| let data = Receipt::from_arrow(batch).context("map receipts from arrow")?; | ||
| for receipt in data { | ||
| receipts.push(receipt); | ||
| } | ||
| } | ||
|
|
||
| let mut inputs = vec![]; | ||
| for batch in data.inputs.iter() { | ||
| let data = Input::from_arrow(batch).context("map inputs from arrow")?; | ||
| for input in data { | ||
| inputs.push(input); | ||
| } | ||
| } | ||
|
|
||
| let mut outputs = vec![]; | ||
| for batch in data.outputs.iter() { | ||
| let data = Output::from_arrow(batch).context("map outputs from arrow")?; | ||
| for output in data { | ||
| outputs.push(output); | ||
| } | ||
| } | ||
|
|
||
| Ok(QueryResponseDataTyped { | ||
| blocks, | ||
| transactions, | ||
| receipts, | ||
| inputs, | ||
| outputs, | ||
| }) | ||
| fn map_binary<'a, T>(i: usize, arr: Option<&'a BinaryArray<i32>>) -> Option<T> | ||
| where | ||
| T: TryFrom<&'a [u8]>, | ||
| <T as TryFrom<&'a [u8]>>::Error: std::fmt::Debug, | ||
| { | ||
| arr.and_then(|arr| arr.get(i).map(|v| v.try_into().unwrap())) | ||
| } |
There was a problem hiding this comment.
Infallible trait with internal panics is risky for external data.
The FromArrow trait now returns Vec<Self> instead of Result<Vec<Self>>, but the implementation uses unwrap() (line 21) which will panic if TryFrom fails. Since this processes data from external servers, malformed responses could crash the client.
Consider either:
- Keeping the trait fallible with proper error propagation, or
- Documenting the panic behavior and ensuring server responses are validated upstream
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@hyperfuel-client/src/from_arrow.rs` around lines 10 - 22, The FromArrow trait
and its from_arrow implementation are currently infallible but map_binary uses
unwrap() on TryFrom which can panic on malformed external data; change
from_arrow to return a Result<Vec<Self>, E> (make FromArrow::from_arrow ->
Result<Vec<Self>, Box<dyn std::error::Error>> or a concrete error type) and
update map_binary to propagate errors instead of calling unwrap (e.g., return
Option<Result<T, _>> or Result<Option<T>, _> and convert TryFrom errors into the
trait's error type), updating callers of FromArrow::from_arrow accordingly so
failures from TryFrom on BinaryArray are surfaced rather than panicking.
| let height: ArchiveHeight = res.json().await.context("read response body json")?; | ||
|
|
||
| Ok(height.height.unwrap_or(0)) |
There was a problem hiding this comment.
Treat a missing height as an invalid server response.
Falling back to 0 makes /height failures look like a healthy chain at genesis, which can silently truncate downstream queries and make health_check succeed for the wrong reason.
Suggested fix
- Ok(height.height.unwrap_or(0))
+ Ok(height
+ .height
+ .context("height endpoint returned no height")?)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@hyperfuel-client/src/lib.rs` around lines 262 - 264, The code currently
treats a missing ArchiveHeight.height as 0 (height.height.unwrap_or(0)), which
masks server errors; change the logic that consumes res.json().await into
ArchiveHeight so that if height.height is None you return an Err (e.g., using
.ok_or_else(...).context("missing height in /height response") or anyhow::bail
with a clear message) instead of defaulting to 0; locate the ArchiveHeight type
and the call site where you parse res.json().await and replace unwrap_or(0) with
an explicit error path that propagates a failure to the caller.
| fn reverse_array(array: &dyn Array) -> Result<Box<dyn Array>> { | ||
| match array.data_type() { | ||
| ArrowDataType::Binary => Ok(BinaryArray::<i32>::from_iter( | ||
| array | ||
| .as_any() | ||
| .downcast_ref::<BinaryArray<i32>>() | ||
| .unwrap() | ||
| .iter() | ||
| .rev(), | ||
| ) | ||
| .boxed()), | ||
| ArrowDataType::Utf8 => Ok(Utf8Array::<i32>::from_iter( | ||
| array | ||
| .as_any() | ||
| .downcast_ref::<Utf8Array<i32>>() | ||
| .unwrap() | ||
| .iter() | ||
| .rev(), | ||
| ) | ||
| .boxed()), | ||
| ArrowDataType::Boolean => Ok(BooleanArray::from_iter( | ||
| array | ||
| .as_any() | ||
| .downcast_ref::<BooleanArray>() | ||
| .unwrap() | ||
| .iter() | ||
| .rev(), | ||
| ) | ||
| .boxed()), | ||
| ArrowDataType::UInt64 => Ok(UInt64Array::from_iter( | ||
| array | ||
| .as_any() | ||
| .downcast_ref::<UInt64Array>() | ||
| .unwrap() | ||
| .iter() | ||
| .map(|opt| opt.copied()) | ||
| .rev(), | ||
| ) | ||
| .boxed()), | ||
| ArrowDataType::UInt8 => Ok(UInt8Array::from_iter( | ||
| array | ||
| .as_any() | ||
| .downcast_ref::<UInt8Array>() | ||
| .unwrap() | ||
| .iter() | ||
| .map(|opt| opt.copied()) | ||
| .rev(), | ||
| ) | ||
| .boxed()), | ||
| dt => Err(anyhow!( | ||
| "reversing an array of datatype {:?} is not supported", | ||
| dt | ||
| )), | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
fd -p 'lib.rs' hyperfuel-schema hyperfuel-client | while read -r f; do
echo "== $f =="
rg -n "BinaryView|Utf8View|ArrowDataType::Binary|ArrowDataType::Utf8" "$f"
doneRepository: enviodev/hyperfuel-client-rust
Length of output: 3888
🏁 Script executed:
#!/bin/bash
# First, let's check the exact content of stream.rs around lines 361-414
fd -p 'stream.rs' hyperfuel-client/src | head -1 | xargs -I {} sh -c 'wc -l {} && echo "---" && sed -n "361,414p" {}'Repository: enviodev/hyperfuel-client-rust
Length of output: 1705
🏁 Script executed:
#!/bin/bash
# Check if reverse_array function handles BinaryView or Utf8View types
fd -p 'stream.rs' hyperfuel-client/src | head -1 | xargs -I {} rg -n "BinaryView|Utf8View" {}Repository: enviodev/hyperfuel-client-rust
Length of output: 58
🏁 Script executed:
#!/bin/bash
# Search for reverse_array calls to understand usage context
fd -p 'stream.rs' hyperfuel-client/src | head -1 | xargs -I {} rg -B5 -A5 "reverse_array" {}Repository: enviodev/hyperfuel-client-rust
Length of output: 698
reverse_array() must handle BinaryView and Utf8View types.
The schema was migrated to view-backed string/binary arrays (BinaryView, Utf8View), but the reverse_array() function only has match arms for legacy Binary and Utf8 types. When reverse mode is enabled, any batch with BinaryView or Utf8View columns will error out as unsupported.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@hyperfuel-client/src/stream.rs` around lines 361 - 414, reverse_array
currently only handles legacy BinaryArray/Utf8Array concrete types and will fail
for view-backed arrays; update reverse_array to also match and handle BinaryView
and Utf8View cases by downcasting array.as_any() to BinaryView::<i32> and
Utf8View::<i32> (respectively), using their iter() (and .rev()) like the
existing arms, and returning the boxed result (mirroring the
BinaryArray/Utf8Array arms). Keep the existing Boolean/UInt8/UInt64 arms
unchanged and ensure the new BinaryView/Utf8View branches return Box<dyn Array>
in the same fashion as the other match arms.
| impl From<u64> for Quantity { | ||
| fn from(value: u64) -> Self { | ||
| Self(value.to_be_bytes().into()) | ||
| } |
There was a problem hiding this comment.
Canonicalize From<u64> to preserve Quantity invariants.
Line 29 stores all 8 big-endian bytes, introducing leading-zero representations. That makes semantically equal values compare/hash differently depending on constructor path.
Proposed fix
impl From<u64> for Quantity {
fn from(value: u64) -> Self {
- Self(value.to_be_bytes().into())
+ if value == 0 {
+ return Self::default();
+ }
+
+ let bytes = value.to_be_bytes();
+ let first_non_zero = bytes.iter().position(|b| *b != 0).unwrap();
+ Self::from(bytes[first_non_zero..].to_vec())
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| impl From<u64> for Quantity { | |
| fn from(value: u64) -> Self { | |
| Self(value.to_be_bytes().into()) | |
| } | |
| impl From<u64> for Quantity { | |
| fn from(value: u64) -> Self { | |
| if value == 0 { | |
| return Self::default(); | |
| } | |
| let bytes = value.to_be_bytes(); | |
| let first_non_zero = bytes.iter().position(|b| *b != 0).unwrap(); | |
| Self::from(bytes[first_non_zero..].to_vec()) | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@hyperfuel-format/src/types/quantity.rs` around lines 27 - 30, The From<u64>
impl stores all 8 BE bytes causing leading-zero representations; update the impl
in the impl From<u64> for Quantity block so it canonicalizes the byte
representation by trimming leading 0x00 bytes (but leaving a single 0x00 for
value == 0) before constructing Quantity, thereby preserving Quantity invariants
used by equality/hash; locate the fn from(value: u64) and replace the raw
to_be_bytes().into() construction with a trimmed/minimal-byte sequence
construction that yields the canonical internal form.
| let mut is_utf8 = false; | ||
| let arrs = chunks | ||
| .iter() | ||
| .map(|chunk| { | ||
| chunk | ||
| let col = chunk | ||
| .columns() | ||
| .get(col) | ||
| .map(|col| col.as_ref()) | ||
| .context("get column") | ||
| .context("get column")?; | ||
| is_utf8 = col.data_type() == &DataType::Utf8; | ||
| Ok(col) | ||
| }) | ||
| .collect::<Result<Vec<_>>>()?; | ||
| compute::concatenate::concatenate(&arrs).context("concat arrays") | ||
| if !is_utf8 { | ||
| compute::concatenate::concatenate(&arrs).context("concat arrays") | ||
| } else { | ||
| let arrs = arrs | ||
| .into_iter() | ||
| .map(|a| { | ||
| a.as_any() | ||
| .downcast_ref::<Utf8Array<i32>>() | ||
| .unwrap() | ||
| .to_binary() | ||
| .boxed() | ||
| }) | ||
| .collect::<Vec<_>>(); | ||
| let arrs = arrs.iter().map(|a| a.as_ref()).collect::<Vec<_>>(); | ||
| let arr = | ||
| compute::concatenate::concatenate(arrs.as_slice()).context("concat arrays")?; | ||
|
|
||
| Ok(compute::cast::binary_to_utf8( | ||
| arr.as_any().downcast_ref::<BinaryArray<i32>>().unwrap(), | ||
| DataType::Utf8, | ||
| ) | ||
| .unwrap() | ||
| .boxed()) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify Utf8View schema usage and Utf8-only handling/panic points in concat_chunks.
rg -n 'Utf8View|DataType::Utf8\b|downcast_ref::<Utf8Array|downcast_ref::<BinaryArray|unwrap\(\)' hyperfuel-schema/src/lib.rsRepository: enviodev/hyperfuel-client-rust
Length of output: 518
🏁 Script executed:
# Check concat_chunks function signature and surrounding context
sed -n '190,245p' hyperfuel-schema/src/lib.rs | cat -nRepository: enviodev/hyperfuel-client-rust
Length of output: 2271
🏁 Script executed:
# Verify what type concat_chunks returns
rg -B 5 'fn concat_chunks' hyperfuel-schema/src/lib.rsRepository: enviodev/hyperfuel-client-rust
Length of output: 229
🏁 Script executed:
# Check if there are other Utf8/Utf8View handling patterns in the codebase
rg -n 'Utf8View' hyperfuel-schema/src/lib.rs | head -20Repository: enviodev/hyperfuel-client-rust
Length of output: 133
🏁 Script executed:
# Check if there are other Utf8View fields in schema
rg -n 'Field::new.*DataType' hyperfuel-schema/src/lib.rs | grep -E 'Utf8|Binary'Repository: enviodev/hyperfuel-client-rust
Length of output: 3820
🏁 Script executed:
# Check where concat_chunks is called and what data flows through it
rg -n 'concat_chunks' hyperfuel-schema/src/lib.rs -B 2 -A 2Repository: enviodev/hyperfuel-client-rust
Length of output: 358
Make UTF-8 concatenation fully fallible and compatible with Utf8View.
The is_utf8 check at line 212 only matches DataType::Utf8 but the schema defines UTF-8 fields as DataType::Utf8View (e.g., the "reason" field at line 74). When a Utf8View column reaches this function, the downcast at line 224 will fail. Additionally, this function returns Result but uses unwrap() at lines 224, 234, and 237, converting recoverable errors into panics.
Proposed fix
- is_utf8 = col.data_type() == &DataType::Utf8;
+ is_utf8 = matches!(col.data_type(), DataType::Utf8 | DataType::Utf8View);
@@
- if !is_utf8 {
+ if !is_utf8 {
compute::concatenate::concatenate(&arrs).context("concat arrays")
} else {
+ // For Utf8View columns, concatenate directly to preserve the view representation.
+ if arrs
+ .iter()
+ .all(|a| a.data_type() == &DataType::Utf8View)
+ {
+ return compute::concatenate::concatenate(&arrs).context("concat utf8view arrays");
+ }
+
let arrs = arrs
.into_iter()
.map(|a| {
a.as_any()
.downcast_ref::<Utf8Array<i32>>()
- .unwrap()
+ .context("expected Utf8Array<i32> for utf8 concatenation")
+ .map(|a| a.to_binary().boxed())
- .to_binary()
- .boxed()
})
- .collect::<Vec<_>>();
+ .collect::<Result<Vec<_>>>()?;
@@
Ok(compute::cast::binary_to_utf8(
- arr.as_any().downcast_ref::<BinaryArray<i32>>().unwrap(),
+ arr.as_any()
+ .downcast_ref::<BinaryArray<i32>>()
+ .context("expected BinaryArray<i32> after utf8->binary conversion")?,
DataType::Utf8,
)
- .unwrap()
+ .context("binary_to_utf8 cast failed")?
.boxed())
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| let mut is_utf8 = false; | |
| let arrs = chunks | |
| .iter() | |
| .map(|chunk| { | |
| chunk | |
| let col = chunk | |
| .columns() | |
| .get(col) | |
| .map(|col| col.as_ref()) | |
| .context("get column") | |
| .context("get column")?; | |
| is_utf8 = col.data_type() == &DataType::Utf8; | |
| Ok(col) | |
| }) | |
| .collect::<Result<Vec<_>>>()?; | |
| compute::concatenate::concatenate(&arrs).context("concat arrays") | |
| if !is_utf8 { | |
| compute::concatenate::concatenate(&arrs).context("concat arrays") | |
| } else { | |
| let arrs = arrs | |
| .into_iter() | |
| .map(|a| { | |
| a.as_any() | |
| .downcast_ref::<Utf8Array<i32>>() | |
| .unwrap() | |
| .to_binary() | |
| .boxed() | |
| }) | |
| .collect::<Vec<_>>(); | |
| let arrs = arrs.iter().map(|a| a.as_ref()).collect::<Vec<_>>(); | |
| let arr = | |
| compute::concatenate::concatenate(arrs.as_slice()).context("concat arrays")?; | |
| Ok(compute::cast::binary_to_utf8( | |
| arr.as_any().downcast_ref::<BinaryArray<i32>>().unwrap(), | |
| DataType::Utf8, | |
| ) | |
| .unwrap() | |
| .boxed()) | |
| let mut is_utf8 = false; | |
| let arrs = chunks | |
| .iter() | |
| .map(|chunk| { | |
| let col = chunk | |
| .columns() | |
| .get(col) | |
| .map(|col| col.as_ref()) | |
| .context("get column")?; | |
| is_utf8 = matches!(col.data_type(), DataType::Utf8 | DataType::Utf8View); | |
| Ok(col) | |
| }) | |
| .collect::<Result<Vec<_>>>()?; | |
| if !is_utf8 { | |
| compute::concatenate::concatenate(&arrs).context("concat arrays") | |
| } else { | |
| // For Utf8View columns, concatenate directly to preserve the view representation. | |
| if arrs | |
| .iter() | |
| .all(|a| a.data_type() == &DataType::Utf8View) | |
| { | |
| return compute::concatenate::concatenate(&arrs).context("concat utf8view arrays"); | |
| } | |
| let arrs = arrs | |
| .into_iter() | |
| .map(|a| { | |
| a.as_any() | |
| .downcast_ref::<Utf8Array<i32>>() | |
| .context("expected Utf8Array<i32> for utf8 concatenation") | |
| .map(|a| a.to_binary().boxed()) | |
| }) | |
| .collect::<Result<Vec<_>>>()?; | |
| let arrs = arrs.iter().map(|a| a.as_ref()).collect::<Vec<_>>(); | |
| let arr = | |
| compute::concatenate::concatenate(arrs.as_slice()).context("concat arrays")?; | |
| Ok(compute::cast::binary_to_utf8( | |
| arr.as_any() | |
| .downcast_ref::<BinaryArray<i32>>() | |
| .context("expected BinaryArray<i32> after utf8->binary conversion")?, | |
| DataType::Utf8, | |
| ) | |
| .context("binary_to_utf8 cast failed")? | |
| .boxed()) | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@hyperfuel-schema/src/lib.rs` around lines 203 - 238, The code currently only
treats DataType::Utf8 as UTF-8 and uses unwrap() on downcasts, causing panics
for Utf8View columns; update the type check to accept both DataType::Utf8 and
DataType::Utf8View, and replace all unwrap() calls (the
downcast_ref::<Utf8Array<i32>>() and downcast_ref::<BinaryArray<i32>>() usages
and the final compute::cast::binary_to_utf8().unwrap()) with fallible operations
that return Err/Context on failure so the surrounding function remains fully
fallible; specifically, in the closure mapping chunks (where is_utf8 is set),
test for DataType::Utf8 | DataType::Utf8View, and when converting arrays (the
blocks using Utf8Array::to_binary and compute::cast::binary_to_utf8) attempt
downcasts safely and propagate errors with context rather than panicking.
| pub fn project_schema(schema: &Schema, field_selection: &BTreeSet<String>) -> Schema { | ||
| let mut select_indices = Vec::new(); | ||
| for col_name in field_selection.iter() { | ||
| let (idx, _) = schema | ||
| if let Some((idx, _)) = schema | ||
| .fields | ||
| .iter() | ||
| .enumerate() | ||
| .find(|(_, f)| &f.name == col_name) | ||
| .context(format!("couldn't find column {col_name} in schema"))?; | ||
| select_indices.push(idx); | ||
| { | ||
| select_indices.push(idx); | ||
| } |
There was a problem hiding this comment.
Do not silently drop unknown projected fields.
The new behavior ignores missing field_selection entries and returns a partial schema. That masks typos and contract drift; this should fail fast with context.
Proposed fix
+use anyhow::{Context, Result};
use polars_arrow::datatypes::ArrowSchema as Schema;
-pub fn project_schema(schema: &Schema, field_selection: &BTreeSet<String>) -> Schema {
- let mut select_indices = Vec::new();
- for col_name in field_selection.iter() {
- if let Some((idx, _)) = schema
- .fields
- .iter()
- .enumerate()
- .find(|(_, f)| &f.name == col_name)
- {
- select_indices.push(idx);
- }
- }
-
- let schema: Schema = schema
- .fields
- .iter()
- .filter(|f| field_selection.contains(&f.name))
- .cloned()
- .collect::<Vec<_>>()
- .into();
-
- schema
+pub fn project_schema(schema: &Schema, field_selection: &BTreeSet<String>) -> Result<Schema> {
+ let mut selected = Vec::with_capacity(field_selection.len());
+ for col_name in field_selection {
+ let field = schema
+ .fields
+ .iter()
+ .find(|f| &f.name == col_name)
+ .with_context(|| format!("column '{}' not found in schema", col_name))?;
+ selected.push(field.clone());
+ }
+ Ok(selected.into())
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@hyperfuel-schema/src/util.rs` around lines 5 - 15, In project_schema, do not
silently ignore names in field_selection that aren't present in schema.fields;
instead detect missing entries and fail fast: change the signature of
project_schema to return Result<Schema, Error> (or anyhow::Error), while
iterating over field_selection (used with select_indices) collect any missing
field names into a Vec<String>, and if non-empty return an Err containing a
clear message listing the missing names and context (e.g., which projection and
schema). Update callers of project_schema accordingly to handle the Result.
Ensure references to project_schema, schema.fields, and field_selection are used
to locate and implement this change.
|
|
||
| [Documentation Page](https://docs.envio.dev/docs/hypersync-clients) | ||
|
|
||
| ### Dependencies |
There was a problem hiding this comment.
Fix heading level jump (MD001).
This heading jumps from H1 to H3. Use H2 here to satisfy markdownlint and keep hierarchy consistent.
Suggested fix
-### Dependencies
+## Dependencies📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ### Dependencies | |
| ## Dependencies |
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 15-15: Heading levels should only increment by one level at a time
Expected: h2; Actual: h3
(MD001, heading-increment)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@README.md` at line 15, The "### Dependencies" heading violates MD001 by
jumping from H1 to H3; update the heading level for the "### Dependencies" line
to H2 (change it to "## Dependencies") so the document maintains proper
hierarchy and satisfies markdownlint rules, ensuring the header text
"Dependencies" remains unchanged except for the leading hashes.
fd7dfcf to
32ed136
Compare
* Pre-publish hardening and version bumps Addresses CodeRabbit findings on the v2-import PR (#9) that were worth fixing before the next publish; deliberately skips theoretical/cosmetic ones and defers the larger FromArrow trait Result-typing refactor. Crate version bumps (all minor): - hyperfuel-net-types 4.0.3 -> 4.1.0 - hyperfuel-format 4.0.1 -> 4.1.0 - hyperfuel-schema 4.0.0 -> 4.1.0 - hyperfuel-client 3.0.1 -> 3.1.0 Security / correctness fixes: - hyperfuel-client/src/parse_response.rs: restore bounded capnp limits (nesting_limit(64), traversal_limit_in_words(Some(64M)) ~= 512 MiB). Previous nesting_limit(i32::MAX) + traversal_limit_in_words(None) removed the DoS safeguards capnp provides specifically for untrusted network input. - hyperfuel-format/src/types/quantity.rs: From<u64> now strips leading zeros to preserve the no-leading-zero invariant enforced by the Vec<u8>/&[u8] constructors. Previously Quantity::from(5u64) stored [0,0,0,0,0,0,0,5] which broke Eq/Hash/round-trip equality against the canonical form. New test_from_u64_canonical covers this. - hyperfuel-client/src/lib.rs get_arrow_impl: swap block_in_place for spawn_blocking. block_in_place panics on tokio current-thread runtimes (#[tokio::test], flavor = "current_thread"), making the library hostile to common configurations. - hyperfuel-client/src/lib.rs get_height_impl: propagate missing height instead of unwrap_or(0). Previously a malformed response collapsed into a successful "genesis height" result, hiding bad data from retry / health-check loops. Schema API improvement (additive, minor-safe): - hyperfuel-schema/src/util.rs: add try_project_schema returning Result<Schema> that errors when requested columns are absent. The existing project_schema is unchanged (still silently drops unknown fields) for backward compatibility. New code should prefer try_project_schema to catch typos in field selections. Cleanup: - hyperfuel-client/Cargo.toml: remove tokio's test-util feature from main deps (was unused; bringing testing-only timer-mocking machinery into production builds). - hyperfuel-client/src/column_mapping.rs: fix doc comments on receipt / input / output (were labeled "log data" / "trace data" / "decoded log data" from the EVM-client copy-paste). Fix "colum" typo in error message. - hyperfuel-client/tests/api_test.rs: deleted. 523 lines of commented-out tests that referenced hypersync_client / hypersync_format (wrong crate names from the v2 import). Track restoring API test coverage in a follow-up issue if wanted. Examples: - Restore examples/simple-logs (workspace member), ported to the current ClientConfig API. Used as a live smoke test against https://fuel.hypersync.xyz: archive_height returned correctly, full HTTP -> capnp -> spawn_blocking -> arrow path exercised. Verified: cargo build --workspace --all-targets, cargo clippy --workspace --all-targets -- -D warnings, cargo test --workspace (34 tests pass), cargo run -p simple-logs against the live server. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * quantity.rs: rustfmt the new test Splits the long assert_eq! in test_from_u64_canonical across multiple lines to satisfy cargo fmt --check. No behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Why
The 4.x crates currently on crates.io were published from
enviodev/hyperfuel-client-rust-v2, a private fork that diverged from this repo in late 2024 and became the de-facto source of truth. This public repo'smainwas stuck on the 3.0.0 / 2.2.1 line with a stalelicense = \"MIT\"and no commit corresponding to any published 4.x crate. That left:cargo install/cargo docusers following therepositoryURL on crates.io to a repo whose tree didn't match the tarball.build.rscapnp dependency) being authored against an outdated baseline and bumping to a3.0.1version that crates.io would treat as a backport on a dead major line.Published versions vs this repo's
mainbefore this PR:mainhyperfuel-net-typeshyperfuel-formathyperfuel-schemahyperfuel-clientThe publish commit for
hyperfuel-net-types-4.0.2(c185862) doesn't exist in this repo's history at all; it lives only inenviodev/hyperfuel-client-rust-v2.What this PR does
Single tree-replacement commit (
dc95ac4) that brings v2'smain(d618951, one commit pastc185862) onto this repo'smain, followed by three cleanup commits addressing review.After this PR is merged, the public repo's tree will match the published 4.x crate metadata (license MPL-2.0,
capnp 0.19,polars-arrow, correcthomepage/repositoryURLs).Commits
dc95ac4Import hyperfuel-client-rust-v2 main as the new baseline019ceeePatches and cleanup from version bump rebase (HyperFuel branding, restoredQuerydocs,JoinModedefault, dropped EVM-only deps fromhyperfuel-format, restoredClient::newHTTP builder settings, re-introducedpreset_query_get_logs+LogContext/LogResponse, removedneedless_lifetimesallows, deletedsimple_types.rs)b55ea2dClean up unused deps and dead commented-out event stubs (unusedxxhash-rust/nohash-hasherinhyperfuel-format, threeEventResponse-referencing/* */blocks inlib.rs, README URL fix,HexOutputDefaultderive)32ed136Delete remaining dead commented-out method stubs inlib.rsDeliberate differences from v1's
mainA handful of v1 things did not survive the migration, with notes:
get_selected_arrow_data/get_selected_data(and thefilter.rsclient-side filter module). v2 introducesJoinMode::JoinNothingonQuery, which appears to be the server-side equivalent. Worth verifying against the server before declaring final; ifJoinNothingdoes what its name implies, no port is needed._with_retryvariants (get_height_with_retry,get_arrow_data_with_retry). v2'sClientdoes retries internally based onmax_num_retries/retry_backoff_ms/ etc., so these became redundant.examples/directory. v2 maintains examples on a separate branch (v2/examples). Not ported into trunk in this PR.Config→ClientConfigrename. Already published as part of v2's 4.x line; preserved here for consistency with crates.io.Test plan
cargo check --workspacecargo clippy --workspace --all-targets -- -D warningscargo test --workspace(per local run on019ceee)After merge
hyperfuel-net-types-v4.0.3,hyperfuel-client-v3.0.1,hyperfuel-format-v4.0.1,hyperfuel-schema-v4.0.0). Nocargo publishruns yet - just git tags so the published crates have traceable commit hashes in the public repo.enviodev/hyperfuel-client-rust-v2so it stops being a parallel source of truth.build.rscapnp dependency) rebased on this baseline with the version bump changed from3.0.1to4.0.4/4.1.0.JoinMode::JoinNothingserver-side behavior; if it doesn't replaceget_selected_arrow_data, port forward.Summary by CodeRabbit
New Features
collect_arrowandcollect_parquetcapabilities for large datasets.Breaking Changes
Configreplaced withClientConfigandStreamConfigfor improved flexibility.Documentation
Chores