refactor: move spamoor benchmark into testify suite by chatton · Pull Request #3107 · evstack/ev-node

chatton · 2026-02-25T12:16:21Z

Overview

Introduces a benchmarking package within the e2e dir a testsuite we can add additional spamoor tests too.

Summary by CodeRabbit

Tests
- Added a new end-to-end benchmark suite for spam-generation, tracing, and performance metrics with JSON output.
- Introduced helpers to collect, parse and validate trace spans and benchmark metrics.
- Removed an older duplicate smoke test.
Refactor
- Promoted and renamed several test utilities and trace/metric helpers to public, improving reuse and reporting.
Chores
- Updated CI workflow triggers and benchmark artifact paths/handling.

…mark - Create test/e2e/benchmark/ subpackage with SpamoorSuite (testify/suite) - Move spamoor smoke test into suite as TestSpamoorSmoke - Split helpers into focused files: traces.go, output.go, metrics.go - Introduce resultWriter for defer-based benchmark JSON output - Export shared symbols from evm_test_common.go for cross-package use - Restructure CI to fan-out benchmark jobs and fan-in publishing - Run benchmarks on PRs only when benchmark-related files change

Resolve conflicts keeping the benchmark suite refactoring: - benchmark.yml: keep path filters and suite-style test command - evm_spamoor_smoke_test.go: keep deleted (moved to benchmark pkg) - evm_test_common.go: keep exported types, drop writeTraceBenchmarkJSON (now in benchmark/output.go)

coderabbitai · 2026-02-25T12:28:07Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

📝 Walkthrough

Walkthrough

Moves Spamoor benchmark logic into a new benchmark suite at test/e2e/benchmark/, adds evm-tagged helpers for metrics, traces, and JSON output, exports common EVM test helpers/types, updates CI workflow artifact paths, and removes the legacy evm_spamoor_smoke_test.go.

Changes

Cohort / File(s)	Summary
CI Workflow `.github/workflows/benchmark.yml`	Updated PR trigger path filters; changed benchmark artifact upload/download and workspace paths to `test/e2e/benchmark/`; adjusted Spamoor smoke test invocation and EVMB binary path.
New Benchmark Suite `test/e2e/benchmark/suite_test.go`, `test/e2e/benchmark/spamoor_smoke_test.go`	Adds `SpamoorSuite` with SetupTest and Smoke test orchestration (Jaeger, reth, sequencer, Spamoor), spammers, metrics collection, trace retrieval and assertions.
Benchmark Helpers `test/e2e/benchmark/metrics.go`, `test/e2e/benchmark/traces.go`, `test/e2e/benchmark/output.go`	New evm build-tagged helpers: HTTP readiness polling, Prometheus counter summation, Jaeger trace parsing/adapters, span assertions, and JSON benchmark result writer.
E2E Common API `test/e2e/evm_test_common.go`	Exported/renamed helpers and types: `setup`→`Setup`, `traceSpan`→`TraceSpan`, `spanStats`→`SpanStats`, `aggregateSpanStats`→`AggregateSpanStats`, `printTraceReport`→`PrintTraceReport`; removed legacy benchmark JSON writer.
Test Callsite Updates `test/e2e/evm_contract_bench_test.go`, `test/e2e/evm_force_inclusion_e2e_test.go`, `test/e2e/evm_full_node_e2e_test.go`, `test/e2e/failover_e2e_test.go`	Updated call sites to use exported helper names and capitalized trace interfaces/report functions.
Removed legacy test `test/e2e/evm_spamoor_smoke_test.go`	Deleted legacy Spamoor smoke test and associated helpers (migrated into `test/e2e/benchmark/`).

Sequence Diagram(s)

sequenceDiagram
    participant Test as "Spamoor Smoke Test"
    participant Suite as "SpamoorSuite"
    participant Jaeger as "Jaeger (tracing)"
    participant Reth as "Reth (execution)"
    participant Sequencer as "Sequencer"
    participant Spamoor as "Spamoor"

    Test->>Suite: run SetupTest
    Suite->>Jaeger: start Jaeger
    Suite->>Reth: start Reth (OTLP -> Jaeger)
    Suite->>Sequencer: start Sequencer
    Suite->>Spamoor: build & start Spamoor (RPC -> Sequencer)
    Spamoor->>Jaeger: export traces
    Test->>Spamoor: create & run spammers
    Spamoor->>Sequencer: submit txs
    Sequencer->>Reth: propose blocks
    Test->>Jaeger: collect traces & metrics
    Jaeger-->>Test: return trace payloads
    Test->>Test: parse/aggregate spans, assert, write results

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

chore: gather spans from both ev-node and ev-reth in spamoor test #3099: Overlaps on Spamoor e2e tracing integration and related helpers (jaeger span parsing, trace adapters).
feat: adding spammoor test to benchmark #3105: Related migration/reorganization of Spamoor benchmark into new benchmark dir and artifact/path updates.

Suggested labels

T:testing, T:code-hygiene

Suggested reviewers

alpe
tzdybal
gupadhyaya

Poem

🐰 I hopped through traces, metrics in paw,
Rearranged the nest for Spamoor to draw,
Helpers exported, benchmarks tucked neat,
CI paths redirected — what a treat!
Hop, run, report — the suite’s ready to eat.

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 77.27% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check	❓ Inconclusive	The description is vague and lacks sufficient detail about the changes, objectives, and rationale for the refactoring.	Expand the description to explain why the refactoring was necessary, what benefits the testify suite provides, and clarify the relationship between the old and new test structures.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'refactor: move spamoor benchmark into testify suite' clearly and concisely describes the main change—moving an existing benchmark test into a new testify-based suite structure.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch cian/bench-refactor

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-02-25T12:28:55Z

The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).

Build	Format	Lint	Breaking	Updated (UTC)
`✅ passed`	`⏩ skipped`	`✅ passed`	`✅ passed`	Feb 25, 2026, 2:46 PM

claude · 2026-02-25T12:28:55Z

Claude finished @chatton's task in 1m 54s —— View job

Code Review

Gather context and understand PR scope
Review new benchmark package files
Review CI workflow changes
Review exported helper changes in evm_test_common.go
Compile and post review findings

Good refactor overall — isolating the Spamoor benchmark into its own testify suite with a clean env struct and per-test setup is much more maintainable than the previous monolithic approach. The SetupTest isolation pattern is correct and the use of t.Cleanup throughout is idiomatic.

I found a few issues ranging from functional bugs to style nitpicks.

Bugs / Functional Issues

1. OTLP endpoint configuration is incorrect — test/e2e/benchmark/suite_test.go:74-77

The OpenTelemetry spec requires OTEL_EXPORTER_OTLP_ENDPOINT to be a base URL; the SDK appends /v1/traces automatically. The current code sets it to .../v1/traces, which will cause requests to go to .../v1/traces/v1/traces. Additionally, "http" is not a valid OTLP protocol value — the spec defines grpc, http/protobuf, and http/json.

// current (broken)
"OTEL_EXPORTER_OTLP_ENDPOINT="+jg.Internal.IngestHTTPEndpoint()+"/v1/traces",
"OTEL_EXPORTER_OTLP_PROTOCOL=http",
"OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=http",

// correct
"OTEL_EXPORTER_OTLP_ENDPOINT="+jg.Internal.IngestHTTPEndpoint(),   // base URL, no suffix
"OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf",
"OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=http/protobuf",
// OTEL_EXPORTER_OTLP_TRACES_ENDPOINT is fine as-is (per-signal = full URL)

Fix this →

2. Hardcoded rethTag: "pr-140" will break — test/e2e/benchmark/spamoor_smoke_test.go:21

This references an ephemeral PR image tag that will become invalid once the image is cleaned up. It should be configurable via an env var with a sensible default so CI can override it without a code change.

rethTag := os.Getenv("BENCH_RETH_TAG")
if rethTag == "" {
    rethTag = "pr-140"
}
e := s.setupEnv(config{
    rethTag:     rethTag,
    serviceName: "ev-node-smoke",
})

Fix this →

Design / Robustness

3. flush() uses require.NoError in a deferred context — test/e2e/benchmark/output.go:74-75

require.NoError calls runtime.Goexit via FailNow. When flush is called via defer, this means an I/O error writing the benchmark JSON file can mask the original test failure. Prefer non-fatal logging here:

data, err := json.MarshalIndent(w.entries, "", "  ")
if err != nil {
    w.t.Logf("WARNING: failed to marshal benchmark JSON: %v", err)
    return
}
if err := os.WriteFile(outputPath, data, 0644); err != nil {
    w.t.Logf("WARNING: failed to write benchmark JSON to %s: %v", outputPath, err)
    return
}

4. Fixed 60-second sleep for trace accumulation — test/e2e/benchmark/spamoor_smoke_test.go:70

This is the canonical fragile-test antipattern. On slow CI runners it may be too short; on fast machines it wastes time. The collectServiceTraces helper already polls Jaeger with a timeout via WaitForTraces — the sleep before it could simply be removed, or converted to poll for a minimum span count.

5. addEntry method is dead code — test/e2e/benchmark/output.go:61

resultWriter.addEntry is defined but never called anywhere in the package. Either add a use or remove it to avoid confusion about whether it's part of the intended API.

Code Quality Nitpicks

6. Loop variable t shadows test helper convention — test/e2e/benchmark/traces.go:25

for _, t := range traces {   // 't' reads as testing.TB to any Go developer

Rename to traceEntry or raw to avoid confusion.

7. Stale doc comment — test/e2e/evm_test_common.go:532

The comment still says setupCommonEVMEnv (the old unexported name); Go convention requires the doc comment to start with the exported function name:

// SetupCommonEVMEnv creates and initializes ev-reth instances...

8. requireHTTP has no cancellation path — test/e2e/benchmark/metrics.go:15

The polling loop uses time.Now().Before(deadline) with time.Sleep inside, with no hook for test context cancellation (t.Context()). If the test is killed mid-flight, this loop will run until the deadline. Minor in practice but inconsistent with the rest of the suite which correctly passes ctx everywhere.

CI

9. Missing path trigger for Block Executor benchmarks — .github/workflows/benchmark.yml:11-16

The evm-benchmark job runs go test ./block/internal/executing/... but block/internal/executing/** is not in the PR path filter. Changes to those files won't trigger the workflow on PRs.

Note: I cannot modify workflow files directly (GitHub App permission restriction), but this is a straightforward one-liner addition to the paths: list.

Performance Regression

The benchmark-action comment shows 3 Spamoor metrics regressed >1.5x vs the previous run. These are likely noise from the infrastructure refactor rather than a code regression, but worth checking:

Metric	Ratio
`Store.NewBatch (avg)`	1.75x
`Batch.SetHeight (avg)`	1.60x
`Store.GetMetadata (avg)`	1.57x

These could be caused by the new suite initialization overhead or Docker orchestration differences in the refactored setup.

Summary: The structural refactor is clean and the direction is right. The two hard-blockers before merge are the OTLP endpoint misconfiguration (items 1) which will produce silently incorrect tracing data, and the hardcoded PR image tag (item 2) which will cause test failures once that image is cleaned up.

chatton · 2026-02-25T12:29:33Z

test/e2e/benchmark/spamoor_smoke_test.go

+	defer w.flush()
+
+	e := s.setupEnv(config{
+		rethTag:     "pr-140",


the PR the traces were added, we can change this to a proper tag in a follow up.

chatton · 2026-02-25T12:32:07Z

.github/workflows/benchmark.yml

  pull_request:
    branches:
      - main
+    paths:


only run benchmark tests for PRs when relevant files are changed.

go test sets the working directory to the package under test, so the env var should be relative to test/e2e/benchmark/, not test/e2e/.

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (1)

test/e2e/benchmark/spamoor_smoke_test.go (1)
20-23: ⚠️ Potential issue | 🟠 Major

Hardcoded rethTag: "pr-140" will become stale and cause opaque failures.

This tag refers to a specific container image built for a PR. Once that image is cleaned up or the tag is overwritten, the test will silently fail to start or pull the container with a confusing error. Expose this via an environment variable or CLI flag so it can be updated without a code change.
🔧 Suggested refactor
+rethTagFromEnv := os.Getenv("BENCH_RETH_TAG")
+if rethTagFromEnv == "" {
+    rethTagFromEnv = "pr-140"
+}
 e := s.setupEnv(config{
-    rethTag:     "pr-140",
+    rethTag:     rethTagFromEnv,
     serviceName: "ev-node-smoke",
 })
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/benchmark/spamoor_smoke_test.go` around lines 20 - 23, Replace the
hardcoded rethTag value ("pr-140") in the test setup by reading it from
configuration (environment variable or test CLI flag) so tests don't break when
image tags change: update the call site where s.setupEnv(config{ rethTag:
"pr-140", ... }) is used in spamoor_smoke_test.go to instead pull the tag from
an env var (eg. os.Getenv("RETH_TAG")) or a provided flag, and adjust the config
struct and any test bootstrap that constructs config to accept and pass that
dynamic value (look for the config type, its rethTag field, and the setupEnv
function to make the change).

🧹 Nitpick comments (5)

test/e2e/benchmark/metrics.go (1)

15-34: Consider accepting a context.Context for cancellable polling.

requireHTTP drives a blocking polling loop with no way to interrupt it if the parent test is cancelled mid-flight (e.g., via t.Context() deadline or a FailNow from another goroutine). For long timeouts this can leave dangling goroutines/blocked tests.

🔧 Suggested refactor

-func requireHTTP(t testing.TB, url string, timeout time.Duration) {
+func requireHTTP(t testing.TB, ctx context.Context, url string, timeout time.Duration) {
 	t.Helper()
 	client := &http.Client{Timeout: 200 * time.Millisecond}
-	deadline := time.Now().Add(timeout)
+	ctx, cancel := context.WithTimeout(ctx, timeout)
+	defer cancel()
 	var lastErr error
-	for time.Now().Before(deadline) {
+	ticker := time.NewTicker(100 * time.Millisecond)
+	defer ticker.Stop()
+	for {
+		select {
+		case <-ctx.Done():
+			t.Fatalf("daemon not ready at %s: %v", url, lastErr)
+			return
+		case <-ticker.C:
+		}
 		resp, err := client.Get(url)
 		if err == nil {
 			_ = resp.Body.Close()
 			if resp.StatusCode >= 200 && resp.StatusCode < 300 {
 				return
 			}
 			lastErr = fmt.Errorf("status %d", resp.StatusCode)
 		} else {
 			lastErr = err
 		}
-		time.Sleep(100 * time.Millisecond)
 	}
-	t.Fatalf("daemon not ready at %s: %v", url, lastErr)
 }

As per coding guidelines, "Use context.Context for cancellation in Go."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@test/e2e/benchmark/metrics.go` around lines 15 - 34, The requireHTTP function
currently blocks and can't be cancelled; change its signature to accept a
context.Context (e.g., requireHTTP(ctx context.Context, t testing.TB, url
string, timeout time.Duration) or use t.Context() inside) and use that context
for cancellation inside the polling loop: create requests with
http.NewRequestWithContext or attach ctx to the client.Do call, replace
time.Sleep with a select on ctx.Done() and time.After(100*time.Millisecond),
check ctx.Err() and call t.Fatalf when the context is cancelled, and ensure you
still honor the timeout logic and preserve the lastErr handling in the loop.

test/e2e/benchmark/spamoor_smoke_test.go (1)

68-70: Fixed 60s time.Sleep is fragile for trace accumulation.

On fast machines traces may be ready much sooner; on slow CI runners traces may not be ready even after 60s. Consider polling the Jaeger API with a timeout instead of an unconditional sleep, making the test both faster in the happy path and more robust in the slow path.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/benchmark/spamoor_smoke_test.go` around lines 68 - 70, Replace the
fixed time.Sleep(60 * time.Second) wait with a polling loop that queries the
Jaeger API until the expected traces/spans from spamoor appear or a context
deadline elapses; implement this by replacing the direct call to time.Sleep with
a function (e.g., waitForTraces(ctx, serviceName, expectedSpanCount)) that uses
a context with timeout, exponential/backoff retries, and HTTP calls to the
Jaeger query endpoint, returning early when traces are observed and failing the
test only if the timeout is reached—update the test to call waitForTraces
instead of time.Sleep and use the existing service identifier used in the test
to locate traces.

test/e2e/benchmark/traces.go (1)

23-47: Loop variable t shadows the conventional test-helper name convention.

Using t as a range variable in extractSpansFromTraces (which imports "testing") is not a bug but is slightly confusing, particularly for readers accustomed to t meaning testing.TB. A more descriptive name like traceEntry or raw would improve clarity.
🔧 Suggested refactor
-	for _, t := range traces {
-		traceMap, ok := t.(map[string]any)
+	for _, traceEntry := range traces {
+		traceMap, ok := traceEntry.(map[string]any)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/benchmark/traces.go` around lines 23 - 47, In
extractSpansFromTraces, rename the range variable t to a clearer name (e.g.,
traceEntry or rawTrace) to avoid shadowing the conventional testing variable t;
update all uses of t in that function (the outer loop in extractSpansFromTraces
and its casts to traceMap) so the logic remains unchanged but the variable is
more descriptive and less confusing to readers familiar with testing.TB.

test/e2e/benchmark/output.go (1)

67-76: require.NoError inside a deferred flush will call runtime.Goexit on failure, which is safe but worth documenting.

When flush is called via defer after a test has already failed, require.NoError → FailNow → runtime.Goexit is invoked from within the deferred function, which Go handles correctly (deferred calls complete). No behavioral bug here, but the interaction with require vs os.WriteFile returning an error silently (by being suppressed) could be surprising. Consider logging the error path rather than asserting fatally so benchmark output failures don't obscure the real test failure.

🔧 Suggested refactor

 func (w *resultWriter) flush() {
 	outputPath := os.Getenv("BENCH_JSON_OUTPUT")
 	if outputPath == "" || len(w.entries) == 0 {
 		return
 	}
 
 	data, err := json.MarshalIndent(w.entries, "", "  ")
-	require.NoError(w.t, err, "failed to marshal benchmark JSON")
-	require.NoError(w.t, os.WriteFile(outputPath, data, 0644), "failed to write benchmark JSON to %s", outputPath)
+	if err != nil {
+		w.t.Logf("WARNING: failed to marshal benchmark JSON: %v", err)
+		return
+	}
+	if err := os.WriteFile(outputPath, data, 0644); err != nil {
+		w.t.Logf("WARNING: failed to write benchmark JSON to %s: %v", outputPath, err)
+		return
+	}
 	w.t.Logf("wrote %d benchmark entries to %s", len(w.entries), outputPath)
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@test/e2e/benchmark/output.go` around lines 67 - 76, In resultWriter.flush,
avoid using require.NoError (which calls FailNow/Goexit when deferred) for the
json.MarshalIndent and os.WriteFile errors; instead check the error values from
json.MarshalIndent and os.WriteFile and log them non-fatally (e.g., use
w.t.Errorf or w.t.Logf with descriptive messages referencing outputPath and the
error) so flush reports failures without invoking runtime.Goexit from a deferred
context; update the calls that currently use require.NoError(w.t, ...) to
explicit if err != nil { w.t.Errorf("...: %v", err); return } or similar logging
behavior for both the marshal and write steps in resultWriter.flush.

test/e2e/evm_test_common.go (1)

532-534: Doc comment references the old unexported name.

Line 532 says setupCommonEVMEnv but the function is now exported as SetupCommonEVMEnv. Per Go convention, the doc comment should start with the exported identifier name.

Proposed fix

-// setupCommonEVMEnv creates and initializes ev-reth instances, while also initializing the local ev-node instance
+// SetupCommonEVMEnv creates and initializes ev-reth instances, while also initializing the local ev-node instance

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@test/e2e/evm_test_common.go` around lines 532 - 534, The doc comment
currently starts with the old unexported name; update the comment to begin with
the exported function name SetupCommonEVMEnv so it follows Go doc conventions
(e.g., change the leading sentence from "setupCommonEVMEnv creates..." to
"SetupCommonEVMEnv creates..."), leaving the rest of the description intact and
ensuring the comment now documents the exported SetupCommonEVMEnv function.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/benchmark.yml:
- Around line 11-16: Update the PR path filter in the GitHub Actions workflow to
include the block executor benchmark sources so changes to that package trigger
the workflow; specifically add the path pattern 'block/internal/executing/**'
(or more broadly 'block/**' if preferred) to the existing paths array in
.github/workflows/benchmark.yml so the job that runs `go test
./block/internal/executing/...` will trigger on relevant PRs.

In `@test/e2e/benchmark/suite_test.go`:
- Around line 73-78: The OTLP env setup in the b.WithTag(...).WithEnv call is
wrong: change the value for "OTEL_EXPORTER_OTLP_ENDPOINT" to use the base ingest
URL (remove the "/v1/traces" suffix) so the SDK can append per-signal paths,
keep "OTEL_EXPORTER_OTLP_TRACES_ENDPOINT" as-is (full URL), and update
"OTEL_EXPORTER_OTLP_PROTOCOL" and "OTEL_EXPORTER_OTLP_TRACES_PROTOCOL" from
"http" to the spec-correct "http/protobuf" (or another valid spec value like
"grpc" or "http/json" if intended).

---

Duplicate comments:
In `@test/e2e/benchmark/spamoor_smoke_test.go`:
- Around line 20-23: Replace the hardcoded rethTag value ("pr-140") in the test
setup by reading it from configuration (environment variable or test CLI flag)
so tests don't break when image tags change: update the call site where
s.setupEnv(config{ rethTag: "pr-140", ... }) is used in spamoor_smoke_test.go to
instead pull the tag from an env var (eg. os.Getenv("RETH_TAG")) or a provided
flag, and adjust the config struct and any test bootstrap that constructs config
to accept and pass that dynamic value (look for the config type, its rethTag
field, and the setupEnv function to make the change).

---

Nitpick comments:
In `@test/e2e/benchmark/metrics.go`:
- Around line 15-34: The requireHTTP function currently blocks and can't be
cancelled; change its signature to accept a context.Context (e.g.,
requireHTTP(ctx context.Context, t testing.TB, url string, timeout
time.Duration) or use t.Context() inside) and use that context for cancellation
inside the polling loop: create requests with http.NewRequestWithContext or
attach ctx to the client.Do call, replace time.Sleep with a select on ctx.Done()
and time.After(100*time.Millisecond), check ctx.Err() and call t.Fatalf when the
context is cancelled, and ensure you still honor the timeout logic and preserve
the lastErr handling in the loop.

In `@test/e2e/benchmark/output.go`:
- Around line 67-76: In resultWriter.flush, avoid using require.NoError (which
calls FailNow/Goexit when deferred) for the json.MarshalIndent and os.WriteFile
errors; instead check the error values from json.MarshalIndent and os.WriteFile
and log them non-fatally (e.g., use w.t.Errorf or w.t.Logf with descriptive
messages referencing outputPath and the error) so flush reports failures without
invoking runtime.Goexit from a deferred context; update the calls that currently
use require.NoError(w.t, ...) to explicit if err != nil { w.t.Errorf("...: %v",
err); return } or similar logging behavior for both the marshal and write steps
in resultWriter.flush.

In `@test/e2e/benchmark/spamoor_smoke_test.go`:
- Around line 68-70: Replace the fixed time.Sleep(60 * time.Second) wait with a
polling loop that queries the Jaeger API until the expected traces/spans from
spamoor appear or a context deadline elapses; implement this by replacing the
direct call to time.Sleep with a function (e.g., waitForTraces(ctx, serviceName,
expectedSpanCount)) that uses a context with timeout, exponential/backoff
retries, and HTTP calls to the Jaeger query endpoint, returning early when
traces are observed and failing the test only if the timeout is reached—update
the test to call waitForTraces instead of time.Sleep and use the existing
service identifier used in the test to locate traces.

In `@test/e2e/benchmark/traces.go`:
- Around line 23-47: In extractSpansFromTraces, rename the range variable t to a
clearer name (e.g., traceEntry or rawTrace) to avoid shadowing the conventional
testing variable t; update all uses of t in that function (the outer loop in
extractSpansFromTraces and its casts to traceMap) so the logic remains unchanged
but the variable is more descriptive and less confusing to readers familiar with
testing.TB.

In `@test/e2e/evm_test_common.go`:
- Around line 532-534: The doc comment currently starts with the old unexported
name; update the comment to begin with the exported function name
SetupCommonEVMEnv so it follows Go doc conventions (e.g., change the leading
sentence from "setupCommonEVMEnv creates..." to "SetupCommonEVMEnv creates..."),
leaving the rest of the description intact and ensuring the comment now
documents the exported SetupCommonEVMEnv function.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 212ac08 and cc56590.

📒 Files selected for processing (12)

.github/workflows/benchmark.yml
test/e2e/benchmark/metrics.go
test/e2e/benchmark/output.go
test/e2e/benchmark/spamoor_smoke_test.go
test/e2e/benchmark/suite_test.go
test/e2e/benchmark/traces.go
test/e2e/evm_contract_bench_test.go
test/e2e/evm_force_inclusion_e2e_test.go
test/e2e/evm_full_node_e2e_test.go
test/e2e/evm_spamoor_smoke_test.go
test/e2e/evm_test_common.go
test/e2e/failover_e2e_test.go

💤 Files with no reviewable changes (1)

test/e2e/evm_spamoor_smoke_test.go

coderabbitai · 2026-02-25T12:43:20Z

.github/workflows/benchmark.yml

+    paths:
+      - 'test/e2e/benchmark/**'
+      - 'test/e2e/evm_contract_bench_test.go'
+      - 'test/e2e/evm_test_common.go'
+      - 'test/e2e/sut_helper.go'
+      - '.github/workflows/benchmark.yml'


⚠️ Potential issue | 🟠 Major

PR path filter misses Block Executor benchmark sources.

This workflow runs go test ./block/internal/executing/... (Line 40), but PR paths do not include that tree. Changes to those benchmarks/sources can skip this workflow unexpectedly.

Suggested fix

pull_request: branches: - main paths: - 'test/e2e/benchmark/**' - 'test/e2e/evm_contract_bench_test.go' - 'test/e2e/evm_test_common.go' - 'test/e2e/sut_helper.go' + - 'block/internal/executing/**' - '.github/workflows/benchmark.yml'

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.github/workflows/benchmark.yml around lines 11 - 16, Update the PR path filter in the GitHub Actions workflow to include the block executor benchmark sources so changes to that package trigger the workflow; specifically add the path pattern 'block/internal/executing/**' (or more broadly 'block/**' if preferred) to the existing paths array in .github/workflows/benchmark.yml so the job that runs `go test ./block/internal/executing/...` will trigger on relevant PRs.

coderabbitai · 2026-02-25T12:43:21Z

test/e2e/benchmark/suite_test.go

+			b.WithTag(cfg.rethTag).WithEnv(
+				"OTEL_EXPORTER_OTLP_ENDPOINT="+jg.Internal.IngestHTTPEndpoint()+"/v1/traces",
+				"OTEL_EXPORTER_OTLP_TRACES_ENDPOINT="+jg.Internal.IngestHTTPEndpoint()+"/v1/traces",
+				"OTEL_EXPORTER_OTLP_PROTOCOL=http",
+				"OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=http",
+				"RUST_LOG=debug",


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

OpenTelemetry OTEL_EXPORTER_OTLP_ENDPOINT specification base URL path suffix

💡 Result:

For OTLP/HTTP, OTEL_EXPORTER_OTLP_ENDPOINT is specified as a base URL. Exporters must append a signal-specific relative path to it:

Traces: v1/traces

Metrics: v1/metrics

Logs: v1/logs

So if you set:

OTEL_EXPORTER_OTLP_ENDPOINT=http://collector:4318

the actual requests go to:

http://collector:4318/v1/traces, .../v1/metrics, .../v1/logs (respectively). The base URL may itself include a path prefix (e.g. /mycollector/), in which case the v1/* suffix is appended under that prefix. [1]

If you instead set a per-signal endpoint (e.g. OTEL_EXPORTER_OTLP_TRACES_ENDPOINT), that URL is used as-is (no automatic /v1/traces appending); if it has no path component, / is used. [1]

Separately, the OTLP protocol spec defines the default OTLP/HTTP paths as /v1/traces, /v1/metrics, /v1/logs (server-side defaults), but clients may be configured to use non-default paths. [2]

For OTLP/gRPC, OTEL_EXPORTER_OTLP_ENDPOINT is a target for the gRPC client (typically host/port; path suffixes like /v1/traces are not part of the gRPC endpoint concept). [1]

Sources: [1] OpenTelemetry OTLP Exporter spec (Endpoint URLs for OTLP/HTTP) [2] OTLP Specification (default OTLP/HTTP URL paths)

🌐 Web query:

OTEL_EXPORTER_OTLP_PROTOCOL valid values http/protobuf http/json grpc

💡 Result:

OTEL_EXPORTER_OTLP_PROTOCOL valid values are exactly: grpc, http/protobuf, http/json. [1][2]

🏁 Script executed:

find . -name "suite_test.go" -path "*/benchmark/*" | head -5

Repository: evstack/ev-node

Length of output: 93

🏁 Script executed:

sed -n '65,85p' ./test/e2e/benchmark/suite_test.go

Repository: evstack/ev-node

Length of output: 889

🏁 Script executed:

rg "IngestHTTPEndpoint" -A 2 -B 2 | head -50

Repository: evstack/ev-node

Length of output: 1044

🏁 Script executed:

rg -i "reth.*otlp|otlp.*protocol" | head -20

Repository: evstack/ev-node

Length of output: 292

🏁 Script executed:

rg -i "http/protobuf|http/json" test/ | head -20

Repository: evstack/ev-node

Length of output: 41

🌐 Web query:

Reth OTLP exporter HTTP protocol http/protobuf http/json specification

💡 Result:

OTLP exporter over HTTP: http/protobuf vs http/json (spec)

Transport & endpoints (OTLP/HTTP):

Uses HTTP POST over HTTP/1.1 or HTTP/2. (opentelemetry.io)

Default URL paths:

Traces: POST /v1/traces with ExportTraceServiceRequest

Metrics: POST /v1/metrics with ExportMetricsServiceRequest

Logs: POST /v1/logs with ExportLogsServiceRequest (opentelemetry.io)

When SDKs are configured with a base endpoint (OTEL_EXPORTER_OTLP_ENDPOINT), they construct per-signal URLs by appending v1/traces, v1/metrics, v1/logs (unless a per-signal endpoint is set). (opentelemetry.io)

Encoding choices (same protobuf schema, different wire format):

http/protobuf: protobuf messages serialized in binary proto3. (opentelemetry.io)

Content-Type: application/x-protobuf (request + response). (opentelemetry.io)

http/json: protobuf messages serialized using protobuf JSON mapping, with OTLP-specific rules. (opentelemetry.io)

Content-Type: application/json (request + response). (opentelemetry.io)

OTLP JSON-specific mapping rules (important for http/json):

traceId / spanId are hex strings (not base64). (opentelemetry.io)

Enums MUST be integers (enum names not allowed). (opentelemetry.io)

Unknown JSON fields must be ignored by receivers. (opentelemetry.io)

JSON field names are lowerCamelCase. (opentelemetry.io)

64-bit integers in JSON are encoded as decimal strings (numbers or strings accepted when decoding). (opentelemetry.io)

Compression:

Client may gzip the request body; if so it MUST send Content-Encoding: gzip. (opentelemetry.io)

Client may send Accept-Encoding: gzip; server may gzip responses. (opentelemetry.io)

Responses:

Success is HTTP 200 OK with an Export<signal>ServiceResponse (e.g., ExportTraceServiceResponse). (opentelemetry.io)

Server must respond using the same Content-Type it received. (opentelemetry.io)

How exporters select protocol:

OTEL_EXPORTER_OTLP_PROTOCOL (and per-signal variants) supports: grpc, http/protobuf, http/json; default is typically http/protobuf. (opentelemetry.io)

Where the protobuf definitions live:

The canonical .proto files and OTLP protocol materials are in open-telemetry/opentelemetry-proto. (github.com)

Citations:

1: https://opentelemetry.io/docs/specs/otlp/?utm_source=openai

2: https://opentelemetry.io/docs/specs/otlp/?utm_source=openai

3: https://opentelemetry.io/docs/specs/otel/protocol/exporter/?utm_source=openai

4: https://opentelemetry.io/docs/specs/otlp/?utm_source=openai

5: https://opentelemetry.io/docs/specs/otlp/?utm_source=openai

6: https://opentelemetry.io/docs/specs/otlp/?utm_source=openai

7: https://opentelemetry.io/docs/specs/otlp/?utm_source=openai

8: https://opentelemetry.io/docs/specs/otlp/?utm_source=openai

9: https://opentelemetry.io/docs/specs/otlp/?utm_source=openai

10: https://opentelemetry.io/docs/specs/otlp/?utm_source=openai

11: https://opentelemetry.io/docs/specs/otlp/?utm_source=openai

12: https://opentelemetry.io/docs/specs/otlp/?utm_source=openai

13: https://opentelemetry.io/docs/specs/otlp/?utm_source=openai

14: https://opentelemetry.io/docs/specs/otlp/?utm_source=openai

15: https://opentelemetry.io/docs/specs/otlp/?utm_source=openai

16: https://opentelemetry.io/docs/specs/otlp/?utm_source=openai

17: https://opentelemetry.io/docs/specs/otel/protocol/exporter/?utm_source=openai

18: https://github.com/open-telemetry/opentelemetry-proto?utm_source=openai

Remove the /v1/traces suffix from OTEL_EXPORTER_OTLP_ENDPOINT and change protocol values to http/protobuf.

The OpenTelemetry specification requires that OTEL_EXPORTER_OTLP_ENDPOINT be a base URL, with the SDK automatically appending signal-specific paths (/v1/traces, /v1/metrics, /v1/logs). Setting it to …/v1/traces will cause requests to be sent to …/v1/traces/v1/traces. The per-signal endpoint OTEL_EXPORTER_OTLP_TRACES_ENDPOINT is correct as-is since it expects the full URL without automatic appending.

Additionally, OTEL_EXPORTER_OTLP_PROTOCOL and OTEL_EXPORTER_OTLP_TRACES_PROTOCOL must use spec-defined values: grpc, http/protobuf, or http/json. The value "http" is not valid.

Proposed fix

- "OTEL_EXPORTER_OTLP_ENDPOINT="+jg.Internal.IngestHTTPEndpoint()+"/v1/traces", + "OTEL_EXPORTER_OTLP_ENDPOINT="+jg.Internal.IngestHTTPEndpoint(), "OTEL_EXPORTER_OTLP_TRACES_ENDPOINT="+jg.Internal.IngestHTTPEndpoint()+"/v1/traces", - "OTEL_EXPORTER_OTLP_PROTOCOL=http", - "OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=http", + "OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf", + "OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=http/protobuf",

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

b.WithTag(cfg.rethTag).WithEnv(

"OTEL_EXPORTER_OTLP_ENDPOINT="+jg.Internal.IngestHTTPEndpoint()+"/v1/traces",

"OTEL_EXPORTER_OTLP_TRACES_ENDPOINT="+jg.Internal.IngestHTTPEndpoint()+"/v1/traces",

"OTEL_EXPORTER_OTLP_PROTOCOL=http",

"OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=http",

"RUST_LOG=debug",

b.WithTag(cfg.rethTag).WithEnv(

"OTEL_EXPORTER_OTLP_ENDPOINT="+jg.Internal.IngestHTTPEndpoint(),

"OTEL_EXPORTER_OTLP_TRACES_ENDPOINT="+jg.Internal.IngestHTTPEndpoint()+"/v1/traces",

"OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf",

"OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=http/protobuf",

"RUST_LOG=debug",

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@test/e2e/benchmark/suite_test.go` around lines 73 - 78, The OTLP env setup in the b.WithTag(...).WithEnv call is wrong: change the value for "OTEL_EXPORTER_OTLP_ENDPOINT" to use the base ingest URL (remove the "/v1/traces" suffix) so the SDK can append per-signal paths, keep "OTEL_EXPORTER_OTLP_TRACES_ENDPOINT" as-is (full URL), and update "OTEL_EXPORTER_OTLP_PROTOCOL" and "OTEL_EXPORTER_OTLP_TRACES_PROTOCOL" from "http" to the spec-correct "http/protobuf" (or another valid spec value like "grpc" or "http/json" if intended).

go test treats all arguments after an unknown flag (--evm-binary) as test binary args, so ./benchmark/ was never recognized as a package pattern.

go test sets the cwd to the package directory (test/e2e/benchmark/), so the binary path needs an extra parent traversal.

github-actions

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Spamoor Trace Benchmarks'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.50.

Benchmark suite	Current: `ae525ca`	Previous: `212ac08`	Ratio
`SpamoorSmoke - Batch.SetHeight (avg)`	`14.966867469879517` us	`9.33933933933934` us	`1.60`
`SpamoorSmoke - Store.GetMetadata (avg)`	`9.872934263679765` us	`6.306730415593968` us	`1.57`
`SpamoorSmoke - Store.NewBatch (avg)`	`13.335843373493976` us	`7.635542168674699` us	`1.75`

This comment was automatically generated by workflow using github-action-benchmark.

The benchmark package doesn't define the --binary flag that test-e2e passes. It has its own CI workflow so it doesn't need to run here.

codecov · 2026-02-25T14:49:41Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 60.93%. Comparing base (212ac08) to head (85c9d2d).

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3107      +/-   ##
==========================================
- Coverage   60.95%   60.93%   -0.02%     
==========================================
  Files         113      113              
  Lines       11617    11617              
==========================================
- Hits         7081     7079       -2     
- Misses       3738     3739       +1     
- Partials      798      799       +1

Flag	Coverage Δ
combined	`60.93% <ø> (-0.02%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

github-actions bot assigned chatton Feb 25, 2026

chatton marked this pull request as draft February 25, 2026 12:17

chatton mentioned this pull request Feb 25, 2026

feat: add gasburner gas throughput benchmark #3108

Draft

chatton commented Feb 25, 2026

View reviewed changes

chatton marked this pull request as ready for review February 25, 2026 12:30

chatton commented Feb 25, 2026

View reviewed changes

fix: correct BENCH_JSON_OUTPUT path for spamoor benchmark

18fc15a

go test sets the working directory to the package under test, so the env var should be relative to test/e2e/benchmark/, not test/e2e/.

coderabbitai bot reviewed Feb 25, 2026

View reviewed changes

fix: place package pattern before test binary flags in benchmark CI

fccd9db

go test treats all arguments after an unknown flag (--evm-binary) as test binary args, so ./benchmark/ was never recognized as a package pattern.

chatton marked this pull request as draft February 25, 2026 13:44

fix: adjust evm-binary path for benchmark subpackage working directory

ae525ca

go test sets the cwd to the package directory (test/e2e/benchmark/), so the binary path needs an extra parent traversal.

github-actions bot reviewed Feb 25, 2026

View reviewed changes

fix: exclude benchmark subpackage from make test-e2e

85c9d2d

The benchmark package doesn't define the --binary flag that test-e2e passes. It has its own CI workflow so it doesn't need to run here.

Conversation

chatton commented Feb 25, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

github-actions bot commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

claude bot commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review

Bugs / Functional Issues

Design / Robustness

Code Quality Nitpicks

CI

Performance Regression

Uh oh!

chatton Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

chatton Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 25, 2026

Choose a reason for hiding this comment

OTLP exporter over HTTP: http/protobuf vs http/json (spec)

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

⚠️ Performance Alert ⚠️

Uh oh!

codecov bot commented Feb 25, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

chatton commented Feb 25, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 25, 2026 •

edited

Loading

github-actions bot commented Feb 25, 2026 •

edited

Loading

claude bot commented Feb 25, 2026 •

edited

Loading

chatton Feb 25, 2026 •

edited

Loading

OTLP exporter over HTTP: `http/protobuf` vs `http/json` (spec)