Skip to content

test(e2e): migrate test-rebuild-hermes.sh to vitest#5588

Merged
jyaunches merged 6 commits into
mainfrom
e2e-migrate-test-rebuild-hermes
Jun 22, 2026
Merged

test(e2e): migrate test-rebuild-hermes.sh to vitest#5588
jyaunches merged 6 commits into
mainfrom
e2e-migrate-test-rebuild-hermes

Conversation

@jyaunches

@jyaunches jyaunches commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Summary

Migrate test/e2e/test-rebuild-hermes.sh with the simplest equivalent live Vitest coverage.

Related Issues

Refs #5098
Refs #3025

Contract mapping

  • Legacy assertion: build an old Hermes base image, create an old Hermes OpenShell sandbox, seed Hermes state and Discord messaging placeholder config, run nemoclaw <sandbox> rebuild --yes, then verify marker state, Hermes version upgrade, messaging config preservation, registry refresh, optional post-rebuild inference, and backup credential hygiene.
    • Replacement: test/e2e-scenario/live/rebuild-hermes.test.ts asserts the same flow through Vitest with the real installer/Docker/OpenShell/rebuild boundaries.
    • Boundary preserved: bash install.sh, Docker base image builds/tags, openshell provider/sandbox create/exec/list, local ~/.nemoclaw registry/session files, nemoclaw rebuild, Hermes runtime hermes --version, and backup leak scan.
  • Legacy assertion: NEMOCLAW_HERMES_STALE_BASE_REBUILD_E2E=1 leaves the cached Hermes base image stale so rebuild must refresh it (nemoclaw hermes rebuild fails to use user specified version #3025).
    • Replacement: the same Vitest file runs under rebuild-hermes-stale-base-vitest with NEMOCLAW_HERMES_STALE_BASE_REBUILD_E2E=1.
    • Boundary preserved: stale ghcr.io/nvidia/nemoclaw/hermes-sandbox-base:latest cache state and real rebuild refresh path.

Simplicity check

  • Test shape: simple live Vitest test.
  • Original runner/lane: nightly-e2e.yaml jobs rebuild-hermes-e2e and rebuild-hermes-stale-base-e2e, reusable e2e-script.yaml, default runs-on: ubuntu-latest, Docker/OpenShell, hosted inference secret, 60m timeout.
  • Replacement runner: same runner class (ubuntu-latest) in .github/workflows/e2e-vitest-scenarios.yaml; Docker/OpenShell/install/rebuild stay real.
  • New shared helpers: none; one-off parsing/seeding/cleanup stays local to the test.
  • New framework/registry/ledger: none
  • Workflow changes: add selective Vitest jobs rebuild-hermes-vitest and rebuild-hermes-stale-base-vitest; legacy shell deletion/workflow retirement deferred to Epic: Migrate legacy bash E2E into the Vitest E2E system #5098 Phase 11.
  • Selective dispatch: e2e-vitest-scenarios.yaml with jobs=rebuild-hermes-vitest,rebuild-hermes-stale-base-vitest on this PR branch.

Verification

Summary by CodeRabbit

  • Tests
    • Added a live E2E Hermes rebuild scenario that destructively rebuilds a sandbox and verifies Hermes messaging/state continuity, correct version upgrade behavior from the manifest, and rebuild-marker persistence.
    • Added checks to fail if Hermes rebuild backup artifacts contain credential-like data.
    • Expanded workflow boundary and dispatch-matrix selector coverage for both standard and stale-base Hermes rebuild Vitest job variants.
  • Chores
    • Updated CI to run two additional Hermes rebuild Vitest jobs (standard and stale-base) and include both in PR scenario result reporting.

@coderabbitai

coderabbitai Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds two new free-standing Vitest CI jobs (rebuild-hermes-vitest and rebuild-hermes-stale-base-vitest) to the e2e workflow, backed by a new 751-line live test that exercises nemoclaw rebuild end-to-end for a Hermes sandbox. A new workflow boundary validator enforces security and structural rules for both job variants, and dispatch-selector support tests cover the new jobs.

Changes

Hermes Rebuild E2E Vitest Integration

Layer / File(s) Summary
Live test configuration, helpers, and resource cleanup
test/e2e-scenario/live/rebuild-hermes.test.ts
Defines imports, repo/file constants for Hermes versions/tags, stale-base mode behavior and sandbox-name derivation/validation, configuration placeholders, base-image tags, hosted inference defaults, timeout settings, local helper types and functions for NemoClaw state snapshot/restore, JSON read/write, manifest version parsing, shell-result assertions, best-effort cleanup routine, Dockerfile generator for old Hermes sandbox, and waitForSandboxReady polling logic.
Test entry point, initialization, and pre-rebuild sandbox setup
test/e2e-scenario/live/rebuild-hermes.test.ts
Defines the live scenario test with stale-base vs non-stale naming and timeout, snapshots NemoClaw registry/session state, registers cleanup hooks, gates execution on Docker availability with CI-specific throw/skip behavior, writes contract.json describing expected preserved boundaries, executes install.sh, validates installed CLIs (nemoclaw/openshell), deletes pre-existing sandbox, stops Hermes forwarding, builds the old Hermes base Docker image using explicit version and tarball SHA, optionally retags old image for stale-base, creates Discord provider and old Hermes sandbox from generated Dockerfile, waits for sandbox readiness, writes in-sandbox marker file, and reads/asserts pre-rebuild .env and config.yaml placeholder expectations before seeding NemoClaw registry/onboarding-session state.
Rebuild execution and post-rebuild validation
test/e2e-scenario/live/rebuild-hermes.test.ts
Builds the current Hermes base image (non-stale mode) or writes a stale-base note, executes nemoclaw rebuild --yes --verbose, verifies marker file persistence, Hermes version upgrade via manifest expected-release matching, .env/config.yaml persistence, agentVersion migration away from old version, hosted inference curl with PONG check, backup directory presence, and credential leak scanning (API key + fake Discord token).
Workflow boundary validator for Hermes rebuild job variants
tools/e2e-scenarios/workflow-boundary.mts
Adds validateRebuildHermesVitestJob enforcing job env vars (NEMOCLAW_RUN_E2E_SCENARIOS, Hermes-specific config, OPENSHELL_GATEWAY), per-step secret-exposure rules, Docker Hub auth behavior with anonymous fallback, pinned action SHAs, required Vitest run commands and artifact upload naming/path conventions, and stable artifact upload settings; wires two invocations (staleBase: false and staleBase: true) into validateE2eVitestScenariosWorkflowBoundary.
CI workflow job definitions and report-to-pr integration
.github/workflows/e2e-vitest-scenarios.yaml
Adds rebuild-hermes-vitest and rebuild-hermes-stale-base-vitest jobs with Hermes-specific env, Docker Hub auth with anonymous fallback, Vitest execution with NVIDIA_INFERENCE_API_KEY, and artifact upload; extends report-to-pr needs list with both new jobs.
Dispatch-selector and matrix tests for Hermes rebuild jobs
test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts
Extends the workflow boundary test suite with evaluateE2eVitestWorkflowDispatchSelectors assertions for rebuild-hermes and rebuild-hermes-stale-base scenarios, validating both scenarios and jobs selector inputs and verifying the selected free-standing job names with liveScenariosRuns: false. Updates the free-standing-to-registry matrix exclusion test by replacing inventory-iteration checks with two explicit generateMatrixForDispatch assertions; increases test timeout to 300_000.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • NVIDIA/NemoClaw#5256: Adds a similar hermes-e2e-vitest free-standing job to the same workflow and extends workflow-boundary.mts and dispatch-selector tests using the same structural pattern as this PR.
  • NVIDIA/NemoClaw#5330: Adds rebuild-vitest workflow jobs and matching boundary validators in the same files, directly parallel to this PR's Hermes rebuild variant additions.
  • NVIDIA/NemoClaw#5542: Extends the same e2e Vitest CI boundary framework with a new free-standing job using the identical wiring pattern in e2e-vitest-scenarios.yaml, workflow-boundary.mts, and report-to-pr.

Suggested labels

area: e2e, integration: hermes, area: ci, chore

Suggested reviewers

  • cv

Poem

🐇 A sandbox is born, gets old and goes stale,
Then nemoclaw rebuild brings it back without fail.
The marker file survives, the version ticks up,
No Discord token leaked, not even a hiccup!
🔑 Credentials stay safe, the PONG test rings clear —
A Hermes reborn, with vitest to cheer! 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately captures the primary change: migrating a bash shell script test (test-rebuild-hermes.sh) to a Vitest implementation.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch e2e-migrate-test-rebuild-hermes

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-code-quality

github-code-quality Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Code Coverage Overview

Languages: TypeScript

TypeScript / code-coverage/plugin

The overall coverage in the branch is 96%. Coverage data for the branch is not yet available.

Show a code coverage summary of the most covered files.
File d5af7f1 +/-
nemoclaw/src/se...cret-scanner.ts 100%
nemoclaw/src/commands/slash.ts 100%
nemoclaw/src/li...bprocess-env.ts 100%
nemoclaw/src/bl...eprint/state.ts 98%
nemoclaw/src/onboard/config.ts 98%
nemoclaw/src/bl...int/snapshot.ts 97%
nemoclaw/src/bl...print/runner.ts 95%
nemoclaw/src/co...ration-state.ts 94%
nemoclaw/src/bl...ate-networks.ts 94%
nemoclaw/src/index.ts 94%

TypeScript / code-coverage/cli

The overall coverage in the branch is 46%. Coverage data for the branch is not yet available.

Show a code coverage summary of the most covered files.
File d5af7f1 +/-
src/lib/state/o...oard-session.ts 91%
src/lib/inference/local.ts 76%
src/lib/sandbox/config.ts 72%
src/lib/actions...dbox/rebuild.ts 67%
src/lib/onboard/preflight.ts 64%
src/lib/actions...licy-channel.ts 56%
src/lib/state/sandbox.ts 55%
src/lib/policy/index.ts 49%
src/lib/onboard...er-gpu-patch.ts 44%
src/lib/onboard.ts 18%

Updated June 22, 2026 17:54 UTC
Code Coverage is in Public Preview. Learn more and provide us with your feedback.

Comment thread test/e2e-scenario/live/rebuild-hermes.test.ts Fixed
@github-actions

github-actions Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: rebuild-hermes-vitest, rebuild-hermes-stale-base-vitest
Optional E2E: rebuild-openclaw-vitest, sandbox-rebuild-vitest

Dispatch hint: rebuild-hermes-vitest,rebuild-hermes-stale-base-vitest

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • rebuild-hermes-vitest (high): This PR adds the rebuild-hermes free-standing workflow job and its live Vitest scenario. Run it as merge-blocking validation because it directly covers installer execution, Docker/OpenShell sandbox creation, Hermes rebuild, credential placeholder preservation, backup hygiene, and post-rebuild state survival.
  • rebuild-hermes-stale-base-vitest (high): This PR adds a separate stale-base cache lane using the same live scenario with NEMOCLAW_HERMES_STALE_BASE_REBUILD_E2E=1. Run it as merge-blocking validation because it covers the stale cached Hermes base-image regression boundary and verifies rebuild refreshes the base while preserving sandbox state.

Optional E2E

  • rebuild-openclaw-vitest (high): Adjacent confidence check for the existing rebuild free-standing job pattern near the edited workflow section. Useful to ensure the new Hermes rebuild jobs did not regress shared rebuild workflow conventions, artifact handling, Docker auth hygiene, or report-to-pr dependency inventory.
  • sandbox-rebuild-vitest (high): Adjacent sandbox lifecycle confidence check. The new Hermes scenario exercises a specialized rebuild path, while sandbox-rebuild-vitest covers the generic sandbox rebuild lane and helps detect workflow-boundary or OpenShell lifecycle regressions around the newly inserted jobs.

New E2E recommendations

  • interactive-hermes-rebuild (medium): The new live test explicitly keeps the interactive issue nemoclaw hermes rebuild fails to use user specified version #3025 reproduction path out of scope: ./bin/nemoclaw.js onboard --agent hermes, hermes rebuild, modal prompt handling, and Y confirmation. That leaves user-facing interactive rebuild confirmation coverage missing.
    • Suggested test: Add an interactive Hermes rebuild confirmation E2E scenario covering onboarding Hermes, invoking hermes rebuild, accepting the modal confirmation prompt, and verifying the rebuilt sandbox preserves Hermes state.

Dispatch hint

  • Workflow: .github/workflows/e2e-vitest-scenarios.yaml
  • jobs input: rebuild-hermes-vitest,rebuild-hermes-stale-base-vitest

@github-actions

github-actions Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Recommendation

Required Vitest E2E scenarios: rebuild-hermes-stale-base-vitest, rebuild-hermes-vitest
Optional Vitest E2E scenarios: None

Dispatch required Vitest E2E scenarios:

  • gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=rebuild-hermes-stale-base-vitest
  • gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=rebuild-hermes-vitest

Workflow run

Full Vitest E2E advisor summary

Vitest E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required Vitest E2E scenarios

  • rebuild-hermes-stale-base-vitest: Focused free-standing Vitest job wired for changed live test test/e2e-scenario/live/rebuild-hermes.test.ts.
    • Dispatch: gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=rebuild-hermes-stale-base-vitest
  • rebuild-hermes-vitest: Focused free-standing Vitest job wired for changed live test test/e2e-scenario/live/rebuild-hermes.test.ts.
    • Dispatch: gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=rebuild-hermes-vitest

Optional Vitest E2E scenarios

  • None.

Relevant changed files

  • .github/workflows/e2e-vitest-scenarios.yaml
  • test/e2e-scenario/live/rebuild-hermes.test.ts
  • test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts
  • tools/e2e-scenarios/workflow-boundary.mts

@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ❌ Some jobs failed

Run: 27958197827
Workflow ref: e2e-migrate-test-rebuild-hermes
Requested scenarios: (default — all supported)
Requested jobs: rebuild-hermes-vitest,rebuild-hermes-stale-base-vitest
Summary: 1 passed, 2 failed, 53 skipped

Job Result
agent-turn-latency-vitest ⏭️ skipped
bedrock-runtime-compatible-anthropic-vitest ⏭️ skipped
brave-search-vitest ⏭️ skipped
channels-add-remove-vitest ⏭️ skipped
cloud-inference-vitest ⏭️ skipped
cloud-onboard-vitest ⏭️ skipped
common-egress-agent-vitest ⏭️ skipped
concurrent-gateway-ports-vitest ⏭️ skipped
credential-migration-vitest ⏭️ skipped
credential-sanitization-vitest ⏭️ skipped
cron-preflight-inference-local-vitest ⏭️ skipped
device-auth-health-vitest ⏭️ skipped
diagnostics-vitest ⏭️ skipped
double-onboard-vitest ⏭️ skipped
full-e2e-vitest ⏭️ skipped
gateway-drift-preflight-vitest ⏭️ skipped
gateway-guard-recovery ⏭️ skipped
gateway-health-honest-vitest ⏭️ skipped
generate-matrix ✅ success
gpu-double-onboard-vitest ⏭️ skipped
gpu-e2e-vitest ⏭️ skipped
hermes-e2e-vitest ⏭️ skipped
hermes-inference-switch-vitest ⏭️ skipped
hermes-root-entrypoint-smoke-vitest ⏭️ skipped
inference-routing-vitest ⏭️ skipped
issue-2478-crash-loop-recovery-vitest ⏭️ skipped
issue-4434-tui-unreachable-inference-vitest ⏭️ skipped
issue-4462-scope-upgrade-approval-vitest ⏭️ skipped
kimi-inference-compat-vitest ⏭️ skipped
launchable-smoke-vitest ⏭️ skipped
live-scenarios ⏭️ skipped
messaging-compatible-endpoint-vitest ⏭️ skipped
messaging-providers-vitest ⏭️ skipped
model-router-provider-routed-inference-vitest ⏭️ skipped
network-policy-vitest ⏭️ skipped
ollama-auth-proxy-vitest ⏭️ skipped
onboard-negative-paths-vitest ⏭️ skipped
onboard-repair-vitest ⏭️ skipped
onboard-resume-vitest ⏭️ skipped
openclaw-inference-switch-vitest ⏭️ skipped
openclaw-skill-cli-vitest ⏭️ skipped
openclaw-tui-chat-correlation-vitest ⏭️ skipped
openshell-version-pin-vitest ⏭️ skipped
rebuild-hermes-stale-base-vitest ❌ failure
rebuild-hermes-vitest ❌ failure
rebuild-openclaw-vitest ⏭️ skipped
runtime-overrides-vitest ⏭️ skipped
sandbox-rebuild-vitest ⏭️ skipped
sandbox-survival-vitest ⏭️ skipped
sessions-agents-cli-vitest ⏭️ skipped
shields-config-vitest ⏭️ skipped
skill-agent-vitest ⏭️ skipped
snapshot-commands-vitest ⏭️ skipped
state-backup-restore-vitest ⏭️ skipped
token-rotation-vitest ⏭️ skipped
upgrade-stale-sandbox-vitest ⏭️ skipped

Failed jobs: rebuild-hermes-stale-base-vitest, rebuild-hermes-vitest. Check run artifacts for logs.

Comment thread test/e2e-scenario/live/rebuild-hermes.test.ts Fixed

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
test/e2e-scenario/live/rebuild-hermes.test.ts (2)

139-145: 🧹 Nitpick | 🔵 Trivial | 💤 Low value

Unused bestEffort helper function.

This function is defined but never called anywhere in the file. Consider removing it or prefixing with underscore if it's intentionally reserved for future use.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/e2e-scenario/live/rebuild-hermes.test.ts` around lines 139 - 145, The
`bestEffort` function is defined but never called anywhere in the file, making
it unused code. Either remove the entire `bestEffort` function definition
entirely if it's not needed, or if it's intentionally reserved for future use,
rename the function to start with an underscore (for example, `_bestEffort`) to
indicate to other developers that it's deliberately unused.

208-220: 🧹 Nitpick | 🔵 Trivial | 💤 Low value

Consider avoiding dynamic RegExp construction.

Static analysis flagged potential ReDoS. While SANDBOX_NAME is validated at line 38, using String.includes() is safer and clearer here:

♻️ Suggested simplification
 async function waitForSandboxReady(host: HostCliClient, apiKey: string): Promise<void> {
   for (let attempt = 1; attempt <= 30; attempt += 1) {
     const list = await host.command("openshell", ["sandbox", "list"], {
       artifactName: `phase-3-sandbox-list-${attempt}`,
       env: testEnv(apiKey),
       redactionValues: [apiKey],
       timeoutMs: 30_000,
     });
-    if (new RegExp(`${SANDBOX_NAME}.*Ready`).test(resultText(list))) return;
+    const output = resultText(list);
+    if (output.includes(SANDBOX_NAME) && output.includes("Ready")) return;
     await sleep(5_000);
   }
   throw new Error(`sandbox ${SANDBOX_NAME} did not become Ready`);
 }

Note: The original regex ensures Ready appears after the sandbox name on the same line. If that ordering matters, consider splitting output by lines and checking each line contains both.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/e2e-scenario/live/rebuild-hermes.test.ts` around lines 208 - 220, In the
waitForSandboxReady function, replace the dynamic RegExp construction using new
RegExp with the SANDBOX_NAME variable with a safer String.includes() check.
Instead of using the regex pattern that tests for SANDBOX_NAME followed by
Ready, check if the resultText(list) string contains both SANDBOX_NAME and the
word Ready using includes() method calls. This eliminates the ReDoS
vulnerability risk from dynamic regex construction while maintaining clarity and
correctness.

Source: Linters/SAST tools

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@test/e2e-scenario/live/rebuild-hermes.test.ts`:
- Around line 139-145: The `bestEffort` function is defined but never called
anywhere in the file, making it unused code. Either remove the entire
`bestEffort` function definition entirely if it's not needed, or if it's
intentionally reserved for future use, rename the function to start with an
underscore (for example, `_bestEffort`) to indicate to other developers that
it's deliberately unused.
- Around line 208-220: In the waitForSandboxReady function, replace the dynamic
RegExp construction using new RegExp with the SANDBOX_NAME variable with a safer
String.includes() check. Instead of using the regex pattern that tests for
SANDBOX_NAME followed by Ready, check if the resultText(list) string contains
both SANDBOX_NAME and the word Ready using includes() method calls. This
eliminates the ReDoS vulnerability risk from dynamic regex construction while
maintaining clarity and correctness.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 734acc08-ee71-4836-89c8-4f909f9d282d

📥 Commits

Reviewing files that changed from the base of the PR and between 14abc34 and 709fa02.

📒 Files selected for processing (4)
  • .github/workflows/e2e-vitest-scenarios.yaml
  • test/e2e-scenario/live/rebuild-hermes.test.ts
  • test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts
  • tools/e2e-scenarios/workflow-boundary.mts

@github-actions

github-actions Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

PR Review Advisor — No blocking findings

Merge posture: No blocking advisor findings
Primary next action: Add or justify PRA-T1 and any related test follow-ups.
Open items: 0 required · 0 warnings · 0 suggestions · 8 test follow-ups
Since last review: 0 prior items resolved · 0 still apply · 0 new items found

Action checklist

  • PRA-T1 Add or justify test follow-up: Runtime validation
  • PRA-T2 Add or justify test follow-up: Runtime validation
  • PRA-T3 Add or justify test follow-up: Runtime validation
  • PRA-T4 Add or justify test follow-up: Runtime validation
  • PRA-T5 Add or justify test follow-up: Runtime validation
  • PRA-T6 Add or justify test follow-up: Acceptance clause
  • PRA-T7 Add or justify test follow-up: Acceptance clause
  • PRA-T8 Add or justify test follow-up: Acceptance clause
Test follow-ups to resolve or justify

If these cover changed behavior, prefer adding them in this PR; otherwise state why existing coverage is enough or link the follow-up.

  • PRA-T1 Runtime validation — Run `rebuild-hermes-vitest` standard mode and confirm the post-rebuild `phase-7-hermes-version-after-rebuild` artifact reports the current `agents/hermes/manifest.yaml` expected version.. Static review can verify workflow shape, selectors, secret placement, and test assertions, but the changed behavior crosses live installer, Docker image build/cache, OpenShell sandbox, Hermes runtime, hosted inference, and artifact boundaries.
  • PRA-T2 Runtime validation — Run `rebuild-hermes-stale-base-vitest` and confirm the cached `ghcr.io/nvidia/nemoclaw/hermes-sandbox-base:latest` starts stale but the rebuilt sandbox reports the current manifest version.. Static review can verify workflow shape, selectors, secret placement, and test assertions, but the changed behavior crosses live installer, Docker image build/cache, OpenShell sandbox, Hermes runtime, hosted inference, and artifact boundaries.
  • PRA-T3 Runtime validation — Confirm both Hermes rebuild lanes produce `phase-7-backup-credential-scan.json` with `leaks: []` after rebuilding.. Static review can verify workflow shape, selectors, secret placement, and test assertions, but the changed behavior crosses live installer, Docker image build/cache, OpenShell sandbox, Hermes runtime, hosted inference, and artifact boundaries.
  • PRA-T4 Runtime validation — If issue nemoclaw hermes rebuild fails to use user specified version #3025's literal interactive repro must be accepted by this PR, add or identify coverage for `./bin/nemoclaw.js onboard --agent hermes` followed by `./bin/nemoclaw.js hermes rebuild`, including the update modal and `Y` confirmation path.. Static review can verify workflow shape, selectors, secret placement, and test assertions, but the changed behavior crosses live installer, Docker image build/cache, OpenShell sandbox, Hermes runtime, hosted inference, and artifact boundaries.
  • PRA-T5 Runtime validation — If the `hermes rebuild` alias is considered in scope for this migration, add or identify coverage that invokes the alias rather than only `nemoclaw <sandbox> rebuild --yes --verbose`.. Static review can verify workflow shape, selectors, secret placement, and test assertions, but the changed behavior crosses live installer, Docker image build/cache, OpenShell sandbox, Hermes runtime, hosted inference, and artifact boundaries.
  • PRA-T6 Acceptance clause — 1. `./bin/nemoclaw.js onboard --agent hermes` — add test evidence or identify existing coverage. The migrated live test runs `bash install.sh --non-interactive`, deletes the sandbox produced by install/onboard setup, then creates a controlled old Hermes sandbox via OpenShell and seeded `~/.nemoclaw` registry/session state. The test header and `contract.json.outOfScope` explicitly state that the literal interactive `./bin/nemoclaw.js onboard --agent hermes` reproduction path is outside this shell-lane migration.
  • PRA-T7 Acceptance clause — 3. `./bin/nemoclaw.js hermes rebuild` — add test evidence or identify existing coverage. The migrated test exercises the legacy non-interactive shell lane by running installed `nemoclaw <SANDBOX_NAME> rebuild --yes --verbose` through real OpenShell/Docker boundaries. It does not invoke the literal `./bin/nemoclaw.js hermes rebuild` alias, and the test documents the interactive alias path as out of scope.
  • PRA-T8 Acceptance clause — 4. You will be presented a modal that suggests the container will be updated to the specified version — add test evidence or identify existing coverage. The new lane is non-interactive and uses `--yes`; it verifies the underlying rebuild result and runtime version, but does not assert the interactive modal text. `contract.json.outOfScope` lists the modal prompt as outside this migration.

Workflow run details

This is an automated, non-binding review; it still expects maintainers and agents to respond to each required or warning item. Treat suggestions as current-PR improvements when they touch changed code; defer only with maintainer rationale or a linked follow-up. A human maintainer must make the final merge decision.

@cv cv added the v0.0.66 Release target label Jun 22, 2026
@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ❌ Some jobs failed

Run: 27966185354
Workflow ref: e2e-migrate-test-rebuild-hermes
Requested scenarios: (default — all supported)
Requested jobs: rebuild-hermes-vitest,rebuild-hermes-stale-base-vitest
Summary: 1 passed, 2 failed, 53 skipped

Job Result
agent-turn-latency-vitest ⏭️ skipped
bedrock-runtime-compatible-anthropic-vitest ⏭️ skipped
brave-search-vitest ⏭️ skipped
channels-add-remove-vitest ⏭️ skipped
cloud-inference-vitest ⏭️ skipped
cloud-onboard-vitest ⏭️ skipped
common-egress-agent-vitest ⏭️ skipped
concurrent-gateway-ports-vitest ⏭️ skipped
credential-migration-vitest ⏭️ skipped
credential-sanitization-vitest ⏭️ skipped
cron-preflight-inference-local-vitest ⏭️ skipped
device-auth-health-vitest ⏭️ skipped
diagnostics-vitest ⏭️ skipped
double-onboard-vitest ⏭️ skipped
full-e2e-vitest ⏭️ skipped
gateway-drift-preflight-vitest ⏭️ skipped
gateway-guard-recovery ⏭️ skipped
gateway-health-honest-vitest ⏭️ skipped
generate-matrix ✅ success
gpu-double-onboard-vitest ⏭️ skipped
gpu-e2e-vitest ⏭️ skipped
hermes-e2e-vitest ⏭️ skipped
hermes-inference-switch-vitest ⏭️ skipped
hermes-root-entrypoint-smoke-vitest ⏭️ skipped
inference-routing-vitest ⏭️ skipped
issue-2478-crash-loop-recovery-vitest ⏭️ skipped
issue-4434-tui-unreachable-inference-vitest ⏭️ skipped
issue-4462-scope-upgrade-approval-vitest ⏭️ skipped
kimi-inference-compat-vitest ⏭️ skipped
launchable-smoke-vitest ⏭️ skipped
live-scenarios ⏭️ skipped
messaging-compatible-endpoint-vitest ⏭️ skipped
messaging-providers-vitest ⏭️ skipped
model-router-provider-routed-inference-vitest ⏭️ skipped
network-policy-vitest ⏭️ skipped
ollama-auth-proxy-vitest ⏭️ skipped
onboard-negative-paths-vitest ⏭️ skipped
onboard-repair-vitest ⏭️ skipped
onboard-resume-vitest ⏭️ skipped
openclaw-inference-switch-vitest ⏭️ skipped
openclaw-skill-cli-vitest ⏭️ skipped
openclaw-tui-chat-correlation-vitest ⏭️ skipped
openshell-version-pin-vitest ⏭️ skipped
rebuild-hermes-stale-base-vitest ❌ failure
rebuild-hermes-vitest ❌ failure
rebuild-openclaw-vitest ⏭️ skipped
runtime-overrides-vitest ⏭️ skipped
sandbox-rebuild-vitest ⏭️ skipped
sandbox-survival-vitest ⏭️ skipped
sessions-agents-cli-vitest ⏭️ skipped
shields-config-vitest ⏭️ skipped
skill-agent-vitest ⏭️ skipped
snapshot-commands-vitest ⏭️ skipped
state-backup-restore-vitest ⏭️ skipped
token-rotation-vitest ⏭️ skipped
upgrade-stale-sandbox-vitest ⏭️ skipped

Failed jobs: rebuild-hermes-stale-base-vitest, rebuild-hermes-vitest. Check run artifacts for logs.

@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ✅ All jobs passed

Run: 27967279654
Workflow ref: e2e-migrate-test-rebuild-hermes
Requested scenarios: (default — all supported)
Requested jobs: rebuild-hermes-vitest,rebuild-hermes-stale-base-vitest
Summary: 3 passed, 0 failed, 53 skipped

Job Result
agent-turn-latency-vitest ⏭️ skipped
bedrock-runtime-compatible-anthropic-vitest ⏭️ skipped
brave-search-vitest ⏭️ skipped
channels-add-remove-vitest ⏭️ skipped
cloud-inference-vitest ⏭️ skipped
cloud-onboard-vitest ⏭️ skipped
common-egress-agent-vitest ⏭️ skipped
concurrent-gateway-ports-vitest ⏭️ skipped
credential-migration-vitest ⏭️ skipped
credential-sanitization-vitest ⏭️ skipped
cron-preflight-inference-local-vitest ⏭️ skipped
device-auth-health-vitest ⏭️ skipped
diagnostics-vitest ⏭️ skipped
double-onboard-vitest ⏭️ skipped
full-e2e-vitest ⏭️ skipped
gateway-drift-preflight-vitest ⏭️ skipped
gateway-guard-recovery ⏭️ skipped
gateway-health-honest-vitest ⏭️ skipped
generate-matrix ✅ success
gpu-double-onboard-vitest ⏭️ skipped
gpu-e2e-vitest ⏭️ skipped
hermes-e2e-vitest ⏭️ skipped
hermes-inference-switch-vitest ⏭️ skipped
hermes-root-entrypoint-smoke-vitest ⏭️ skipped
inference-routing-vitest ⏭️ skipped
issue-2478-crash-loop-recovery-vitest ⏭️ skipped
issue-4434-tui-unreachable-inference-vitest ⏭️ skipped
issue-4462-scope-upgrade-approval-vitest ⏭️ skipped
kimi-inference-compat-vitest ⏭️ skipped
launchable-smoke-vitest ⏭️ skipped
live-scenarios ⏭️ skipped
messaging-compatible-endpoint-vitest ⏭️ skipped
messaging-providers-vitest ⏭️ skipped
model-router-provider-routed-inference-vitest ⏭️ skipped
network-policy-vitest ⏭️ skipped
ollama-auth-proxy-vitest ⏭️ skipped
onboard-negative-paths-vitest ⏭️ skipped
onboard-repair-vitest ⏭️ skipped
onboard-resume-vitest ⏭️ skipped
openclaw-inference-switch-vitest ⏭️ skipped
openclaw-skill-cli-vitest ⏭️ skipped
openclaw-tui-chat-correlation-vitest ⏭️ skipped
openshell-version-pin-vitest ⏭️ skipped
rebuild-hermes-stale-base-vitest ✅ success
rebuild-hermes-vitest ✅ success
rebuild-openclaw-vitest ⏭️ skipped
runtime-overrides-vitest ⏭️ skipped
sandbox-rebuild-vitest ⏭️ skipped
sandbox-survival-vitest ⏭️ skipped
sessions-agents-cli-vitest ⏭️ skipped
shields-config-vitest ⏭️ skipped
skill-agent-vitest ⏭️ skipped
snapshot-commands-vitest ⏭️ skipped
state-backup-restore-vitest ⏭️ skipped
token-rotation-vitest ⏭️ skipped
upgrade-stale-sandbox-vitest ⏭️ skipped

@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ✅ All jobs passed

Run: 27968173871
Workflow ref: e2e-migrate-test-rebuild-hermes
Requested scenarios: (default — all supported)
Requested jobs: rebuild-hermes-vitest,rebuild-hermes-stale-base-vitest
Summary: 3 passed, 0 failed, 53 skipped

Job Result
agent-turn-latency-vitest ⏭️ skipped
bedrock-runtime-compatible-anthropic-vitest ⏭️ skipped
brave-search-vitest ⏭️ skipped
channels-add-remove-vitest ⏭️ skipped
cloud-inference-vitest ⏭️ skipped
cloud-onboard-vitest ⏭️ skipped
common-egress-agent-vitest ⏭️ skipped
concurrent-gateway-ports-vitest ⏭️ skipped
credential-migration-vitest ⏭️ skipped
credential-sanitization-vitest ⏭️ skipped
cron-preflight-inference-local-vitest ⏭️ skipped
device-auth-health-vitest ⏭️ skipped
diagnostics-vitest ⏭️ skipped
double-onboard-vitest ⏭️ skipped
full-e2e-vitest ⏭️ skipped
gateway-drift-preflight-vitest ⏭️ skipped
gateway-guard-recovery ⏭️ skipped
gateway-health-honest-vitest ⏭️ skipped
generate-matrix ✅ success
gpu-double-onboard-vitest ⏭️ skipped
gpu-e2e-vitest ⏭️ skipped
hermes-e2e-vitest ⏭️ skipped
hermes-inference-switch-vitest ⏭️ skipped
hermes-root-entrypoint-smoke-vitest ⏭️ skipped
inference-routing-vitest ⏭️ skipped
issue-2478-crash-loop-recovery-vitest ⏭️ skipped
issue-4434-tui-unreachable-inference-vitest ⏭️ skipped
issue-4462-scope-upgrade-approval-vitest ⏭️ skipped
kimi-inference-compat-vitest ⏭️ skipped
launchable-smoke-vitest ⏭️ skipped
live-scenarios ⏭️ skipped
messaging-compatible-endpoint-vitest ⏭️ skipped
messaging-providers-vitest ⏭️ skipped
model-router-provider-routed-inference-vitest ⏭️ skipped
network-policy-vitest ⏭️ skipped
ollama-auth-proxy-vitest ⏭️ skipped
onboard-negative-paths-vitest ⏭️ skipped
onboard-repair-vitest ⏭️ skipped
onboard-resume-vitest ⏭️ skipped
openclaw-inference-switch-vitest ⏭️ skipped
openclaw-skill-cli-vitest ⏭️ skipped
openclaw-tui-chat-correlation-vitest ⏭️ skipped
openshell-version-pin-vitest ⏭️ skipped
rebuild-hermes-stale-base-vitest ✅ success
rebuild-hermes-vitest ✅ success
rebuild-openclaw-vitest ⏭️ skipped
runtime-overrides-vitest ⏭️ skipped
sandbox-rebuild-vitest ⏭️ skipped
sandbox-survival-vitest ⏭️ skipped
sessions-agents-cli-vitest ⏭️ skipped
shields-config-vitest ⏭️ skipped
skill-agent-vitest ⏭️ skipped
snapshot-commands-vitest ⏭️ skipped
state-backup-restore-vitest ⏭️ skipped
token-rotation-vitest ⏭️ skipped
upgrade-stale-sandbox-vitest ⏭️ skipped

@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ✅ All jobs passed

Run: 27969318676
Workflow ref: e2e-migrate-test-rebuild-hermes
Requested scenarios: (default — all supported)
Requested jobs: rebuild-hermes-vitest,rebuild-hermes-stale-base-vitest
Summary: 3 passed, 0 failed, 53 skipped

Job Result
agent-turn-latency-vitest ⏭️ skipped
bedrock-runtime-compatible-anthropic-vitest ⏭️ skipped
brave-search-vitest ⏭️ skipped
channels-add-remove-vitest ⏭️ skipped
cloud-inference-vitest ⏭️ skipped
cloud-onboard-vitest ⏭️ skipped
common-egress-agent-vitest ⏭️ skipped
concurrent-gateway-ports-vitest ⏭️ skipped
credential-migration-vitest ⏭️ skipped
credential-sanitization-vitest ⏭️ skipped
cron-preflight-inference-local-vitest ⏭️ skipped
device-auth-health-vitest ⏭️ skipped
diagnostics-vitest ⏭️ skipped
double-onboard-vitest ⏭️ skipped
full-e2e-vitest ⏭️ skipped
gateway-drift-preflight-vitest ⏭️ skipped
gateway-guard-recovery ⏭️ skipped
gateway-health-honest-vitest ⏭️ skipped
generate-matrix ✅ success
gpu-double-onboard-vitest ⏭️ skipped
gpu-e2e-vitest ⏭️ skipped
hermes-e2e-vitest ⏭️ skipped
hermes-inference-switch-vitest ⏭️ skipped
hermes-root-entrypoint-smoke-vitest ⏭️ skipped
inference-routing-vitest ⏭️ skipped
issue-2478-crash-loop-recovery-vitest ⏭️ skipped
issue-4434-tui-unreachable-inference-vitest ⏭️ skipped
issue-4462-scope-upgrade-approval-vitest ⏭️ skipped
kimi-inference-compat-vitest ⏭️ skipped
launchable-smoke-vitest ⏭️ skipped
live-scenarios ⏭️ skipped
messaging-compatible-endpoint-vitest ⏭️ skipped
messaging-providers-vitest ⏭️ skipped
model-router-provider-routed-inference-vitest ⏭️ skipped
network-policy-vitest ⏭️ skipped
ollama-auth-proxy-vitest ⏭️ skipped
onboard-negative-paths-vitest ⏭️ skipped
onboard-repair-vitest ⏭️ skipped
onboard-resume-vitest ⏭️ skipped
openclaw-inference-switch-vitest ⏭️ skipped
openclaw-skill-cli-vitest ⏭️ skipped
openclaw-tui-chat-correlation-vitest ⏭️ skipped
openshell-version-pin-vitest ⏭️ skipped
rebuild-hermes-stale-base-vitest ✅ success
rebuild-hermes-vitest ✅ success
rebuild-openclaw-vitest ⏭️ skipped
runtime-overrides-vitest ⏭️ skipped
sandbox-rebuild-vitest ⏭️ skipped
sandbox-survival-vitest ⏭️ skipped
sessions-agents-cli-vitest ⏭️ skipped
shields-config-vitest ⏭️ skipped
skill-agent-vitest ⏭️ skipped
snapshot-commands-vitest ⏭️ skipped
state-backup-restore-vitest ⏭️ skipped
token-rotation-vitest ⏭️ skipped
upgrade-stale-sandbox-vitest ⏭️ skipped

@jyaunches

Copy link
Copy Markdown
Contributor Author

PR review advisor follow-up disposition for da8dbc2b5:

  • PRA-T1: rebuild-hermes-vitest passed in selective run 27969318676; artifact phase-7-hermes-version-after-rebuild.stdout.txt reports Hermes Agent v0.14.0 (2026.5.16), matching agents/hermes/manifest.yaml.
  • PRA-T2: rebuild-hermes-stale-base-vitest passed in selective run 27969318676; artifact phase-5-stale-base-note.txt records stale base cache setup and post-rebuild version artifact reports 2026.5.16.
  • PRA-T3: both lanes produced phase-7-backup-credential-scan.json with leaks: [].
  • PRA-T4PRA-T7: literal interactive ./bin/nemoclaw.js onboard --agent hermes, hermes rebuild, modal prompt, and Y confirmation are explicitly out of scope for this legacy non-interactive shell-lane migration and are documented in the test header plus contract.json.outOfScope.
  • PRA-T8: the live test sets both COMPATIBLE_API_KEY and NVIDIA_INFERENCE_API_KEY, then exercises the installed nemoclaw <sandbox> rebuild --yes --verbose path through real OpenShell/Docker boundaries.

Current state: PR checks green, PR review advisor green, required Vitest E2E scenarios green.

@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ✅ All jobs passed

Run: 27973149846
Workflow ref: e2e-migrate-test-rebuild-hermes
Requested scenarios: (default — all supported)
Requested jobs: rebuild-hermes-vitest,rebuild-hermes-stale-base-vitest
Summary: 3 passed, 0 failed, 55 skipped

Job Result
agent-turn-latency-vitest ⏭️ skipped
bedrock-runtime-compatible-anthropic-vitest ⏭️ skipped
brave-search-vitest ⏭️ skipped
channels-add-remove-vitest ⏭️ skipped
channels-stop-start-vitest ⏭️ skipped
cloud-inference-vitest ⏭️ skipped
cloud-onboard-vitest ⏭️ skipped
common-egress-agent-vitest ⏭️ skipped
concurrent-gateway-ports-vitest ⏭️ skipped
credential-migration-vitest ⏭️ skipped
credential-sanitization-vitest ⏭️ skipped
cron-preflight-inference-local-vitest ⏭️ skipped
device-auth-health-vitest ⏭️ skipped
diagnostics-vitest ⏭️ skipped
double-onboard-vitest ⏭️ skipped
full-e2e-vitest ⏭️ skipped
gateway-drift-preflight-vitest ⏭️ skipped
gateway-guard-recovery ⏭️ skipped
gateway-health-honest-vitest ⏭️ skipped
generate-matrix ✅ success
gpu-double-onboard-vitest ⏭️ skipped
gpu-e2e-vitest ⏭️ skipped
hermes-e2e-vitest ⏭️ skipped
hermes-inference-switch-vitest ⏭️ skipped
hermes-root-entrypoint-smoke-vitest ⏭️ skipped
inference-routing-vitest ⏭️ skipped
issue-2478-crash-loop-recovery-vitest ⏭️ skipped
issue-4434-tui-unreachable-inference-vitest ⏭️ skipped
issue-4462-scope-upgrade-approval-vitest ⏭️ skipped
kimi-inference-compat-vitest ⏭️ skipped
launchable-smoke-vitest ⏭️ skipped
live-scenarios ⏭️ skipped
messaging-compatible-endpoint-vitest ⏭️ skipped
messaging-providers-vitest ⏭️ skipped
model-router-provider-routed-inference-vitest ⏭️ skipped
network-policy-vitest ⏭️ skipped
ollama-auth-proxy-vitest ⏭️ skipped
onboard-negative-paths-vitest ⏭️ skipped
onboard-repair-vitest ⏭️ skipped
onboard-resume-vitest ⏭️ skipped
openclaw-inference-switch-vitest ⏭️ skipped
openclaw-skill-cli-vitest ⏭️ skipped
openclaw-tui-chat-correlation-vitest ⏭️ skipped
openshell-version-pin-vitest ⏭️ skipped
rebuild-hermes-stale-base-vitest ✅ success
rebuild-hermes-vitest ✅ success
rebuild-openclaw-vitest ⏭️ skipped
runtime-overrides-vitest ⏭️ skipped
sandbox-rebuild-vitest ⏭️ skipped
sandbox-survival-vitest ⏭️ skipped
sessions-agents-cli-vitest ⏭️ skipped
shields-config-vitest ⏭️ skipped
skill-agent-vitest ⏭️ skipped
snapshot-commands-vitest ⏭️ skipped
state-backup-restore-vitest ⏭️ skipped
telegram-injection-vitest ⏭️ skipped
token-rotation-vitest ⏭️ skipped
upgrade-stale-sandbox-vitest ⏭️ skipped

@jyaunches

Copy link
Copy Markdown
Contributor Author

PR review advisor follow-up disposition for refreshed head d5af7f1d8:

  • PRA-T1: rebuild-hermes-vitest passed in selective run 27973149846; artifact phase-7-hermes-version-after-rebuild.stdout.txt reports Hermes Agent v0.14.0 (2026.5.16), matching the current manifest expected version.
  • PRA-T2: rebuild-hermes-stale-base-vitest passed in selective run 27973149846; artifact phase-5-stale-base-note.txt records the stale base cache setup and the post-rebuild version artifact reports 2026.5.16.
  • PRA-T3: both Hermes rebuild lanes produced phase-7-backup-credential-scan.json with leaks: [].
  • PRA-T4/PRA-T5: the literal interactive repro and hermes rebuild alias are intentionally out of scope for this legacy non-interactive shell-lane migration; this is documented in the test header and in contract.json.outOfScope.
  • PRA-T6PRA-T8: the interactive onboard/modal/confirmation clauses are likewise out of scope for this conversion PR; the migrated lane validates the underlying rebuild behavior via installed nemoclaw <sandbox> rebuild --yes --verbose across real Docker/OpenShell/Hermes boundaries.

Current state: PR checks green, PR review advisor green, unresolved review threads 0, required Vitest E2E scenarios green on the refreshed head.

@jyaunches jyaunches merged commit 1bbc2e2 into main Jun 22, 2026
99 checks passed
@jyaunches jyaunches deleted the e2e-migrate-test-rebuild-hermes branch June 22, 2026 18:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

v0.0.66 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants