test(e2e): migrate test-rebuild-hermes.sh to vitest#5588
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds two new free-standing Vitest CI jobs ( ChangesHermes Rebuild E2E Vitest Integration
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Code Coverage OverviewLanguages: TypeScript TypeScript / code-coverage/pluginThe overall coverage in the branch is 96%. Coverage data for the branch is not yet available. Show a code coverage summary of the most covered files.
TypeScript / code-coverage/cliThe overall coverage in the branch is 46%. Coverage data for the branch is not yet available. Show a code coverage summary of the most covered files.
Updated |
E2E Advisor RecommendationRequired E2E: Dispatch hint: Full advisor summaryE2E Recommendation AdvisorBase: Required E2E
Optional E2E
New E2E recommendations
Dispatch hint
|
Vitest E2E Scenario RecommendationRequired Vitest E2E scenarios: Dispatch required Vitest E2E scenarios:
Full Vitest E2E advisor summaryVitest E2E Scenario AdvisorBase: Required Vitest E2E scenarios
Optional Vitest E2E scenarios
Relevant changed files
|
Vitest E2E Scenario Results — ❌ Some jobs failedRun: 27958197827
|
There was a problem hiding this comment.
🧹 Nitpick comments (2)
test/e2e-scenario/live/rebuild-hermes.test.ts (2)
139-145: 🧹 Nitpick | 🔵 Trivial | 💤 Low valueUnused
bestEfforthelper function.This function is defined but never called anywhere in the file. Consider removing it or prefixing with underscore if it's intentionally reserved for future use.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/e2e-scenario/live/rebuild-hermes.test.ts` around lines 139 - 145, The `bestEffort` function is defined but never called anywhere in the file, making it unused code. Either remove the entire `bestEffort` function definition entirely if it's not needed, or if it's intentionally reserved for future use, rename the function to start with an underscore (for example, `_bestEffort`) to indicate to other developers that it's deliberately unused.
208-220: 🧹 Nitpick | 🔵 Trivial | 💤 Low valueConsider avoiding dynamic RegExp construction.
Static analysis flagged potential ReDoS. While
SANDBOX_NAMEis validated at line 38, usingString.includes()is safer and clearer here:♻️ Suggested simplification
async function waitForSandboxReady(host: HostCliClient, apiKey: string): Promise<void> { for (let attempt = 1; attempt <= 30; attempt += 1) { const list = await host.command("openshell", ["sandbox", "list"], { artifactName: `phase-3-sandbox-list-${attempt}`, env: testEnv(apiKey), redactionValues: [apiKey], timeoutMs: 30_000, }); - if (new RegExp(`${SANDBOX_NAME}.*Ready`).test(resultText(list))) return; + const output = resultText(list); + if (output.includes(SANDBOX_NAME) && output.includes("Ready")) return; await sleep(5_000); } throw new Error(`sandbox ${SANDBOX_NAME} did not become Ready`); }Note: The original regex ensures
Readyappears after the sandbox name on the same line. If that ordering matters, consider splitting output by lines and checking each line contains both.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/e2e-scenario/live/rebuild-hermes.test.ts` around lines 208 - 220, In the waitForSandboxReady function, replace the dynamic RegExp construction using new RegExp with the SANDBOX_NAME variable with a safer String.includes() check. Instead of using the regex pattern that tests for SANDBOX_NAME followed by Ready, check if the resultText(list) string contains both SANDBOX_NAME and the word Ready using includes() method calls. This eliminates the ReDoS vulnerability risk from dynamic regex construction while maintaining clarity and correctness.Source: Linters/SAST tools
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@test/e2e-scenario/live/rebuild-hermes.test.ts`:
- Around line 139-145: The `bestEffort` function is defined but never called
anywhere in the file, making it unused code. Either remove the entire
`bestEffort` function definition entirely if it's not needed, or if it's
intentionally reserved for future use, rename the function to start with an
underscore (for example, `_bestEffort`) to indicate to other developers that
it's deliberately unused.
- Around line 208-220: In the waitForSandboxReady function, replace the dynamic
RegExp construction using new RegExp with the SANDBOX_NAME variable with a safer
String.includes() check. Instead of using the regex pattern that tests for
SANDBOX_NAME followed by Ready, check if the resultText(list) string contains
both SANDBOX_NAME and the word Ready using includes() method calls. This
eliminates the ReDoS vulnerability risk from dynamic regex construction while
maintaining clarity and correctness.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 734acc08-ee71-4836-89c8-4f909f9d282d
📒 Files selected for processing (4)
.github/workflows/e2e-vitest-scenarios.yamltest/e2e-scenario/live/rebuild-hermes.test.tstest/e2e-scenario/support-tests/e2e-scenarios-workflow.test.tstools/e2e-scenarios/workflow-boundary.mts
PR Review Advisor — No blocking findingsMerge posture: No blocking advisor findings Action checklist
Test follow-ups to resolve or justifyIf these cover changed behavior, prefer adding them in this PR; otherwise state why existing coverage is enough or link the follow-up.
This is an automated, non-binding review; it still expects maintainers and agents to respond to each required or warning item. Treat suggestions as current-PR improvements when they touch changed code; defer only with maintainer rationale or a linked follow-up. A human maintainer must make the final merge decision. |
Vitest E2E Scenario Results — ❌ Some jobs failedRun: 27966185354
|
Vitest E2E Scenario Results — ✅ All jobs passedRun: 27967279654
|
Vitest E2E Scenario Results — ✅ All jobs passedRun: 27968173871
|
Vitest E2E Scenario Results — ✅ All jobs passedRun: 27969318676
|
|
PR review advisor follow-up disposition for
Current state: PR checks green, PR review advisor green, required Vitest E2E scenarios green. |
Vitest E2E Scenario Results — ✅ All jobs passedRun: 27973149846
|
|
PR review advisor follow-up disposition for refreshed head
Current state: PR checks green, PR review advisor green, unresolved review threads 0, required Vitest E2E scenarios green on the refreshed head. |
Summary
Migrate
test/e2e/test-rebuild-hermes.shwith the simplest equivalent live Vitest coverage.Related Issues
Refs #5098
Refs #3025
Contract mapping
nemoclaw <sandbox> rebuild --yes, then verify marker state, Hermes version upgrade, messaging config preservation, registry refresh, optional post-rebuild inference, and backup credential hygiene.test/e2e-scenario/live/rebuild-hermes.test.tsasserts the same flow through Vitest with the real installer/Docker/OpenShell/rebuild boundaries.bash install.sh, Docker base image builds/tags,openshell provider/sandboxcreate/exec/list, local~/.nemoclawregistry/session files,nemoclaw rebuild, Hermes runtimehermes --version, and backup leak scan.NEMOCLAW_HERMES_STALE_BASE_REBUILD_E2E=1leaves the cached Hermes base image stale so rebuild must refresh it (nemoclaw hermes rebuildfails to use user specified version #3025).rebuild-hermes-stale-base-vitestwithNEMOCLAW_HERMES_STALE_BASE_REBUILD_E2E=1.ghcr.io/nvidia/nemoclaw/hermes-sandbox-base:latestcache state and real rebuild refresh path.Simplicity check
nightly-e2e.yamljobsrebuild-hermes-e2eandrebuild-hermes-stale-base-e2e, reusablee2e-script.yaml, defaultruns-on: ubuntu-latest, Docker/OpenShell, hosted inference secret, 60m timeout.ubuntu-latest) in.github/workflows/e2e-vitest-scenarios.yaml; Docker/OpenShell/install/rebuild stay real.rebuild-hermes-vitestandrebuild-hermes-stale-base-vitest; legacy shell deletion/workflow retirement deferred to Epic: Migrate legacy bash E2E into the Vitest E2E system #5098 Phase 11.e2e-vitest-scenarios.yamlwithjobs=rebuild-hermes-vitest,rebuild-hermes-stale-base-viteston this PR branch.Verification
npm ci --ignore-scriptsnpm run build:cliNEMOCLAW_RUN_E2E_SCENARIOS=1 npx vitest run --project e2e-scenarios-live test/e2e-scenario/live/rebuild-hermes.test.ts --silent=false --reporter=default(local no-secret skip)npx vitest run --project cli test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.tsnpm run typecheck:clinpm run source-shape:checkgit diff --checkSummary by CodeRabbit