Add Gherkin acceptance E2E harness example#2887
Conversation
There was a problem hiding this comment.
aboimpinto has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.
|
Thanks @aboimpinto for taking the time to contribute. This repository is currently observing a maintainer-managed contribution gate in dry-run mode, so this pull request is staying open. When enforcement is enabled, pull requests from contributors who are not listed in Please read |
There was a problem hiding this comment.
Code Review
This pull request introduces Cucumber acceptance tests for directory listing and the public LLM/tool lifecycle in crates/tui, adding feature files and corresponding test runners. Feedback on the implementation focuses on improving the robustness of the test harness: resolving a potential deadlock in run_with_timeout by reading process output concurrently, recursively creating parent directories for workspace files to prevent write failures, and preserving terminal and locale environment variables when clearing the host environment.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
There was a problem hiding this comment.
aboimpinto has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.
There was a problem hiding this comment.
aboimpinto has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.
Summary
Refs #2886 and #2791. Reference branch/PR: #2851.
This PR adds the first Gherkin-style acceptance E2E example for the command/tool lifecycle work. It is intentionally a small layer: it does not refactor command structure. It adds an executable acceptance harness that can describe owner-level behavior in
Given / When / Thenlanguage and then verify the first slice through public process and mocked-provider borders.Why this layer
The command-strategy work is being split into smaller PRs. Before moving command ownership/routing code, we need behavior-level tests that describe the full user-visible flow. These tests complement the existing unit and narrower integration tests:
What changed
cucumberas a TUI dev-dependency.directory_listing_acceptance.rswith a simple Gherkin feature for the current directory-listing happy path.tool_lifecycle_acceptance.rswith a fuller public-border lifecycle scenario.crates/tui/tests/features/.Step-by-step behavior asserted
The main lifecycle scenario is written as:
The executable step definitions assert this through these components:
/v1/modelsand/v1/chat/completions.codewhale-tui exec --auto --output-format stream-jsonbinary.list_dirtool call with{"path":"."}.tool_useevent and enforces the running marker contract[~]from the scenario table.list_dirresult includesREADME.md,notes.txt, andsrcwith file/folder metadata.toolmessage back to the mocked LLM with the expectedtool_call_idand directory entries.Statusline and BlueWhale note
This PR does not claim to verify the interactive screen yet. The current executable slice uses the cross-platform
execstream, so it cannot honestly assert the rendered Statusline or the moving BlueWhale in the top-right UI.The feature file includes a note for the next PTY/frame-capture layer. That layer should drive the real TUI and assert:
[~] list_dir .[✓] list_dir .Validation
I also temporarily mutated the scenario markers to prove the contract fails as expected:
[~]changed to[x]failed withleft: "[x]",right: "[~]"✓changed toXfailed withleft: "X",right: "✓"Paulo Aboim Pinto