Skip to content

ci: add bench-baseline workflow_dispatch (C-g step 3 followup)#89

Merged
chaploud merged 1 commit intomainfrom
develop/cg-bench-baseline-workflow
Apr 29, 2026
Merged

ci: add bench-baseline workflow_dispatch (C-g step 3 followup)#89
chaploud merged 1 commit intomainfrom
develop/cg-bench-baseline-workflow

Conversation

@chaploud
Copy link
Copy Markdown
Contributor

Summary

  • Closes the last open piece of Plan C-g. The C-g schema (bench(c-g): make history.yaml multi-arch #86) and the 3-OS bench matrix (ci(c-g step 5): flip benchmark job to a 3-OS matrix #88) made bench/history.yaml natively multi-arch and made the on-PR regression check run on Linux/macOS/Windows simultaneously. The remaining gap was that the user does not have measurement-grade local hardware for native x86_64-linux or x86_64-windows; the OrbStack rows in history.yaml are Rosetta-translated, so they are useful for schema validation but not for absolute-time tracking.
  • New manually-triggered workflow runs scripts/record-merge-bench.sh on a GitHub-hosted runner of the requested OS and commits the resulting row directly to main, with the same naming convention the local Mac per-merge bench uses.

Usage

gh workflow run bench-baseline.yml \
  -R clojurewasm/zwasm \
  -f os=ubuntu-latest

-f os= is ubuntu-latest / macos-latest / windows-latest. Optional -f reason="..." overrides the bench-row reason; default is the HEAD commit subject.

Implementation notes

  • Linux/macOS provision via the nix devshell (hyperfine + yq + zig + wasi-sdk all from the same pinned versions test-nix uses).
  • Windows uses install-tools.ps1 -OnlyTool zig + -OnlyTool hyperfine (the two pieces the bench harness needs) plus a one-shot yq_windows_amd64.exe download — yq is not on install-tools.ps1's pinned list and the only downstream consumer is bench/record.sh, which runs locally on hosts where the user already has yq via nix.
  • One retry on push collision so a concurrent local Mac per-merge bench commit does not lose the new row.
  • permissions: contents: write so the default GITHUB_TOKEN can push to main.

Test plan

  • Merge, then run gh workflow run bench-baseline.yml -f os=ubuntu-latest and verify the resulting Record x86_64-linux bench baseline for <subject> (workflow_dispatch) commit lands on main with one new history.yaml entry tagged arch: x86_64-linux.
  • Same with os=windows-latest once an opportunity to verify presents itself.
  • CI green on this PR (workflow file is YAML-only, but actions parses it on push).

Closes the last open piece of Plan C-g — collecting native
x86_64-linux / x86_64-windows / aarch64-darwin bench rows for
merges where the user does not have measurement-grade local
hardware (the OrbStack `my-ubuntu-amd64` VM in particular is
Rosetta-translated, so its x86_64-linux rows in
`bench/history.yaml` are schema-shakedown only — not a true
native baseline).

`.github/workflows/bench-baseline.yml`: workflow_dispatch with
`os` (choice of ubuntu-latest / macos-latest / windows-latest)
and an optional `reason` override. Linux/macOS provision via nix
devshell (so hyperfine / yq / zig come from the same pinned
versions test-nix uses); Windows uses
`install-tools.ps1 -OnlyTool zig` + `-OnlyTool hyperfine` plus a
one-shot yq.exe download (yq is not on the install-tools.ps1
list of pinned tools; the only consumer outside this workflow is
`bench/record.sh` and that runs locally where the user already
has yq through nix). The workflow runs
`scripts/record-merge-bench.sh` and commits the new row directly
to main with subject
`Record <arch_suffix> bench baseline for <subject>
(workflow_dispatch)`. One retry on push collision so a
concurrent local Mac per-merge-bench commit does not lose the
new row.

`.claude/CLAUDE.md` Merge-Gate item 10 paragraph reworded:
the on-PR ci_compare regression check is now spelled as
3-OS (not Ubuntu-only) and a forward pointer to the new workflow
is added.

Why workflow_dispatch and not on-push: bench runs cost ~5–7 min
and the Mac aarch64-darwin row is already recorded locally on
every merge; this workflow is for the platforms the user can't
record locally with confidence. Manual trigger lets the user
pick which merge SHAs deserve a native baseline rather than
recording every one.
@chaploud chaploud merged commit 3be1767 into main Apr 29, 2026
10 checks passed
@chaploud chaploud deleted the develop/cg-bench-baseline-workflow branch April 29, 2026 13:07
chaploud added a commit that referenced this pull request Apr 29, 2026
github-actions Bot added a commit that referenced this pull request Apr 29, 2026
…ep 3 followup): add bench-baseline workflow_dispatch (#89) (workflow_dispatch)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant