Add downstream performance workflow by Stevengre · Pull Request #4141 · runtimeverification/haskell-backend

Stevengre · 2026-03-12T03:08:46Z

Summary

add a standalone downstream-perf workflow for KEVM and Kontrol downstream checks
run master and current for each suite as separate jobs, then compare them before publishing results
publish one upserted PR comment driven by suite-level compare outputs

Goal

automate a workflow that was previously manual
make the execution model explicit: raw runs first, suite compare second, PR reporting last
keep expensive downstream jobs off most PRs while still self-testing workflow changes
preserve diagnostics and downloadable artifacts even when one side fails or times out

Security and Reliability

workflow-level permissions stay at contents: read
PR-comment write permissions are scoped to the final report job only
checkout in non-report jobs uses persist-credentials: false
suite collection and artifact upload run with always() so failure diagnostics are still preserved

Triggering

workflow_dispatch always enables the selected suites
for pull_request, downstream jobs run only when at least one is true:
- the PR has the perf label
- the PR changes a configured downstream-perf path:
  - booster/**
  - kore/**
  - kore-rpc-types/**
  - flake.nix
  - deps/**
  - scripts/performance-tests-kevm.sh
  - scripts/performance-tests-kontrol.sh
  - scripts/compare.py
  - scripts/collect-downstream-perf-results.sh
  - scripts/downstream-perf-lib.sh
  - .github/actions/downstream-perf-suite/**
  - .github/actionlint.yaml
  - .github/workflows/downstream-perf.yml

Confirmed Design

The intended workflow structure is:

Run four raw downstream jobs in parallel:
- KEVM master
- KEVM current
- Kontrol master
- Kontrol current
Run two suite compare jobs:
- KEVM compare, which waits for KEVM master and KEVM current
- Kontrol compare, which waits for Kontrol master and Kontrol current
Run one final reporting job:
- Report PR comment, which waits for both compare jobs

The compare jobs are responsible for:

consuming the raw results from master and current
producing the suite-level comparison output
producing the final suite artifact bundle that users download

The report job is responsible for:

reading suite compare outputs
writing a single PR comment
linking to the final suite artifacts

Workflow

flowchart TD
    T[Trigger] --> S[Select]

    S --> KM[KEVM master]
    S --> KC[KEVM current]
    S --> QM[Kontrol master]
    S --> QC[Kontrol current]

    KM --> KX[KEVM compare]
    KC --> KX

    QM --> QX[Kontrol compare]
    QC --> QX

    KX --> R[Report PR comment]
    QX --> R

Artifact Model

raw run jobs produce suite-and-branch specific intermediate results
compare jobs consume those raw results and emit the final downloadable suite artifact
the final PR comment is derived from compare outputs, not from ad hoc status reconstruction

In other words, the intended data flow is:

master/current raw results -> suite compare -> final suite artifact + PR comment

Reporting Expectations

checks should be named by suite and branch role, not by internal shard numbering
suite comparison should be presented as current versus master
the PR comment should contain the suite-level summary plus direct links to the final suite artifacts
detailed logs, charts, top deltas, and raw summaries should live inside the final suite artifacts

Current Policy Direction

baseline target is current origin/master
master and current runs use the same timeout setting within a suite
compare happens only after both sides finish
if one side fails or times out, the compare/report path should still surface that state with preserved diagnostics

Testing

local shell syntax checks passed for downstream scripts (bash -n)
local helper regression test passed (bash scripts/test-downstream-perf-lib.sh)
live GitHub Actions runs on this PR are being used to validate the workflow shape and reporting behavior

github-actions · 2026-03-12T03:10:20Z

Downstream Performance

Trigger: perf-label
Target PR: Add downstream performance workflow #4141
Workflow run: https://github.com/runtimeverification/haskell-backend/actions/runs/23040085895

KEVM

Artifacts: https://github.com/runtimeverification/haskell-backend/actions/runs/23040085895#artifacts

No KEVM final summary was produced.

Kontrol

Artifacts: https://github.com/runtimeverification/haskell-backend/actions/runs/23040085895/artifacts/5906670814

Side	Status	Duration (s)	Head commit
master	failure	410	`463a830`
current	failure	416	`463a830`

Compare file: not generated

ci(actions): add downstream perf workflow

41bf365

Stevengre added perf and removed perf labels Mar 12, 2026

Stevengre self-assigned this Mar 12, 2026

Stevengre marked this pull request as draft March 12, 2026 03:19

Stevengre marked this pull request as ready for review March 12, 2026 03:19

Stevengre marked this pull request as draft March 12, 2026 03:21

Stevengre added 21 commits March 12, 2026 11:23

ci(actions): skip perf report comment on cancelled runs

acacc32

ci(actions): install uv for downstream perf jobs

d2bae31

ci(actions): pin downstream perf jobs to normal runners

717db1e

ci(actions): serialize downstream perf suites

3c73c82

ci(actions): restore normal runner label for downstream perf

76125d2

ci(actions): harden downstream perf workflow

513ef67

ci(actions): run downstream suites in parallel

e3650a5

ci(actions): write downstream manifests to workspace

73f2c97

ci(actions): fix downstream perf helper contract

917f131

ci(actions): inject plugin build toolchain for downstream suites

0c10dd4

ci(actions): pin downstream plugin cmake below v4

92faec5

ci(actions): install standalone cmake 3.x for suites

a52868b

ci(actions): include openssl and gmp in plugin build shell

aa906f2

ci(actions): inject k and clang bins without nested nix shell

2922ab7

ci(actions): export openssl and gmp hints for plugin build

52a40e0

ci(actions): pass openssl and gmp flags via libff cmake args

28dc420

ci(actions): export downstream toolchain path in suite shells

edf66e2

ci(actions): use downstream k release for injected toolchain

69b14af

ci(actions): pin injected clang toolchain to clang 14

0375231

ci(actions): classify downstream failures against baseline

a9e7c13

ci(actions): enforce downstream budget with command timeouts

e653695

Stevengre added 4 commits March 13, 2026 07:34

ci(actions): classify downstream budget timeouts against baseline

e5f5be7

Shard KEVM downstream perf and simplify timeout reporting

dcdddcd

Align downstream flow to master/current then compare

5a3e3a0

Rename KEVM shard jobs to proof targets

b93151b

Stevengre changed the title ~~Add downstream performance workflow~~ [EXPERIMENTAL] Add downstream performance workflow Mar 13, 2026

refactor(ci): split downstream perf into raw-compare-report stages

85e2d70

Stevengre changed the title ~~[EXPERIMENTAL] Add downstream performance workflow~~ Add downstream performance workflow Mar 13, 2026

fix(ci): keep local action while running master raw jobs

463a830

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add downstream performance workflow#4141

Add downstream performance workflow#4141
Stevengre wants to merge 28 commits intomasterfrom
jh/downstream-perf-workflow

Stevengre commented Mar 12, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 12, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Stevengre commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Goal

Security and Reliability

Triggering

Confirmed Design

Workflow

Artifact Model

Reporting Expectations

Current Policy Direction

Testing

Uh oh!

github-actions bot commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Downstream Performance

KEVM

Kontrol

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Stevengre commented Mar 12, 2026 •

edited

Loading

github-actions bot commented Mar 12, 2026 •

edited

Loading