Skip to content

Add downstream performance workflow#4141

Draft
Stevengre wants to merge 28 commits intomasterfrom
jh/downstream-perf-workflow
Draft

Add downstream performance workflow#4141
Stevengre wants to merge 28 commits intomasterfrom
jh/downstream-perf-workflow

Conversation

@Stevengre
Copy link

@Stevengre Stevengre commented Mar 12, 2026

Summary

  • add a standalone downstream-perf workflow for KEVM and Kontrol downstream checks
  • run master and current for each suite as separate jobs, then compare them before publishing results
  • publish one upserted PR comment driven by suite-level compare outputs

Goal

  • automate a workflow that was previously manual
  • make the execution model explicit: raw runs first, suite compare second, PR reporting last
  • keep expensive downstream jobs off most PRs while still self-testing workflow changes
  • preserve diagnostics and downloadable artifacts even when one side fails or times out

Security and Reliability

  • workflow-level permissions stay at contents: read
  • PR-comment write permissions are scoped to the final report job only
  • checkout in non-report jobs uses persist-credentials: false
  • suite collection and artifact upload run with always() so failure diagnostics are still preserved

Triggering

  • workflow_dispatch always enables the selected suites
  • for pull_request, downstream jobs run only when at least one is true:
    • the PR has the perf label
    • the PR changes a configured downstream-perf path:
      • booster/**
      • kore/**
      • kore-rpc-types/**
      • flake.nix
      • deps/**
      • scripts/performance-tests-kevm.sh
      • scripts/performance-tests-kontrol.sh
      • scripts/compare.py
      • scripts/collect-downstream-perf-results.sh
      • scripts/downstream-perf-lib.sh
      • .github/actions/downstream-perf-suite/**
      • .github/actionlint.yaml
      • .github/workflows/downstream-perf.yml

Confirmed Design

The intended workflow structure is:

  1. Run four raw downstream jobs in parallel:
    • KEVM master
    • KEVM current
    • Kontrol master
    • Kontrol current
  2. Run two suite compare jobs:
    • KEVM compare, which waits for KEVM master and KEVM current
    • Kontrol compare, which waits for Kontrol master and Kontrol current
  3. Run one final reporting job:
    • Report PR comment, which waits for both compare jobs

The compare jobs are responsible for:

  • consuming the raw results from master and current
  • producing the suite-level comparison output
  • producing the final suite artifact bundle that users download

The report job is responsible for:

  • reading suite compare outputs
  • writing a single PR comment
  • linking to the final suite artifacts

Workflow

flowchart TD
    T[Trigger] --> S[Select]

    S --> KM[KEVM master]
    S --> KC[KEVM current]
    S --> QM[Kontrol master]
    S --> QC[Kontrol current]

    KM --> KX[KEVM compare]
    KC --> KX

    QM --> QX[Kontrol compare]
    QC --> QX

    KX --> R[Report PR comment]
    QX --> R
Loading

Artifact Model

  • raw run jobs produce suite-and-branch specific intermediate results
  • compare jobs consume those raw results and emit the final downloadable suite artifact
  • the final PR comment is derived from compare outputs, not from ad hoc status reconstruction

In other words, the intended data flow is:

master/current raw results -> suite compare -> final suite artifact + PR comment

Reporting Expectations

  • checks should be named by suite and branch role, not by internal shard numbering
  • suite comparison should be presented as current versus master
  • the PR comment should contain the suite-level summary plus direct links to the final suite artifacts
  • detailed logs, charts, top deltas, and raw summaries should live inside the final suite artifacts

Current Policy Direction

  • baseline target is current origin/master
  • master and current runs use the same timeout setting within a suite
  • compare happens only after both sides finish
  • if one side fails or times out, the compare/report path should still surface that state with preserved diagnostics

Testing

  • local shell syntax checks passed for downstream scripts (bash -n)
  • local helper regression test passed (bash scripts/test-downstream-perf-lib.sh)
  • live GitHub Actions runs on this PR are being used to validate the workflow shape and reporting behavior

@Stevengre Stevengre added perf and removed perf labels Mar 12, 2026
@github-actions
Copy link

github-actions bot commented Mar 12, 2026

Downstream Performance

KEVM

No KEVM final summary was produced.

Kontrol

Side Status Duration (s) Head commit
master failure 410 463a830
current failure 416 463a830
  • Compare file: not generated

@Stevengre Stevengre self-assigned this Mar 12, 2026
@Stevengre Stevengre marked this pull request as draft March 12, 2026 03:19
@Stevengre Stevengre marked this pull request as ready for review March 12, 2026 03:19
@Stevengre Stevengre marked this pull request as draft March 12, 2026 03:21
@Stevengre Stevengre changed the title Add downstream performance workflow [EXPERIMENTAL] Add downstream performance workflow Mar 13, 2026
@Stevengre Stevengre changed the title [EXPERIMENTAL] Add downstream performance workflow Add downstream performance workflow Mar 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant