Skip to content

[codex] Add Codex pace headroom hint#1789

Merged
steipete merged 8 commits into
steipete:mainfrom
astuteprogrammer:codexbar-pace-headroom
Jul 3, 2026
Merged

[codex] Add Codex pace headroom hint#1789
steipete merged 8 commits into
steipete:mainfrom
astuteprogrammer:codexbar-pace-headroom

Conversation

@astuteprogrammer

@astuteprogrammer astuteprogrammer commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Owner decision — product-tone hold

Decision requested: Should the menu proactively tell a user Try 1.5x! when the conservative headroom model says their remaining quota can support it, and should the minimum historical-week threshold be user-configurable in Advanced settings?

Tradeoff: The wording turns safe headroom into an immediate, useful action and the setting exposes a real confidence threshold. The copy can feel prescriptive or celebratory in a utility menu, while the setting adds another expert-facing control.

Recommendation: Land as written with the conservative gate and current copy for one release. Keep this PR unmerged only for the owner's tone/settings acceptance and exact-head CI; if the tone is declined, change the copy to 1.5x headroom without changing the model.

Summary

  • Show Try 1.5x! when Codex usage has more than 15% reserve and remaining quota supports at least 1.5x the projected remaining burn.
  • Add an Advanced setting for the minimum number of complete Codex weeks used to enable historical pace, defaulting to 3 and allowing 2-8.
  • Keep Ollama cookie diagnostics out of this PR so the pace/settings decision stays independently reviewable.

Behavior and tradeoff

The hint appears only when the quota is projected to last until reset, reserve exceeds 15%, and the safe remaining-burn multiplier is at least 1.5. Linear and workday pace use remaining capacity versus usage projected across the remaining window; historical pace uses remaining capacity versus the remaining expected historical curve.

The current Try 1.5x! wording is intentionally prominent and actionable, but it is a product-tone choice. Recommendation: keep the conservative gate and use the current copy for one release; if it feels too celebratory in the menu, change only the copy to 1.5x headroom without changing the model.

Maintainer preparation

  • Split the unrelated Ollama access-denied change from this branch.
  • Fixed late-window headroom math and added linear, historical, UI, and CLI regression coverage.
  • Fixed Italian localization and repository formatting failures present on the contributor head.
  • Packaged the debug app and verified the Advanced setting in the live bundle.
  • Generated a redacted synthetic three-state card fixture for ahead/on-track/behind comparison. No real account usage is represented by that fixture.

Validation

  • make check
  • make test (43 shards)
  • focused pace, historical pace, CLI snapshot, localization, and Ollama regression suites
  • CODEXBAR_SIGNING=adhoc ./Scripts/package_app.sh debug
  • autoreview: clean, no accepted/actionable findings

The debug bundle requested macOS app-data access during live-account rendering. That permission was not granted; packaged UI proof is limited to the setting and synthetic redacted card states.

@clawsweeper

clawsweeper Bot commented Jun 30, 2026

Copy link
Copy Markdown

Codex review: needs real behavior proof before merge. Reviewed July 3, 2026, 4:22 PM ET / 20:22 UTC.

Summary
The PR adds a Codex-only 1.5× headroom pace label in menu and CLI output by computing a remaining-burn speed multiplier and covering it with focused pace and snapshot tests.

Reproducibility: not applicable. this is a feature PR, not a bug with a failing current-main reproduction path. Source and diff inspection show current main lacks the headroom hint and the branch adds it through pace text rendering.

Review metrics: 3 noteworthy metrics.

  • Changed surface: 33 files, +237/-39. The runtime change is focused, but localization and tests make exact-head review worthwhile.
  • Settings surface: 0 settings files changed. The latest branch removed the Advanced lookback setting that previously required separate product acceptance.
  • Proof freshness: 2 screenshots inspected, 0 showing latest copy. The available media proves an earlier UI state, not the final 1.5× headroom label.

Merge readiness
Overall: 🦪 silver shellfish
Proof: 🦪 silver shellfish
Patch quality: 🐚 platinum hermit
Result: blocked until stronger real behavior proof is added.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • [P1] Add final-head proof showing the 1.5× headroom label and absence from non-qualifying states; redact private account details before posting.
  • Update the PR body so it matches the latest diff: no Advanced historical-week setting and no Try 1.5x! recommendation.
  • Remove the release-owned CHANGELOG.md entry from this normal feature branch.

Proof guidance:

  • [P1] Needs stronger real behavior proof before merge: The inspected screenshots predate the latest exact head and still show Try 1.5x! plus a removed setting, so the contributor should add redacted final-head screenshot or terminal output; updating the PR body should trigger re-review, or a maintainer can comment @clawsweeper re-review. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.

Mantis proof suggestion
A fresh visible proof pass would materially help because the available screenshots show an older label and removed setting. A maintainer can ask Mantis to capture proof by posting this exact PR comment:

@openclaw-mantis visual task: verify the latest Codex pace card shows `1.5× headroom` only for qualifying Codex ahead states and not on-track/behind states.

Risk before merge

  • [P1] The PR body is stale: it still describes Try 1.5x! and an Advanced historical-week setting even though the latest diff uses 1.5× headroom and has no settings changes.
  • [P1] The inspected screenshots predate the latest exact head, so they do not prove the final 1.5× headroom copy or final no-settings scope.
  • [P1] CI on the latest head was still queued/unstable at review time, so exact-head merge readiness was not yet proven.

Maintainer options:

  1. Decide the mitigation before merge
    Land the focused Codex-only 1.5× headroom implementation after the PR body, final-head proof, release-note ownership, and exact-head checks are cleaned up.
  2. Pause or close
    Do not merge this PR until maintainers decide whether the risk is worth taking.

Next step before merge

  • [P1] Manual review remains because the blockers are final-head proof, release-owned changelog cleanup, maintainer copy acceptance, and exact-head CI rather than a safe automated repair.

Security
Cleared: The final diff is limited to pace calculation, UI/CLI text, localization, tests, and a changelog entry; no concrete security or supply-chain concern was found.

Review findings

  • [P3] Drop the release-owned changelog entry — CHANGELOG.md:5-6
Review details

Best possible solution:

Land the focused Codex-only 1.5× headroom implementation after the PR body, final-head proof, release-note ownership, and exact-head checks are cleaned up.

Do we have a high-confidence way to reproduce the issue?

Not applicable: this is a feature PR, not a bug with a failing current-main reproduction path. Source and diff inspection show current main lacks the headroom hint and the branch adds it through pace text rendering.

Is this the best way to solve the issue?

Unclear until maintainer sign-off: the latest implementation is narrower than the original proposal and addresses the earlier scope concern, but the user-facing headroom label still needs exact-head proof and acceptance.

Full review comments:

  • [P3] Drop the release-owned changelog entry — CHANGELOG.md:5-6
    CHANGELOG.md is release-owned in this review workflow, and this normal feature PR already carries the release-note context in the PR body. Please remove this entry so release notes are assembled by the release path rather than the feature branch.
    Confidence: 0.78

Overall correctness: patch is correct
Overall confidence: 0.83

AGENTS.md: found and applied where relevant.

Codex review notes: model internal, reasoning high; reviewed against 692da32ecbbb.

Label changes

Label changes:

  • add rating: 🦪 silver shellfish: Overall readiness is 🦪 silver shellfish; proof is 🦪 silver shellfish and patch quality is 🐚 platinum hermit.
  • add status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs stronger real behavior proof before merge: The inspected screenshots predate the latest exact head and still show Try 1.5x! plus a removed setting, so the contributor should add redacted final-head screenshot or terminal output; updating the PR body should trigger re-review, or a maintainer can comment @clawsweeper re-review. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.
  • remove proof: sufficient: Current real behavior proof status is insufficient, not sufficient.
  • remove status: 👀 ready for maintainer look: Current PR status label is status: 📣 needs proof.
  • remove rating: 🐚 platinum hermit: Current PR rating is rating: 🦪 silver shellfish, so this older rating label is no longer current.

Label justifications:

  • P3: This is a low-risk UX/CLI feature held for proof freshness, copy acceptance, and normal exact-head readiness rather than an urgent regression.
  • rating: 🦪 silver shellfish: Overall readiness is 🦪 silver shellfish; proof is 🦪 silver shellfish and patch quality is 🐚 platinum hermit.
  • status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs stronger real behavior proof before merge: The inspected screenshots predate the latest exact head and still show Try 1.5x! plus a removed setting, so the contributor should add redacted final-head screenshot or terminal output; updating the PR body should trigger re-review, or a maintainer can comment @clawsweeper re-review. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.
  • proof: 📸 screenshot: Contributor real behavior proof includes screenshot evidence. The inspected screenshots predate the latest exact head and still show Try 1.5x! plus a removed setting, so the contributor should add redacted final-head screenshot or terminal output; updating the PR body should trigger re-review, or a maintainer can comment @clawsweeper re-review.
Evidence reviewed

What I checked:

  • Repository policy read: AGENTS.md was read fully; its focused-test, no-live-Keychain-validation, and UI/runtime proof guidance informed this review. (AGENTS.md:1, 692da32ecbbb)
  • Current main lacks the requested headroom surface: A targeted search on current main found no speedMultiplierToReset, 1.5× headroom, Try 1.5x, or persisted historical-pace setting symbols. (692da32ecbbb)
  • Current main pace model: Current main UsagePace exposes stage, expected/actual usage, ETA, reset survival, and run-out probability, but no speed multiplier field. (Sources/CodexBarCore/UsagePace.swift:18, 692da32ecbbb)
  • PR adds remaining-burn headroom: The final PR diff adds speedMultiplierToReset, computes it from remaining capacity versus projected remaining usage, and feeds historical projected remaining usage into the same model. (Sources/CodexBarCore/UsagePace.swift:18, 0580c3dc027f)
  • Provider scope narrowed: The final PR diff passes provider context into UsagePaceText and CLIRenderer and guards the headroom label to Codex only, addressing the earlier non-Codex scope concern. (Sources/CodexBar/UsagePaceText.swift:77, 0580c3dc027f)
  • Final diff no longer has settings changes: The latest GitHub file list has 33 changed files and no PreferencesAdvancedPane or SettingsStore changes, so the earlier 2...8 lookback setting and unreachable-8-week review comment are obsolete for this head. (0580c3dc027f)

Likely related people:

  • steipete: Peter Steinberger authored the maintainer prep and latest focus commits on this PR, posted the owner review comments, and has recent work in the pace model area. (role: owner-decision holder and recent area contributor; confidence: high; commits: 0af3a7cf70a5, d6a732cb6f89, d7a997d9e826; files: Sources/CodexBarCore/UsagePace.swift, Sources/CodexBar/UsagePaceText.swift, Sources/CodexBarCLI/CLIRenderer.swift)
  • Remedy92: Commit a679311 added the weekly pace indicator and the UsagePace/UsagePaceText surface this PR extends. (role: pace feature introducer; confidence: high; commits: a679311227fc; files: Sources/CodexBarCore/UsagePace.swift, Sources/CodexBar/UsagePaceText.swift, Sources/CodexBar/MenuDescriptor.swift)
  • Yuxin-Qiao: Commit f986661 corrected historical pace run-out forecasts in the evaluator/test surface that now feeds projected remaining usage into headroom. (role: recent historical pace repair contributor; confidence: medium; commits: f986661480c7; files: Sources/CodexBar/HistoricalUsagePace.swift, Tests/CodexBarTests/HistoricalUsagePaceTests.swift, Tests/CodexBarTests/UsagePaceTextTests.swift)
  • kmatsunami: Commit 5bfd1d1 added recent CLI pace rendering and JSON output, and this PR changes CLIRenderer pace text. (role: CLI pace contributor; confidence: medium; commits: 5bfd1d1474bb; files: Sources/CodexBarCLI/CLIRenderer.swift, Tests/CodexBarTests/CLISnapshotTests.swift)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P3 Low-risk cleanup, docs, polish, ergonomics, or speculative feature. merge-risk: 🚨 auth-provider 🚨 Merging this PR could break OAuth, tokens, provider routing, model choice, or credentials. labels Jun 30, 2026
@steipete

steipete commented Jul 1, 2026

Copy link
Copy Markdown
Owner

Maintainer review against current main (55327375) and exact head 2c508fce:

The pace implementation is internally coherent, and the focused suites are green (UsagePaceTests, UsagePaceTextTests, HistoricalUsagePaceTests, OllamaUsageFetcherTests, and OllamaUsageFetcherRetryMappingTests). I am not landing it yet for two reasons:

  1. “Try 1.5x!” is a product/taste choice. Please add packaged menu proof for representative ahead/on-track/behind states, including the threshold where the hint appears, so the owner can judge whether this is useful guidance or distracting copy.
  2. The Ollama access-denied diagnostic is unrelated scope and should be split. As written, one denied browser can dominate the result even when another accessible browser simply has no Ollama session, so the error does not reliably identify the actionable cause. A separate PR should define deterministic browser precedence and test mixed denied/missing-session candidates.

Recommendation: keep the historical lookback setting and pace headroom work together, split the Ollama diagnostic, then request the copy/threshold decision with refreshed screenshots. No merge from this review.

@steipete

steipete commented Jul 1, 2026

Copy link
Copy Markdown
Owner

Maintainer preparation is complete on exact head d6a732c.

  • Split the unrelated Ollama diagnostic; the branch now contains only pace headroom and the historical lookback setting.
  • Corrected the headroom calculation to compare remaining quota with projected remaining burn. The late-window regression case (90% elapsed, 70% used) now exposes 3.86x safe headroom instead of suppressing the hint.
  • Fixed Italian localization and formatting failures from the contributor head.
  • Green proof: make check; all 43 make test shards; focused pace, historical pace, CLI snapshot, localization, and Ollama regression suites; clean autoreview.
  • Packaged proof: ad-hoc debug app built and launched. The Advanced setting below is from that bundle with the minimum set to 2 weeks.
Packaged CodexBar Advanced settings showing Historical pace weeks set to 2 weeks

The three-state comparison below is a redacted synthetic UsageMenuCardView fixture rendered from the exact branch. It is not real-account data; it proves copy, threshold, and layout for ahead/on-track/behind states.

Synthetic redacted pace cards showing ahead, on-track, and behind states

The debug bundle requested macOS app-data access while attempting live-account rendering. I did not grant that persistent permission, so I am not claiming live-account bar proof.

Recommendation: keep the conservative model and current Try 1.5x! copy for one release. If the tone feels too emphatic, change only the label to 1.5x headroom; the calculation and threshold no longer need a product decision. Leaving this draft unmerged for that copy/tone choice.

@clawsweeper clawsweeper Bot added proof: sufficient Contributor real behavior proof is sufficient. proof: 📸 screenshot Contributor real behavior proof includes screenshot evidence. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. and removed rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. merge-risk: 🚨 auth-provider 🚨 Merging this PR could break OAuth, tokens, provider routing, model choice, or credentials. labels Jul 1, 2026
@astuteprogrammer astuteprogrammer marked this pull request as ready for review July 2, 2026 06:34

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d6a732cb6f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

.foregroundStyle(.tertiary)
}
Spacer(minLength: 12)
Stepper(value: self.$settings.historicalPaceMinimumWeeks, in: 2...8, step: 1) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Don't offer an unreachable 8-week minimum

When the user sets this to 8 after roughly the first day of the current weekly window, historical pace effectively never activates: the history store retains only 56 days of samples, while isCompleteWeek requires coverage in the first 24h of each completed week, so the 8th prior complete week loses its start-boundary samples before it can be counted. With minimumCompleteWeeks == 8, evaluation falls back to linear despite the UI accepting the value; either retain enough boundary history or cap this control at 7.

Useful? React with 👍 / 👎.

steipete and others added 3 commits July 3, 2026 12:58
# Conflicts:
#	Sources/CodexBar/PreferencesAdvancedPane.swift
Co-authored-by: astuteprogrammer <astuteprogrammer@gmail.com>
@clawsweeper clawsweeper Bot added rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. and removed proof: sufficient Contributor real behavior proof is sufficient. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. labels Jul 3, 2026
@clawsweeper clawsweeper Bot added status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. and removed status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. labels Jul 3, 2026
@steipete

steipete commented Jul 3, 2026

Copy link
Copy Markdown
Owner

Reworked and refreshed on exact head 0580c3dc027ff6df3fd7e32078520dfead8eeeed.

Changes:

  • removes the proposed new Advanced setting
  • limits the visible hint to Codex; Claude and other providers keep existing pace text
  • replaces imperative Try 1.5x! copy with neutral 1.5× headroom
  • keeps a fixed conservative three-week history threshold
  • corrects remaining-capacity versus projected-remaining-usage math for linear, workday-aware, and historical pace
  • adds localized copy and readable regression names

Proof:

  • focused pace/history/CLI suite: 128 tests passed across 4 suites
  • make check passed
  • SwiftFormat: 0/1237 files require formatting
  • SwiftLint: 0 violations in 1236 files
  • autoreview: clean, no accepted/actionable findings
  • exact-head CI: all required checks green
  • CI run: https://github.com/steipete/CodexBar/actions/runs/28681716776

@steipete steipete merged commit 53ff0af into steipete:main Jul 3, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

P3 Low-risk cleanup, docs, polish, ergonomics, or speculative feature. proof: 📸 screenshot Contributor real behavior proof includes screenshot evidence. rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants