Skip to content

feat: auto-compact and retry on context window errors (rebased)#3226

Closed
TheArchitectit wants to merge 2 commits into
ultraworkers:mainfrom
TheArchitectit:worktree-auto-compact-retry
Closed

feat: auto-compact and retry on context window errors (rebased)#3226
TheArchitectit wants to merge 2 commits into
ultraworkers:mainfrom
TheArchitectit:worktree-auto-compact-retry

Conversation

@TheArchitectit
Copy link
Copy Markdown
Contributor

No description provided.

TheArchitectit and others added 2 commits June 4, 2026 21:25
When the model API returns a context window exceeded error, the CLI now
automatically compacts the session to free up token budget, then retries
the failed turn. This prevents users from hitting a hard stop when
sessions grow too long.

Problem:
Previously, auto-compact retry only worked in the interactive REPL path
(run_turn). The non-interactive paths (run_prompt_json,
run_prompt_compact, run_prompt_compact_json) simply propagated the
error with a result? and no retry. Additionally, context window
detection used ad-hoc string matching (contains("context_window") ||
contains("no parseable body")) instead of the canonical detection
method in the api crate.

Solution:
1. Added "no parseable body" to CONTEXT_WINDOW_ERROR_MARKERS in the api
   crate, so is_context_window_failure() now covers OpenAI-compat
   backends that return 400 with an un-parseable body when the request
   exceeds context limits.

2. Added RuntimeError::is_context_window_failure() method in the
   runtime crate. Since ApiError is erased into a string message when
   it crosses the runtime boundary, we need a runtime-level marker
   check that mirrors the api crate's detection. This replaces the
   ad-hoc string matching that was inlined in run_turn().

3. Extracted the auto-compact retry logic from run_turn() into a
   shared LiveCli::auto_compact_retry() method. This method:
   - Detects context window errors via RuntimeError::is_context_window_failure()
   - Compacts progressively (preserve 4 -> 2 -> 0 recent messages)
   - Retries the same user input with the compacted session
   - Is bounded by MAX_COMPACT_RETRIES = 3 to prevent infinite loops
   - Logs user-facing messages like "Context limit reached, auto-compacting
     session... (attempt N/3)"

4. Extended auto-compact retry to ALL turn execution paths:
   - run_turn() (interactive REPL) — now uses shared helper
   - run_prompt_compact() (-p --compact) — auto-retry added
   - run_prompt_compact_json() (-p --compact --json) — auto-retry added
   - run_prompt_json() (-p --json) — auto-retry added

Changes:
- rust/crates/api/src/error.rs: Added "no parseable body" marker
- rust/crates/runtime/src/conversation.rs: Added
  RUNTIME_CONTEXT_WINDOW_MARKERS constant and
  RuntimeError::is_context_window_failure() method
- rust/crates/rusty-claude-cli/src/main.rs: Extracted
  LiveCli::auto_compact_retry() with MAX_COMPACT_RETRIES = 3, replaced
  inline retry logic in run_turn(), added auto-compact retry to
  run_prompt_compact(), run_prompt_compact_json(), run_prompt_json()
Extract the inline preserve schedule into LiveCli::PRESERVE_SCHEDULE and
add a focused unit test asserting it covers every retry round, strictly
decreases, and ends at zero. The full auto_compact_retry loop is coupled
to live runtime/API execution, so only the pure progression logic is
unit-tested here.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants