feat: auto-compact and retry on context window errors (rebased)#3226
Closed
TheArchitectit wants to merge 2 commits into
Closed
feat: auto-compact and retry on context window errors (rebased)#3226TheArchitectit wants to merge 2 commits into
TheArchitectit wants to merge 2 commits into
Conversation
When the model API returns a context window exceeded error, the CLI now
automatically compacts the session to free up token budget, then retries
the failed turn. This prevents users from hitting a hard stop when
sessions grow too long.
Problem:
Previously, auto-compact retry only worked in the interactive REPL path
(run_turn). The non-interactive paths (run_prompt_json,
run_prompt_compact, run_prompt_compact_json) simply propagated the
error with a result? and no retry. Additionally, context window
detection used ad-hoc string matching (contains("context_window") ||
contains("no parseable body")) instead of the canonical detection
method in the api crate.
Solution:
1. Added "no parseable body" to CONTEXT_WINDOW_ERROR_MARKERS in the api
crate, so is_context_window_failure() now covers OpenAI-compat
backends that return 400 with an un-parseable body when the request
exceeds context limits.
2. Added RuntimeError::is_context_window_failure() method in the
runtime crate. Since ApiError is erased into a string message when
it crosses the runtime boundary, we need a runtime-level marker
check that mirrors the api crate's detection. This replaces the
ad-hoc string matching that was inlined in run_turn().
3. Extracted the auto-compact retry logic from run_turn() into a
shared LiveCli::auto_compact_retry() method. This method:
- Detects context window errors via RuntimeError::is_context_window_failure()
- Compacts progressively (preserve 4 -> 2 -> 0 recent messages)
- Retries the same user input with the compacted session
- Is bounded by MAX_COMPACT_RETRIES = 3 to prevent infinite loops
- Logs user-facing messages like "Context limit reached, auto-compacting
session... (attempt N/3)"
4. Extended auto-compact retry to ALL turn execution paths:
- run_turn() (interactive REPL) — now uses shared helper
- run_prompt_compact() (-p --compact) — auto-retry added
- run_prompt_compact_json() (-p --compact --json) — auto-retry added
- run_prompt_json() (-p --json) — auto-retry added
Changes:
- rust/crates/api/src/error.rs: Added "no parseable body" marker
- rust/crates/runtime/src/conversation.rs: Added
RUNTIME_CONTEXT_WINDOW_MARKERS constant and
RuntimeError::is_context_window_failure() method
- rust/crates/rusty-claude-cli/src/main.rs: Extracted
LiveCli::auto_compact_retry() with MAX_COMPACT_RETRIES = 3, replaced
inline retry logic in run_turn(), added auto-compact retry to
run_prompt_compact(), run_prompt_compact_json(), run_prompt_json()
Extract the inline preserve schedule into LiveCli::PRESERVE_SCHEDULE and add a focused unit test asserting it covers every retry round, strictly decreases, and ends at zero. The full auto_compact_retry loop is coupled to live runtime/API execution, so only the pure progression logic is unit-tested here. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.