Overview
This issue documents the root cause analysis and resolution of several critical reliability and performance issues discovered during development.
Issues Resolved
1. Tool Use Conversion Gate (ISSUE-004)
- Problem: Many OpenAI-compatible models (qwen, deepseek, minimax) were returning tool calls as plain text or were being suppressed by a model-prefix gate.
- Fix: Removed the model-prefix gate in
convert_litellm_to_anthropic and added a parser fallback for manual tool calls in content_text.
- Result: Models like
deepseek-v3.1 now correctly produce structured tool_use blocks.
2. Streaming Response RuntimeError
- Problem: Agent would 'stall' or connections would drop during streaming.
- Root Cause:
HTTPException was being raised after the first SSE blocks were yielded.
- Fix: Implemented a
response_started flag. Any errors occurring after headers are sent are now yielded as standard Anthropic SSE error events.
3. Upstream Stalls (502/504 Errors)
- Problem: Transient 502/504 errors from upstream providers were not being handled gracefully.
- Fix: Implemented
retry_with_backoff for both non-streaming and the initial phase of streaming requests.
4. Streaming Lifecycle Fix
- Problem:
AsyncClient created outside the generator lifecycle was being closed prematurely.
- Fix: Moved
AsyncClient lifecycle inside the stream generator.
Status
All fixes have been implemented and verified with complex tool-use tests. We have prepared a PR with these changes.
Overview
This issue documents the root cause analysis and resolution of several critical reliability and performance issues discovered during development.
Issues Resolved
1. Tool Use Conversion Gate (ISSUE-004)
convert_litellm_to_anthropicand added a parser fallback for manual tool calls incontent_text.deepseek-v3.1now correctly produce structuredtool_useblocks.2. Streaming Response RuntimeError
HTTPExceptionwas being raised after the first SSE blocks were yielded.response_startedflag. Any errors occurring after headers are sent are now yielded as standard Anthropic SSEerrorevents.3. Upstream Stalls (502/504 Errors)
retry_with_backofffor both non-streaming and the initial phase of streaming requests.4. Streaming Lifecycle Fix
AsyncClientcreated outside the generator lifecycle was being closed prematurely.AsyncClientlifecycle inside the stream generator.Status
All fixes have been implemented and verified with complex tool-use tests. We have prepared a PR with these changes.