## fix(agent): call on_agent_done hook before storing assistant message in history#6721
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a critical bug in the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- Since
on_agent_doneis now invoked beforerun_context.messages.append(...), consider auditing any existing hooks that may implicitly rely on the final assistant message already being present inrun_context.messagesand, if needed, add a defensive check or explicit comment near the hook call to clarify this ordering contract. - The try/except wrapper around
agent_hooks.on_agent_doneis now duplicated in multiple places; consider extracting this into a small helper (e.g.,_safe_on_agent_done(llm_resp)) to keep the control flow consistent and easier to maintain. - When logging hook errors (
Error in on_agent_done hook), consider including additional context such as the current run ID or agent name to make it easier to trace failures in multi-run or multi-agent environments.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Since `on_agent_done` is now invoked before `run_context.messages.append(...)`, consider auditing any existing hooks that may implicitly rely on the final assistant message already being present in `run_context.messages` and, if needed, add a defensive check or explicit comment near the hook call to clarify this ordering contract.
- The try/except wrapper around `agent_hooks.on_agent_done` is now duplicated in multiple places; consider extracting this into a small helper (e.g., `_safe_on_agent_done(llm_resp)`) to keep the control flow consistent and easier to maintain.
- When logging hook errors (`Error in on_agent_done hook`), consider including additional context such as the current run ID or agent name to make it easier to trace failures in multi-run or multi-agent environments.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| # call the on_agent_done hook BEFORE recording the message, | ||
| # so that plugins can clean metadata tags from completion_text first. | ||
| try: | ||
| await self.agent_hooks.on_agent_done(self.run_context, llm_resp) | ||
| except Exception as e: | ||
| logger.error(f"Error in on_agent_done hook: {e}", exc_info=True) |
There was a problem hiding this comment.
为了提高代码的可维护性并避免重复,建议将这个 try...except 块提取到一个私有的辅助方法中。此逻辑在 _finalize_aborted_step 方法(第 976-979 行)中也被重复使用。
您可以考虑在 ToolLoopAgentRunner 类中添加一个类似这样的方法:
async def _safe_run_on_agent_done(self, llm_resp: LLMResponse):
"""Safely runs the on_agent_done hook and logs any errors."""
try:
await self.agent_hooks.on_agent_done(self.run_context, llm_resp)
except Exception as e:
logger.error(f"Error in on_agent_done hook: {e}", exc_info=True)然后,您就可以用一行 await self._safe_run_on_agent_done(llm_resp) 来替换当前的 try...except 块,并在 _finalize_aborted_step 中也进行同样的替换。
|
Related Documentation 1 document(s) may need updating based on files changed in this PR: AstrBotTeam's Space pr4697的改动View Suggested Changes@@ -740,7 +740,8 @@
- **[PR #5850](https://github.com/AstrBotDevs/AstrBot/pull/5850) 增强**:停止信号现在会传递到 LLM 调用和工具执行器,通过 `abort_signal` 参数使停止机制在整个执行堆栈中一致传播
- **工具执行中断**:长时间运行的工具等待(包括子代理 handoff)现在可以被停止请求中断,`_iter_tool_executor_results()` 方法监控 `abort_signal`,使用 `asyncio.wait()` 与 `FIRST_COMPLETED` 在工具结果和中止信号之间竞争
- **部分输出保留**:当工具执行期间发生中断时,`_finalize_aborted_step()` 方法会保留已生成的部分 LLM 响应
-- Agent 转换为 DONE 状态,并触发 `on_agent_done` 钩子
+- **钩子执行顺序(PR #6721)**:在 Agent 完成或中止时,`on_agent_done` 钩子会在 assistant 消息存入对话历史之前触发。这确保插件可以在 `on_llm_response` 处理器中修改 `completion_text`(例如剥离元数据标签),修改后的内容会被保存到历史记录中,而非原始的 LLM 输出
+- Agent 转换为 DONE 状态
- 对话历史和会话状态得以保留,响应类型标记为 "aborted"
- `was_aborted()` 方法返回 `true`,表明任务被用户主动中止
@@ -769,6 +770,7 @@
- 历史保存条件从 `not event.is_stopped()` 改为 `not event.is_stopped() or agent_runner.was_aborted()`
- 确保即使会话被中止,已生成的部分内容也能正确保存到历史记录中
- 空响应在用户中止时会被妥善处理,不会导致历史记录丢失
+- **钩子与历史记录顺序(PR #6721)**:`on_agent_done` 钩子现在在 assistant 消息存入历史记录之前触发。这确保插件对 `completion_text` 的修改(例如通过 `on_llm_response` 处理器剥离元数据标签)能够反映到存储的对话历史中,防止未清理的 LLM 输出泄漏到后续上下文中
##### 前端实现
- 状态管理(`useMessages.ts`):新增 `currentRequestController`、`currentReader`、`currentRunningSessionId` 和 `userStopRequested` 状态Note: You must be authenticated to accept/decline updates. |
Motivation / 动机
在
ToolLoopAgentRunner中,assistant 消息在on_agent_done钩子(触发OnLLMResponseEvent)被调用之前就已存入run_context.messages。这意味着插件在on_llm_response处理器中对completion_text所做的任何修改,都不会反映到已存储的对话历史中。具体影响:通过
on_llm_response从 LLM 输出中剥离元数据标签(如[Favour: ...]、[Spend: ...]、[Profile: ...])的插件,虽然能成功清理当前响应的文本,但未清理的原始文本已经存入了对话历史。在后续的 LLM 调用中,模型会在上下文中看到残留的元数据标签,导致重复出账、重复记忆更新等问题。很难想象这个问题之前没有人发现。我已经分析过了这个改动可能带来的后果,目前来说我认为并不会对现有的插件调用产生影响,风险极低。Modifications / 改动点
File:
astrbot/core/agent/runners/tool_loop_agent_runner.pyReordered the
on_agent_donehook call to execute beforerun_context.messages.append(...)in two locations:_finalize_aborted_step(~line 976): When agent execution is abortedBefore (bug):
After (fix):
No new dependencies. No breaking changes. The hook contract is unchanged —
on_agent_donestill receives the samerun_contextandllm_respobjects.Screenshots or Test Results / 运行截图或测试结果
Before fix (from real production logs):
After fix: The
on_agent_donehook fires first, plugins cleancompletion_text, and the cleaned text is stored inrun_context.messages. Subsequent LLM calls see clean history without metadata tags.Syntax validation:
python -c "import ast; ast.parse(...)"→ OKLint:
ruff check→ All checks passedChecklist / 检查清单
😊 If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
/ 如果 PR 中有新加入的功能,已经通过 Issue / 邮件等方式和作者讨论过。
👀 My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
/ 我的更改经过了良好的测试,并已在上方提供了"验证步骤"和"运行截图"。
🤓 I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in
requirements.txtandpyproject.toml./ 我确保没有引入新依赖库,或者引入了新依赖库的同时将其添加到
requirements.txt和pyproject.toml文件相应位置。😮 My changes do not introduce malicious code.
/ 我的更改没有引入恶意代码。
Summary by Sourcery
Bug Fixes: