Skip to content

fix: simplify send_message_to_user schema to reduce LLM tool-call errors#6602

Open
kawayiYokami wants to merge 2 commits intoAstrBotDevs:masterfrom
kawayiYokami:chore/from-upstream-master-20260319
Open

fix: simplify send_message_to_user schema to reduce LLM tool-call errors#6602
kawayiYokami wants to merge 2 commits intoAstrBotDevs:masterfrom
kawayiYokami:chore/from-upstream-master-20260319

Conversation

@kawayiYokami
Copy link
Contributor

@kawayiYokami kawayiYokami commented Mar 19, 2026

This change simplifies send_message_to_user for LLM tool-calling reliability.

Current messages[] + type nested schema causes frequent argument-shape errors in practice. LLMs can already call tools multiple times in one turn, so message-chain batching is not required and increases failure risk.

Modifications / 改动点

  • Refactor SendMessageToUserTool input schema from nested messages[] to flat args.

  • Remove explicit type requirement from tool input; infer component type from path/url extension.

  • Enforce exactly one primary payload per call (text or path or url) to reduce ambiguous calls.

  • Keep mention_user_id composable with one primary payload (or mention-only).

  • Add unit tests for success/error paths and media inference behavior.

  • This is NOT a breaking change. / 这不是一个破坏性变更。

Screenshots or Test Results / 运行截图或测试结果

Verification steps:

.venv\\Scripts\\python.exe -m pytest -q tests/unit/test_astr_main_agent_resources.py

Result:

7 passed, 3 warnings in 2.34s

Checklist / 检查清单

  • 😊 If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
    / 如果 PR 中有新加入的功能,已经通过 Issue / 邮件等方式和作者讨论过。

  • 👀 My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
    / 我的更改经过了良好的测试,并已在上方提供了“验证步骤”和“运行截图”

  • 🤓 I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in requirements.txt and pyproject.toml.
    / 我确保没有引入新依赖库,或者引入了新依赖库的同时将其添加到 requirements.txtpyproject.toml 文件相应位置。

  • 😮 My changes do not introduce malicious code.
    / 我的更改没有引入恶意代码。

Summary by Sourcery

Simplify the send_message_to_user tool interface to a flat, single-payload schema and infer media types from paths/URLs for more reliable LLM tool calls.

New Features:

  • Support automatic inference of message component type (image, audio/record, video, file) from file extensions in paths and URLs.
  • Allow optional session targeting and file naming when sending media or files via the send_message_to_user tool.

Bug Fixes:

  • Prevent ambiguous or invalid send_message_to_user invocations by enforcing exactly one primary payload per call and validating input combinations.

Enhancements:

  • Refactor send_message_to_user parameters from a nested messages array schema to flat arguments for simpler LLM integration.
  • Improve error handling for message component construction to return descriptive error strings instead of raising exceptions.

Tests:

  • Add unit tests covering primary payload validation, mention-only and mention+payload flows, and media-type inference for URLs and sandbox paths.

@auto-assign auto-assign bot requested review from Soulter and anka-afk March 19, 2026 05:15
@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Mar 19, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors the send_message_to_user tool to enhance its reliability and ease of use for Large Language Models (LLMs). By simplifying the input schema from a complex nested structure to a flatter set of parameters, the change aims to mitigate frequent argument-shape errors that LLMs encountered. The update also introduces automatic media type inference based on file extensions and enforces clearer rules for payload composition, making the tool more robust and predictable for automated agents.

Highlights

  • Schema Simplification: Refactored the SendMessageToUserTool input schema from a nested messages[] array to flat arguments for improved LLM tool-calling reliability.
  • Type Inference: Removed the explicit type requirement from tool input, now inferring the component type (e.g., image, record, video, file) automatically from file extensions in path or url.
  • Payload Enforcement: Enforced that exactly one primary payload (text, path, or url) is allowed per call to reduce ambiguous tool invocations.
  • Mention Composability: Ensured that mention_user_id can be combined with a single primary payload or used independently.
  • Unit Tests: Added comprehensive unit tests to cover success and error paths, as well as the new media type inference behavior.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 2 issues, and left some high level feedback:

  • The logic for stripping query/hash and extracting the basename from URLs/paths is duplicated in _infer_component_type_from_ref and the URL file-name fallback; consider centralizing this into a small helper to keep the behavior consistent and easier to maintain.
  • In call, text is strip()ped and then used to determine the primary payload, which means whitespace-only text is treated as missing; if this is intentional, it may be worth explicitly handling or documenting it, otherwise you could check for None vs empty to avoid surprising rejections.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The logic for stripping query/hash and extracting the basename from URLs/paths is duplicated in `_infer_component_type_from_ref` and the URL file-name fallback; consider centralizing this into a small helper to keep the behavior consistent and easier to maintain.
- In `call`, `text` is `strip()`ped and then used to determine the primary payload, which means whitespace-only text is treated as missing; if this is intentional, it may be worth explicitly handling or documenting it, otherwise you could check for `None` vs empty to avoid surprising rejections.

## Individual Comments

### Comment 1
<location path="astrbot/core/astr_main_agent_resources.py" line_range="356-365" />
<code_context>
                 else:
-                    return (
-                        f"error: unsupported message type '{msg_type}' at index {idx}."
+                    file_name = (
+                        (str(name).strip() if name is not None else "")
+                        or os.path.basename(str(url).split("?", 1)[0].split("#", 1)[0])
+                        or "file"
                     )
</code_context>
<issue_to_address>
**suggestion:** The URL cleaning logic is duplicated and could be centralized for consistency with `_infer_component_type_from_ref`.

Here you manually strip query/fragment from `url` before `basename`, while `_infer_component_type_from_ref` performs similar normalization. Consider extracting a shared helper (or reusing that cleaning step) so URL/path normalization is defined in one place and stays consistent for both type inference and filename derivation.

Suggested implementation:

```python
                else:
                    file_name = (
                        (str(name).strip() if name is not None else "")
                        or os.path.basename(self._normalize_ref_path(str(url)))
                        or "file"
                    )

```

```python
    def _normalize_ref_path(self, ref: str) -> str:
        # Centralized normalization so URL/path handling is consistent across
        # filename derivation and component-type inference.
        # Strips query params and fragments but leaves the rest of the path intact.
        clean_ref = ref.split("?", 1)[0].split("#", 1)[0]
        return clean_ref

    def _infer_component_type_from_ref(

```

1. Inside `_infer_component_type_from_ref`, replace any manual `ref.split("?", 1)[0].split("#", 1)[0]` (or equivalent) logic with a call to `self._normalize_ref_path(ref)` so both type inference and filename derivation share the same normalization behavior.
2. If `os` is not already imported at the top of `astrbot/core/astr_main_agent_resources.py`, add `import os` to the imports section.
</issue_to_address>

### Comment 2
<location path="tests/unit/test_astr_main_agent_resources.py" line_range="98-107" />
<code_context>
+
+
+@pytest.mark.asyncio
+async def test_send_message_path_infers_record_component(monkeypatch: pytest.MonkeyPatch):
+    tool = SendMessageToUserTool()
+    run_context, send_message = _build_run_context()
+    monkeypatch.setattr(
+        tool,
+        "_resolve_path_from_sandbox",
+        AsyncMock(return_value=("/tmp/voice.mp3", False)),
+    )
+
+    result = await tool.call(run_context, path="/sandbox/voice.mp3")
+
+    assert result.startswith("Message sent to session")
+    send_message.assert_awaited_once()
+    _, chain = send_message.await_args.args
+    assert isinstance(chain, MessageChain)
+    assert len(chain.chain) == 1
+    assert isinstance(chain.chain[0], Record)
+
+
</code_context>
<issue_to_address>
**suggestion (testing):** Add a test for the error-handling path when component construction raises

Since you’re already using `monkeypatch` for the happy path, please also cover the error path where component construction fails and we return `"error: failed to build message component: {exc}"`.

You can monkeypatch `_resolve_path_from_sandbox` (or a `Comp.*` constructor) to raise, then assert that:
- `tool.call(...)` returns a string starting with `"error: failed to build message component"`, and
- `send_message` is never awaited.

This ensures the tool’s error handling and messaging are exercised when component construction fails.

Suggested implementation:

```python
import pytest
from unittest.mock import AsyncMock

from astrbot.core.agent.run_context import ContextWrapper

```

```python
@pytest.mark.asyncio
async def test_send_message_path_infers_record_component(monkeypatch: pytest.MonkeyPatch):
    tool = SendMessageToUserTool()
    run_context, send_message = _build_run_context()
    monkeypatch.setattr(
        tool,
        "_resolve_path_from_sandbox",
        AsyncMock(return_value=("/tmp/voice.mp3", False)),
    )

    result = await tool.call(run_context, path="/sandbox/voice.mp3")

    assert result.startswith("Message sent to session")
    send_message.assert_awaited_once()
    _, chain = send_message.await_args.args
    assert isinstance(chain, MessageChain)
    assert len(chain.chain) == 1
    assert isinstance(chain.chain[0], Record)


@pytest.mark.asyncio
async def test_send_message_path_component_construction_error(
    monkeypatch: pytest.MonkeyPatch,
):
    tool = SendMessageToUserTool()
    run_context, send_message = _build_run_context()

    async def _raise(*args, **kwargs):
        raise RuntimeError("boom")

    monkeypatch.setattr(
        tool,
        "_resolve_path_from_sandbox",
        AsyncMock(side_effect=_raise),
    )

    result = await tool.call(run_context, path="/sandbox/voice.mp3")

    assert isinstance(result, str)
    assert result.startswith("error: failed to build message component")
    send_message.assert_not_awaited()

```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +98 to +107
async def test_send_message_path_infers_record_component(monkeypatch: pytest.MonkeyPatch):
tool = SendMessageToUserTool()
run_context, send_message = _build_run_context()
monkeypatch.setattr(
tool,
"_resolve_path_from_sandbox",
AsyncMock(return_value=("/tmp/voice.mp3", False)),
)

result = await tool.call(run_context, path="/sandbox/voice.mp3")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Add a test for the error-handling path when component construction raises

Since you’re already using monkeypatch for the happy path, please also cover the error path where component construction fails and we return "error: failed to build message component: {exc}".

You can monkeypatch _resolve_path_from_sandbox (or a Comp.* constructor) to raise, then assert that:

  • tool.call(...) returns a string starting with "error: failed to build message component", and
  • send_message is never awaited.

This ensures the tool’s error handling and messaging are exercised when component construction fails.

Suggested implementation:

import pytest
from unittest.mock import AsyncMock

from astrbot.core.agent.run_context import ContextWrapper
@pytest.mark.asyncio
async def test_send_message_path_infers_record_component(monkeypatch: pytest.MonkeyPatch):
    tool = SendMessageToUserTool()
    run_context, send_message = _build_run_context()
    monkeypatch.setattr(
        tool,
        "_resolve_path_from_sandbox",
        AsyncMock(return_value=("/tmp/voice.mp3", False)),
    )

    result = await tool.call(run_context, path="/sandbox/voice.mp3")

    assert result.startswith("Message sent to session")
    send_message.assert_awaited_once()
    _, chain = send_message.await_args.args
    assert isinstance(chain, MessageChain)
    assert len(chain.chain) == 1
    assert isinstance(chain.chain[0], Record)


@pytest.mark.asyncio
async def test_send_message_path_component_construction_error(
    monkeypatch: pytest.MonkeyPatch,
):
    tool = SendMessageToUserTool()
    run_context, send_message = _build_run_context()

    async def _raise(*args, **kwargs):
        raise RuntimeError("boom")

    monkeypatch.setattr(
        tool,
        "_resolve_path_from_sandbox",
        AsyncMock(side_effect=_raise),
    )

    result = await tool.call(run_context, path="/sandbox/voice.mp3")

    assert isinstance(result, str)
    assert result.startswith("error: failed to build message component")
    send_message.assert_not_awaited()

@dosubot
Copy link

dosubot bot commented Mar 19, 2026

Related Documentation

1 document(s) may need updating based on files changed in this PR:

AstrBotTeam's Space

pr4697的改动
View Suggested Changes
@@ -434,17 +434,69 @@
 `send_message_to_user` 工具支持主动向用户发送消息,包括文本和多媒体内容。该工具可与定时任务、后台任务等结合,实现主动推送。
 
 - 支持多平台:Telegram、Slack、Discord 等
-- 支持多种消息类型:`plain`(文本)、`image`(图片)、`record`(语音消息)、`video`(视频)、`file`(文件)、`mention_user`(提及用户)
+- 支持多种消息类型:文本、图片、语音消息、视频、文件、提及用户
+
+> **架构改进(PR #6602)**:工具现使用扁平化参数结构,移除了嵌套的 `messages[]` 数组,提升 LLM 工具调用的可靠性。媒体类型通过文件扩展名自动推断,无需显式指定 `type` 字段。
+
+#### 参数说明
+
+工具支持以下扁平化参数:
+
+- **`text`**(字符串):纯文本内容
+- **`path`**(字符串):本地文件路径或沙箱路径,指向媒体/文件
+- **`url`**(字符串):远程媒体/文件的 URL
+- **`name`**(字符串,可选):当推断类型为 `file` 时的文件名
+- **`mention_user_id`**(字符串,可选):要提及的用户 ID,可单独使用或与一个主要载荷组合
+- **`session`**(字符串,可选):目标会话,默认为当前会话
+
+**调用约束:**
+- 每次调用只能包含**一个主要载荷**:`text` 或 `path` 或 `url`
+- `mention_user_id` 可以与一个主要载荷组合使用,或单独使用
+
+#### 媒体类型自动推断
+
+对于 `path` 或 `url` 参数,系统会根据文件扩展名自动推断媒体类型:
+
+- **图片(image)**:`.png`、`.jpg`、`.jpeg`、`.gif`、`.webp`、`.bmp`、`.tif`、`.tiff`、`.ico`、`.heic`、`.heif`、`.avif`
+- **音频/语音(record)**:`.mp3`、`.wav`、`.m4a`、`.aac`、`.flac`、`.ogg`、`.opus`、`.amr`、`.wma`
+- **视频(video)**:`.mp4`、`.mkv`、`.mov`、`.avi`、`.webm`、`.m4v`、`.flv`、`.wmv`、`.3gp`、`.mpeg`、`.mpg`
+- **文件(file)**:无法识别的扩展名会被视为通用文件类型
 
 #### 使用场景
 - **发送多媒体文件**:使用该工具发送图片、语音、视频、文件等媒体内容
 - **主动推送消息**:在定时任务或后台任务中主动向用户发送消息
 - **普通文本回复**:对于普通文本回复,可以直接在对话中输出,无需使用此工具
 
-> **注意**:在 `skills_like` 工具调用模式下,该工具描述已明确列出所有支持的消息类型,确保 AI 能够正确理解该工具支持发送多媒体消息,避免使用 Python 工具等绕弯路的方式处理媒体文件。
-
 #### 使用示例
-在定时任务触发后,主动调用 `send_message_to_user` 工具,将生成的日报、图片或其他内容直接发送给用户。
+
+**发送文本消息:**
+```python
+await send_message_to_user(text="您好,这是一条文本消息。")
+```
+
+**发送图片:**
+```python
+await send_message_to_user(path="/path/to/image.png")
+# 或使用 URL
+await send_message_to_user(url="https://example.com/image.jpg")
+```
+
+**发送文件并指定文件名:**
+```python
+await send_message_to_user(path="/path/to/report.pdf", name="月度报告.pdf")
+```
+
+**提及用户并发送消息:**
+```python
+await send_message_to_user(mention_user_id="12345", text="您的任务已完成!")
+```
+
+**仅提及用户:**
+```python
+await send_message_to_user(mention_user_id="12345")
+```
+
+> **注意**:该工具描述已针对 LLM 工具调用优化,确保 AI 能够正确理解该工具支持发送多媒体消息,避免使用 Python 工具等绕弯路的方式处理媒体文件。扁平化参数结构减少了 LLM 在调用工具时的参数形状错误。
 
 ---
 

[Accept] [Decline]

Note: You must be authenticated to accept/decline updates.

How did I do? Any feedback?  Join Discord

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request is a solid improvement, simplifying the send_message_to_user tool by replacing a complex nested schema with flat arguments. This change should significantly improve reliability when used with LLMs. The automatic inference of media types from file extensions is a thoughtful addition, and the new unit tests provide good coverage for the new functionality. I have a couple of suggestions to further enhance the code's maintainability by reducing some minor code duplication and improving conciseness.

Comment on lines +327 to +333
primary_count = 0
if text:
primary_count += 1
if path:
primary_count += 1
if url:
primary_count += 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic for counting primary payloads can be made more concise and idiomatic by using sum() with map(). This improves readability by expressing the intent in a single line.

Suggested change
primary_count = 0
if text:
primary_count += 1
if path:
primary_count += 1
if url:
primary_count += 1
primary_count = sum(map(bool, (text, path, url)))

Comment on lines +343 to +378
try:
if text:
components.append(Comp.Plain(text=text))
elif path:
component_type = self._infer_component_type_from_ref(path)
local_path, _ = await self._resolve_path_from_sandbox(context, path)
if component_type == "image":
components.append(Comp.Image.fromFileSystem(path=local_path))
elif component_type == "record":
components.append(Comp.Record.fromFileSystem(path=local_path))
elif component_type == "video":
components.append(Comp.Video.fromFileSystem(path=local_path))
else:
file_name = (
(str(name).strip() if name is not None else "")
or os.path.basename(local_path)
or "file"
)
if path:
(
local_path,
file_from_sandbox,
) = await self._resolve_path_from_sandbox(context, path)
components.append(Comp.File(name=name, file=local_path))
elif url:
components.append(Comp.File(name=name, url=url))
else:
return f"error: messages[{idx}] must include path or url for file component."
elif msg_type == "mention_user":
mention_user_id = msg.get("mention_user_id")
if not mention_user_id:
return f"error: messages[{idx}].mention_user_id is required for mention_user component."
components.append(
Comp.At(
qq=mention_user_id,
),
)
components.append(Comp.File(name=file_name, file=local_path))
elif url:
component_type = self._infer_component_type_from_ref(url)
if component_type == "image":
components.append(Comp.Image.fromURL(url=url))
elif component_type == "record":
components.append(Comp.Record.fromURL(url=url))
elif component_type == "video":
components.append(Comp.Video.fromURL(url=url))
else:
return (
f"error: unsupported message type '{msg_type}' at index {idx}."
file_name = (
(str(name).strip() if name is not None else "")
or os.path.basename(str(url).split("?", 1)[0].split("#", 1)[0])
or "file"
)
except Exception as exc: # 捕获组件构造异常,避免直接抛出
return f"error: failed to build messages[{idx}] component: {exc}"
components.append(Comp.File(name=file_name, url=url))
except Exception as exc:
return f"error: failed to build message component: {exc}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic for creating message components contains duplicated if/elif/else blocks for handling path and url. You can reduce this repetition and improve maintainability by using a dictionary to map component types to their corresponding classes. This makes the code cleaner and easier to extend with new component types in the future.

        COMPONENT_MAP = {
            "image": Comp.Image,
            "record": Comp.Record,
            "video": Comp.Video,
        }
        try:
            if text:
                components.append(Comp.Plain(text=text))
            elif path:
                component_type = self._infer_component_type_from_ref(path)
                local_path, _ = await self._resolve_path_from_sandbox(context, path)
                component_cls = COMPONENT_MAP.get(component_type)
                if component_cls:
                    components.append(component_cls.fromFileSystem(path=local_path))
                else:  # Fallback to File
                    file_name = (
                        (str(name).strip() if name is not None else "")
                        or os.path.basename(local_path)
                        or "file"
                    )
                    components.append(Comp.File(name=file_name, file=local_path))
            elif url:
                component_type = self._infer_component_type_from_ref(url)
                component_cls = COMPONENT_MAP.get(component_type)
                if component_cls:
                    components.append(component_cls.fromURL(url=url))
                else:  # Fallback to File
                    file_name = (
                        (str(name).strip() if name is not None else "")
                        or os.path.basename(str(url).split("?", 1)[0].split("#", 1)[0])
                        or "file"
                    )
                    components.append(Comp.File(name=file_name, url=url))
        except Exception as exc:
            return f"error: failed to build message component: {exc}"

@kawayiYokami
Copy link
Contributor Author

kawayiYokami commented Mar 19, 2026

已根据 review 修复:

  1. 抽取了 URL/路径标准化 helper,去掉重复逻辑;
  2. 补了组件构造失败分支测试;
  3. 明确并测试了空白视为无效输入。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant