Skip to content

fix(dify): strip <think> tags from Dify runner output#6619

Open
NayukiChiba wants to merge 2 commits intoAstrBotDevs:masterfrom
NayukiChiba:fix/dify-think-tag-filter
Open

fix(dify): strip <think> tags from Dify runner output#6619
NayukiChiba wants to merge 2 commits intoAstrBotDevs:masterfrom
NayukiChiba:fix/dify-think-tag-filter

Conversation

@NayukiChiba
Copy link
Contributor

@NayukiChiba NayukiChiba commented Mar 19, 2026

这个 PR 做了什么?

修复 #6437

部分底层模型(如 DeepSeek-R1)即使在 Dify 侧关闭了思考模式,仍会在输出中
夹带 <think>...</think> 格式的思维链内容。由于 Dify Runner 没有对此进行过滤,
这些标签会被原样透传给用户。

改动内容

  • dify_agent_runner.py中新增 _strip_think_tags()静态方法,用于移除<think>...</think> 块及多余的孤立 </think> 标签,与openai_source._parse_openai_completion 中已有的处理逻辑保持一致。
  • parse_dify_result()的所有输出路径中应用该过滤器(包括 chat/agent/chatflow 字符串结果,以及 workflow 的文本/列表/回退输出)。
  • 同样对流式输出的增量 chunk 进行过滤,避免 <think> 内容在流式模式下被逐字打印给用户。

Summary by Sourcery

Strip unintended chain-of-thought tags from Dify runner responses before delivering them to users.

Bug Fixes:

  • Filter out ... chain-of-thought blocks and trailing tags from non-streaming Dify chat and workflow outputs.
  • Sanitize streaming delta chunks from Dify to avoid emitting -wrapped content incrementally to clients.

@dosubot dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Mar 19, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

此拉取请求旨在解决 Dify 代理运行器中一个问题,即某些底层模型即使在 Dify 侧禁用了思考模式,仍会在输出中包含 <think>...</think> 标签。通过引入一个通用的清理方法并将其应用于所有输出路径,确保了提供给用户的响应是干净的,不含这些内部思考标签,从而提升了用户体验并保持了输出的一致性。

Highlights

  • 新增标签清理方法: 在 dify_agent_runner.py 中新增了一个静态方法 _strip_think_tags(),用于移除模型输出中意外出现的 <think>...</think> 块及孤立的 </think> 标签。
  • 全面应用过滤器: 该标签清理方法已应用于 parse_dify_result() 的所有输出路径,包括聊天、代理、聊天流的字符串结果,以及工作流的文本、列表和回退输出。
  • 流式输出过滤: 对流式输出的增量数据块也进行了过滤,以确保在流式模式下用户不会看到 <think> 内容。
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@NayukiChiba NayukiChiba marked this pull request as draft March 19, 2026 11:06
@dosubot dosubot bot added the area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. label Mar 19, 2026
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • In parse_dify_result, for the workflow branch where output is a list, the non-file case inside the for item in output loop still does str(output) instead of str(item), which will duplicate the whole list instead of just the current element and likely isn't what you want.
  • Consider short-circuiting _strip_think_tags (e.g., if '<think' not in text and '</think>' not in text: return text.strip()) to avoid running two regexes on every chunk that doesn't contain these tags, especially in streaming mode where this method is called very frequently.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `parse_dify_result`, for the workflow branch where `output` is a list, the non-file case inside the `for item in output` loop still does `str(output)` instead of `str(item)`, which will duplicate the whole list instead of just the current element and likely isn't what you want.
- Consider short-circuiting `_strip_think_tags` (e.g., `if '<think' not in text and '</think>' not in text: return text.strip()`) to avoid running two regexes on every chunk that doesn't contain these tags, especially in streaming mode where this method is called very frequently.

## Individual Comments

### Comment 1
<location path="astrbot/core/agent/runners/dify/dify_agent_runner.py" line_range="196-195" />
<code_context>
-                                    chain=MessageChain().message(chunk["answer"])
-                                ),
-                            )
+                            delta = self._strip_think_tags(chunk["answer"])
+                            if delta:
+                                yield AgentResponse(
+                                    type="streaming_delta",
+                                    data=AgentResponseData(
</code_context>
<issue_to_address>
**issue (bug_risk):** Streaming stripping of <think> tags may leak partial chain-of-thought when tags span multiple chunks.

Because `_strip_think_tags` runs per chunk, a `<think>` block split across chunks won’t be fully removed until the closing tag appears in the same chunk. Earlier chunks can leak partial chain-of-thought, and the final chunk may contain only the tail of the visible answer. To avoid this, buffer while inside `<think>...</think>` and only emit deltas when outside those regions, instead of stripping per chunk.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@dosubot
Copy link

dosubot bot commented Mar 19, 2026

Related Documentation

1 document(s) may need updating based on files changed in this PR:

AstrBotTeam's Space

pr4697的改动

[Accept] [Decline]

Note: You must be authenticated to accept/decline updates.

How did I do? Any feedback?  Join Discord

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

这个 PR 旨在解决底层模型输出的 <think> 标签会泄露给用户的问题。改动引入了 _strip_think_tags 方法来过滤这些标签,并正确地将其应用于流式和非流式输出的各个路径。实现方式直接有效。我有一个关于优化正则表达式操作的建议,以提高性能和代码简洁性。

Comment on lines +293 to +294
text = re.sub(r"<think>.*?</think>", "", text, flags=re.DOTALL)
text = re.sub(r"</think>\s*$", "", text)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

为了提高效率和代码简洁性,可以将这两个 re.sub 调用合并为一个。如果此函数被频繁调用(例如在流式处理的每个块中),在类级别预编译正则表达式(例如 _THINK_TAG_PATTERN = re.compile(...))可以进一步提升性能。

Suggested change
text = re.sub(r"<think>.*?</think>", "", text, flags=re.DOTALL)
text = re.sub(r"</think>\s*$", "", text)
text = re.sub(r"<think>.*?</think>|</think>\s*$", "", text, flags=re.DOTALL)

@NayukiChiba NayukiChiba marked this pull request as ready for review March 19, 2026 11:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant