Skip to content

feat: lark platform auto-create thread and thread-aware context isolation#6716

Open
Nick2781 wants to merge 3 commits intoAstrBotDevs:masterfrom
Nick2781:feat/lark-thread-auto-create
Open

feat: lark platform auto-create thread and thread-aware context isolation#6716
Nick2781 wants to merge 3 commits intoAstrBotDevs:masterfrom
Nick2781:feat/lark-thread-auto-create

Conversation

@Nick2781
Copy link

@Nick2781 Nick2781 commented Mar 20, 2026

Summary

Add thread (话题) support for the Lark/Feishu platform adapter:

  1. Auto-create thread on reply: When replying to a non-thread group message, automatically set reply_in_thread=True to create a new thread. Messages already inside a thread reply naturally within it.

  2. Thread-aware session isolation: Use thread_id / root_id to build isolated session_ids (format: {chat_id}%thread%{thread_id} or {chat_id}%root%{root_id}), so conversation context is properly scoped per thread instead of per group.

  3. Streaming card in thread: Streaming cards (CardKit) also respect reply_in_thread, enabling the typewriter effect within auto-created threads.

Files Changed

  • astrbot/core/platform/sources/lark/lark_event.py — Pass reply_in_thread through the full send chain (text, file, audio, video, streaming card) instead of hardcoding False.
  • astrbot/core/platform/sources/lark/lark_adapter.py — Build thread-aware session_id; parse it correctly in send_by_session; pass is_in_thread flag to LarkMessageEvent.

Modifications / 改动点

  • This is NOT a breaking change. / 这不是一个破坏性变更。

Screenshots or Test Results / 运行截图或测试结果

Tested on a real Feishu group:

  • ✅ Bot auto-creates a thread when replying to a group message
  • ✅ Streaming (typewriter) reply works within the auto-created thread
  • ✅ Subsequent messages in the same thread maintain correct conversation context
  • ✅ Non-thread messages and DMs are unaffected

Checklist / 检查清单

  • 😊 New feature discussed with maintainers.
  • 👀 Well-tested with verification steps and screenshots provided above.
  • 🤓 No new dependencies introduced.
  • 😮 No malicious code introduced.

Summary by Sourcery

Add thread-aware messaging support to the Lark/Feishu adapter so replies can auto-create and stay within threads while preserving isolated conversation context per thread.

New Features:

  • Support replying in Lark/Feishu threads by propagating the reply_in_thread flag across all message send paths, including streaming cards.
  • Introduce thread- and root-based session IDs so conversation context is isolated per thread or reply chain in both group chats and DMs.

Enhancements:

  • Normalize session IDs back to plain chat or user IDs when sending by session to keep platform receive IDs valid while preserving internal thread scoping.

@dosubot dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Mar 20, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the Lark/Feishu platform adapter by introducing comprehensive thread support. It enables the bot to automatically create new threads when replying to group messages, ensures that conversation contexts are correctly isolated per thread using unique session IDs, and allows streaming cards to function seamlessly within these new thread environments.

Highlights

  • Auto-create thread on reply: When replying to a non-thread group message, automatically set "reply_in_thread=True" to create a new thread. Messages already inside a thread reply naturally within it.
  • Thread-aware session isolation: Used "thread_id" / "root_id" to build isolated "session_id"s (format: "{chat_id}%thread%{thread_id}" or "{chat_id}%root%{root_id}"), so conversation context is properly scoped per thread instead of per group.
  • Streaming card in thread: Streaming cards (CardKit) also respect "reply_in_thread", enabling the typewriter effect within auto-created threads.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@dosubot dosubot bot added the area:platform The bug / feature is about IM platform adapter, such as QQ, Lark, Telegram, WebChat and so on. label Mar 20, 2026
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • The is_in_thread flag on LarkMessageEvent is effectively used as a "should_reply_in_thread" switch (it’s set to not bool(message.thread_id) and then passed straight into reply_in_thread), which is confusingly named and contradicts the comment; consider renaming it and/or correcting the logic to clearly distinguish "message is in thread" from "we need to set reply_in_thread to create a thread".
  • In convert_msg, the _is_in_thread calculation is based only on message.thread_id; if Feishu ever sends root_id without thread_id for certain threaded cases, this heuristic may misbehave—consider explicitly handling/clarifying the relationship between thread_id and root_id here to avoid edge-case routing issues.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The `is_in_thread` flag on `LarkMessageEvent` is effectively used as a "should_reply_in_thread" switch (it’s set to `not bool(message.thread_id)` and then passed straight into `reply_in_thread`), which is confusingly named and contradicts the comment; consider renaming it and/or correcting the logic to clearly distinguish "message is in thread" from "we need to set reply_in_thread to create a thread".
- In `convert_msg`, the `_is_in_thread` calculation is based only on `message.thread_id`; if Feishu ever sends `root_id` without `thread_id` for certain threaded cases, this heuristic may misbehave—consider explicitly handling/clarifying the relationship between `thread_id` and `root_id` here to avoid edge-case routing issues.

## Individual Comments

### Comment 1
<location path="astrbot/core/platform/sources/lark/lark_adapter.py" line_range="610-614" />
<code_context>
+        # 判断消息是否在话题/回复链中
+        # 没有已存在的 thread 时,需要 reply_in_thread 创建话题
+        # 已在话题中的消息回复自然在话题内,无需 reply_in_thread
+        _is_in_thread = not bool(message.thread_id)
+        await self.handle_msg(abm, is_in_thread=_is_in_thread)

-    async def handle_msg(self, abm: AstrBotMessage) -> None:
</code_context>
<issue_to_address>
**suggestion:** The `is_in_thread` flag name and its semantics are inverted, which is likely to cause confusion and future misuse.

Here `_is_in_thread` is `not bool(message.thread_id)`, so it is `True` when there is *no* existing thread and `False` when the message is already in a thread, yet it is named `is_in_thread` and then used as `reply_in_thread`. The behavior matches the comment, but the name suggests the opposite state and is likely to be misunderstood later.

Consider either:
- Renaming the argument to something like `should_reply_in_thread` / `create_thread_on_reply`, or
- Flipping the boolean so `is_in_thread == bool(message.thread_id)` and using a separate local flag for `reply_in_thread`.

This will make the code easier to reason about and safer to reuse.

```suggestion
        # 判断消息是否在话题/回复链中
        # 存在 thread_id 表示消息已在话题/回复链内
        is_in_thread = bool(message.thread_id)
        await self.handle_msg(abm, is_in_thread=is_in_thread)
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces thread-aware functionality for the Lark/Feishu platform adapter, enabling automatic thread creation for replies to non-thread group messages and isolating conversation context per thread. It also ensures streaming cards respect the reply_in_thread flag. The changes are well-structured and address the stated objectives. However, there are a few areas where clarity and consistency could be improved, particularly in session ID parsing and variable naming, to enhance maintainability and prevent potential misinterpretations.

…sion_id parsing, fix misleading log

Address review feedback from Sourcery AI and Gemini Code Assist:
- Rename is_in_thread -> should_reply_in_thread to match actual semantics
- Simplify send_by_session: unify all % handling to split('%')[0]
- Fix misleading log message in send_streaming
@Nick2781
Copy link
Author

@sourcery-ai review it

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 3 issues, and left some high level feedback:

  • The session ID encoding/decoding logic (%thread%, %root%) is currently duplicated and slightly inconsistent between convert_msg and send_by_session; consider centralizing this into helper functions to avoid subtle split bugs and make future changes to the format safer.
  • The _should_reply_in_thread = not bool(message.thread_id) condition applies to all messages, including DMs and non-group contexts; if thread auto-creation is only desired for group messages, it would be clearer to gate this on MessageType.GROUP_MESSAGE (and/or root_id) to avoid unexpected reply_in_thread=True in other scenarios.
  • The log line in send_streaming says 非话题消息,将通过 reply_in_thread=True 创建新话题 but is keyed off self.should_reply_in_thread, which may also be true in non-group contexts; consider tightening the condition or adjusting the message so logs accurately describe the scenario.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The session ID encoding/decoding logic (`%thread%`, `%root%`) is currently duplicated and slightly inconsistent between `convert_msg` and `send_by_session`; consider centralizing this into helper functions to avoid subtle split bugs and make future changes to the format safer.
- The `_should_reply_in_thread = not bool(message.thread_id)` condition applies to all messages, including DMs and non-group contexts; if thread auto-creation is only desired for group messages, it would be clearer to gate this on `MessageType.GROUP_MESSAGE` (and/or `root_id`) to avoid unexpected `reply_in_thread=True` in other scenarios.
- The log line in `send_streaming` says `非话题消息,将通过 reply_in_thread=True 创建新话题` but is keyed off `self.should_reply_in_thread`, which may also be true in non-group contexts; consider tightening the condition or adjusting the message so logs accurately describe the scenario.

## Individual Comments

### Comment 1
<location path="astrbot/core/platform/sources/lark/lark_event.py" line_range="781-783" />
<code_context>
         使用解耦发送循环,LLM token 到达时只更新 buffer 并唤醒发送协程,
         发送频率由网络 RTT 自然限流。
         """
+        # 话题内消息也尝试流式卡片 + reply_in_thread
+        if self.should_reply_in_thread:
+            logger.info("[Lark] 非话题消息,将通过 reply_in_thread=True 创建新话题")
+
         # Step 1: 创建流式卡片实体
</code_context>
<issue_to_address>
**issue:** The new comment and log message conflict with the actual `should_reply_in_thread` semantics and may cause confusion.

Here `should_reply_in_thread` is defined as `not bool(message.thread_id)`, so it is `True` only for non-thread messages and `False` for messages already in a thread. In this method, the inline comment mentions "话题内消息" while the log line correctly refers to "非话题消息" in line with the condition. Please align the comment and log message with the actual semantics (e.g., clarify that we create a new topic for non-thread messages, while existing topic messages rely on the thread context and don’t need `reply_in_thread=True`).
</issue_to_address>

### Comment 2
<location path="astrbot/core/platform/sources/lark/lark_adapter.py" line_range="470-481" />
<code_context>
             id_type = "chat_id"
             receive_id = session.session_id
+            # 从 session_id 中提取真实的 chat_id(去除 %thread% / %root% 等后缀)
             if "%" in receive_id:
-                receive_id = receive_id.split("%")[1]
+                receive_id = receive_id.split("%")[0]
         else:
             id_type = "open_id"
</code_context>
<issue_to_address>
**suggestion:** Splitting on any '%' in `session_id` assumes Lark IDs never contain '%', which could be brittle; consider splitting on the explicit suffix markers instead.

Both here and in the single-chat branch, `session_id` is normalized by splitting on any `%` and taking index 0. This relies on `chat_id` / `open_id` never containing `%`, so if the upstream ID format or appended metadata ever changes, you risk truncating valid IDs. To make this resilient and clearer, strip only the known suffixes (e.g. `receive_id.split('%thread%')[0].split('%root%')[0]` or similar logic using exact markers) instead of splitting on a bare `%`.

```suggestion
        if session.message_type == MessageType.GROUP_MESSAGE:
            id_type = "chat_id"
            receive_id = session.session_id
            # 从 session_id 中提取真实的 chat_id,仅去除已知后缀(%thread% / %root%),避免误截断包含 '%' 的合法 ID
            for suffix in ("%thread%", "%root%"):
                if suffix in receive_id:
                    receive_id = receive_id.split(suffix)[0]
        else:
            id_type = "open_id"
            receive_id = session.session_id
            # 单聊也需要去除已知后缀(%thread% / %root%),但不对任意 '%' 做截断
            for suffix in ("%thread%", "%root%"):
                if suffix in receive_id:
                    receive_id = receive_id.split(suffix)[0]
```
</issue_to_address>

### Comment 3
<location path="astrbot/core/platform/sources/lark/lark_adapter.py" line_range="474" />
<code_context>
             id_type = "chat_id"
             receive_id = session.session_id
+            # 从 session_id 中提取真实的 chat_id(去除 %thread% / %root% 等后缀)
             if "%" in receive_id:
-                receive_id = receive_id.split("%")[1]
+                receive_id = receive_id.split("%")[0]
</code_context>
<issue_to_address>
**issue (complexity):** Consider introducing helper functions to build and parse `session_id` and to derive `should_reply_in_thread` so that callers are agnostic of the `%thread%` / `%root%` encoding details.

You can reduce the new complexity by centralizing the `%thread%` / `%root%` encoding/decoding and making callers agnostic of the exact string format.

### 1. Centralize session_id build/parse

Instead of open‑coding `f"{base_id}%thread%{...}"` and `split("%")[0]` in multiple places, add small helpers:

```python
SESSION_THREAD_SUFFIX = "%thread%"
SESSION_ROOT_SUFFIX = "%root%"

def build_session_id(base_id: str, *, thread_id: str | None, root_id: str | None) -> str:
    if thread_id:
        return f"{base_id}{SESSION_THREAD_SUFFIX}{thread_id}"
    if root_id:
        return f"{base_id}{SESSION_ROOT_SUFFIX}{root_id}"
    return base_id

def extract_base_session_id(session_id: str) -> str:
    # We only care about the "base" right now
    for marker in (SESSION_THREAD_SUFFIX, SESSION_ROOT_SUFFIX):
        idx = session_id.find(marker)
        if idx != -1:
            return session_id[:idx]
    return session_id
```

Then update the call sites:

```python
# in convert_msg / webhook handling
if abm.type == MessageType.GROUP_MESSAGE:
    base_id = abm.group_id or ""
else:
    base_id = abm.sender.user_id

abm.session_id = build_session_id(
    base_id=base_id,
    thread_id=message.thread_id,
    root_id=message.root_id,
)
```

```python
# in send_by_session
if session.message_type == MessageType.GROUP_MESSAGE:
    id_type = "chat_id"
else:
    id_type = "open_id"

receive_id = extract_base_session_id(session.session_id)
```

This removes the “magic” `%`‑based splitting from your business logic and keeps all existing behavior.

### 2. Keep the boolean, but avoid coupling it to the string format

The `should_reply_in_thread` flag is a separate concern from how you encode `session_id`. Keeping it as a boolean is fine, but the current code implicitly relies on `thread_id` and `%thread%` semantics at different layers.

You can at least make the intent explicit and local:

```python
_should_reply_in_thread = message.thread_id is None
await self.handle_msg(abm, should_reply_in_thread=_should_reply_in_thread)
```

Combined with the helper above, callers no longer need to know or care that threads/root are represented via `%thread%` / `%root%` in the `session_id` string, which makes the control flow easier to follow and change later.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

…or session_id parsing

- Fix inline comment to match 'should_reply_in_thread' semantics
- Replace bare '%' split with explicit '%thread%'/'%root%' suffix matching
  to avoid truncating IDs that may legitimately contain '%'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:platform The bug / feature is about IM platform adapter, such as QQ, Lark, Telegram, WebChat and so on. size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant