feat: lark platform auto-create thread and thread-aware context isolation#6716
feat: lark platform auto-create thread and thread-aware context isolation#6716Nick2781 wants to merge 3 commits intoAstrBotDevs:masterfrom
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances the Lark/Feishu platform adapter by introducing comprehensive thread support. It enables the bot to automatically create new threads when replying to group messages, ensures that conversation contexts are correctly isolated per thread using unique session IDs, and allows streaming cards to function seamlessly within these new thread environments. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- The
is_in_threadflag onLarkMessageEventis effectively used as a "should_reply_in_thread" switch (it’s set tonot bool(message.thread_id)and then passed straight intoreply_in_thread), which is confusingly named and contradicts the comment; consider renaming it and/or correcting the logic to clearly distinguish "message is in thread" from "we need to set reply_in_thread to create a thread". - In
convert_msg, the_is_in_threadcalculation is based only onmessage.thread_id; if Feishu ever sendsroot_idwithoutthread_idfor certain threaded cases, this heuristic may misbehave—consider explicitly handling/clarifying the relationship betweenthread_idandroot_idhere to avoid edge-case routing issues.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The `is_in_thread` flag on `LarkMessageEvent` is effectively used as a "should_reply_in_thread" switch (it’s set to `not bool(message.thread_id)` and then passed straight into `reply_in_thread`), which is confusingly named and contradicts the comment; consider renaming it and/or correcting the logic to clearly distinguish "message is in thread" from "we need to set reply_in_thread to create a thread".
- In `convert_msg`, the `_is_in_thread` calculation is based only on `message.thread_id`; if Feishu ever sends `root_id` without `thread_id` for certain threaded cases, this heuristic may misbehave—consider explicitly handling/clarifying the relationship between `thread_id` and `root_id` here to avoid edge-case routing issues.
## Individual Comments
### Comment 1
<location path="astrbot/core/platform/sources/lark/lark_adapter.py" line_range="610-614" />
<code_context>
+ # 判断消息是否在话题/回复链中
+ # 没有已存在的 thread 时,需要 reply_in_thread 创建话题
+ # 已在话题中的消息回复自然在话题内,无需 reply_in_thread
+ _is_in_thread = not bool(message.thread_id)
+ await self.handle_msg(abm, is_in_thread=_is_in_thread)
- async def handle_msg(self, abm: AstrBotMessage) -> None:
</code_context>
<issue_to_address>
**suggestion:** The `is_in_thread` flag name and its semantics are inverted, which is likely to cause confusion and future misuse.
Here `_is_in_thread` is `not bool(message.thread_id)`, so it is `True` when there is *no* existing thread and `False` when the message is already in a thread, yet it is named `is_in_thread` and then used as `reply_in_thread`. The behavior matches the comment, but the name suggests the opposite state and is likely to be misunderstood later.
Consider either:
- Renaming the argument to something like `should_reply_in_thread` / `create_thread_on_reply`, or
- Flipping the boolean so `is_in_thread == bool(message.thread_id)` and using a separate local flag for `reply_in_thread`.
This will make the code easier to reason about and safer to reuse.
```suggestion
# 判断消息是否在话题/回复链中
# 存在 thread_id 表示消息已在话题/回复链内
is_in_thread = bool(message.thread_id)
await self.handle_msg(abm, is_in_thread=is_in_thread)
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Code Review
This pull request introduces thread-aware functionality for the Lark/Feishu platform adapter, enabling automatic thread creation for replies to non-thread group messages and isolating conversation context per thread. It also ensures streaming cards respect the reply_in_thread flag. The changes are well-structured and address the stated objectives. However, there are a few areas where clarity and consistency could be improved, particularly in session ID parsing and variable naming, to enhance maintainability and prevent potential misinterpretations.
…sion_id parsing, fix misleading log
Address review feedback from Sourcery AI and Gemini Code Assist:
- Rename is_in_thread -> should_reply_in_thread to match actual semantics
- Simplify send_by_session: unify all % handling to split('%')[0]
- Fix misleading log message in send_streaming
|
@sourcery-ai review it |
There was a problem hiding this comment.
Hey - I've found 3 issues, and left some high level feedback:
- The session ID encoding/decoding logic (
%thread%,%root%) is currently duplicated and slightly inconsistent betweenconvert_msgandsend_by_session; consider centralizing this into helper functions to avoid subtle split bugs and make future changes to the format safer. - The
_should_reply_in_thread = not bool(message.thread_id)condition applies to all messages, including DMs and non-group contexts; if thread auto-creation is only desired for group messages, it would be clearer to gate this onMessageType.GROUP_MESSAGE(and/orroot_id) to avoid unexpectedreply_in_thread=Truein other scenarios. - The log line in
send_streamingsays非话题消息,将通过 reply_in_thread=True 创建新话题but is keyed offself.should_reply_in_thread, which may also be true in non-group contexts; consider tightening the condition or adjusting the message so logs accurately describe the scenario.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The session ID encoding/decoding logic (`%thread%`, `%root%`) is currently duplicated and slightly inconsistent between `convert_msg` and `send_by_session`; consider centralizing this into helper functions to avoid subtle split bugs and make future changes to the format safer.
- The `_should_reply_in_thread = not bool(message.thread_id)` condition applies to all messages, including DMs and non-group contexts; if thread auto-creation is only desired for group messages, it would be clearer to gate this on `MessageType.GROUP_MESSAGE` (and/or `root_id`) to avoid unexpected `reply_in_thread=True` in other scenarios.
- The log line in `send_streaming` says `非话题消息,将通过 reply_in_thread=True 创建新话题` but is keyed off `self.should_reply_in_thread`, which may also be true in non-group contexts; consider tightening the condition or adjusting the message so logs accurately describe the scenario.
## Individual Comments
### Comment 1
<location path="astrbot/core/platform/sources/lark/lark_event.py" line_range="781-783" />
<code_context>
使用解耦发送循环,LLM token 到达时只更新 buffer 并唤醒发送协程,
发送频率由网络 RTT 自然限流。
"""
+ # 话题内消息也尝试流式卡片 + reply_in_thread
+ if self.should_reply_in_thread:
+ logger.info("[Lark] 非话题消息,将通过 reply_in_thread=True 创建新话题")
+
# Step 1: 创建流式卡片实体
</code_context>
<issue_to_address>
**issue:** The new comment and log message conflict with the actual `should_reply_in_thread` semantics and may cause confusion.
Here `should_reply_in_thread` is defined as `not bool(message.thread_id)`, so it is `True` only for non-thread messages and `False` for messages already in a thread. In this method, the inline comment mentions "话题内消息" while the log line correctly refers to "非话题消息" in line with the condition. Please align the comment and log message with the actual semantics (e.g., clarify that we create a new topic for non-thread messages, while existing topic messages rely on the thread context and don’t need `reply_in_thread=True`).
</issue_to_address>
### Comment 2
<location path="astrbot/core/platform/sources/lark/lark_adapter.py" line_range="470-481" />
<code_context>
id_type = "chat_id"
receive_id = session.session_id
+ # 从 session_id 中提取真实的 chat_id(去除 %thread% / %root% 等后缀)
if "%" in receive_id:
- receive_id = receive_id.split("%")[1]
+ receive_id = receive_id.split("%")[0]
else:
id_type = "open_id"
</code_context>
<issue_to_address>
**suggestion:** Splitting on any '%' in `session_id` assumes Lark IDs never contain '%', which could be brittle; consider splitting on the explicit suffix markers instead.
Both here and in the single-chat branch, `session_id` is normalized by splitting on any `%` and taking index 0. This relies on `chat_id` / `open_id` never containing `%`, so if the upstream ID format or appended metadata ever changes, you risk truncating valid IDs. To make this resilient and clearer, strip only the known suffixes (e.g. `receive_id.split('%thread%')[0].split('%root%')[0]` or similar logic using exact markers) instead of splitting on a bare `%`.
```suggestion
if session.message_type == MessageType.GROUP_MESSAGE:
id_type = "chat_id"
receive_id = session.session_id
# 从 session_id 中提取真实的 chat_id,仅去除已知后缀(%thread% / %root%),避免误截断包含 '%' 的合法 ID
for suffix in ("%thread%", "%root%"):
if suffix in receive_id:
receive_id = receive_id.split(suffix)[0]
else:
id_type = "open_id"
receive_id = session.session_id
# 单聊也需要去除已知后缀(%thread% / %root%),但不对任意 '%' 做截断
for suffix in ("%thread%", "%root%"):
if suffix in receive_id:
receive_id = receive_id.split(suffix)[0]
```
</issue_to_address>
### Comment 3
<location path="astrbot/core/platform/sources/lark/lark_adapter.py" line_range="474" />
<code_context>
id_type = "chat_id"
receive_id = session.session_id
+ # 从 session_id 中提取真实的 chat_id(去除 %thread% / %root% 等后缀)
if "%" in receive_id:
- receive_id = receive_id.split("%")[1]
+ receive_id = receive_id.split("%")[0]
</code_context>
<issue_to_address>
**issue (complexity):** Consider introducing helper functions to build and parse `session_id` and to derive `should_reply_in_thread` so that callers are agnostic of the `%thread%` / `%root%` encoding details.
You can reduce the new complexity by centralizing the `%thread%` / `%root%` encoding/decoding and making callers agnostic of the exact string format.
### 1. Centralize session_id build/parse
Instead of open‑coding `f"{base_id}%thread%{...}"` and `split("%")[0]` in multiple places, add small helpers:
```python
SESSION_THREAD_SUFFIX = "%thread%"
SESSION_ROOT_SUFFIX = "%root%"
def build_session_id(base_id: str, *, thread_id: str | None, root_id: str | None) -> str:
if thread_id:
return f"{base_id}{SESSION_THREAD_SUFFIX}{thread_id}"
if root_id:
return f"{base_id}{SESSION_ROOT_SUFFIX}{root_id}"
return base_id
def extract_base_session_id(session_id: str) -> str:
# We only care about the "base" right now
for marker in (SESSION_THREAD_SUFFIX, SESSION_ROOT_SUFFIX):
idx = session_id.find(marker)
if idx != -1:
return session_id[:idx]
return session_id
```
Then update the call sites:
```python
# in convert_msg / webhook handling
if abm.type == MessageType.GROUP_MESSAGE:
base_id = abm.group_id or ""
else:
base_id = abm.sender.user_id
abm.session_id = build_session_id(
base_id=base_id,
thread_id=message.thread_id,
root_id=message.root_id,
)
```
```python
# in send_by_session
if session.message_type == MessageType.GROUP_MESSAGE:
id_type = "chat_id"
else:
id_type = "open_id"
receive_id = extract_base_session_id(session.session_id)
```
This removes the “magic” `%`‑based splitting from your business logic and keeps all existing behavior.
### 2. Keep the boolean, but avoid coupling it to the string format
The `should_reply_in_thread` flag is a separate concern from how you encode `session_id`. Keeping it as a boolean is fine, but the current code implicitly relies on `thread_id` and `%thread%` semantics at different layers.
You can at least make the intent explicit and local:
```python
_should_reply_in_thread = message.thread_id is None
await self.handle_msg(abm, should_reply_in_thread=_should_reply_in_thread)
```
Combined with the helper above, callers no longer need to know or care that threads/root are represented via `%thread%` / `%root%` in the `session_id` string, which makes the control flow easier to follow and change later.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
…or session_id parsing - Fix inline comment to match 'should_reply_in_thread' semantics - Replace bare '%' split with explicit '%thread%'/'%root%' suffix matching to avoid truncating IDs that may legitimately contain '%'
Summary
Add thread (话题) support for the Lark/Feishu platform adapter:
Auto-create thread on reply: When replying to a non-thread group message, automatically set
reply_in_thread=Trueto create a new thread. Messages already inside a thread reply naturally within it.Thread-aware session isolation: Use
thread_id/root_idto build isolatedsession_ids (format:{chat_id}%thread%{thread_id}or{chat_id}%root%{root_id}), so conversation context is properly scoped per thread instead of per group.Streaming card in thread: Streaming cards (CardKit) also respect
reply_in_thread, enabling the typewriter effect within auto-created threads.Files Changed
astrbot/core/platform/sources/lark/lark_event.py— Passreply_in_threadthrough the full send chain (text, file, audio, video, streaming card) instead of hardcodingFalse.astrbot/core/platform/sources/lark/lark_adapter.py— Build thread-awaresession_id; parse it correctly insend_by_session; passis_in_threadflag toLarkMessageEvent.Modifications / 改动点
Screenshots or Test Results / 运行截图或测试结果
Tested on a real Feishu group:
Checklist / 检查清单
Summary by Sourcery
Add thread-aware messaging support to the Lark/Feishu adapter so replies can auto-create and stay within threads while preserving isolated conversation context per thread.
New Features:
Enhancements: