auto-recall can block gateway startup / first-turn path long enough to fail health checks

﻿## Symptom | 症状
With `auto-recall` enabled and existing memories present, gateway startup or the first user turn can block long enough to trigger health-check failures and restart loops.

在 `auto-recall` 开启且已有记忆数据时，gateway 启动阶段或首轮用户消息阶段可能被阻塞得足够久，最终触发健康检查失败甚至重启链。

## Trigger | 触发条件
- `@memtensor/memos-local-openclaw-plugin@1.0.8`
- Linux + `systemctl --user`
- existing memory records already in database
- `allowPromptInjection=true` and `auto-recall enabled`
- recall filtering path uses a slow/timeout-prone model

## Minimal Reproduction | 最小复现
1. Enable `auto-recall`.
2. Have enough existing memories in the local database.
3. Use a slow model on the recall/filter path.
4. Restart gateway or send the first message after startup.
5. On affected runs, recall/filter work blocks the critical path for tens of seconds.

1. 开启 `auto-recall`。
2. 本地数据库里已有一定数量的记忆。
3. 让 recall/filter 链路走一个慢模型。
4. 重启 gateway，或在启动后发送第一条消息。
5. 有问题的运行里，这条 recall/filter 工作会把关键路径阻塞几十秒。

## Actual Result | 实际结果
In the affected environment, this could stall the startup / first-turn path for ~30-40s. That was enough to trip health checks and contribute to restart loops.

在实际环境里，这条链路会把启动或首轮路径卡住约 30-40 秒，足以打爆健康检查，并参与重启循环。

## Local Workaround | 本地临时解决办法
The local mitigation was:
- add hard timeout around recall/filter LLM work
- fail open on timeout/errors
- do not let recall/filter exceptions propagate to the gateway top level
- keep startup `ready` independent from slow recall work

本地止血方式是：
- 给 recall/filter 的 LLM 工作加硬超时
- 超时或报错后 fail-open
- 不让 recall/filter 异常再抛到 gateway 顶层
- 保持 startup `ready` 不依赖慢召回链路

## Suggested Fix | 建议修复方向
Please move auto-recall filtering out of the startup-critical / first-turn-critical path, and enforce timeout + fail-open semantics for slow recall models. Recall enrichment should be optional context, not something that can delay readiness or destabilize the gateway.

建议把 auto-recall filtering 从 startup-critical / first-turn-critical 路径中挪开，并对慢召回模型强制 timeout + fail-open。召回增强应该是可选上下文，而不应该影响 ready 或拖垮 gateway 稳定性。


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

auto-recall can block gateway startup / first-turn path long enough to fail health checks #1452

Symptom | 症状

Trigger | 触发条件

Minimal Reproduction | 最小复现

Actual Result | 实际结果

Local Workaround | 本地临时解决办法

Suggested Fix | 建议修复方向

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

auto-recall can block gateway startup / first-turn path long enough to fail health checks #1452

Description

Symptom | 症状

Trigger | 触发条件

Minimal Reproduction | 最小复现

Actual Result | 实际结果

Local Workaround | 本地临时解决办法

Suggested Fix | 建议修复方向

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions