fix: add timeout + fail-open to recall search path by kagura-agent · Pull Request #1455 · MemTensor/MemOS

kagura-agent · 2026-04-10T04:16:37Z

Summary

Fixes #1452 — auto-recall can block gateway startup / first-turn path long enough to fail health checks when embedding or LLM calls are slow.

Problem

With auto-recall enabled and existing memories, the recall search path can block for 30-40 seconds when:

Embedding model is slow to respond
LLM skill relevance judgment hangs
Hub memory search times out

This is enough to trip health checks and cause restart loops.

Solution

Add configurable timeout + fail-open semantics at three layers:

1. Recall engine (`recall/engine.ts`)

withTimeout() helper: races any promise against a deadline, returns fallback on timeout
embedder.embedQuery() wrapped with timeout → falls back to FTS-only search (no vector candidates)
Hub memory embedding wrapped with timeout → skipped on timeout
judgeSkillRelevance() (LLM call) wrapped with timeout → returns all candidates on timeout

2. Tool handler (`tools/memory-search.ts`)

Top-level timeout on the entire memory_search handler
Returns empty results with timedOut: true in meta on timeout
Never throws — always returns a valid response shape

3. Configuration (`types.ts`, `config.ts`)

New recall.timeoutMs option (default: 10000ms, 0 = no timeout)
Operators can tune based on their model latency

Key Principles

Fail-open: timeout/error → partial or empty results, never block
No propagation: recall exceptions never reach gateway top level
Startup independence: ready state doesn't depend on slow recall

Testing

TypeScript typecheck passes (tsc --noEmit)
No behavioral changes when recall completes within timeout
Timeout only activates when operations exceed configured timeoutMs

Fixes MemTensor#1452 — auto-recall can block gateway startup / first-turn path long enough to fail health checks when embedding or LLM calls are slow. Changes: - Add configurable `recall.timeoutMs` (default 10s) - Wrap embedder.embedQuery() with timeout in RecallEngine.search(); falls back to FTS-only results on timeout - Wrap LLM skill relevance judgment with timeout in searchSkills(); falls back to returning all candidates on timeout - Add top-level timeout in memory_search tool handler; returns empty results with `timedOut: true` flag on timeout - All timeouts fail-open: partial/empty results, never throw - Recall exceptions never propagate to gateway top level

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: add timeout + fail-open to recall search path#1455

fix: add timeout + fail-open to recall search path#1455
kagura-agent wants to merge 1 commit intoMemTensor:mainfrom
kagura-agent:fix/recall-timeout-fail-open

kagura-agent commented Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kagura-agent commented Apr 10, 2026

Summary

Problem

Solution

1. Recall engine (recall/engine.ts)

2. Tool handler (tools/memory-search.ts)

3. Configuration (types.ts, config.ts)

Key Principles

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

1. Recall engine (`recall/engine.ts`)

2. Tool handler (`tools/memory-search.ts`)

3. Configuration (`types.ts`, `config.ts`)