Skip to content

fix(opencode): resume the OpenCode session on follow-ups instead of starting an empty one#3617

Open
vdmkotai wants to merge 8 commits into
pingdotgg:mainfrom
vdmkotai:fix/3604-opencode-session-resume
Open

fix(opencode): resume the OpenCode session on follow-ups instead of starting an empty one#3617
vdmkotai wants to merge 8 commits into
pingdotgg:mainfrom
vdmkotai:fix/3604-opencode-session-resume

Conversation

@vdmkotai

@vdmkotai vdmkotai commented Jun 30, 2026

Copy link
Copy Markdown

Fixes #3604.

What

The OpenCode adapter never read or emitted a resume cursor, so the upstream ses_… id lived only in process memory. When that in-memory binding was lost — ProviderSessionReaper stopping an idle session (~30 min), or an app/server restart — the next follow-up in the same visible thread was sent to a brand-new, empty OpenCode session. t3code kept rendering its own projection DB, so the user still saw the full history while the model had no context.

This makes the OpenCode adapter resumable, mirroring the existing Grok/Cursor/Codex pattern, entirely within apps/server/src/provider/Layers/OpenCodeAdapter.ts:

  • startSession now emits resumeCursor: { schemaVersion, sessionId } on the returned ProviderSession (and sendTurn echoes it), so ProviderService persists it into provider_session_runtime.resume_cursor_json.
  • When a cursor is present, startSession validates the id with session.get and re-adopts that session instead of calling session.create. OpenCode scopes history by session id, so prompting the same id restores the full prior conversation.
  • A missing/closed session (or any session.get failure) falls back to a fresh session, so a stale cursor can never wedge the thread.
  • The start race-cleanup only aborts the upstream session when we actually created it — never one we merely resumed.

Why

This is the documented root cause in #3604 (with DB-level evidence of two OpenCode sessions per visible thread). The persistence/recovery plumbing is already provider-agnostic: ProviderService.startSession falls back to the stored binding cursor when the reactor passes none, so no changes are needed outside the adapter — it just needed the adapter to start producing and consuming a cursor like every other provider already does.

Scope

Intentionally small and focused — 2 files, no contract / persistence / orchestration changes:

  • apps/server/src/provider/Layers/OpenCodeAdapter.ts (+118/-20)
  • apps/server/src/provider/Layers/OpenCodeAdapter.test.ts (+147) — regression tests for: fresh-session cursor emission, resume re-adopting the persisted id (no create), follow-up turns targeting the resumed id, stale-cursor fallback to create, and malformed/foreign-cursor rejection.

Validation

  • tsgo --noEmit clean; OpenCodeAdapter.test.ts (21 tests) plus the full src/provider + src/orchestration suites (531 tests) pass.
  • Ran a desktop build carrying this fix for a full day across many OpenCode sessions, including idle-past-reaper and app-restart between turns — every follow-up retained full context, no regressions observed.

Note

Medium Risk
Changes provider session lifecycle and error handling for OpenCode threads; misclassified errors could still lose context, but the PR explicitly hardens against silent fresh sessions on non-404 failures.

Overview
Fixes #3604 by making the OpenCode adapter produce and consume a versioned resumeCursor (schemaVersion + sessionId) so follow-ups reuse the same upstream ses_… conversation after idle reaper or restart, instead of silently starting an empty session.

startSession now parses the persisted cursor, probes with session.get, and either re-adopts the session (with session.update to re-apply permissions for the current runtimeMode), forks into a new working directory when cwd changed (preserving history), or creates fresh only on a confirmed not-found. Transient/auth/server errors propagate via isOpenCodeNotFound (structured 404/NotFoundError only—no message substring matching). sendTurn echoes the cursor; concurrent-start cleanup aborts only when this call created a session, not when resuming.

Tests extend the OpenCode runtime mock (get/update/fork) and add regression coverage for resume, stale cursor, bad cursor, transient failures, cwd fork, and not-found classification.

Reviewed by Cursor Bugbot for commit 0d6eb88. Bugbot is set up for automated code reviews on this repo. Configure here.

Note

Resume existing OpenCode sessions on follow-up turns instead of creating new ones

  • startSession in OpenCodeAdapter.ts now parses a versioned resumeCursor from input and probes the OpenCode server via session.get to check if the session still exists.
  • If the session exists and the working directory matches (or is unset), it reuses the session and re-applies permission rules via session.update instead of calling session.create.
  • If the working directory has changed, it forks the session with session.fork to preserve conversation history while binding to the new directory.
  • On a confirmed 404, it falls back to creating a new session; non-404 errors propagate.
  • sendTurn now includes the active resumeCursor in its result so callers can persist it across turns.
  • The abort-on-race logic only fires when this call actually created a new session, not when resuming an existing one.

Macroscope summarized 0d6eb88.

…tarting an empty one

The OpenCode adapter always called session.create and never read or
emitted a resume cursor, so the upstream ses_… id lived only in memory.
When that in-memory binding was lost — the ProviderSessionReaper stopping
an idle session (~30 min) or an app/server restart — the next follow-up
in the same visible thread was sent to a brand-new, empty OpenCode
session. t3code kept rendering its own projection DB, so the user still
saw the full history while the model had no context (issue pingdotgg#3604).

Mirror the Grok/Cursor/Codex resume pattern, entirely within the adapter:

- Emit resumeCursor { schemaVersion, sessionId } on the started
  ProviderSession (and echo it from sendTurn) so ProviderService persists
  it into provider_session_runtime.resume_cursor_json.
- On startSession, when a cursor is present, validate the id with
  session.get and re-adopt that session instead of creating a new one.
  OpenCode scopes history by session id, so prompting the same id
  restores the full prior conversation. A missing/closed session (or any
  get failure) falls back to a fresh session so a stale cursor can't wedge
  the thread.
- Only abort the upstream session in the start race-cleanup when we
  actually created it; never abort a session we merely resumed.

The persistence/recovery plumbing is provider-agnostic and already feeds
a persisted cursor back into startSession on a reaped/restarted follow-up
(ProviderService falls back to the stored binding cursor when the reactor
passes none), so no changes outside the adapter are needed.

Adds regression tests: fresh-session cursor emission, resume re-adopting
the persisted id (no create), follow-up turns targeting the resumed id,
stale-cursor fallback to create, and malformed-cursor rejection.

Fixes pingdotgg#3604
@coderabbitai

coderabbitai Bot commented Jun 30, 2026

Copy link
Copy Markdown

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 35dc3648-d191-48cc-8fa8-fc0e5117454e

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@github-actions github-actions Bot added vouch:unvouched PR author is not yet trusted in the VOUCHED list. size:M 30-99 changed lines (additions + deletions). labels Jun 30, 2026
Comment thread apps/server/src/provider/Layers/OpenCodeAdapter.ts Outdated
Comment thread apps/server/src/provider/Layers/OpenCodeAdapter.ts
@macroscopeapp

macroscopeapp Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Approvability

Verdict: Needs human review

This PR fundamentally changes OpenCode session lifecycle behavior - adding session resumption, directory-change forking, and sophisticated error detection. While well-tested and addressing a real bug, the scope of changes to core session management warrants human review.

You can customize Macroscope's approvability policy. Learn more.

…ass)

Addresses review feedback on pingdotgg#3617 (macroscope + Cursor Bugbot + a deep
multi-agent review) without widening scope beyond the adapter:

- Re-apply permissions on resume. `session.create` is the only place the
  runtimeMode permission ruleset is set, so re-adopting a session skipped
  it; the reactor restarts with the persisted cursor on a runtime-mode
  change, which would leave a resumed OpenCode session on stale (e.g.
  full-access) permissions. Resume now calls session.update with
  buildOpenCodePermissionRules(runtimeMode).

- Don't resume into the wrong directory. OpenCode routes a prompt to the
  session's own stored directory, so reusing a session created under a
  different cwd would silently run there. Resume now starts a fresh
  session when the re-adopted session's directory differs from the
  requested cwd.

- Distinguish "not found" from transient failures. The SDK client uses
  throwOnError:true, so session.get rejects on any non-2xx. Only a
  confirmed 404 / NotFoundError now falls back to creating a fresh
  session; transport/auth/server errors propagate instead of silently
  resetting a live thread to an empty session.

Tests model throwOnError:true (session.get rejects) and add coverage for
the permission re-application, cwd-mismatch fallback, and transient-error
propagation paths.
@github-actions github-actions Bot added size:L 100-499 changed lines (additions + deletions). and removed size:M 30-99 changed lines (additions + deletions). labels Jun 30, 2026
@vdmkotai

Copy link
Copy Markdown
Author

Thanks for the reviews — addressed in 7be6629. Three hardening changes, all kept within the adapter:

  1. Permission re-application on resume (Cursor Bugbot): session.create was the only place buildOpenCodePermissionRules(runtimeMode) was applied, so re-adopting a session via session.get left it on its original permissions — and ProviderCommandReactor restarts with the persisted cursor on a runtime-mode change. Resume now calls session.update({ sessionID, permission }), so a runtime-mode change takes effect on the re-adopted session.

  2. Confirmed-not-found vs. transient errors (macroscope): the SDK client is created with throwOnError: true, so session.get rejects on any non-2xx. The fallback to a fresh session now fires only on a confirmed 404 / NotFoundError; transport/auth/server errors propagate instead of silently resetting a live thread to an empty session (matching [Bug]: OpenCode provider loses thread context on follow-up — t3code starts a new OpenCode session instead of resuming (no durable session binding) #3604's own "surface an explicit error" suggestion).

  3. Directory-aware resume: OpenCode routes a prompt to the session's own stored directory, so resuming a session created under a different cwd would silently run there. Resume now starts a fresh session when the re-adopted session's directory differs from the requested cwd.

Tests updated to model throwOnError: true (get rejects, not a result tuple) and add coverage for permission re-application, the cwd-mismatch fallback, and transient-error propagation. Full server provider + orchestration suite green (534 passing).

Comment thread apps/server/src/provider/Layers/OpenCodeAdapter.ts
Address review (Cursor Bugbot, high): isOpenCodeNotFound only walked the
`cause` chain checking a numeric `status`, so it relied on the 404 sitting
at one specific nesting and ignored `response.status` and the
OpenCodeRuntimeError `detail` string. Reworked it into a bounded BFS that
also checks `statusCode`, nested `response.status`, the NotFoundError
`name`/`body`, and `message`/`detail` text, descending cause/body/error/data.
Export it and add direct unit tests across every shape (incl. the real
wrapped-Error production shape and a response.status-only 404), plus the
transient/auth/network cases that must still propagate.

@cursor cursor Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes using high effort and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit b984684. Configure here.

Comment thread apps/server/src/provider/Layers/OpenCodeAdapter.ts Outdated
vdmkotai added 3 commits June 30, 2026 20:43
…e text

Address review (Cursor Bugbot): isOpenCodeNotFound matched the free-text
message/detail, so a non-404 error whose text merely contains 'not found'
(a 500 saying 'upstream X not found', an auth error, or a serialized body
from openCodeRuntimeErrorDetail) was misclassified as a missing session and
silently started a fresh one. Decide only on structured signals — a numeric
404 (status/statusCode/nested response.status) or an explicit NotFoundError
name — which already cover the real throwOnError:true production shape
(cause.status=404 + body.name). Update unit tests to assert free-text-only
inputs (incl. a 500 whose message contains 'not found') now propagate.
@vdmkotai

vdmkotai commented Jul 1, 2026

Copy link
Copy Markdown
Author

UPDATE:
Seems like this fix is not enough. Making more changes

…ontext

The directory-mismatch guard added while hardening this PR started a fresh, EMPTY
session whenever the resumed session's stored directory differed from the requested
cwd. Its stated rationale -- "OpenCode routes a prompt to the session's own stored
directory, so resuming under a different cwd would run in the wrong tree" -- does not
hold: OpenCode resolves tool execution, snapshots and file ops from the per-request
directory param (instance context), not from session.info.directory. So the guard
solved a non-problem and introduced a real one: any time a thread's cwd changes (most
commonly when it moves from the project root into a git worktree between turns) the
whole conversation was stranded in the old session and the follow-up landed in an
empty one -- the exact pingdotgg#3604 symptom this PR set out to fix.

Fix: when the persisted session exists but was created under a different directory,
fork it INTO the requested directory (client.session.fork({ sessionID, directory }))
instead of creating an empty session. OpenCode clones the full message history into a
new session bound to the requested worktree, so the follow-up keeps its context and
tools still run on the correct tree. The fork id becomes the durable resume cursor. A
genuinely missing (404) session still starts fresh, unchanged.

Verified against the OpenCode source that fork copies all messages/parts with fresh
ids and stamps the new session's directory/path from the request context.

Test: the former "starts fresh on directory mismatch" case now asserts the session is
forked with history (fork called, no session.create, cursor -> fork id) via a new fork
mock; the not-found path still starts fresh.
@vdmkotai

vdmkotai commented Jul 1, 2026

Copy link
Copy Markdown
Author

Pushed a follow-up correcting the directory-mismatch handling that was added during hardening.

Problem: the guard that started a fresh session when the resumed session's stored directory differed from the requested cwd was justified by "OpenCode routes a prompt to the session's own stored directory." That premise doesn't hold — OpenCode resolves tool execution, snapshots and file ops from the per-request directory param (instance context), not session.info.directory. So on any cwd change (most commonly a thread moving from the project root into a git worktree between turns) the whole conversation was stranded in the old session and the follow-up landed in an empty one — reproducing the exact #3604 symptom this PR fixes. I hit this in real use with a worktree-backed thread (turn 1 created the session in the project root before the worktree metadata settled; turn 2 requested the worktree cwd → empty session).

Fix: when the persisted session exists but was created under a different directory, client.session.fork({ sessionID, directory }) into the requested directory instead of creating an empty session. OpenCode clones the full history into a new session bound to the requested worktree, so the follow-up keeps context and tools still run on the correct tree. The fork id becomes the durable resume cursor. A genuinely missing (404) session still starts fresh.

Verified against the OpenCode source that fork copies all messages/parts with fresh ids and stamps the new session's directory/path from the request context. The former "starts fresh on directory mismatch" test now asserts fork-with-history.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:L 100-499 changed lines (additions + deletions). vouch:unvouched PR author is not yet trusted in the VOUCHED list.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: OpenCode provider loses thread context on follow-up — t3code starts a new OpenCode session instead of resuming (no durable session binding)

1 participant