feat(agents): claim-scoped write tokens#4287
Conversation
Write tokens are now issued when a consumer claims a wake and revoked on done. This prevents leaked credentials from granting permanent write access. Removes writeToken from webhook notifications and spawn response headers. Adds autoClaim to IdempotentProducer instances. Includes fixes for done-clobbers-newer-claim race and kill-path cleanup of stale claim state. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4287 +/- ##
==========================================
+ Coverage 54.87% 55.48% +0.60%
==========================================
Files 193 193
Lines 19567 19599 +32
Branches 5062 5065 +3
==========================================
+ Hits 10737 10874 +137
+ Misses 8828 8721 -107
- Partials 2 4 +2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
✅ Deploy Preview for electric-next ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
These tests use ctx.currentWriteToken (now null) for tag operations. They need to be adapted to the claim-scoped token flow in a follow-up. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- `tag update on stopped entity`: kill clears claims, so the correct response is 401 (no valid claim), not 409. Updated assertion. - `sequential tag updates accumulate`: uses the claim flow (send message → expectWebhook → claim via callback-forward → get write token) to obtain a claim-scoped token before writing tags. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The durable-streams callback endpoint requires a Bearer token for authentication. Pass notification.parsed.token as the Authorization header when claiming via callback-forward. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
balegas
left a comment
There was a problem hiding this comment.
Invariant clarification
The "only the current claim holder can write" claim is technically not held — handleStreamAppend checks the token sync then forwards async, so an in-flight write from a superseded claim can still land. Runtime writes are epoch-fenced via IdempotentProducer, but direct HTTP writers (external clients, tag writes) only have the bearer check. Probably fine in practice (short window, rare supersession), just noting it.
TTL for orphaned claims
issuedAt is stored but unread. A crashed consumer wedges the stream until another claim, kill, or restart. Suggest a lazy check in isValidEntityWriteToken (or a small sweep) that evicts entries older than ~3× heartbeat. Cheap, and closes the orphan case.
Bug: done clears the token before updating status
if (stillOwnsClaim) clearActiveClaimForStream(...)
...
if (entity && stillOwnsClaim) await registry.updateStatus(entity.url, `idle`)If updateStatus throws, the token is gone but the entity stays running — a retried done sees stillOwnsClaim === false and never sets idle. Swap the order: update status first, clear the token after.
If updateStatus throws after the token is already cleared, a retried done sees stillOwnsClaim === false and never transitions to idle. Fix by updating status first, clearing the token only on success. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
Replace static entity write tokens with ephemeral, claim-scoped tokens. Under the old model, the entity's permanent
write_tokenwas returned viax-write-tokenheader on spawn and embedded in webhook notification payloads — any consumer that ever saw it retained permanent write access. Now, write tokens are issued when a consumer claims a wake and revoked when it sendsdone.Approach
Token lifecycle:
randomUUID()token, stores it in an in-memoryactiveClaimWriteTokensmapdone→ token is revoked from the mapKey implementation detail: the callback-forward endpoint acts as a proxy between the consumer and the durable-streams server's
/callback/{consumerId}endpoint. Claim requests must include the durable-streams claim token (notification.token) as aBearerauth header. When the durable-streams server responds{ok: true}, the agents-server enriches the response with a claim-scoped write token.Key invariants:
autoClaim: trueonIdempotentProducerensures the producer participates in the claim protocolNon-goals:
issuedAtis stored for future use but no sweep is implemented yet)write/set_tag/writeStateProtocolDSL actions (they used static tokens; adapting them to the claim flow is a follow-up)Trade-offs:
writeTokenfrom webhook notification payloads entirely rather than deprecating: clean break since the runtime already uses the claim flow.Verification
Files changed
agents-server/src/server.tsactiveClaimWriteTokens+activeClaimWriteTokensByConsumer) for one-to-one invariant. Done-clobber fix.agents-server/src/electric-agents-manager.tswriteTokenValidatorviasetWriteTokenValidator()dependency injectionagents-server/src/electric-agents-routes.tsx-write-tokenfrom spawn response, clear claims on kill pathagents-runtime/src/process-wake.tsautoClaim: truetoIdempotentProducer, simplify writeToken fallbackagents-runtime/src/types.tswriteTokenfromWebhookNotificationinterfaceagents-server/test/server-claim-write-token.test.tsagents-server/test/wake-registry.test.tsappendInternalEventhelper (bypass HTTP write auth)agents-server/test/scheduler-integration.test.tsconformance-tests/src/electric-agents-tests.tssequential tag updatesuses callback-forward with auth,tag update on stopped entityexpects 401, new spawn-token-absence assertions. Skip oldwrite/set_tagtests. Remove dead code from property tests.conformance-tests/src/electric-agents-dsl.tswrite,set_tag,writeStateProtocolfrom DSL action typeswebsite/docs/.../programmatic-runtime-client.mdsetTag/removeTagdocs to reference claim-scoped write tokens🤖 Generated with Claude Code