Skip to content

docs(specs): PR Shepherds design#4296

Draft
balegas wants to merge 34 commits intomainfrom
balegas/pr-shepherds-design
Draft

docs(specs): PR Shepherds design#4296
balegas wants to merge 34 commits intomainfrom
balegas/pr-shepherds-design

Conversation

@balegas
Copy link
Copy Markdown
Contributor

@balegas balegas commented May 8, 2026

Summary

Draft design for PR Shepherds — a multi-agent system that shepherds GitHub PRs through a fixed set of gates (template, CI, conflicts, review threads, docs) until ready to merge. First non-trivial multi-agent system on the Electric Agents reactive-blackboard pattern.

The primary goal is to exercise the framework end-to-end on a real workload, find platform bugs, and surface improvements before tackling harder agentic systems.

Three worker agents (pr-reviewer, pr-build-doctor, pr-doc-editor) plus a thin pr-manager and discovery pr-watcher. Each is a hybrid: TS entity shell wires subscriptions/tools/prelude; the agent's reasoning lives in a markdown skill at packages/agents/skills/pr/<role>.md.

Spec at docs/superpowers/specs/2026-05-08-pr-shepherds-design.md. Sections cover entities, shared blackboard schema, signal vocabulary, subscription mechanism, per-role decision trees, iteration caps + /continue//stop slash-commands, safety gates (agents label entry, worktree lock, no force-push), failure modes, phase-1/phase-2 boundary, testing strategy, component layout, and §15 Templates (PR description, review thread comment, status comment, commit message, slash-command grammar).

Deliberately limited (per §2): no coding agent, one repo per watcher, no webhooks (phase 1), no automerge, not a general-purpose framework.

Test plan

  • Review the spec for correctness, ambiguity, and unstated assumptions.
  • Confirm the framework's reactive-observe API shape (ctx.observe(...).where(...)) matches what §3.4 assumes.
  • Sanity-check the iteration caps and pause/resume semantics against your expectations.
  • Validate template defaults (§15.2 PR description; §15.4 status comment) against existing repo conventions.
  • Approve before moving to the implementation plan via the writing-plans skill.

🤖 Generated with Claude Code

balegas and others added 30 commits May 8, 2026 16:05
Reactive blackboard architecture for shepherding GitHub PRs through
gates (template, CI, conflicts, threads, docs) using independent
observer agents that wake on signals. Phase 1 polls; phase 2 will
swap in webhooks without changing the observer contracts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the dedicated pr-observer entity with the existing generic
worker entity. Each role becomes a skill under skills/pr/ that the
worker loads via use_skill on spawn. pr-manager owns the only
persistent subscription to the signals collection and dispatches a
fresh worker per signal+role pair, with iteration counters persisted
in agent_state so caps work across spawns.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Consolidate seven roles into three long-lived worker entities
(pr-reviewer, pr-build-doctor, pr-doc-editor) plus pr-manager that
absorbs the four mechanical roles (sync, description, gates,
lifecycle). All five entities are hybrid: small TS shell wires
subscriptions/tools/prelude, agent reasoning lives in a markdown
skill at packages/agents/skills/pr/<role>.md. Each entity has its
own persistent timeline so it can reason about prior work on the PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Drop historical references to prior design drafts.
- Add per-watcher state schema (managed_prs ledger).
- Clarify reviewer skill decision tree to handle new_human_comment
  and continue_granted signals correctly (review pass and address
  pass decided independently).
- Document signal payload shapes (head_sha_changed, ci_failed,
  new_human_comment, commits_pushed, human_input_required,
  continue_granted).
- Fix role-naming convention: entity = pr-<role>, role name in
  state/payloads/slash-commands = short form.
- Detect human-authored pushes by checking head_sha against the
  agent-authored commits table.
- Consistency: 'observers' -> 'workers'; 'emit/write a signal' ->
  'insert' where appropriate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- §15 Templates defines the six concrete artifacts agents produce or
  consume: gate computation rules, PR description template (default
  + repo override path), review-thread comment, status comment,
  commit message, thread reply, slash-command grammar.
- Spell out what ready_to_merge does: apply 'agents:ready' label,
  update status comment. No auto-merge, no LGTM, no draft removal.
- Add 2s wake debounce to the manager so chatty PRs don't burn
  through one agent run per signal.
- Add mergeable + status_comment_id to pr_meta schema since they're
  now referenced.
- Drop the now-defined items from §14 Items deferred.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The primary goal is exercising the Electric Agents framework on a
real workload to find platform bugs, not building a fully autonomous
PR-shipping system. Spell out the deliberate limits up front:
no coding agent, one repo per watcher, no webhooks, no automerge.

Also drop redundant non-goals now covered by §1's limits.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The "no coding agent / one repo / no webhooks / no automerge"
items are scope decisions, not goals. Consolidate them into §2 with
the rest of the non-goals so the goal section stays focused on
intent.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Existing fixtures (runtime-dsl tests, process-wake tests) all set 'type'
on every SharedStateCollectionSchema entry. The plan omitted it, which
typechecks today only because no consumer wires the schema yet — the
moment Tasks 13-18 do db('pr-...', schema), strict structural matching
against SharedStateCollectionSchema would fail.

Set distinct event-type strings (pr:meta, pr:check, pr:signal, etc.) so
collectionNameByEventType in entity-stream-db.ts can route change events
deterministically, instead of relying on the state:${name} fallback.
pr_synced has Record<string, never> per spec §3.3 so {foo:1} no longer
type-checks. Switch to head_sha_changed which has a typed payload to
exercise the same generic-payload-passthrough behavior.
…to-discovers them

SkillsRegistry.scanDir is non-recursive — skills under skills/pr/ would
not have been loaded into the catalog, so use_skill('pr-reviewer') would
return null. Move the four worker skills to skills/pr-<role>.md (flat).
Templates stay at skills/pr/templates/ since they're not registered
skills, just data files the skills reference.
execFile's promisified return is { stdout: string | NonSharedBuffer }
when no encoding is specified. GhRunner expects string. Coerce explicitly.
balegas and others added 4 commits May 9, 2026 01:55
Register all 5 PR shepherd entities (pr-watcher, pr-manager, pr-reviewer, pr-build-doctor, pr-doc-editor) in the bootstrap registry alongside horton and worker. Includes comprehensive test to ensure all entity types are properly exposed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… idempotent firstWake

Three correctness fixes from final review:

1. ctx.observe now passes { wake: { on: 'change', collections: ['signals'] } }
   so the manager wakes on any worker-emitted signal (commits_pushed,
   review_complete, human_input_required), not just timer/inbox.
2. Insert gate_state_changed when a gate flips and ready_to_merge when
   it newly becomes true — spec §3.3 lists pr-manager as producer.
3. Guard pr_meta and agent_state inserts with empty-collection checks so
   re-delivered firstWake events don't crash on duplicate-key inserts.
@codecov
Copy link
Copy Markdown

codecov Bot commented May 9, 2026

Codecov Report

❌ Patch coverage is 80.75117% with 82 lines in your changes missing coverage. Please review.
✅ Project coverage is 59.21%. Comparing base (590aabb) to head (f435de3).
⚠️ Report is 2 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
packages/agents/src/agents/pr-manager.ts 79.24% 33 Missing ⚠️
packages/agents/src/agents/pr-reviewer.ts 75.00% 13 Missing ⚠️
packages/agents/src/agents/pr-shared/github.ts 55.17% 13 Missing ⚠️
packages/agents/src/agents/pr-watcher.ts 75.00% 11 Missing ⚠️
packages/agents/src/agents/pr-shared/worktree.ts 60.86% 9 Missing ⚠️
...ages/agents/src/agents/pr-shared/status-comment.ts 92.10% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4296      +/-   ##
==========================================
+ Coverage   57.23%   59.21%   +1.98%     
==========================================
  Files         227      266      +39     
  Lines       23013    24892    +1879     
  Branches     6006     6493     +487     
==========================================
+ Hits        13171    14741    +1570     
- Misses       9837    10144     +307     
- Partials        5        7       +2     
Flag Coverage Δ
packages/agents 70.83% <80.75%> (+10.52%) ⬆️
packages/agents-mcp 77.19% <ø> (?)
packages/agents-runtime 80.10% <ø> (+0.87%) ⬆️
packages/agents-server 69.27% <ø> (+0.19%) ⬆️
packages/agents-server-ui 5.31% <ø> (-0.09%) ⬇️
packages/electric-ax 38.04% <ø> (ø)
packages/experimental 87.73% <ø> (ø)
packages/react-hooks 86.48% <ø> (ø)
packages/start 82.83% <ø> (ø)
packages/typescript-client 94.32% <ø> (ø)
packages/y-electric 56.05% <ø> (ø)
typescript 59.21% <80.75%> (+1.98%) ⬆️
unit-tests 59.21% <80.75%> (+1.98%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@netlify
Copy link
Copy Markdown

netlify Bot commented May 9, 2026

Deploy Preview for electric-next ready!

Name Link
🔨 Latest commit f435de3
🔍 Latest deploy log https://app.netlify.com/projects/electric-next/deploys/69fe89c9c6e7df0008e98231
😎 Deploy Preview https://deploy-preview-4296--electric-next.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant