Skip to content

Step 2.5: Inbox substrate + dumb v1 ordering + operator-action logging #23

@heavygee

Description

@heavygee

Goal

Land the inbox_items substrate and a deliberately dumb v1: promote per-session terminal states into persistent operator attention items, ordered oldest-first with category labels (no learned salience), and log every operator action as a training signal. This is the instrumented training ground for the future cross-session salience model — not a smart prioritizer.

Spec

  • docs/plans/2026-06-03-overseer-build-sequence.md Step 2.5
  • docs/plans/2026-06-03-overseer-contracts.md §3 (inbox item model), §1 (attention_candidate promotion gate), §11 (provenance)
  • docs/plans/2026-06-03-overseer-prioritization.md §5 (v0 simplifications; implicit/explicit operator signals)
  • docs/plans/2026-06-05-overseer-rollout-and-diagrams.md §4 (attention-inbox mocks + confirmed v1 decisions) — primary for v1 scope
  • Upstream: tiann Voice roadmap: picker -> provider abstraction -> local stack -> fleet command centre tiann/hapi#691 (the "priority model" = category taxonomy + coarse rank; learned salience is later)

Acceptance (v1 — deliberately dumb)

  • Migration adds inbox_items table (per contracts §3, full-model schema) + a join/index path from events.id to inbox_items.source_event_ids. Schema is designed for the full salience model; v1 leaves the learned fields inert.
  • Per-session promotion: each session terminal/turn state with attention_candidate = 1 becomes one attention item. No cross-session dedupe / merge / root-cause rollup in v1 — deferred to the Overseer (Step 3: Read-only Overseer wired to voice #25).
  • Dumb ordering: default sort is oldest-first (email discipline). Category badge (QUESTION / BLOCKED / ERROR / FINALE / STALE / needs-approval) is a label, not computed priority. No learned weights. The coarse permission > blocked > completion rank exists only as an optional grouping (the Triage-Stack view), not the default.
  • Operator-action logging from day one: open / snooze / done / dismiss / delete / route / retry are recorded against the item as training signal (per inbox_items status + operator_feedback), even though nothing consumes them yet. The dataset is the point.
  • Artifact-centric title where available: when an item carries an artifact_ref (PR / issue / branch / deploy), surface that as the title; fall back to the session name otherwise.
  • Lightweight explain_priority/provenance: "why am I seeing this?" answerable from category + age + source event IDs (full aging/criticality reasoning is a later layer).

Out of scope (deferred)

Dependencies

Suggested PR breakdown

1-2 PRs: inbox schema migration; per-session promotion + oldest-first ordering + category labels + operator-action logging.

Risks

  • "Becomes email" — unread-count anxiety. Guardrails: stale expiry, severity-label discipline, FYI ⇒ captured-only (never queued), bulk handling, "why am I seeing this?" answerable. (See rollout doc §4 failure-mode list.)
  • Resisting premature smartness: the temptation to add priority inference in v1 defeats the purpose — the first useful dataset is operator response, not agent judgement.

Metadata

Metadata

Assignees

No one assigned

    Labels

    architectureArchitectural / substrate workfleet-overseerFleet attention-arbitration architecturemvpPart of the Overseer MVP acceptance bar (Steps 1-4)

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions