Skip to content

feat(selflearn): L1 per-turn background reviewer implementation (#52)#75

Merged
yanmxa merged 77 commits into
mainfrom
feat/selflearn-l1
Jun 3, 2026
Merged

feat(selflearn): L1 per-turn background reviewer implementation (#52)#75
yanmxa merged 77 commits into
mainfrom
feat/selflearn-l1

Conversation

@yanmxa
Copy link
Copy Markdown
Member

@yanmxa yanmxa commented May 30, 2026

L1 self-learning implementation for #52, the per-turn background reviewer
designed in PR #58. Wires the feature end-to-end: trigger core →
restricted fork → memory + skill stores → action-permission gates →
runtime UI surfaces.

What's in this PR

Core (internal/selflearn/)

  • Reviewer (per-turn trigger) — two arms with independent enable +
    cadence, single-flight drop-don't-reset, defensive snapshot copy before
    the goroutine, hydration on session resume.
  • MemoryStore (memory_write tool surface) — project-partitioned
    store under ~/.gen/projects/<encoded-cwd>/memory/, atomic temp+rename
    writes, per-file 25 KB cap matching the injection cap (§4.2 invariant),
    injection scanner rejecting prompt-injection / exfiltration payloads.
  • SkillManager (skill_manage tool surface) — six actions (create,
    edit, patch, write_file, remove_file, delete) gated by Deny-encoded
    permission flags + one advanced opt-in for patching user-authored
    skills; injection scanner on bodies + descriptions + patch results +
    support files.
  • RunReview (fork mechanics) — fresh core.Agent inheriting the
    parent's cached system prompt verbatim for prefix-cache parity (§6
    invariant push #2), restricted toolset whitelist + action-level allowlist,
    5-min wall-clock deadline, best-effort + recover, single-flight.
  • Three review prompts (memory / skill / combined) implementing the
    full §5.1 trigger conditions — UPDATE > DELETE > CREATE preference,
    CREATE four required criteria, the "user voiced a correction" UPDATE
    path, eviction-first memory framing with the cap-near retire loop.
  • ResolveSettings bridges setting.SelfLearn → runtime values
    (Config, ActionPermissions, MemoryMaxChars) with Validate() up front
    so an illegal §3.1 combination is rejected at startup.

Config (internal/setting/)

  • SelfLearnSettings + SelfLearnMemory + SelfLearnSkills types
    with Deny-encoded permission booleans (zero ⇒ allow, Go idiom).
  • Validate() enforces the three illegal combinations from §3.1
    (no-update-with-create, no-delete-when-create+update,
    advanced-opt-in-without-base).
  • UpdateSelfLearnAt(cfg, userLevel) persists the panel's edits
    through the existing atomic merge path.

Wire-up (internal/app/)

  • wireSelfLearn at session start; reads setting.SelfLearn, applies
    the GEN_DISABLE_SELF_LEARN env override (§3.1 escape hatch),
    builds stores + manager + reviewer, hooks write observers, captures
    build params for cache-parity-preserving fork construction.
  • OnTurnEnd forwards every Result to the reviewer; cancelled /
    interrupted turns are silently skipped inside Observe.
  • StopAgentSession clears the reviewer reference; in-flight goroutines
    finish on the 5-min deadline.
  • memory-auto reminder provider registered alongside memory-user
    and memory-project so agent-written entries are injected as their
    own scope and never mixed with user-authored GEN.md / CLAUDE.md.

User-visible surface (internal/app/selflearn_ui.go + UI hooks)

  • SelfLearnUIState four-phase state machine (idle / reviewing / done
    / failed) with target debounce and auto-decay holds (2s / 3s).
  • Status-bar evolving <braille> <target> indicator using the shared
    kit.BrailleSpinnerFrames table; hidden in both idle and disabled
    states (status line is pixel-identical to a feature-off session).
  • In-stream recap block built from the action log (the actual tool
    calls captured via observers), not the model's narration — the recap
    reads as structurally accurate data, not LLM self-report.
  • /config slash command opens a Self-Learning panel with editable
    rows for both arms, inline validation, scope toggle between user-wide
    and project-local saves.

Tests

  • -race clean across internal/selflearn, internal/setting,
    internal/reminder, internal/app, internal/app/conv,
    internal/app/input, internal/app/kit.
  • Coverage includes: trigger cadence (turns / iters / combined),
    interrupted-turn skip, concurrency cap behaviour, restricted-toolset
    enforcement, permission gate (every legal §5.5 combination +
    startup-rejection of the three illegal ones), allowUpdateUserCreated
    scope extension, injection scan on every memory + skill write path,
    recap-block formatting + verb mapping, state-machine transitions
    including done-hold decay survival across DrainActions, snapshot
    isolation under concurrent Observe.

What's NOT in this PR (per the design)

  • §8 L2 curator — explicitly deferred to its own issue
  • §10 user-level auto-memory (cross-project persona) — open question
  • §10 topic-file count cap — L2's job
  • §10 cache parity verification on non-Anthropic providers — needs
    real-provider testing
  • §10 usage telemetry — Phase 2 prereq

Manual e2e

Not yet run against a real LLM. Reviewers / users who pull this branch
can exercise by setting in settings.json:

{
  "selfLearn": {
    "memory": { "enabled": true, "everyTurns": 3, "maxKB": 25 },
    "skills": { "enabled": true, "everyToolIters": 5 }
  }
}

Then watch for evolving in the bottom bar after 3 user turns / 5
cumulative tool iters, and the recap block appearing in the
conversation flow on completion. Disk artifacts will land in
~/.gen/projects/<encoded-cwd>/memory/ and ~/.gen/skills/<name>/ (or
./.gen/skills/ for project-level entries).

Design alignment

Design note (#58, now in main) tracked at every step; cross-references
in code comments use the stable §-anchors. Highlights of where this PR
deviates from the merged design and why:

  • Permission flags renamed from allowCreate / allowUpdate / allowDelete
    (*bool) to denyCreate / denyUpdate / denyDelete (bool) — zero
    value should be the safe default. The design note was updated to
    match in the same PR that landed the rename (commit e2780e2).
  • Inventory() reads SKILL.md directly from disk instead of via
    skill.Registry. The registry is loaded once at startup and only
    refreshed via ReloadPluginBackedState; a registry-backed Inventory
    would not see skills the reviewer created earlier in the same
    session, causing it to re-create duplicates. The disk scan is the
    correct freshness for this caller. Documented inline.

Closes #52 (implementation).

🤖 Generated with Claude Code

yanmxa and others added 30 commits June 2, 2026 23:22
First slice of the self-learning loop (#52). Package internal/selflearn (not a
cryptic "l1") holds the per-turn Reviewer:

- Two-signal trigger: memory on user-turn cadence, skills on tool-iteration
  threshold; combined when both trip the same turn.
- Only StopEndTurn turns count (cancelled/interrupted/max-turns skipped).
- ≤1 review in flight; a trigger that arrives mid-review is dropped and NOT
  reset, so it fires again on the next clean turn.
- The fork+review is an injected ReviewFunc so the trigger is unit-testable
  without an LLM; SeedTurns hydrates cadence on resume.
- selfLearn settings (memory/skills enabled + intervals), off by default.

Tests cover memory cadence, skill threshold, combined, interrupted-turn skip,
concurrency cap (drop + retry), and seed. Build + format-check + tests green.

Next slices: the fork (restricted core.Agent), memory_write + skill_manage
tools, the memory store + injection source, skill origin field, review prompts,
and Task.Start wiring.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add the L1 reviewer's durable-memory write surface: a project-partitioned
MemoryStore under ~/.gen/projects/<encoded-cwd>/memory/ with add/replace/
remove-by-substring entry semantics (mirroring the hermes memory tool), atomic
temp+rename writes, exact-duplicate dedup, a per-file char budget, and a
coarse injection/exfiltration scan since entries are injected into a future
system prompt. The memory_write tool exposes these to the fork only.

Add the read side as a distinct source so agent-written memory and
user-authored GEN.md/CLAUDE.md never mix: system.AutoMemoryDir/IndexPath and
LoadAutoMemory (capped, line-boundary truncation). Reminder-layer injection is
wired in a later slice.

Also fix two stale "l1:" log prefixes left from the package rename.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add the Origin frontmatter field to Skill (absent = user-created) so the
reviewer can tell apart skills it owns. Add the L1-only skill_manage tool over
gen-code's existing user/project skill scopes (no agent-created/ subdir):
create / patch / edit / write_file / remove_file / delete, all refusing to
touch user-created skills.

patch uses a pragmatic fuzzy-match chain (exact -> line-trimmed ->
whitespace-collapsed) with an escape-drift guard and a unique-match
requirement; the block-anchor and context-similarity tiers from the design are
deferred. Class-level kebab names are enforced (also a traversal guard), and
support files are confined to references/templates/scripts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add RunReview: forks a restricted core.Agent over the just-completed turn
snapshot to capture durable learnings. The fork inherits the parent's system
prompt verbatim via a fresh System (avoids clobbering the parent's telemetry
observer that NewAgent installs), is granted only memory_write + skill_manage
behind a static allow-only permission policy (never prompts the TUI), runs
headless (OutboxBuf -1) with a bounded MaxTurns, and is driven via
SetMessages + Append + ThinkAct.

Add the three review prompts (memory / skill / combined) selected by which arms
fired; each embeds the current memory store and skill inventory so the fork
refreshes and dedupes rather than blindly appending. Thread the turn snapshot
through ReviewFunc so the reviewer reviews the turn it observed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Security — extend the injection scan from memory writes to all four skill
write paths (Create, Edit, Patch, WriteFile). Skills are loaded into a
future system prompt the same way memory is, so a poisoned skill body or
description is the same stored-injection vector with the same fix.

Refactor scan.go to expose two checks:
  - scanContent — rejects empty + threats; for required content
    (memory entries, skill bodies).
  - scanForThreats — threats only, allows empty; for optional fields
    (skill descriptions, support-file contents, patch results that may
    legitimately remove text).

Add TestSkillRejectsInjectionContent covering create / edit / patch /
write_file paths plus the rollback guarantee (a rejected write must not
mutate the on-disk skill).

memory.go's two existing callsites are renamed scanMemoryContent
→ scanContent to match the refactored API.

Docs alignment (post-rebase) — update three stale §3a / §3b code-comment
references to the merged design doc's new structure: §4 Memory flow,
§5.2 Provenance / write scope, §5.3 Tool surface. invariant #8 and §3
references remain valid and are untouched.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Implement the action-level permission scheme defined in the merged design
note: three booleans (allowCreate / allowUpdate / allowDelete) gate the
corresponding skill_manage actions on the agent-created scope; one
advanced opt-in (allowUpdateUserCreated) extends patch — and only patch —
to user-created skills.

Config (internal/setting/settings.go)
- SelfLearnMemory grows a MaxKB field with the 25 KB upper-bound invariant
  (§4.2). Constants: SelfLearnMaxMemoryKB, SelfLearnDefaultMemoryKB.
- SelfLearnSkills grows four permission fields. The three core ones are
  *bool so unset (nil) cleanly distinguishes from explicit false; default
  resolves to true via AllowCreateOr / AllowUpdateOr / AllowDeleteOr.
- SelfLearnSettings.Validate enforces the three rejected combinations
  documented in §3.1: create-without-update, only-add-never-subtract,
  advanced-opt-in-without-update-base; plus the maxKB range check.
- MaxKBOr returns the default fallback so callers don't repeat the
  zero-means-default branch.

SkillManager (internal/selflearn/skill.go)
- ActionPermissions struct + DefaultActionPermissions() constructor. Carries
  the four booleans. NewSkillManager now takes ActionPermissions explicitly
  so the caller's intent is visible at every construction site.
- requirePatchable extends the existing requireAgentOwned guard: patch on
  user-created is allowed iff AllowUpdateUserCreated is set. All other
  update-shaped actions (edit, write_file, remove_file) stay on
  requireAgentOwned — only patch crosses the user-created boundary
  (§5.2 / §5.5 invariant).
- Each of the six action entry points (Create, Edit, Patch, WriteFile,
  RemoveFile, Delete) opens with a uniform errActionDenied early-return
  when its allow flag is off, so the model sees a consistent
  "permission denied" shape on the veto path.

Tests
- TestSelfLearnValidate covers the four legal combinations (default, freeze,
  patch-only, feature-off), the three rejected combinations, and the maxKB
  range — both the upper bound and the negative case.
- TestSelfLearnAllowAccessors confirms the nil→true / explicit-pass-through
  contract every downstream consumer depends on.
- TestSelfLearnMemoryMaxKBOr exercises the default fallback.
- TestSkillActionPermissions runs each allow* flag through its action(s),
  asserting both the error message and the rollback guarantee (a denied
  write must not mutate disk).
- TestSkillAllowUpdateUserCreated asserts the advanced opt-in extends patch
  to user-created, but never Edit, Delete, or WriteFile — the design
  invariant that only patch crosses the boundary.

Existing helpers (newTestSkillManager, fork tests) are updated to pass
DefaultActionPermissions, preserving prior behavior. -race passes for both
selflearn and setting packages.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wraps up the Phase 1 implementation surface of the L1 design note
(notes/active/l1-background-review.md): configurable memory cap, fork
deadline, defensive snapshot copy, permission-aware prompt synthesis,
the setting→runtime bridge, and the memory-auto reminder provider.

Memory cap configurable (memory.go)
- Rename memoryFileCharLimit → DefaultMemoryFileCharLimit and store it
  on MemoryStore via NewMemoryStoreWithCap(cwd, maxFile). Add/Replace
  enforce per-instance maxFile instead of the package constant, so the
  app can lower the cap via memory.maxKB (§3.1) without forking the
  store. NewMemoryStore preserves the old default for callers that
  don't know about §4.2 yet.

Fork deadline (fork.go)
- ForkConfig grows a Deadline (default 5 min, DefaultForkDeadline).
  RunReview wraps the caller's ctx in context.WithTimeout, satisfying
  invariant #5: a hung provider call now returns instead of leaving
  inFlight=true and silently disabling all future reviews.

Snapshot copy (reviewer.go)
- Observe copies result.Messages into a fresh slice before handing it
  to the background goroutine. The main agent loop reuses its message
  slice; without the copy the goroutine would race on truncate-on-compact
  / append. -race passes for concurrent Observe (concurrency_test.go).

Prompt synthesis per permissions (prompts.go)
- skillSection constant replaced with skillSectionFor(mgr): action steps
  the SkillManager will veto at dispatch are stripped from the prompt.
  AllowCreate=false swaps "create is last resort" for "creation is
  disabled"; AllowDelete=false drops the retire-skill bullet entirely;
  AllowUpdateUserCreated=true widens the scope sentence to "patch any
  existing skill (including user-created)". Permission layer remains
  the hard floor — this is steering, not enforcement.

Setting → runtime bridge (config_setting.go, new)
- ResolveSettings(setting.SelfLearnSettings) → Resolved{Config, Perms,
  MemoryMaxChars}. Single conversion point with Validate() up front so
  wire-up code never branches on per-field defaults.

Auto-memory reminder provider (reminder + app/agent.go)
- New ProviderMemoryAuto constant. agent.go's reminder registration
  picks up the L1 store via system.LoadAutoMemory(cwd) and emits it
  under scope="auto" — distinct from memory-user / memory-project so
  agent-written and user-authored memory never mix. Wraps the same
  PostCompact / cwd-change refresh path the other memory providers use.

Tests
- concurrency_test.go: snapshot-copy isolation + concurrent Observe
  hammered by 8×20 goroutines under -race.
- prompts_test.go: 5 permission-scenario coverage of skillSectionFor.
- config_setting_test.go: happy path, validation error propagation,
  MaxKB default fallback and pass-through.

-race passes for selflearn, setting, reminder, and app packages.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Connects the implementation of the previous commits into the running
session so the feature actually fires when configured. Implements §9
step 4 of the design note.

services.SelfLearn (services.go)
- New field on the app's services struct that holds the live Reviewer
  when at least one arm is enabled. Nil ⇒ zero overhead: no goroutine,
  no counters, no extra model calls (§3.1 promise).

wireSelfLearn (internal/app/selflearn.go, new)
- Called from ensureAgentSession after Agent.Start so the ReviewFunc can
  capture the live System.
- Reads setting.SelfLearn via selflearn.ResolveSettings, which also runs
  Validate() so an illegal config is logged and skipped instead of
  crashing the session.
- Builds NewMemoryStoreWithCap + NewSkillManager with the resolved cap
  and permissions, threads both into a ForkConfig the ReviewFunc
  constructs at fire time.
- Rebuilds the LLM client from the captured BuildParams.Provider/ModelID
  /MaxTokens so the fork hits the same prefix-cache key as the parent
  (§6 invariant #2) without sharing the parent's request cycle.
- Seeds the memory arm's counter from the preloaded user-turn history
  via SeedTurns (invariant #8 resume).
- Review summary is logged for Phase 1; user-visible MessageEvent +
  status-bar surface is Phase 1.5 (§"User-visible surface").

OnTurnEnd hook (model_agent_events.go)
- Forwards every turn Result to forwardTurnToSelfLearn unconditionally.
  The Reviewer gates on StopEndTurn internally so cancelled / interrupted
  turns are silently skipped at one place rather than scattered up the
  call stack.

StopAgentSession cleanup (agent.go)
- clearSelfLearn nils out services.SelfLearn so a future ReviewFunc
  closure can't hold references to a torn-down agent. In-flight
  goroutines complete on their own via DefaultForkDeadline.

systemOnlyAgent shim
- Minimal core.Agent wrapper that exposes only System(). The ReviewFunc
  needs the parent's System for prefix-cache parity but nothing else
  from the live agent — keeps the dependency surface narrow and lets us
  avoid extending the core.Agent interface with an LLM() accessor that
  nothing else needs.

ReviewKind.String (reviewer.go)
- Stable log-friendly labels (none / memory / skill / memory+skill) for
  the wire-up's review log entry. Covered by TestReviewKindString.

-race passes across selflearn, setting, reminder, and app packages.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Implements two of the three §"User-visible surface" pieces from the design
note (the /config Self-Learning panel remains for a follow-up):

Status-bar "evolving …" indicator (conv/message.go + view.go)
- New SelfLearnEvolving flag on OperationModeParams and a paired
  renderSelfLearnIndicator producing a muted, italic "evolving …" segment
  in the bottom-left when a review fork is in flight. Plain text — no
  emoji, no spinner — so it reads as a static piece of structural UI
  rather than a decoration (matches the design's text-not-icon ask).
- view.go reads services.SelfLearnRunning at render time. Hidden in both
  the idle and disabled cases — the status bar pixel-matches a
  feature-off session whenever no review is actively running.

Completion / failure notices (selflearn.go + services.go)
- services.SelfLearnRunning (*atomic.Bool) is the bridge between the
  ReviewFunc goroutine and the render path; pointer-typed so the
  surrounding services struct stays freely copy-able via the View()
  path.
- ReviewFunc sets/clears the flag with defer, then on success publishes
  a hub.Event{Target: "main", Subject: header, Data: summary} that the
  app's main event loop routes through onMainEvent →
  injectNotification, surfacing the recap as a notice + content block
  in the conversation flow. Failures publish a parallel
  selflearn.review.failed event with the trimmed error message — silent
  swallowing would leave the user wondering what happened.
- "Nothing to save" stays silent (the caller already filters on
  empty summary).

Tests
- TestRenderModeStatusShowsSelfLearnIndicatorWhenEvolving locks in the
  visibility contract: hidden when the flag is false, visible — and only
  then — when true.

Deferred to a follow-up
- Braille / target-name spinner with tea.Tick animation and per-tool-call
  status updates (requires the fork to publish progress events back).
- /config multi-panel Self-Learning panel.

-race passes across selflearn, setting, reminder, and the app + conv
packages.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…vers

Promotes the bottom-left "evolving" indicator from a static text to the full
§"User-visible surface" specification:

  hidden → "evolving ⠋"           (review started)
         → "evolving ⠋ go-testing" (each successful tool call)
         → "evolved · N changes"   (2 s after success)
         → "evolving failed"       (3 s after failure)
         → hidden

State machine (selflearn_ui.go, new)
- SelfLearnUIState packs phase + target + frame + change-count behind a
  single mutex. BeginReview / Step / Complete / Fail are called from the
  reviewer fork's goroutine; Tick is called from the Bubble Tea Update
  goroutine. Snapshot returns a consistent triple for the render path.
- Step debounces target swaps at ≥ 400 ms so rapid tool calls don't flicker
  the bar; the change-counter is unaffected and counts every write.
- Done/failed phases auto-decay back to idle after 2 s / 3 s respectively;
  Tick reports active so the dispatcher knows when to stop re-arming.

Write observers (selflearn package)
- MemoryStore.SetWriteObserver and SkillManager.SetWriteObserver register
  goroutine-safe callbacks fired after each successful write. The fork
  package itself stays UI-agnostic; the app layer hooks the callbacks to
  Step the UI state with the affected name (skill kebab name) or
  "memory · <topic>" (basename minus .md, MEMORY.md collapses to bare
  "memory"). memoryTopicSuffix encodes that mapping.

Tick + hub plumbing (update.go, model_turn_queue.go, selflearn.go)
- selflearnTickMsg + scheduleSelflearnTick drive the spinner at ~100 ms.
- ReviewFunc publishes a "selflearn.review.started" hub event the instant
  it begins so the main loop schedules the first tick — the indicator is
  visible from frame zero rather than waiting for a render.
- onMainEvent intercepts the started event and routes it to the tick
  scheduler instead of letting it become a user-visible notice (the recap
  block at completion is still routed normally for "selflearn.review.done"
  / "selflearn.review.failed").
- handleSelflearnTick advances the state and re-arms while non-idle; the
  loop quiesces automatically when the state decays to idle.

services / view
- services.SelfLearnUI replaces the previous atomic.Bool. Always non-nil
  (constructed by newServices) so render can take Snapshot without a
  nil-check and idle just renders empty.
- conv.OperationModeParams.SelfLearnSegment carries the pre-rendered text
  so the conv layer doesn't have to import the app-side state struct.

Tests
- TestSelfLearnUIPhaseTransitions: idle→reviewing→done lifecycle, render
  format, and decay timing.
- TestSelfLearnUIFailDecay: failed-phase hold outlasts done-hold.
- TestSelfLearnUIStepDebouncesTarget: rapid Step calls keep the previous
  target but still bump the change counter.
- TestSelfLearnUITickFrameAdvances: spinner moves each tick.
- TestMemoryTopicSuffix: index / topic file → status tail mapping.

-race passes for selflearn, setting, reminder, and the app + conv + input
packages.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Implements the §"User-visible surface" /config slash command — the third
and last runtime UI surface in the design note (status-bar spinner and
completion notice landed earlier). A multi-panel sidebar (Provider,
Permissions, Appearance, …) is planned; for now Self-Learning is the
single panel and the scaffolding is shaped so siblings can be added
without breaking the slash command path.

setting.UpdateSelfLearnAt (loader.go)
- New helper that validates the config (§3.1) before writing and merges
  the change into either ~/.gen/settings.json (user-wide) or
  ./.gen/settings.json (project-local) via the existing atomic
  saveToFile path. Re-uses Validate() so disk never ends up holding an
  illegal combination, and clears the loadedSettings cache so the next
  Snapshot picks up the new values.

input/on_config.go (new) — the panel
- ConfigSelector struct holding a working snapshot of the L1 config that
  the user mutates in place; Save merges back to disk, Esc discards.
- rows() materializes the panel layout as a flat list of typed rows:
  header / spacer / bool / int / advanced-hint / save. The same function
  drives navigation bounds and Render so they can never disagree.
- Boolean toggles cycle the *bool fields through nil→true/false so users
  can land on any of the four §5.5 legal combinations. Int rows
  enter/edit/commit with backspace + digit handling and clamp the
  result to the row's [min, max] range — MaxKB clamps to
  SelfLearnMaxMemoryKB so the §4.2 invariant is enforced from the UI.
- Render lays the panel out with rounded borders, a cursor caret, and a
  ≈ N EN words / N 中文字 hint next to the MaxKB row. Validation errors
  surface inline beneath the rows; the Save row reads as muted/grey when
  the snapshot is illegal and green when it is legal, so the user has a
  pre-write affordance for the same check setting.UpdateSelfLearnAt
  applies.
- Tab toggles the save target between user-wide and project-local;
  surfaced in the title row so the action key is one keystroke.

Wiring
- input.Model grows a Config field; NewConfigSelector takes deps.Setting.
- /config registered in slash_command.go's dispatch table; the handler
  just calls Config.Enter(width, height).
- ConfigSelector added to popups() so the active panel intercepts keys
  before the textarea.
- Update handles ConfigSavedMsg by adding a "Self-learning config saved
  (<scope>)" notice to the conversation flow.

Tests
- TestConfigSelectorIsActivated: Enter/Esc lifecycle.
- TestConfigSelectorTogglesBool: space and enter both flip a bool row.
- TestConfigSelectorIntEditAndClamp: int edit flow + clamp to row max.
- TestConfigSelectorRenderShowsValidationError: illegal §3.1 combination
  surfaces the underlying error message in the rendered panel.
- TestConfigSelectorTabFlipsScope: Tab cycles user ↔ project.

-race passes for selflearn, setting, reminder, and the app + conv +
input packages.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…, prompt rewrites

Closes four §-level discrepancies between the L1 implementation and the
design note:

§6 invariant #7 — silent on "Nothing to save"
- selflearn.go now suppresses the user-visible recap when the model's
  reply is empty or a case/whitespace/period variant of "Nothing to
  save". The previous code only checked summary == "" so any "Nothing
  to save." reply produced a noisy notice. Helper isNothingToSave +
  TestIsNothingToSave covers the variants.

§3.1 — GEN_DISABLE_SELF_LEARN env override
- New const selfLearnDisableEnv. wireSelfLearn returns early when the
  env var is "1" / "true" regardless of settings.json. Matches the
  Claude-Code-style CLAUDE_CODE_DISABLE_AUTO_MEMORY documented in §3.1.

§4.1 — eviction-first memory prompt
- memorySection rewritten with the 4-step decision flow the design
  diagram encodes: (1) retire stale entries first; (2) replace, don't
  duplicate; (3) at-cap forces another prune; (4) only then add. The
  "a pass that only adds is a missed pruning opportunity" framing is
  now explicit.

§5.1 — full trigger conditions in skill prompt
- skillSectionFor now enumerates the table's CREATE-ALL-FOUR criteria
  (non-trivial + no coverage + class-level name + not anti-pattern),
  the UPDATE three paths (including the previously-missing
  "user-voiced style/format/workflow correction"), and the DELETE
  three paths verbatim. Preference order UPDATE > DELETE > CREATE is
  stated once at the top instead of being implicit in step ordering.
- Tests updated for the new prose anchors.

reviewPreamble + reviewClosing also adjusted: the model no longer
writes a free-form summary line — the user-visible recap is built from
the actual tool calls (the action log via the existing observers), so
the closing instruction reduces to "reply with 'Nothing to save.' or
empty".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ocks, stale comments

Mostly the audit's "should-fix" bucket — nothing changes behavior, but the
package's surface and read-flow tighten up:

- systemOnlyAgent shim deleted. currentSystem() now returns (core.System,
  bool) directly; the previous shim was a core.Agent that lied about its
  capabilities for one caller.
- forwardTurnToSelfLearn / clearSelfLearn one-line wrappers inlined at
  their two call sites.
- MaxFileChars (zero callers) and SelfLearnUISnapshot.IsActive (no
  non-test callers; redundant with Tick's bool return) deleted.
- NewMemoryStoreWithCap merged into NewMemoryStore (one constructor, one
  default behavior). Two test call sites updated to pass 0 explicitly.
- Observer fields (MemoryStore.onWrite, SkillManager.onWrite) drop their
  RWMutex. The reviewer fork is single-flight per session (§6 invariant
  #8), so the field is written once at wire-up and read on every write
  by the same goroutine — no race. Type docs now state the
  "SetWriteObserver before first write" contract.
- m.services.Agent == nil dead-branch dropped from currentSystem (Agent
  is set at newServices and never nilled).
- Empty `if !replaceAll {}` block in skillpatch.go removed; comment moved
  to the outer loop where it actually applies.
- errActionDenied's stale §3.1 anchor corrected to §5.5 (action
  permissions live there in the merged design doc).
- Validate "0 < maxKB <= 25" error message corrected to "0 <= maxKB <=
  25 (0 = default)" — 0 is intentionally accepted as "unset".
- ConfigSelector.Render caches Validate() once instead of calling it
  twice per frame (Save row label + footer error line).
- Trivial restate-the-name comments on Complete/Fail/Dir trimmed; the
  comments that remain explain why, not what.

Snapshot-copy test rewritten to exercise the actual failure mode (caller
truncates the slice while the goroutine still holds the original) rather
than mutating the immutable element it used to.

-race passes across selflearn, setting, reminder, and app + conv + input.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rses

A · MessageEvent-style recap block (§"User-visible surface")
- New ReviewAction type + SelfLearnUIState.RecordAction supersede the
  earlier Step(target). Each successful tool call now records (verb,
  kind, target) into the per-pass action log; the spinner-tail target
  still swaps with the existing 400 ms debounce.
- MemoryWriteObserver gains an action parameter ("add" | "replace" |
  "remove"); SkillWriteObserver already had one. The wire-up maps both
  to verb / kind / target rows via memoryVerb / skillVerb.
- formatRecapBlock turns the drained log into the delimiter-bounded
  block from the design example:
    ─────────────────────────────────────────────────
      · updated skill   go-testing
      · saved memory   memory · debugging
      · retired skill   outdated-thing
    ─────────────────────────────────────────────────
  Empty input ⇒ "" so the no-write pass keeps quiet (§6 invariant #7).
- publishSelfLearnSummary now takes []ReviewAction (not the model's
  one-line summary) and routes through the existing hub → notice path,
  so the recap surfaces as structural data rather than LLM self-report.
- DrainActions clears the log after the publish so back-to-back passes
  don't cross-contaminate.

D · de-duplicate the double frontmatter parse
- New parsed{path, origin, fm, body} plus parseSkill(name) — guards
  requireAgentOwned and requirePatchable now return the full parsed
  struct, so Edit and Patch no longer call ParseFrontmatterFile a
  second time for the same file. WriteFile / RemoveFile / Delete
  ignore fm/body (they only need the path).
- The mutex range itself is unchanged: ≤1 in-flight per session
  (§6 invariant #8) means the broad mu doesn't actually contend, and
  collapsing the locks would invite TOCTOU on origin checks.

Tests
- TestFormatRecapBlock locks the delimiter shape + per-row layout.
- TestVerbMapping covers every memory_write / skill_manage action's
  verb mapping so the recap stays in sync with the design example.
- TestSelfLearnUIPhaseTransitions / TestSelfLearnUIStepDebouncesTarget
  updated to the RecordAction API.

-race passes across selflearn, setting, reminder, app + conv + input.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ero defaults

settings.json public schema change. The three skill permission flags switch
from *bool tri-state (nil = default true) to Deny-encoded bool (zero =
allow). Eliminates the AllowCreateOr/AllowUpdateOr/AllowDeleteOr accessor
sprawl and three local boolPtr / ptr helpers that existed only to construct
the previous shape.

settings.json schema (§3.1)
- skills.allowCreate / allowUpdate / allowDelete *bool fields → DenyCreate
  / DenyUpdate / DenyDelete bool fields. Zero ⇒ allowed (Go idiom "zero
  value should be sensible default"). Every omitted field reads as
  "allow"; setting true is the explicit lockdown.
- skills.allowUpdateUserCreated stays an Allow-encoded bool — it is an
  opt-in for an off-by-default capability, so the polarity stays
  positive.
- Three new read-side methods AllowCreate() / AllowUpdate() /
  AllowDelete() return the negated stored value. Downstream code reads
  through these so the polarity inversion is invisible past this layer.

Validate (§3.1)
- Same three rejected combinations, error messages updated to the new
  field names:
  · denyUpdate=true requires denyCreate=true (created-but-never-refined)
  · denyDelete=true requires denyCreate=true (only-add-never-subtract)
  · allowUpdateUserCreated=true requires denyUpdate=false

Callers
- ResolveSettings (selflearn package) reads via the new accessors.
- ConfigSelector's three skill rows toggle s.Skills.DenyX directly (no
  more *bool ceremony). The toggleAllowCreate/Update/Delete helpers
  and boolPtr in on_config.go are dropped; the row toggle is now a
  one-line lambda.
- Three test helpers (boolPtr / ptr in selflearn / setting / input
  test files) deleted alongside the *bool fields they were built for.

Design doc (notes/active/l1-background-review.md)
- §3.1 config block, fields table, validation table, and rationale
  paragraph all switched to the Deny-encoded schema.
- §5.1 trigger-conditions table and §5.0 trigger-layers row also
  updated so the permission-check column shows the new field names.
- §5.5 meaningful-combinations table now shows the effective Allow
  permissions in the columns (preserving the at-a-glance reading) but
  annotates each row with the corresponding denyX field setting so the
  config-side correspondence is unambiguous.

-race passes across selflearn, setting, reminder, app + conv + input.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ead code

Four parallel /simplify agents found ~40 items. Applying the high-leverage
subset; skipping the rest with notes below.

Two correctness regressions caught by the audit (both introduced by this
branch, not pre-existing):

- setting/settings.go Data.Clone() never copied the new SelfLearn field,
  so Settings.Snapshot() silently returned the on-disk default instead
  of the user's saved config. /config would always have shown stale
  defaults. Restored the row + extended clone_test.go to lock it down
  alongside the rest of the scalar-drift guard.
- app/selflearn.go publishSelfLearnSummary/Failure put the recap block
  in hub.Event.Data, which routes through injectNotification's
  Content branch into SubmitToAgent — every successful review would
  have re-prompted the LLM with its own audit trail, breaking §6's
  "out-of-band" promise. Recap now goes in Subject (Notice-only path).
  Documented the trap so the next caller doesn't relearn it.

Simplifications:

- isNothingToSave removed. The action log is authoritative (the design
  doc says so explicitly); empty log ⇒ no writes ⇒ silent. Substring
  matching the model's text was redundant + undermined the invariant.
- currentSystem(*model) inlined at its one caller. The systemOnlyAgent
  shim it once defended against has been deleted upstream; the helper
  no longer earned its existence.
- ForkConfig.MaxTurns / Deadline override fields removed (always
  zero-defaulted in practice; no caller sets them). Exported
  DefaultMaxTurns / DefaultForkDeadline collapsed to package-private
  forkMaxTurns / forkDeadline constants used inline.
- selflearn.readOrigin (dead outside tests) removed. The two test call
  sites use the production parseSkill helper now.
- configRow.tail (write-only field) removed.
- SelfLearnUIState.changes (kept in lockstep with len(actions)) merged
  into a single doneCount snapshot that's only needed for the done-hold
  render after DrainActions clears the live log.
- Stale "Step" reference in the package doc comment fixed (the method
  was renamed to RecordAction earlier).

Reuse:

- brailleSpinnerFrames promoted to internal/app/kit.BrailleSpinnerFrames.
  on_provider_view.go's providerSpinnerFrames is now a one-line alias
  so the two consumers share the table and any non-Unicode fallback
  lands in one place.

Skipped (each noted, not argued):

- Inventory() disk re-read on every prompt build — would need a
  per-write cache invalidation hook that crosses observers; out of
  scope for /simplify, queued for a perf pass.
- Tick re-arming during done/failed hold — same; the cleanest fix
  schedules a single deadline tick instead of polling, which is a
  small but real refactor of the state machine.
- Snapshot fast-path on idle (lock-free check) — would require an
  atomic phase field that mirrors the mutex-guarded one; the cost of
  divergence outweighs the per-frame lock saving.
- ResolveSettings / ActionPermissions / SkillManager observer
  constructor — the three "drop the indirection" findings each touch
  multiple packages; they read cleanly but would be a separate
  refactor commit, not a quality cleanup.
- hub.Event "selflearn.review.started" special case in onMainEvent —
  the proper fix is a dedicated hub target with its own subscriber;
  out of /simplify scope.
- yaml.Marshal vs hand-rolled buildSkillMD — a careful refactor;
  hand-rolled quoter works for current inputs.
- Inventory via skill.Registry — already documented as deliberate
  (registry is stale-cached against same-session writes).

-race passes across selflearn, setting, reminder, and the app + conv +
input + kit packages.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
#1 mergeSettings dropped SelfLearn (setting/merger.go)
- The merger enumerated every Data field except the new SelfLearn block,
  so every Load + every SaveToFile silently wiped selfLearn from disk.
  The L1 feature was effectively unreachable from settings.json. Added
  mergeSelfLearn with field-level semantics (int coalesce, bool OR for
  enable/deny — safety-biased), plus a regression test that exercises
  base→overlay → base+overlay merges and the symmetric base-only case.

#2 /config Space on rowHeader panicked (app/input/on_config.go)
- Cursor started at index 0 (the Memory header) whose toggle field is
  nil, so the first Space keystroke after /config crashed the TUI.
  Up/Down didn't skip non-editable rows either. Fixed by (a) starting
  the cursor at the first editable row in Enter, (b) replacing the
  cursor++ / cursor-- math with nextEditableRow / prevEditableRow that
  walks past headers/spacers/advanced-hints, and (c) guarding Space and
  Enter with an explicit kind/toggle check so a stale cursor index can
  never crash via a nil call. Test: navigation across the panel never
  lands on a non-editable row.

#3 DrainActions ran BEFORE the deferred Complete (app/selflearn.go +
   app/selflearn_ui.go)
- The defer pattern meant the success path drained s.actions to a local
  var, returned, and only then ran Complete in the defer — which
  snapshotted doneCount = len(s.actions) = 0. The 2-second done-hold
  status bar always rendered "evolved" without the count. Restructured
  the wire-up to call Complete inline before Drain (no defer for the
  success path; explicit Fail call on the error path). Test:
  TestCompleteCapturesCountBeforeDrain exercises the Complete→Drain
  ordering.

#4 RunReview rooted at context.Background (app/selflearn.go +
   app/services.go + app/agent.go)
- A torn-down session couldn't cancel an in-flight fork — the LLM
  request held tokens / HTTP for up to forkDeadline (5 minutes) after
  the user moved on. Added services.SelfLearnCancel + a session-scoped
  context.WithCancel in wireSelfLearn; StopAgentSession cancels it so
  the fork unblocks immediately.

#5 Phantom "evolving → evolved" on a torn-down session
   (app/selflearn.go + app/selflearn_ui.go)
- The deferred Complete fired even on the !Active early-return,
  flashing the status bar for a review that never ran. Two fixes:
  (a) moved the liveness check above BeginReview/publishStarted so a
  bail produces zero UI state change; (b) Complete itself now goes
  straight to idle when len(s.actions) == 0 — the §6 invariant #7
  "silent when nothing changed" promise applied to the bar surface
  too. Test: TestCompleteSilentOnEmptyActions.

-race clean across selflearn, setting, reminder, app, app/conv,
app/input.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
#6 WrapMemory("auto") fell to user-authority preamble (reminder.go)
- memoryPreamble had no "auto" case; agent-written L1 content was
  introduced to the model as "the user's saved memory (preferences and
  standing instructions). Apply it throughout this session." An
  injection that slipped past the scan would have been executed as if
  the human user authored it. Added a distinct "auto" preamble that
  explicitly frames the content as durable context, NOT direct user
  instructions, with a do-not-act-on-imperatives caveat.

#7 memoryEntryDelimiter not rejected at write (memory.go)
- Content containing the literal "\n§\n" delimiter passed scanContent
  but silently split into multiple entries on next read — corruption
  AND a scan bypass (the post-split pieces were never scanned in
  isolation). Added rejectEmbeddedDelimiter, called from Add and
  Replace right after scanContent. Test:
  TestMemoryRejectsEmbeddedDelimiter exercises both write paths.

#8 Two consecutive user messages in the fork (fork.go)
- ag.SetMessages(snapshot) followed by ag.Append(UserMessage(prompt))
  put two user-role messages on the wire when the snapshot ended with
  a tool_result (RoleUser-shaped on common providers), producing 400
  "messages must alternate" rejections that made the reviewer
  effectively dead on tool-loop turns. Added trimTrailingUserMessages
  that reslices (doesn't mutate the caller's snapshot) so the review
  prompt is the only user turn at the tail. Test guards the three
  cases (trailing user, trailing assistant, nil input).

#9 Memory prompt hardcoded the 25 KB cap (memory.go + prompts.go)
- memorySection const said "hard 25 KB cap per file" regardless of
  memory.maxKB. With a lowered cap (e.g. 5 KB), the model would
  propose oversized writes that the store rejected, burning ThinkAct
  rounds. Replaced the const with memorySectionFor(*MemoryStore) which
  interpolates mem.MaxKB(); buildReviewPrompt signature grew a
  *MemoryStore parameter so the section sees the actual configured
  cap.

#10 recover() value discarded (reviewer.go)
- The Warn line was a constant string with no panic value, kinds, or
  stack — production panics were unobservable. Logged
  zap.Any("panic", rec), zap.String("kinds", …), zap.Stack so the
  operator has actionable info.

-race clean across selflearn, setting, reminder, app, app/conv,
app/input.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
#11 Observer outlives session (app/selflearn.go)
- The closures registered via SetWriteObserver captured SelfLearnUI by
  pointer and continued to fire after StopAgentSession had nilled
  SelfLearn — mutating UI state for a session the user terminated.
  Added a `m.services.SelfLearn == nil` early-return inside both
  closures so a write that lands post-stop is silently dropped.

#12 Tick chain stacking on back-to-back reviews
   (app/selflearn_ui.go + app/model_turn_queue.go)
- Every selflearn.review.started event unconditionally scheduled a
  fresh tick chain; if review B started before review A's done-hold
  finished, two chains advanced the spinner together at 2× cadence.
  Added a tickArmed bool guarded by the existing mutex plus an
  ArmTick() reservation method; onMainEvent only schedules a new tick
  when ArmTick returns true. tickArmed is cleared in Tick when the
  state decays to idle, so subsequent reviews rearm cleanly.

#13 itersSinceSkill unbounded during in-flight (reviewer.go)
- A 4-minute review during which 50 turns each contributed 10 tool
  iters left the counter at 500, then fired on every turn for many
  turns after the release. Capped the accumulator at 2× the threshold
  so a stuck review produces at most two immediate refires once it
  releases — preserves the "fire again next clean turn" intent
  without the post-release burst. Test:
  TestSkillIterCounterCappedDuringInFlight.

#14 Snapshot took mutex on every render frame (app/selflearn_ui.go)
- TUI repaints on every key press, resize, and inbound message;
  Snapshot's unconditional Lock/Unlock was pure overhead in the
  dominant idle case. Added phaseAtomic (sync/atomic.Int32) mirroring
  phase; writers update both together under the mutex; Snapshot
  fast-paths to the empty value when phaseAtomic reads idle without
  touching the mutex.

#15 Tick re-armed at 100 ms during done/failed hold
   (app/selflearn_ui.go + app/selflearn.go)
- ~20 useless tick round-trips during the 2 s done-hold and ~30
  during the 3 s failed-hold, each forcing a full tea.Cmd → scheduler
  → Msg → Update dispatch. Tick now returns the delay for the next
  scheduled tick: selflearnTickInterval while reviewing,
  REMAINING hold time while done/failed (one deadline tick instead
  of polling). handleSelflearnTick uses the returned delay
  to schedule via tea.Tick directly. Test call sites updated for the
  new (delay, active) signature.

-race clean across selflearn, setting, reminder, app, app/conv,
app/input.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
format-check was failing in CI on three files left over from manual
edits across the recent batches:

- internal/app/input/on_config.go
- internal/app/services.go
- internal/app/view.go

Pure whitespace / alignment changes — no behavior delta.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two CI failures on this branch, both fixed:

#1 internal/core/system → internal/session (layer violation)
- AutoMemoryDir called session.EncodePath. core ranks 3 and session
  ranks 2 (feature), so this was an upward import in the layered
  architecture (caught by tools/layercheck). Inlined a 5-line
  encodeProjectPath helper in core/system that mirrors
  session.EncodePath — both functions are tiny string-ops and must
  stay in lockstep so memory and transcript stores resolve to the
  same project partition; comment makes the dependency explicit.

#2 internal/selflearn unmapped in layercheck
- Added "internal/selflearn": "feature" to tools/layercheck/main.go's
  layerOf map. selflearn imports core (system, agent) and infra
  (markdown) — both downward, consistent with feature-layer peers.

Pre-push tooling
- New `make ci` target runs every step the .github workflow runs in
  the same order (format-check → build-all → lint → unit tests →
  integration tests). Catches format/vet/layercheck/test failures
  locally instead of round-tripping through Actions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…om /code-review

Findings from the max-effort /code-review pass on this branch:

- #1 config disable can't persist: UpdateSelfLearnAt now replaces the
  selfLearn block instead of OR-merging it via mergeSelfLearn, so a
  true->false toggle from /config actually sticks (other settings in the
  file are preserved). Cross-level Load layering still ORs by design.
- #2/#3 lifecycle race + leak: write observers gate on a captured
  atomic.Bool instead of racing on m.services.SelfLearn; new
  teardownSelfLearn() cancels the prior fork context before a re-wire
  (the agent-toggle rebuild path bypassed StopAgentSession) and on stop.
- #4 skill name traversal: resolve() validates name with skillNameRe so
  patch/edit/delete/write_file/remove_file can't escape the skills dir.
- #5 yamlScalar under-quoting: emit the description via yaml.Marshal so a
  value like "[draft] ..." can't produce invalid frontmatter that bricks
  every later parseSkill.
- #7 applyPatch overlap: lineWindowReplace collects non-overlapping fuzzy
  windows (advance by w on a hit), mirroring the exact tier.
- #8 stale settings: reload the in-memory Settings handle on ConfigSavedMsg.
- #10 memory delimiter: reject a bare "§" line that fuses with the entry
  delimiter and shifts boundaries on read.
- #9 relax Validate: "create + update allowed, delete denied" is now a
  legal config (delete is already agent-created-only, so opting out
  removes no safety). Rule, test, and design note updated.

Adds regression tests for the config round-trip, traversal, description
round-trip, and fuzzy-overlap cases (previously untested).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Rename `SelfLearnSegment` field → `SelfLearnStatus`, `SelfLearnUIState`
→ `SelfLearnIndicator` (file `selflearn_ui.go` → `selflearn_indicator.go`),
`ArmTick` → `TryStartTicker`, `tickArmed` → `tickerRunning`, and
`internal/selflearn/config_setting.go` → `resolve.go`. Trim verbose doc
blocks across the L1 surface (selflearn.go, indicator, reviewer, fork,
memory, skill, resolve) — keep WHY comments where they're load-bearing.

No behaviour change.
- Rename `selflearn.Resolved` → `Runtime` (clearer: it's the runtime bundle ResolveSettings returns)
- Extract `SelfLearnIndicator.decayPhase` to dedupe done/failed branches in Tick
- Extract `ConfigSelector.renderCursor` helper (3 inline call sites)
- Use `s.Memory.MaxKBOr()` in /config panel instead of local `defaultIfZero`
- `slices.Delete` for memory entry removal
- `countUserTurns([]core.ChatMessage)` reads `m.conv.Messages` directly, avoiding a `ConvertToProvider()` allocation just for counting
- Flatten `selflearn.Config`: drop the awkward `Resolved`/`Runtime`/`Triggers` wrappers; the resolved bundle is just `Config` with `Memory`, `Skills`, `Perms`, `MemoryMaxChars` at the top level.
- Rename `internal/selflearn/resolve.go` → `config.go` to match.
- Group the four `services.SelfLearn*` fields under a `SelfLearnServices` sub-struct (`services.SelfLearn.Reviewer / .Cancel / .Live / .Indicator`) — cleaner field cluster and aligned with how the data moves together.
- Extract a Panel interface so /config can host multiple sub-panels in
  the future. ConfigSelector becomes the popup shell: breadcrumb,
  optional tab strip when >1 panel is registered, esc/ctrl-tab handling.
  Today only Self-Learning is registered; adding a sibling panel
  (Provider, Permissions, …) is a one-liner in NewConfigSelector.
- Move all Self-Learning UI into a private selfLearnPanel that implements
  Panel: rows, scope, snapshot, validate, save.
- Adopt the /plugin overlay layout via kit.Panel — centered floating box,
  Padding(1, 2), full-width separator lines, kit.PanelTab tab strip,
  kit.HintLine footer. Right-aligned value column on int rows.
- Add "/config › Self-Learning" breadcrumb so the popup advertises which
  command opened it.
- Replace on_config.go / on_config_test.go with config_panel.go +
  config_selflearn.go + config_selflearn_test.go.
Layout:
- Cap popup width to ~84 chars so on a wide terminal the form values
  stay near their labels instead of drifting to the far right.
- Move the cursor inside the indent so ▸ sits right next to the row it
  marks, not orphaned in the leftmost column.
- Flatten the section hierarchy: promote "Allowed actions" and "Advanced"
  from sub-headers under Skills to top-level sections. Result is four
  clean sections (Memory / Skills / Allowed actions / Advanced).
- "Review cadence (user turns)" → "Run every" with the unit ("user turns",
  "tool iterations", "KB") rendered as a muted suffix next to the value.
  Reads naturally: "Run every  10 user turns".
- The "Max size" footnote (≈ words / 中文字 equivalence) moves under the
  label column instead of right-indented.

Slash:
- Register /config in the builtin command list so it shows in the
  command picker — handler was already wired, but the metadata entry
  was missing so autocomplete never offered it.
- Hoist explicit layout columns: every row's label starts at the same
  column (labelCol=8) regardless of row type; numeric values right-end
  at valueCol=40 with units flowing past in muted style. Bool rows
  render "[ ] " between cursor and label; int rows pad an equivalent
  gutter so the label lands on the same column. Result: labels align
  vertically across the form, and "10" / "25" line up in a clean value
  column instead of trailing off to the box edge.
- Slightly wider cap (84→92) and taller (32→34) so the form has room
  to breathe at "expanded" sizes.
- Save button is column-aligned with the row content above it.
Drop the right-alignment that pushed values to col 40 with a wide gap
between label and value. Render int rows as a compact phrase instead:
"Run every  10 user turns". The value sits right next to the label
(highlighted), the unit follows in muted style — no floating value in
the middle of empty space.
- Int rows render value-first as a compact phrase with the underline
  highlighting the number, aligned in the same column as "[":
    10 user turns  -  Run every
    25 KB  -  Max size  ~  4500 EN words / 8500 中文字 (UTF-8)
- "Max size" footnote moves inline after the label with "~" separator
  instead of a separate line below.
- Merge the rowAdvHint row into the bool row via a new advHint field.
  "Update user-authored skills  ⚠ rewrites your authored skill files"
  renders on one line — at narrow widths the hint wraps naturally.
- Save button aligns at the same indent as the "[ ]" rows.
yanmxa added 20 commits June 2, 2026 23:22
Strip the recap back to a clean grouped list — no card frame, no top
"Self-improvement" header, no horizontal rules. The kind sub-header
(memory / skill) carries the only color in the block so the eye can
tell groups apart at a glance:

  memory  (blue)
    · index — noted that lint runs via make ci, not go vet
    · debugging — added 3 race-condition repro tips
  skill  (purple)
    · go-testing — trimmed verbose examples
    · python-typing — new · typing-hints and Protocol patterns
    · outdated-thing

- memory header → AdaptiveColor{Dark: #6BA4D6, Light: #1F77B4}
- skill  header → AdaptiveColor{Dark: #B68EE0, Light: #7B2D8E}
- rows stay italic + TextDim so the bulk of the block still recedes
- created actions get an inline "new ·" marker in accent green + bold
  before the note ("new · typing-hints"), echoing the user's design
- rows without a note degrade to "  · <target>" (no trailing dash)
- memory index file still renders as "index" so the column lines up

Indent provides the hierarchy; color and "new ·" are the only signals
that compete for attention.
…chat

Two follow-ups from the live demo:

1. The "new ·" inline marker on created skills was duplicating what
   the LLM's note already said ("python-typing — new · new skill,
   typing-hints …"). Dropped the marker entirely — the note describes
   the change, including its newness when relevant.

2. The recap shared visual weight with chat text, making it hard to
   tell apart at a glance. Every recap line now leads with a dim
   "│ " rail so the whole block reads as a quoted self-learn surface,
   not part of the conversation:

       │ memory
       │   · index — noted that lint runs via make ci, not go vet
       │   · debugging — added 3 race-condition repro tips
       │ skill
       │   · go-testing — trimmed verbose examples
       │   · python-typing — new skill, typing-hints and Protocol patterns

   Kind sub-headers keep their color (memory blue, skill purple); the
   rail itself is TextDim so the column reads as soft chrome.
Hand the visual styling over to the chat's markdown renderer. The
recap publishes plain markdown — no ANSI escapes, no inline lipgloss
styles — so the conversation pipeline owns how bold/code-span/bullet
list renders. Output:

  **memory**
  - `index` — noted that lint runs via make ci, not go vet
  - `debugging` — added 3 race-condition repro tips

  **skill**
  - `go-testing` — trimmed verbose examples
  - `python-typing` — new skill, typing-hints and Protocol patterns
  - `outdated-thing`

- Kind sub-headers use **bold** markdown
- Targets render as `code-span` so they pop from the prose note
- Em-dash separates target and LLM note; rows without a note drop
  the em-dash entirely
- Memory index file still renders as `index` so the column lines up
- Blank line between groups so the markdown parser breaks the list

Removes the lipgloss imports and all the recap-specific styles
(kind/memory/skill/row/rail/new) since the renderer handles them now.
The block is also free to inherit any future markdown theming the
conversation gets without touching this code.
Per user feedback ("前面那一道线有点丑"), the per-row "│ " left rail
felt heavy. Drop it in favour of one thin horizontal rule above the
block — the markdown "---" idiom drawn inline since Notices render
raw text:

  ──────────────────────────────────────────────────
  memory
    · index — noted that lint runs via make ci, not go vet
    · debugging — added 3 race-condition repro tips
  skill
    · go-testing — trimmed verbose examples
    · python-typing — new skill, typing-hints and Protocol patterns

The rule (50 chars of "─" in dim+faint) separates the recap from
chat above; the block ends naturally with the last row. Kind sub-
headers still carry the only color (memory blue, skill purple).
Per user feedback (the standalone horizontal rule didn't look great
either), wrap the whole recap in a thin rounded box so it reads as a
self-contained card:

  ╭──────────────────────────────────────────────╮
  │  memory                                      │
  │    · index — noted that lint runs via make ci│
  │    · debugging — added 3 race-condition tips │
  │  skill                                       │
  │    · go-testing — trimmed verbose examples   │
  │    · python-typing — new skill, typing-hints │
  ╰──────────────────────────────────────────────╯

Frame is lipgloss RoundedBorder in TextDim — soft chrome. Padding(0,2)
gives a 2-col gutter inside the border so the content breathes. Kind
sub-headers still color-code (memory blue, skill purple); rows stay
italic + TextDim.
Per user feedback: the recap should surface a session ID the user can
"gen --resume <id>" to replay the L1 review's LLM calls. The previous
sidechain approach (sharing the main session's ID) would resume the
main chat instead.

Each L1 fork now creates its OWN session in the transcript store:

    <main-session>.selflearn-review.<unix-seconds>

NewSidechainRecorder generates that ID and binds the Recorder to a
fresh session with it. The fork's session.started, messages, and
inferences all land under the new ID — so "gen --resume <fork-id>"
loads only the L1 review.

Recap footer now reads:

    gen --resume <fork-session-id>

shown in dim+faint inside the recap box, ready to be copy/pasted.
Multiple concurrent forks coexist via the unix-seconds suffix.

Recorder gets a SessionID() accessor so the caller can surface it
without poking the private field.

Demo command fabricates a plausibly-shaped ID so the recap footer
looks identical to the real path.
Swap the solid rounded border for a quadruple-dash light style
("┈" horizontal, "┊" vertical) with rounded corners kept, and dim
the border color from TextDim to a noticeably fainter shade
(AdaptiveColor #4A4A52 dark / #C8C8CC light) so the frame whispers
instead of competing for attention.
Quadruple-dash chars (┈/┊) pack four tiny dashes per cell; the gap
between dashes doesn't align with the cell boundary, so consecutive
cells produce an uneven "long-short-long-short" pattern. Triple-dash
(┄/┆) lays out one dash-gap-dash-gap-dash per cell — the gaps land
on cell midpoints and the pattern reads uniform when tiled.
Two structural changes per user direction:

1. spinner moves from status bar to chat flow.

   renderChatSection now emits the L1 indicator inline (above the
   prompt, below the active content) while the review is running or
   failed. The status bar entry (SelfLearnStatus) is removed —
   selfLearnStatus() and the param field are deleted. During the
   done phase the inline row is suppressed; the recap box that
   publishes into the conversation flow already carries the summary.

   The inline row uses selflearnLiveStyle (italic + TextDim) so it
   sits softly in the chat thread without pulling focus.

2. "gen --resume <id>" rides on the bottom border.

   formatRecapBlock no longer leans on lipgloss Border — the box is
   built by hand so the footer can ride on the bottom edge itself:

       ╭┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄╮
       ┊  memory                                       ┊
       ┊    · index — noted that lint runs via make ci ┊
       ┊  skill                                        ┊
       ┊    · go-testing — trimmed verbose examples    ┊
       ╰┄ gen --resume demo.selflearn-review.123 ┄┄┄┄┄╯

   The footer is italic + faint so it reads as a label baked into the
   chrome, not a content row. innerWidth widens automatically if the
   footer is longer than every content row, so the text never wraps.
   Falls back to a plain dashed bottom border when sessionID == "".
Two follow-ups from the screenshot:

1. Bottom border overshoots the box.

   The width math for "╰┄ <footer> ┄…╯" was off by 2 — the lead/trail
   accounting double-counted the dashes. Spelt out the constraint:

       top span    = innerWidth
       bottom span = lead(1) + " "(1) + footerLen + " "(1) + trail(N)
                   = footerLen + N + 3
       align       → N = innerWidth - footerLen - 3

   And the auto-widen check now compares footerLen directly to
   contentWidth (the constraint is contentWidth >= footerLen, derived
   from innerWidth >= footerLen + 4). End result: every box now has
   four perfectly aligned corners.

2. Live indicator needs breathing room.

   renderChatSection now wraps the inline row with blank lines on
   both sides ("", live, "") so it reads as its own block — clearly
   apart from the active content above and the prompt below.
The first row (memory) sat against the top border and the last row
sat against the bottom border with the footer. Add a one-line empty
row (vertical sides + spaces) inside the box on both edges so the
content breathes and the "gen --resume …" footer reads as part of
the frame, not crowded by the row above it.
Two compactness wins per user feedback:

- Drop the top padding row. The first content row ("memory") now sits
  directly under the top border; the horizontal gutter handles the
  breathing room. The bottom padding row stays so the "gen --resume"
  footer keeps its visual separation from the last content row.

- Shrink row indent from 4 spaces to 2 (" · " instead of "    · ").
  The bullet lands right under the kind sub-header instead of being
  dragged two extra columns right, so the whole list pulls in.
Italic on a shell command makes it look ornamental rather than runnable.
Keep TextDim + Faint for the muted-chrome feel but render it upright so
the user reads it as 'a command I can paste', not 'a styled label'.
User confirmed it wasn't needed. The card is now as tight as it can be
while still framed — no internal padding rows on either edge, only the
horizontal gutter on each side. The last content row sits directly
above the bottom border carrying "gen --resume <id>".
1. Blank row between kind groups. memory + its rows and skill + its
   rows now sit in separate visual blocks with one empty padded row
   between them — the eye reads them as two sections instead of one
   long list.

2. Footer affordance. The "gen --resume <id>" footer gains a "↪ "
   lead so it reads as a next-action rather than a passive label.
   The bottom-border auto-widens slightly to keep the corners aligned.

3. Desaturated kind colors. memory (blue) and skill (purple) pulled
   ~15-20% toward gray:

       memory dark  #6BA4D6 → #82A0BA
       memory light #1F77B4 → #487192
       skill  dark  #B68EE0 → #A89AC4
       skill  light #7B2D8E → #745783

   Still recognisable as blue / purple, but they blend with the
   italic+TextDim aesthetic instead of fighting it.
Terminal can't give us fractional rows, so user feedback of "feels
cramped, but don't add a whole row" gets addressed horizontally: an
extra column of padding on each side lowers the perceived density
without touching the row count. Last skill row keeps the same
distance from the bottom border, but every line gets one column of
extra breathing on each end.
User/linter inlined memory.go's fireWrite but left skill.go's call
sites pointing at the free function — added the free-function
fireWrite + str helpers local to skill.go and dropped the unused
tool import. Build green; gutter widening change ships in this
commit too (was rolled into the original push but blocked by the
helper drift).
…elow

The hand-built box was a hack: manual character placement, manual
width math, off-by-one bugs already paid for. Drop it.

- Box rendering goes back to lipgloss.Border with custom dashed glyphs
  and Padding(0, 2). lipgloss owns the width math; the corners can't
  drift apart.
- "↪ gen --resume <id>" lives on its own line BELOW the box, indented
  to align with the first content column. No more cramming text onto
  the bottom border, no more auto-widening to fit the footer.
- Blank line between kind groups is now just "\n\n" between the inner
  blocks — lipgloss propagates it.

Same visual idea, half the code, zero geometry math.
- selflearn/reviewer.go: ReviewKind.String iterates bits instead of
  enumerating the 2-bit power set; inline Reviewer.run into the
  goroutine in Observe (its only caller).
- app/selflearn_indicator.go: actionLabel collapses 3 identical
  switch arms into one if/else.
- app/selflearn.go: memoryTopicName consults system.AutoMemoryIndexName
  instead of hardcoding "MEMORY"/"memory" literals.
- app/input/config_panel.go: drop hand-rolled visibleWidth ANSI
  stripper in favor of lipgloss.Width.
- skill/types.go: neutralize Skill.Origin doc — keep provenance
  semantics, drop L1/L2-specific narrative from the shared type.
- skill.go: use tool.GetString/GetBool instead of local str/boolOf helpers
- conv/message.go: drop the dead SelfLearnStatus status-bar path (field,
  branch, renderSelfLearnIndicator, style, test) — superseded by the inline
  renderSelfLearnLive indicator
- memory.go: drop redundant empty-content check in Add (scanContent covers
  it); range with strings.SplitSeq in readEntries
- app: extract selflearn.review.{started,done,failed} event types into shared
  consts (publisher/consumer agreement)
- kit/spinner.go: correct the BrailleSpinnerFrames doc (selflearn uses ASCII)
@yanmxa yanmxa force-pushed the feat/selflearn-l1 branch from f0e4d3e to 41d41c3 Compare June 2, 2026 15:33
yanmxa added 5 commits June 3, 2026 07:49
- fork: trimTrailing* also strips RoleTool so a trailing tool_result no
  longer puts two consecutive user-role messages on the wire (renamed to
  trimTrailingPendingMessages, test extended)
- session: NewSidechainRecorder sets Sidechain: true so L1 fork
  messages stay out of the inspector's main-thread view
- skill: joinFrontmatter returns body alone when input had no
  frontmatter (no more silent --- prepending on user-authored files);
  Create rejects empty descriptions (preserve UPDATE/CREATE
  disambiguation in Inventory); NewSkillManager leaves userDir empty on
  UserHomeDir failure and dirFor/resolve/Inventory guard against it
  instead of aliasing user-scope writes onto cwd/.gen/skills
- app/selflearn: wireSelfLearn takes pendingSend so countUserTurns
  excludes the in-flight submit and the cadence seed no longer
  double-counts; failure path drains the indicator and publishes a
  partial recap so writes before a timeout still surface
- app/update + input: ConfigSavedMsg re-wires the L1 reviewer when the
  agent is active (toggle was a silent no-op until next session start)
  and surfaces a notice when the just-saved Enabled flag is overridden
  by the other settings level (merger ORs across user+project)
- app/selflearn: add live.Load() guards before each phase mutation
  (BeginReview, the post-LLM Fail/Complete entry, the publish funcs)
  so a teardown during the 5-min fork window can't flash 'evolving' or
  a stale recap on a session the user just cleared
- scan: tighten role_hijack regex to require an actual role keyword
  (admin/root/developer/jailbroken/in <X> mode/...) — the old
  unanchored `you are now ` matched benign English like
  "you are now in the repo root" and rejected legitimate memory
  entries on every retry
- prompts: append a reviewToolScope clarifier so the fork's reviewer
  LLM knows only memory_write / skill_manage are available, despite
  the inherited (verbatim-for-cache-parity) system prompt advertising
  Bash/Read/Edit/etc.
- scan + skill.Patch: introduce scanNewThreats(original, candidate)
  so a Patch only fails when it INTRODUCES a new threat pattern. A
  skill whose body legitimately quotes a threat-pattern substring
  (e.g. an injection-defense example) stays patchable; the previous
  scanForThreats(patched) bricked any such skill permanently
- fixes_test: regression coverage for the regex tightening and the
  scanNewThreats semantics (clean patch, threat-removing patch, new
  attack family blocked, invisible-rune smuggling blocked)
Signed-off-by: Meng Yan <yanmxa@gmail.com>
The default cadence (10) was duplicated three ways: a magic literal in
the /config panel, parallel Default* constants in the reviewer, and two
near-identical helpers (defaultIfZero / positiveOr).

Consolidate onto the existing setting-package resolver convention
(SelfLearnMaxMemoryKB + ResolvedMaxKB): add SelfLearnDefaultEveryTurns /
SelfLearnDefaultEveryToolIters constants and ResolvedEveryTurns /
ResolvedEveryToolIters methods, apply them in ResolveSettings (mirroring
ResolvedMaxKB), and drop the reviewer's Default* constants + positiveOr
and the config panel's defaultIfZero. Also rename the MaxKBOr accessor to
ResolvedMaxKB for a consistent, non-dangling name.
Signed-off-by: Meng Yan <yanmxa@gmail.com>
@yanmxa yanmxa merged commit 5767b0e into main Jun 3, 2026
1 check passed
@yanmxa yanmxa deleted the feat/selflearn-l1 branch June 3, 2026 01:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

L1: Background Review (per-turn self-learning)

1 participant