feat(selflearn): L1 per-turn background reviewer implementation (#52)#75
Merged
Conversation
First slice of the self-learning loop (#52). Package internal/selflearn (not a cryptic "l1") holds the per-turn Reviewer: - Two-signal trigger: memory on user-turn cadence, skills on tool-iteration threshold; combined when both trip the same turn. - Only StopEndTurn turns count (cancelled/interrupted/max-turns skipped). - ≤1 review in flight; a trigger that arrives mid-review is dropped and NOT reset, so it fires again on the next clean turn. - The fork+review is an injected ReviewFunc so the trigger is unit-testable without an LLM; SeedTurns hydrates cadence on resume. - selfLearn settings (memory/skills enabled + intervals), off by default. Tests cover memory cadence, skill threshold, combined, interrupted-turn skip, concurrency cap (drop + retry), and seed. Build + format-check + tests green. Next slices: the fork (restricted core.Agent), memory_write + skill_manage tools, the memory store + injection source, skill origin field, review prompts, and Task.Start wiring. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add the L1 reviewer's durable-memory write surface: a project-partitioned MemoryStore under ~/.gen/projects/<encoded-cwd>/memory/ with add/replace/ remove-by-substring entry semantics (mirroring the hermes memory tool), atomic temp+rename writes, exact-duplicate dedup, a per-file char budget, and a coarse injection/exfiltration scan since entries are injected into a future system prompt. The memory_write tool exposes these to the fork only. Add the read side as a distinct source so agent-written memory and user-authored GEN.md/CLAUDE.md never mix: system.AutoMemoryDir/IndexPath and LoadAutoMemory (capped, line-boundary truncation). Reminder-layer injection is wired in a later slice. Also fix two stale "l1:" log prefixes left from the package rename. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add the Origin frontmatter field to Skill (absent = user-created) so the reviewer can tell apart skills it owns. Add the L1-only skill_manage tool over gen-code's existing user/project skill scopes (no agent-created/ subdir): create / patch / edit / write_file / remove_file / delete, all refusing to touch user-created skills. patch uses a pragmatic fuzzy-match chain (exact -> line-trimmed -> whitespace-collapsed) with an escape-drift guard and a unique-match requirement; the block-anchor and context-similarity tiers from the design are deferred. Class-level kebab names are enforced (also a traversal guard), and support files are confined to references/templates/scripts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add RunReview: forks a restricted core.Agent over the just-completed turn snapshot to capture durable learnings. The fork inherits the parent's system prompt verbatim via a fresh System (avoids clobbering the parent's telemetry observer that NewAgent installs), is granted only memory_write + skill_manage behind a static allow-only permission policy (never prompts the TUI), runs headless (OutboxBuf -1) with a bounded MaxTurns, and is driven via SetMessages + Append + ThinkAct. Add the three review prompts (memory / skill / combined) selected by which arms fired; each embeds the current memory store and skill inventory so the fork refreshes and dedupes rather than blindly appending. Thread the turn snapshot through ReviewFunc so the reviewer reviews the turn it observed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Security — extend the injection scan from memory writes to all four skill
write paths (Create, Edit, Patch, WriteFile). Skills are loaded into a
future system prompt the same way memory is, so a poisoned skill body or
description is the same stored-injection vector with the same fix.
Refactor scan.go to expose two checks:
- scanContent — rejects empty + threats; for required content
(memory entries, skill bodies).
- scanForThreats — threats only, allows empty; for optional fields
(skill descriptions, support-file contents, patch results that may
legitimately remove text).
Add TestSkillRejectsInjectionContent covering create / edit / patch /
write_file paths plus the rollback guarantee (a rejected write must not
mutate the on-disk skill).
memory.go's two existing callsites are renamed scanMemoryContent
→ scanContent to match the refactored API.
Docs alignment (post-rebase) — update three stale §3a / §3b code-comment
references to the merged design doc's new structure: §4 Memory flow,
§5.2 Provenance / write scope, §5.3 Tool surface. invariant #8 and §3
references remain valid and are untouched.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Implement the action-level permission scheme defined in the merged design note: three booleans (allowCreate / allowUpdate / allowDelete) gate the corresponding skill_manage actions on the agent-created scope; one advanced opt-in (allowUpdateUserCreated) extends patch — and only patch — to user-created skills. Config (internal/setting/settings.go) - SelfLearnMemory grows a MaxKB field with the 25 KB upper-bound invariant (§4.2). Constants: SelfLearnMaxMemoryKB, SelfLearnDefaultMemoryKB. - SelfLearnSkills grows four permission fields. The three core ones are *bool so unset (nil) cleanly distinguishes from explicit false; default resolves to true via AllowCreateOr / AllowUpdateOr / AllowDeleteOr. - SelfLearnSettings.Validate enforces the three rejected combinations documented in §3.1: create-without-update, only-add-never-subtract, advanced-opt-in-without-update-base; plus the maxKB range check. - MaxKBOr returns the default fallback so callers don't repeat the zero-means-default branch. SkillManager (internal/selflearn/skill.go) - ActionPermissions struct + DefaultActionPermissions() constructor. Carries the four booleans. NewSkillManager now takes ActionPermissions explicitly so the caller's intent is visible at every construction site. - requirePatchable extends the existing requireAgentOwned guard: patch on user-created is allowed iff AllowUpdateUserCreated is set. All other update-shaped actions (edit, write_file, remove_file) stay on requireAgentOwned — only patch crosses the user-created boundary (§5.2 / §5.5 invariant). - Each of the six action entry points (Create, Edit, Patch, WriteFile, RemoveFile, Delete) opens with a uniform errActionDenied early-return when its allow flag is off, so the model sees a consistent "permission denied" shape on the veto path. Tests - TestSelfLearnValidate covers the four legal combinations (default, freeze, patch-only, feature-off), the three rejected combinations, and the maxKB range — both the upper bound and the negative case. - TestSelfLearnAllowAccessors confirms the nil→true / explicit-pass-through contract every downstream consumer depends on. - TestSelfLearnMemoryMaxKBOr exercises the default fallback. - TestSkillActionPermissions runs each allow* flag through its action(s), asserting both the error message and the rollback guarantee (a denied write must not mutate disk). - TestSkillAllowUpdateUserCreated asserts the advanced opt-in extends patch to user-created, but never Edit, Delete, or WriteFile — the design invariant that only patch crosses the boundary. Existing helpers (newTestSkillManager, fork tests) are updated to pass DefaultActionPermissions, preserving prior behavior. -race passes for both selflearn and setting packages. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wraps up the Phase 1 implementation surface of the L1 design note (notes/active/l1-background-review.md): configurable memory cap, fork deadline, defensive snapshot copy, permission-aware prompt synthesis, the setting→runtime bridge, and the memory-auto reminder provider. Memory cap configurable (memory.go) - Rename memoryFileCharLimit → DefaultMemoryFileCharLimit and store it on MemoryStore via NewMemoryStoreWithCap(cwd, maxFile). Add/Replace enforce per-instance maxFile instead of the package constant, so the app can lower the cap via memory.maxKB (§3.1) without forking the store. NewMemoryStore preserves the old default for callers that don't know about §4.2 yet. Fork deadline (fork.go) - ForkConfig grows a Deadline (default 5 min, DefaultForkDeadline). RunReview wraps the caller's ctx in context.WithTimeout, satisfying invariant #5: a hung provider call now returns instead of leaving inFlight=true and silently disabling all future reviews. Snapshot copy (reviewer.go) - Observe copies result.Messages into a fresh slice before handing it to the background goroutine. The main agent loop reuses its message slice; without the copy the goroutine would race on truncate-on-compact / append. -race passes for concurrent Observe (concurrency_test.go). Prompt synthesis per permissions (prompts.go) - skillSection constant replaced with skillSectionFor(mgr): action steps the SkillManager will veto at dispatch are stripped from the prompt. AllowCreate=false swaps "create is last resort" for "creation is disabled"; AllowDelete=false drops the retire-skill bullet entirely; AllowUpdateUserCreated=true widens the scope sentence to "patch any existing skill (including user-created)". Permission layer remains the hard floor — this is steering, not enforcement. Setting → runtime bridge (config_setting.go, new) - ResolveSettings(setting.SelfLearnSettings) → Resolved{Config, Perms, MemoryMaxChars}. Single conversion point with Validate() up front so wire-up code never branches on per-field defaults. Auto-memory reminder provider (reminder + app/agent.go) - New ProviderMemoryAuto constant. agent.go's reminder registration picks up the L1 store via system.LoadAutoMemory(cwd) and emits it under scope="auto" — distinct from memory-user / memory-project so agent-written and user-authored memory never mix. Wraps the same PostCompact / cwd-change refresh path the other memory providers use. Tests - concurrency_test.go: snapshot-copy isolation + concurrent Observe hammered by 8×20 goroutines under -race. - prompts_test.go: 5 permission-scenario coverage of skillSectionFor. - config_setting_test.go: happy path, validation error propagation, MaxKB default fallback and pass-through. -race passes for selflearn, setting, reminder, and app packages. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Connects the implementation of the previous commits into the running session so the feature actually fires when configured. Implements §9 step 4 of the design note. services.SelfLearn (services.go) - New field on the app's services struct that holds the live Reviewer when at least one arm is enabled. Nil ⇒ zero overhead: no goroutine, no counters, no extra model calls (§3.1 promise). wireSelfLearn (internal/app/selflearn.go, new) - Called from ensureAgentSession after Agent.Start so the ReviewFunc can capture the live System. - Reads setting.SelfLearn via selflearn.ResolveSettings, which also runs Validate() so an illegal config is logged and skipped instead of crashing the session. - Builds NewMemoryStoreWithCap + NewSkillManager with the resolved cap and permissions, threads both into a ForkConfig the ReviewFunc constructs at fire time. - Rebuilds the LLM client from the captured BuildParams.Provider/ModelID /MaxTokens so the fork hits the same prefix-cache key as the parent (§6 invariant #2) without sharing the parent's request cycle. - Seeds the memory arm's counter from the preloaded user-turn history via SeedTurns (invariant #8 resume). - Review summary is logged for Phase 1; user-visible MessageEvent + status-bar surface is Phase 1.5 (§"User-visible surface"). OnTurnEnd hook (model_agent_events.go) - Forwards every turn Result to forwardTurnToSelfLearn unconditionally. The Reviewer gates on StopEndTurn internally so cancelled / interrupted turns are silently skipped at one place rather than scattered up the call stack. StopAgentSession cleanup (agent.go) - clearSelfLearn nils out services.SelfLearn so a future ReviewFunc closure can't hold references to a torn-down agent. In-flight goroutines complete on their own via DefaultForkDeadline. systemOnlyAgent shim - Minimal core.Agent wrapper that exposes only System(). The ReviewFunc needs the parent's System for prefix-cache parity but nothing else from the live agent — keeps the dependency surface narrow and lets us avoid extending the core.Agent interface with an LLM() accessor that nothing else needs. ReviewKind.String (reviewer.go) - Stable log-friendly labels (none / memory / skill / memory+skill) for the wire-up's review log entry. Covered by TestReviewKindString. -race passes across selflearn, setting, reminder, and app packages. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Implements two of the three §"User-visible surface" pieces from the design
note (the /config Self-Learning panel remains for a follow-up):
Status-bar "evolving …" indicator (conv/message.go + view.go)
- New SelfLearnEvolving flag on OperationModeParams and a paired
renderSelfLearnIndicator producing a muted, italic "evolving …" segment
in the bottom-left when a review fork is in flight. Plain text — no
emoji, no spinner — so it reads as a static piece of structural UI
rather than a decoration (matches the design's text-not-icon ask).
- view.go reads services.SelfLearnRunning at render time. Hidden in both
the idle and disabled cases — the status bar pixel-matches a
feature-off session whenever no review is actively running.
Completion / failure notices (selflearn.go + services.go)
- services.SelfLearnRunning (*atomic.Bool) is the bridge between the
ReviewFunc goroutine and the render path; pointer-typed so the
surrounding services struct stays freely copy-able via the View()
path.
- ReviewFunc sets/clears the flag with defer, then on success publishes
a hub.Event{Target: "main", Subject: header, Data: summary} that the
app's main event loop routes through onMainEvent →
injectNotification, surfacing the recap as a notice + content block
in the conversation flow. Failures publish a parallel
selflearn.review.failed event with the trimmed error message — silent
swallowing would leave the user wondering what happened.
- "Nothing to save" stays silent (the caller already filters on
empty summary).
Tests
- TestRenderModeStatusShowsSelfLearnIndicatorWhenEvolving locks in the
visibility contract: hidden when the flag is false, visible — and only
then — when true.
Deferred to a follow-up
- Braille / target-name spinner with tea.Tick animation and per-tool-call
status updates (requires the fork to publish progress events back).
- /config multi-panel Self-Learning panel.
-race passes across selflearn, setting, reminder, and the app + conv
packages.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…vers
Promotes the bottom-left "evolving" indicator from a static text to the full
§"User-visible surface" specification:
hidden → "evolving ⠋" (review started)
→ "evolving ⠋ go-testing" (each successful tool call)
→ "evolved · N changes" (2 s after success)
→ "evolving failed" (3 s after failure)
→ hidden
State machine (selflearn_ui.go, new)
- SelfLearnUIState packs phase + target + frame + change-count behind a
single mutex. BeginReview / Step / Complete / Fail are called from the
reviewer fork's goroutine; Tick is called from the Bubble Tea Update
goroutine. Snapshot returns a consistent triple for the render path.
- Step debounces target swaps at ≥ 400 ms so rapid tool calls don't flicker
the bar; the change-counter is unaffected and counts every write.
- Done/failed phases auto-decay back to idle after 2 s / 3 s respectively;
Tick reports active so the dispatcher knows when to stop re-arming.
Write observers (selflearn package)
- MemoryStore.SetWriteObserver and SkillManager.SetWriteObserver register
goroutine-safe callbacks fired after each successful write. The fork
package itself stays UI-agnostic; the app layer hooks the callbacks to
Step the UI state with the affected name (skill kebab name) or
"memory · <topic>" (basename minus .md, MEMORY.md collapses to bare
"memory"). memoryTopicSuffix encodes that mapping.
Tick + hub plumbing (update.go, model_turn_queue.go, selflearn.go)
- selflearnTickMsg + scheduleSelflearnTick drive the spinner at ~100 ms.
- ReviewFunc publishes a "selflearn.review.started" hub event the instant
it begins so the main loop schedules the first tick — the indicator is
visible from frame zero rather than waiting for a render.
- onMainEvent intercepts the started event and routes it to the tick
scheduler instead of letting it become a user-visible notice (the recap
block at completion is still routed normally for "selflearn.review.done"
/ "selflearn.review.failed").
- handleSelflearnTick advances the state and re-arms while non-idle; the
loop quiesces automatically when the state decays to idle.
services / view
- services.SelfLearnUI replaces the previous atomic.Bool. Always non-nil
(constructed by newServices) so render can take Snapshot without a
nil-check and idle just renders empty.
- conv.OperationModeParams.SelfLearnSegment carries the pre-rendered text
so the conv layer doesn't have to import the app-side state struct.
Tests
- TestSelfLearnUIPhaseTransitions: idle→reviewing→done lifecycle, render
format, and decay timing.
- TestSelfLearnUIFailDecay: failed-phase hold outlasts done-hold.
- TestSelfLearnUIStepDebouncesTarget: rapid Step calls keep the previous
target but still bump the change counter.
- TestSelfLearnUITickFrameAdvances: spinner moves each tick.
- TestMemoryTopicSuffix: index / topic file → status tail mapping.
-race passes for selflearn, setting, reminder, and the app + conv + input
packages.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Implements the §"User-visible surface" /config slash command — the third and last runtime UI surface in the design note (status-bar spinner and completion notice landed earlier). A multi-panel sidebar (Provider, Permissions, Appearance, …) is planned; for now Self-Learning is the single panel and the scaffolding is shaped so siblings can be added without breaking the slash command path. setting.UpdateSelfLearnAt (loader.go) - New helper that validates the config (§3.1) before writing and merges the change into either ~/.gen/settings.json (user-wide) or ./.gen/settings.json (project-local) via the existing atomic saveToFile path. Re-uses Validate() so disk never ends up holding an illegal combination, and clears the loadedSettings cache so the next Snapshot picks up the new values. input/on_config.go (new) — the panel - ConfigSelector struct holding a working snapshot of the L1 config that the user mutates in place; Save merges back to disk, Esc discards. - rows() materializes the panel layout as a flat list of typed rows: header / spacer / bool / int / advanced-hint / save. The same function drives navigation bounds and Render so they can never disagree. - Boolean toggles cycle the *bool fields through nil→true/false so users can land on any of the four §5.5 legal combinations. Int rows enter/edit/commit with backspace + digit handling and clamp the result to the row's [min, max] range — MaxKB clamps to SelfLearnMaxMemoryKB so the §4.2 invariant is enforced from the UI. - Render lays the panel out with rounded borders, a cursor caret, and a ≈ N EN words / N 中文字 hint next to the MaxKB row. Validation errors surface inline beneath the rows; the Save row reads as muted/grey when the snapshot is illegal and green when it is legal, so the user has a pre-write affordance for the same check setting.UpdateSelfLearnAt applies. - Tab toggles the save target between user-wide and project-local; surfaced in the title row so the action key is one keystroke. Wiring - input.Model grows a Config field; NewConfigSelector takes deps.Setting. - /config registered in slash_command.go's dispatch table; the handler just calls Config.Enter(width, height). - ConfigSelector added to popups() so the active panel intercepts keys before the textarea. - Update handles ConfigSavedMsg by adding a "Self-learning config saved (<scope>)" notice to the conversation flow. Tests - TestConfigSelectorIsActivated: Enter/Esc lifecycle. - TestConfigSelectorTogglesBool: space and enter both flip a bool row. - TestConfigSelectorIntEditAndClamp: int edit flow + clamp to row max. - TestConfigSelectorRenderShowsValidationError: illegal §3.1 combination surfaces the underlying error message in the rendered panel. - TestConfigSelectorTabFlipsScope: Tab cycles user ↔ project. -race passes for selflearn, setting, reminder, and the app + conv + input packages. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…, prompt rewrites Closes four §-level discrepancies between the L1 implementation and the design note: §6 invariant #7 — silent on "Nothing to save" - selflearn.go now suppresses the user-visible recap when the model's reply is empty or a case/whitespace/period variant of "Nothing to save". The previous code only checked summary == "" so any "Nothing to save." reply produced a noisy notice. Helper isNothingToSave + TestIsNothingToSave covers the variants. §3.1 — GEN_DISABLE_SELF_LEARN env override - New const selfLearnDisableEnv. wireSelfLearn returns early when the env var is "1" / "true" regardless of settings.json. Matches the Claude-Code-style CLAUDE_CODE_DISABLE_AUTO_MEMORY documented in §3.1. §4.1 — eviction-first memory prompt - memorySection rewritten with the 4-step decision flow the design diagram encodes: (1) retire stale entries first; (2) replace, don't duplicate; (3) at-cap forces another prune; (4) only then add. The "a pass that only adds is a missed pruning opportunity" framing is now explicit. §5.1 — full trigger conditions in skill prompt - skillSectionFor now enumerates the table's CREATE-ALL-FOUR criteria (non-trivial + no coverage + class-level name + not anti-pattern), the UPDATE three paths (including the previously-missing "user-voiced style/format/workflow correction"), and the DELETE three paths verbatim. Preference order UPDATE > DELETE > CREATE is stated once at the top instead of being implicit in step ordering. - Tests updated for the new prose anchors. reviewPreamble + reviewClosing also adjusted: the model no longer writes a free-form summary line — the user-visible recap is built from the actual tool calls (the action log via the existing observers), so the closing instruction reduces to "reply with 'Nothing to save.' or empty". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ocks, stale comments Mostly the audit's "should-fix" bucket — nothing changes behavior, but the package's surface and read-flow tighten up: - systemOnlyAgent shim deleted. currentSystem() now returns (core.System, bool) directly; the previous shim was a core.Agent that lied about its capabilities for one caller. - forwardTurnToSelfLearn / clearSelfLearn one-line wrappers inlined at their two call sites. - MaxFileChars (zero callers) and SelfLearnUISnapshot.IsActive (no non-test callers; redundant with Tick's bool return) deleted. - NewMemoryStoreWithCap merged into NewMemoryStore (one constructor, one default behavior). Two test call sites updated to pass 0 explicitly. - Observer fields (MemoryStore.onWrite, SkillManager.onWrite) drop their RWMutex. The reviewer fork is single-flight per session (§6 invariant #8), so the field is written once at wire-up and read on every write by the same goroutine — no race. Type docs now state the "SetWriteObserver before first write" contract. - m.services.Agent == nil dead-branch dropped from currentSystem (Agent is set at newServices and never nilled). - Empty `if !replaceAll {}` block in skillpatch.go removed; comment moved to the outer loop where it actually applies. - errActionDenied's stale §3.1 anchor corrected to §5.5 (action permissions live there in the merged design doc). - Validate "0 < maxKB <= 25" error message corrected to "0 <= maxKB <= 25 (0 = default)" — 0 is intentionally accepted as "unset". - ConfigSelector.Render caches Validate() once instead of calling it twice per frame (Save row label + footer error line). - Trivial restate-the-name comments on Complete/Fail/Dir trimmed; the comments that remain explain why, not what. Snapshot-copy test rewritten to exercise the actual failure mode (caller truncates the slice while the goroutine still holds the original) rather than mutating the immutable element it used to. -race passes across selflearn, setting, reminder, and app + conv + input. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rses
A · MessageEvent-style recap block (§"User-visible surface")
- New ReviewAction type + SelfLearnUIState.RecordAction supersede the
earlier Step(target). Each successful tool call now records (verb,
kind, target) into the per-pass action log; the spinner-tail target
still swaps with the existing 400 ms debounce.
- MemoryWriteObserver gains an action parameter ("add" | "replace" |
"remove"); SkillWriteObserver already had one. The wire-up maps both
to verb / kind / target rows via memoryVerb / skillVerb.
- formatRecapBlock turns the drained log into the delimiter-bounded
block from the design example:
─────────────────────────────────────────────────
· updated skill go-testing
· saved memory memory · debugging
· retired skill outdated-thing
─────────────────────────────────────────────────
Empty input ⇒ "" so the no-write pass keeps quiet (§6 invariant #7).
- publishSelfLearnSummary now takes []ReviewAction (not the model's
one-line summary) and routes through the existing hub → notice path,
so the recap surfaces as structural data rather than LLM self-report.
- DrainActions clears the log after the publish so back-to-back passes
don't cross-contaminate.
D · de-duplicate the double frontmatter parse
- New parsed{path, origin, fm, body} plus parseSkill(name) — guards
requireAgentOwned and requirePatchable now return the full parsed
struct, so Edit and Patch no longer call ParseFrontmatterFile a
second time for the same file. WriteFile / RemoveFile / Delete
ignore fm/body (they only need the path).
- The mutex range itself is unchanged: ≤1 in-flight per session
(§6 invariant #8) means the broad mu doesn't actually contend, and
collapsing the locks would invite TOCTOU on origin checks.
Tests
- TestFormatRecapBlock locks the delimiter shape + per-row layout.
- TestVerbMapping covers every memory_write / skill_manage action's
verb mapping so the recap stays in sync with the design example.
- TestSelfLearnUIPhaseTransitions / TestSelfLearnUIStepDebouncesTarget
updated to the RecordAction API.
-race passes across selflearn, setting, reminder, app + conv + input.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ero defaults settings.json public schema change. The three skill permission flags switch from *bool tri-state (nil = default true) to Deny-encoded bool (zero = allow). Eliminates the AllowCreateOr/AllowUpdateOr/AllowDeleteOr accessor sprawl and three local boolPtr / ptr helpers that existed only to construct the previous shape. settings.json schema (§3.1) - skills.allowCreate / allowUpdate / allowDelete *bool fields → DenyCreate / DenyUpdate / DenyDelete bool fields. Zero ⇒ allowed (Go idiom "zero value should be sensible default"). Every omitted field reads as "allow"; setting true is the explicit lockdown. - skills.allowUpdateUserCreated stays an Allow-encoded bool — it is an opt-in for an off-by-default capability, so the polarity stays positive. - Three new read-side methods AllowCreate() / AllowUpdate() / AllowDelete() return the negated stored value. Downstream code reads through these so the polarity inversion is invisible past this layer. Validate (§3.1) - Same three rejected combinations, error messages updated to the new field names: · denyUpdate=true requires denyCreate=true (created-but-never-refined) · denyDelete=true requires denyCreate=true (only-add-never-subtract) · allowUpdateUserCreated=true requires denyUpdate=false Callers - ResolveSettings (selflearn package) reads via the new accessors. - ConfigSelector's three skill rows toggle s.Skills.DenyX directly (no more *bool ceremony). The toggleAllowCreate/Update/Delete helpers and boolPtr in on_config.go are dropped; the row toggle is now a one-line lambda. - Three test helpers (boolPtr / ptr in selflearn / setting / input test files) deleted alongside the *bool fields they were built for. Design doc (notes/active/l1-background-review.md) - §3.1 config block, fields table, validation table, and rationale paragraph all switched to the Deny-encoded schema. - §5.1 trigger-conditions table and §5.0 trigger-layers row also updated so the permission-check column shows the new field names. - §5.5 meaningful-combinations table now shows the effective Allow permissions in the columns (preserving the at-a-glance reading) but annotates each row with the corresponding denyX field setting so the config-side correspondence is unambiguous. -race passes across selflearn, setting, reminder, app + conv + input. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ead code Four parallel /simplify agents found ~40 items. Applying the high-leverage subset; skipping the rest with notes below. Two correctness regressions caught by the audit (both introduced by this branch, not pre-existing): - setting/settings.go Data.Clone() never copied the new SelfLearn field, so Settings.Snapshot() silently returned the on-disk default instead of the user's saved config. /config would always have shown stale defaults. Restored the row + extended clone_test.go to lock it down alongside the rest of the scalar-drift guard. - app/selflearn.go publishSelfLearnSummary/Failure put the recap block in hub.Event.Data, which routes through injectNotification's Content branch into SubmitToAgent — every successful review would have re-prompted the LLM with its own audit trail, breaking §6's "out-of-band" promise. Recap now goes in Subject (Notice-only path). Documented the trap so the next caller doesn't relearn it. Simplifications: - isNothingToSave removed. The action log is authoritative (the design doc says so explicitly); empty log ⇒ no writes ⇒ silent. Substring matching the model's text was redundant + undermined the invariant. - currentSystem(*model) inlined at its one caller. The systemOnlyAgent shim it once defended against has been deleted upstream; the helper no longer earned its existence. - ForkConfig.MaxTurns / Deadline override fields removed (always zero-defaulted in practice; no caller sets them). Exported DefaultMaxTurns / DefaultForkDeadline collapsed to package-private forkMaxTurns / forkDeadline constants used inline. - selflearn.readOrigin (dead outside tests) removed. The two test call sites use the production parseSkill helper now. - configRow.tail (write-only field) removed. - SelfLearnUIState.changes (kept in lockstep with len(actions)) merged into a single doneCount snapshot that's only needed for the done-hold render after DrainActions clears the live log. - Stale "Step" reference in the package doc comment fixed (the method was renamed to RecordAction earlier). Reuse: - brailleSpinnerFrames promoted to internal/app/kit.BrailleSpinnerFrames. on_provider_view.go's providerSpinnerFrames is now a one-line alias so the two consumers share the table and any non-Unicode fallback lands in one place. Skipped (each noted, not argued): - Inventory() disk re-read on every prompt build — would need a per-write cache invalidation hook that crosses observers; out of scope for /simplify, queued for a perf pass. - Tick re-arming during done/failed hold — same; the cleanest fix schedules a single deadline tick instead of polling, which is a small but real refactor of the state machine. - Snapshot fast-path on idle (lock-free check) — would require an atomic phase field that mirrors the mutex-guarded one; the cost of divergence outweighs the per-frame lock saving. - ResolveSettings / ActionPermissions / SkillManager observer constructor — the three "drop the indirection" findings each touch multiple packages; they read cleanly but would be a separate refactor commit, not a quality cleanup. - hub.Event "selflearn.review.started" special case in onMainEvent — the proper fix is a dedicated hub target with its own subscriber; out of /simplify scope. - yaml.Marshal vs hand-rolled buildSkillMD — a careful refactor; hand-rolled quoter works for current inputs. - Inventory via skill.Registry — already documented as deliberate (registry is stale-cached against same-session writes). -race passes across selflearn, setting, reminder, and the app + conv + input + kit packages. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
#1 mergeSettings dropped SelfLearn (setting/merger.go) - The merger enumerated every Data field except the new SelfLearn block, so every Load + every SaveToFile silently wiped selfLearn from disk. The L1 feature was effectively unreachable from settings.json. Added mergeSelfLearn with field-level semantics (int coalesce, bool OR for enable/deny — safety-biased), plus a regression test that exercises base→overlay → base+overlay merges and the symmetric base-only case. #2 /config Space on rowHeader panicked (app/input/on_config.go) - Cursor started at index 0 (the Memory header) whose toggle field is nil, so the first Space keystroke after /config crashed the TUI. Up/Down didn't skip non-editable rows either. Fixed by (a) starting the cursor at the first editable row in Enter, (b) replacing the cursor++ / cursor-- math with nextEditableRow / prevEditableRow that walks past headers/spacers/advanced-hints, and (c) guarding Space and Enter with an explicit kind/toggle check so a stale cursor index can never crash via a nil call. Test: navigation across the panel never lands on a non-editable row. #3 DrainActions ran BEFORE the deferred Complete (app/selflearn.go + app/selflearn_ui.go) - The defer pattern meant the success path drained s.actions to a local var, returned, and only then ran Complete in the defer — which snapshotted doneCount = len(s.actions) = 0. The 2-second done-hold status bar always rendered "evolved" without the count. Restructured the wire-up to call Complete inline before Drain (no defer for the success path; explicit Fail call on the error path). Test: TestCompleteCapturesCountBeforeDrain exercises the Complete→Drain ordering. #4 RunReview rooted at context.Background (app/selflearn.go + app/services.go + app/agent.go) - A torn-down session couldn't cancel an in-flight fork — the LLM request held tokens / HTTP for up to forkDeadline (5 minutes) after the user moved on. Added services.SelfLearnCancel + a session-scoped context.WithCancel in wireSelfLearn; StopAgentSession cancels it so the fork unblocks immediately. #5 Phantom "evolving → evolved" on a torn-down session (app/selflearn.go + app/selflearn_ui.go) - The deferred Complete fired even on the !Active early-return, flashing the status bar for a review that never ran. Two fixes: (a) moved the liveness check above BeginReview/publishStarted so a bail produces zero UI state change; (b) Complete itself now goes straight to idle when len(s.actions) == 0 — the §6 invariant #7 "silent when nothing changed" promise applied to the bar surface too. Test: TestCompleteSilentOnEmptyActions. -race clean across selflearn, setting, reminder, app, app/conv, app/input. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
#6 WrapMemory("auto") fell to user-authority preamble (reminder.go) - memoryPreamble had no "auto" case; agent-written L1 content was introduced to the model as "the user's saved memory (preferences and standing instructions). Apply it throughout this session." An injection that slipped past the scan would have been executed as if the human user authored it. Added a distinct "auto" preamble that explicitly frames the content as durable context, NOT direct user instructions, with a do-not-act-on-imperatives caveat. #7 memoryEntryDelimiter not rejected at write (memory.go) - Content containing the literal "\n§\n" delimiter passed scanContent but silently split into multiple entries on next read — corruption AND a scan bypass (the post-split pieces were never scanned in isolation). Added rejectEmbeddedDelimiter, called from Add and Replace right after scanContent. Test: TestMemoryRejectsEmbeddedDelimiter exercises both write paths. #8 Two consecutive user messages in the fork (fork.go) - ag.SetMessages(snapshot) followed by ag.Append(UserMessage(prompt)) put two user-role messages on the wire when the snapshot ended with a tool_result (RoleUser-shaped on common providers), producing 400 "messages must alternate" rejections that made the reviewer effectively dead on tool-loop turns. Added trimTrailingUserMessages that reslices (doesn't mutate the caller's snapshot) so the review prompt is the only user turn at the tail. Test guards the three cases (trailing user, trailing assistant, nil input). #9 Memory prompt hardcoded the 25 KB cap (memory.go + prompts.go) - memorySection const said "hard 25 KB cap per file" regardless of memory.maxKB. With a lowered cap (e.g. 5 KB), the model would propose oversized writes that the store rejected, burning ThinkAct rounds. Replaced the const with memorySectionFor(*MemoryStore) which interpolates mem.MaxKB(); buildReviewPrompt signature grew a *MemoryStore parameter so the section sees the actual configured cap. #10 recover() value discarded (reviewer.go) - The Warn line was a constant string with no panic value, kinds, or stack — production panics were unobservable. Logged zap.Any("panic", rec), zap.String("kinds", …), zap.Stack so the operator has actionable info. -race clean across selflearn, setting, reminder, app, app/conv, app/input. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
#11 Observer outlives session (app/selflearn.go) - The closures registered via SetWriteObserver captured SelfLearnUI by pointer and continued to fire after StopAgentSession had nilled SelfLearn — mutating UI state for a session the user terminated. Added a `m.services.SelfLearn == nil` early-return inside both closures so a write that lands post-stop is silently dropped. #12 Tick chain stacking on back-to-back reviews (app/selflearn_ui.go + app/model_turn_queue.go) - Every selflearn.review.started event unconditionally scheduled a fresh tick chain; if review B started before review A's done-hold finished, two chains advanced the spinner together at 2× cadence. Added a tickArmed bool guarded by the existing mutex plus an ArmTick() reservation method; onMainEvent only schedules a new tick when ArmTick returns true. tickArmed is cleared in Tick when the state decays to idle, so subsequent reviews rearm cleanly. #13 itersSinceSkill unbounded during in-flight (reviewer.go) - A 4-minute review during which 50 turns each contributed 10 tool iters left the counter at 500, then fired on every turn for many turns after the release. Capped the accumulator at 2× the threshold so a stuck review produces at most two immediate refires once it releases — preserves the "fire again next clean turn" intent without the post-release burst. Test: TestSkillIterCounterCappedDuringInFlight. #14 Snapshot took mutex on every render frame (app/selflearn_ui.go) - TUI repaints on every key press, resize, and inbound message; Snapshot's unconditional Lock/Unlock was pure overhead in the dominant idle case. Added phaseAtomic (sync/atomic.Int32) mirroring phase; writers update both together under the mutex; Snapshot fast-paths to the empty value when phaseAtomic reads idle without touching the mutex. #15 Tick re-armed at 100 ms during done/failed hold (app/selflearn_ui.go + app/selflearn.go) - ~20 useless tick round-trips during the 2 s done-hold and ~30 during the 3 s failed-hold, each forcing a full tea.Cmd → scheduler → Msg → Update dispatch. Tick now returns the delay for the next scheduled tick: selflearnTickInterval while reviewing, REMAINING hold time while done/failed (one deadline tick instead of polling). handleSelflearnTick uses the returned delay to schedule via tea.Tick directly. Test call sites updated for the new (delay, active) signature. -race clean across selflearn, setting, reminder, app, app/conv, app/input. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
format-check was failing in CI on three files left over from manual edits across the recent batches: - internal/app/input/on_config.go - internal/app/services.go - internal/app/view.go Pure whitespace / alignment changes — no behavior delta. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two CI failures on this branch, both fixed: #1 internal/core/system → internal/session (layer violation) - AutoMemoryDir called session.EncodePath. core ranks 3 and session ranks 2 (feature), so this was an upward import in the layered architecture (caught by tools/layercheck). Inlined a 5-line encodeProjectPath helper in core/system that mirrors session.EncodePath — both functions are tiny string-ops and must stay in lockstep so memory and transcript stores resolve to the same project partition; comment makes the dependency explicit. #2 internal/selflearn unmapped in layercheck - Added "internal/selflearn": "feature" to tools/layercheck/main.go's layerOf map. selflearn imports core (system, agent) and infra (markdown) — both downward, consistent with feature-layer peers. Pre-push tooling - New `make ci` target runs every step the .github workflow runs in the same order (format-check → build-all → lint → unit tests → integration tests). Catches format/vet/layercheck/test failures locally instead of round-tripping through Actions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…om /code-review Findings from the max-effort /code-review pass on this branch: - #1 config disable can't persist: UpdateSelfLearnAt now replaces the selfLearn block instead of OR-merging it via mergeSelfLearn, so a true->false toggle from /config actually sticks (other settings in the file are preserved). Cross-level Load layering still ORs by design. - #2/#3 lifecycle race + leak: write observers gate on a captured atomic.Bool instead of racing on m.services.SelfLearn; new teardownSelfLearn() cancels the prior fork context before a re-wire (the agent-toggle rebuild path bypassed StopAgentSession) and on stop. - #4 skill name traversal: resolve() validates name with skillNameRe so patch/edit/delete/write_file/remove_file can't escape the skills dir. - #5 yamlScalar under-quoting: emit the description via yaml.Marshal so a value like "[draft] ..." can't produce invalid frontmatter that bricks every later parseSkill. - #7 applyPatch overlap: lineWindowReplace collects non-overlapping fuzzy windows (advance by w on a hit), mirroring the exact tier. - #8 stale settings: reload the in-memory Settings handle on ConfigSavedMsg. - #10 memory delimiter: reject a bare "§" line that fuses with the entry delimiter and shifts boundaries on read. - #9 relax Validate: "create + update allowed, delete denied" is now a legal config (delete is already agent-created-only, so opting out removes no safety). Rule, test, and design note updated. Adds regression tests for the config round-trip, traversal, description round-trip, and fuzzy-overlap cases (previously untested). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Rename `SelfLearnSegment` field → `SelfLearnStatus`, `SelfLearnUIState` → `SelfLearnIndicator` (file `selflearn_ui.go` → `selflearn_indicator.go`), `ArmTick` → `TryStartTicker`, `tickArmed` → `tickerRunning`, and `internal/selflearn/config_setting.go` → `resolve.go`. Trim verbose doc blocks across the L1 surface (selflearn.go, indicator, reviewer, fork, memory, skill, resolve) — keep WHY comments where they're load-bearing. No behaviour change.
- Rename `selflearn.Resolved` → `Runtime` (clearer: it's the runtime bundle ResolveSettings returns) - Extract `SelfLearnIndicator.decayPhase` to dedupe done/failed branches in Tick - Extract `ConfigSelector.renderCursor` helper (3 inline call sites) - Use `s.Memory.MaxKBOr()` in /config panel instead of local `defaultIfZero` - `slices.Delete` for memory entry removal - `countUserTurns([]core.ChatMessage)` reads `m.conv.Messages` directly, avoiding a `ConvertToProvider()` allocation just for counting
- Flatten `selflearn.Config`: drop the awkward `Resolved`/`Runtime`/`Triggers` wrappers; the resolved bundle is just `Config` with `Memory`, `Skills`, `Perms`, `MemoryMaxChars` at the top level. - Rename `internal/selflearn/resolve.go` → `config.go` to match. - Group the four `services.SelfLearn*` fields under a `SelfLearnServices` sub-struct (`services.SelfLearn.Reviewer / .Cancel / .Live / .Indicator`) — cleaner field cluster and aligned with how the data moves together.
- Extract a Panel interface so /config can host multiple sub-panels in the future. ConfigSelector becomes the popup shell: breadcrumb, optional tab strip when >1 panel is registered, esc/ctrl-tab handling. Today only Self-Learning is registered; adding a sibling panel (Provider, Permissions, …) is a one-liner in NewConfigSelector. - Move all Self-Learning UI into a private selfLearnPanel that implements Panel: rows, scope, snapshot, validate, save. - Adopt the /plugin overlay layout via kit.Panel — centered floating box, Padding(1, 2), full-width separator lines, kit.PanelTab tab strip, kit.HintLine footer. Right-aligned value column on int rows. - Add "/config › Self-Learning" breadcrumb so the popup advertises which command opened it. - Replace on_config.go / on_config_test.go with config_panel.go + config_selflearn.go + config_selflearn_test.go.
Layout:
- Cap popup width to ~84 chars so on a wide terminal the form values
stay near their labels instead of drifting to the far right.
- Move the cursor inside the indent so ▸ sits right next to the row it
marks, not orphaned in the leftmost column.
- Flatten the section hierarchy: promote "Allowed actions" and "Advanced"
from sub-headers under Skills to top-level sections. Result is four
clean sections (Memory / Skills / Allowed actions / Advanced).
- "Review cadence (user turns)" → "Run every" with the unit ("user turns",
"tool iterations", "KB") rendered as a muted suffix next to the value.
Reads naturally: "Run every 10 user turns".
- The "Max size" footnote (≈ words / 中文字 equivalence) moves under the
label column instead of right-indented.
Slash:
- Register /config in the builtin command list so it shows in the
command picker — handler was already wired, but the metadata entry
was missing so autocomplete never offered it.
- Hoist explicit layout columns: every row's label starts at the same column (labelCol=8) regardless of row type; numeric values right-end at valueCol=40 with units flowing past in muted style. Bool rows render "[ ] " between cursor and label; int rows pad an equivalent gutter so the label lands on the same column. Result: labels align vertically across the form, and "10" / "25" line up in a clean value column instead of trailing off to the box edge. - Slightly wider cap (84→92) and taller (32→34) so the form has room to breathe at "expanded" sizes. - Save button is column-aligned with the row content above it.
Drop the right-alignment that pushed values to col 40 with a wide gap between label and value. Render int rows as a compact phrase instead: "Run every 10 user turns". The value sits right next to the label (highlighted), the unit follows in muted style — no floating value in the middle of empty space.
- Int rows render value-first as a compact phrase with the underline
highlighting the number, aligned in the same column as "[":
10 user turns - Run every
25 KB - Max size ~ 4500 EN words / 8500 中文字 (UTF-8)
- "Max size" footnote moves inline after the label with "~" separator
instead of a separate line below.
- Merge the rowAdvHint row into the bool row via a new advHint field.
"Update user-authored skills ⚠ rewrites your authored skill files"
renders on one line — at narrow widths the hint wraps naturally.
- Save button aligns at the same indent as the "[ ]" rows.
Strip the recap back to a clean grouped list — no card frame, no top
"Self-improvement" header, no horizontal rules. The kind sub-header
(memory / skill) carries the only color in the block so the eye can
tell groups apart at a glance:
memory (blue)
· index — noted that lint runs via make ci, not go vet
· debugging — added 3 race-condition repro tips
skill (purple)
· go-testing — trimmed verbose examples
· python-typing — new · typing-hints and Protocol patterns
· outdated-thing
- memory header → AdaptiveColor{Dark: #6BA4D6, Light: #1F77B4}
- skill header → AdaptiveColor{Dark: #B68EE0, Light: #7B2D8E}
- rows stay italic + TextDim so the bulk of the block still recedes
- created actions get an inline "new ·" marker in accent green + bold
before the note ("new · typing-hints"), echoing the user's design
- rows without a note degrade to " · <target>" (no trailing dash)
- memory index file still renders as "index" so the column lines up
Indent provides the hierarchy; color and "new ·" are the only signals
that compete for attention.
…chat
Two follow-ups from the live demo:
1. The "new ·" inline marker on created skills was duplicating what
the LLM's note already said ("python-typing — new · new skill,
typing-hints …"). Dropped the marker entirely — the note describes
the change, including its newness when relevant.
2. The recap shared visual weight with chat text, making it hard to
tell apart at a glance. Every recap line now leads with a dim
"│ " rail so the whole block reads as a quoted self-learn surface,
not part of the conversation:
│ memory
│ · index — noted that lint runs via make ci, not go vet
│ · debugging — added 3 race-condition repro tips
│ skill
│ · go-testing — trimmed verbose examples
│ · python-typing — new skill, typing-hints and Protocol patterns
Kind sub-headers keep their color (memory blue, skill purple); the
rail itself is TextDim so the column reads as soft chrome.
Hand the visual styling over to the chat's markdown renderer. The recap publishes plain markdown — no ANSI escapes, no inline lipgloss styles — so the conversation pipeline owns how bold/code-span/bullet list renders. Output: **memory** - `index` — noted that lint runs via make ci, not go vet - `debugging` — added 3 race-condition repro tips **skill** - `go-testing` — trimmed verbose examples - `python-typing` — new skill, typing-hints and Protocol patterns - `outdated-thing` - Kind sub-headers use **bold** markdown - Targets render as `code-span` so they pop from the prose note - Em-dash separates target and LLM note; rows without a note drop the em-dash entirely - Memory index file still renders as `index` so the column lines up - Blank line between groups so the markdown parser breaks the list Removes the lipgloss imports and all the recap-specific styles (kind/memory/skill/row/rail/new) since the renderer handles them now. The block is also free to inherit any future markdown theming the conversation gets without touching this code.
Per user feedback ("前面那一道线有点丑"), the per-row "│ " left rail
felt heavy. Drop it in favour of one thin horizontal rule above the
block — the markdown "---" idiom drawn inline since Notices render
raw text:
──────────────────────────────────────────────────
memory
· index — noted that lint runs via make ci, not go vet
· debugging — added 3 race-condition repro tips
skill
· go-testing — trimmed verbose examples
· python-typing — new skill, typing-hints and Protocol patterns
The rule (50 chars of "─" in dim+faint) separates the recap from
chat above; the block ends naturally with the last row. Kind sub-
headers still carry the only color (memory blue, skill purple).
Per user feedback (the standalone horizontal rule didn't look great either), wrap the whole recap in a thin rounded box so it reads as a self-contained card: ╭──────────────────────────────────────────────╮ │ memory │ │ · index — noted that lint runs via make ci│ │ · debugging — added 3 race-condition tips │ │ skill │ │ · go-testing — trimmed verbose examples │ │ · python-typing — new skill, typing-hints │ ╰──────────────────────────────────────────────╯ Frame is lipgloss RoundedBorder in TextDim — soft chrome. Padding(0,2) gives a 2-col gutter inside the border so the content breathes. Kind sub-headers still color-code (memory blue, skill purple); rows stay italic + TextDim.
Per user feedback: the recap should surface a session ID the user can
"gen --resume <id>" to replay the L1 review's LLM calls. The previous
sidechain approach (sharing the main session's ID) would resume the
main chat instead.
Each L1 fork now creates its OWN session in the transcript store:
<main-session>.selflearn-review.<unix-seconds>
NewSidechainRecorder generates that ID and binds the Recorder to a
fresh session with it. The fork's session.started, messages, and
inferences all land under the new ID — so "gen --resume <fork-id>"
loads only the L1 review.
Recap footer now reads:
gen --resume <fork-session-id>
shown in dim+faint inside the recap box, ready to be copy/pasted.
Multiple concurrent forks coexist via the unix-seconds suffix.
Recorder gets a SessionID() accessor so the caller can surface it
without poking the private field.
Demo command fabricates a plausibly-shaped ID so the recap footer
looks identical to the real path.
Swap the solid rounded border for a quadruple-dash light style
("┈" horizontal, "┊" vertical) with rounded corners kept, and dim
the border color from TextDim to a noticeably fainter shade
(AdaptiveColor #4A4A52 dark / #C8C8CC light) so the frame whispers
instead of competing for attention.
Quadruple-dash chars (┈/┊) pack four tiny dashes per cell; the gap between dashes doesn't align with the cell boundary, so consecutive cells produce an uneven "long-short-long-short" pattern. Triple-dash (┄/┆) lays out one dash-gap-dash-gap-dash per cell — the gaps land on cell midpoints and the pattern reads uniform when tiled.
Two structural changes per user direction:
1. spinner moves from status bar to chat flow.
renderChatSection now emits the L1 indicator inline (above the
prompt, below the active content) while the review is running or
failed. The status bar entry (SelfLearnStatus) is removed —
selfLearnStatus() and the param field are deleted. During the
done phase the inline row is suppressed; the recap box that
publishes into the conversation flow already carries the summary.
The inline row uses selflearnLiveStyle (italic + TextDim) so it
sits softly in the chat thread without pulling focus.
2. "gen --resume <id>" rides on the bottom border.
formatRecapBlock no longer leans on lipgloss Border — the box is
built by hand so the footer can ride on the bottom edge itself:
╭┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄╮
┊ memory ┊
┊ · index — noted that lint runs via make ci ┊
┊ skill ┊
┊ · go-testing — trimmed verbose examples ┊
╰┄ gen --resume demo.selflearn-review.123 ┄┄┄┄┄╯
The footer is italic + faint so it reads as a label baked into the
chrome, not a content row. innerWidth widens automatically if the
footer is longer than every content row, so the text never wraps.
Falls back to a plain dashed bottom border when sessionID == "".
Two follow-ups from the screenshot:
1. Bottom border overshoots the box.
The width math for "╰┄ <footer> ┄…╯" was off by 2 — the lead/trail
accounting double-counted the dashes. Spelt out the constraint:
top span = innerWidth
bottom span = lead(1) + " "(1) + footerLen + " "(1) + trail(N)
= footerLen + N + 3
align → N = innerWidth - footerLen - 3
And the auto-widen check now compares footerLen directly to
contentWidth (the constraint is contentWidth >= footerLen, derived
from innerWidth >= footerLen + 4). End result: every box now has
four perfectly aligned corners.
2. Live indicator needs breathing room.
renderChatSection now wraps the inline row with blank lines on
both sides ("", live, "") so it reads as its own block — clearly
apart from the active content above and the prompt below.
The first row (memory) sat against the top border and the last row sat against the bottom border with the footer. Add a one-line empty row (vertical sides + spaces) inside the box on both edges so the content breathes and the "gen --resume …" footer reads as part of the frame, not crowded by the row above it.
Two compactness wins per user feedback:
- Drop the top padding row. The first content row ("memory") now sits
directly under the top border; the horizontal gutter handles the
breathing room. The bottom padding row stays so the "gen --resume"
footer keeps its visual separation from the last content row.
- Shrink row indent from 4 spaces to 2 (" · " instead of " · ").
The bullet lands right under the kind sub-header instead of being
dragged two extra columns right, so the whole list pulls in.
Italic on a shell command makes it look ornamental rather than runnable. Keep TextDim + Faint for the muted-chrome feel but render it upright so the user reads it as 'a command I can paste', not 'a styled label'.
User confirmed it wasn't needed. The card is now as tight as it can be while still framed — no internal padding rows on either edge, only the horizontal gutter on each side. The last content row sits directly above the bottom border carrying "gen --resume <id>".
1. Blank row between kind groups. memory + its rows and skill + its
rows now sit in separate visual blocks with one empty padded row
between them — the eye reads them as two sections instead of one
long list.
2. Footer affordance. The "gen --resume <id>" footer gains a "↪ "
lead so it reads as a next-action rather than a passive label.
The bottom-border auto-widens slightly to keep the corners aligned.
3. Desaturated kind colors. memory (blue) and skill (purple) pulled
~15-20% toward gray:
memory dark #6BA4D6 → #82A0BA
memory light #1F77B4 → #487192
skill dark #B68EE0 → #A89AC4
skill light #7B2D8E → #745783
Still recognisable as blue / purple, but they blend with the
italic+TextDim aesthetic instead of fighting it.
Terminal can't give us fractional rows, so user feedback of "feels cramped, but don't add a whole row" gets addressed horizontally: an extra column of padding on each side lowers the perceived density without touching the row count. Last skill row keeps the same distance from the bottom border, but every line gets one column of extra breathing on each end.
User/linter inlined memory.go's fireWrite but left skill.go's call sites pointing at the free function — added the free-function fireWrite + str helpers local to skill.go and dropped the unused tool import. Build green; gutter widening change ships in this commit too (was rolled into the original push but blocked by the helper drift).
…elow The hand-built box was a hack: manual character placement, manual width math, off-by-one bugs already paid for. Drop it. - Box rendering goes back to lipgloss.Border with custom dashed glyphs and Padding(0, 2). lipgloss owns the width math; the corners can't drift apart. - "↪ gen --resume <id>" lives on its own line BELOW the box, indented to align with the first content column. No more cramming text onto the bottom border, no more auto-widening to fit the footer. - Blank line between kind groups is now just "\n\n" between the inner blocks — lipgloss propagates it. Same visual idea, half the code, zero geometry math.
- selflearn/reviewer.go: ReviewKind.String iterates bits instead of enumerating the 2-bit power set; inline Reviewer.run into the goroutine in Observe (its only caller). - app/selflearn_indicator.go: actionLabel collapses 3 identical switch arms into one if/else. - app/selflearn.go: memoryTopicName consults system.AutoMemoryIndexName instead of hardcoding "MEMORY"/"memory" literals. - app/input/config_panel.go: drop hand-rolled visibleWidth ANSI stripper in favor of lipgloss.Width. - skill/types.go: neutralize Skill.Origin doc — keep provenance semantics, drop L1/L2-specific narrative from the shared type.
- skill.go: use tool.GetString/GetBool instead of local str/boolOf helpers
- conv/message.go: drop the dead SelfLearnStatus status-bar path (field,
branch, renderSelfLearnIndicator, style, test) — superseded by the inline
renderSelfLearnLive indicator
- memory.go: drop redundant empty-content check in Add (scanContent covers
it); range with strings.SplitSeq in readEntries
- app: extract selflearn.review.{started,done,failed} event types into shared
consts (publisher/consumer agreement)
- kit/spinner.go: correct the BrailleSpinnerFrames doc (selflearn uses ASCII)
- fork: trimTrailing* also strips RoleTool so a trailing tool_result no longer puts two consecutive user-role messages on the wire (renamed to trimTrailingPendingMessages, test extended) - session: NewSidechainRecorder sets Sidechain: true so L1 fork messages stay out of the inspector's main-thread view - skill: joinFrontmatter returns body alone when input had no frontmatter (no more silent --- prepending on user-authored files); Create rejects empty descriptions (preserve UPDATE/CREATE disambiguation in Inventory); NewSkillManager leaves userDir empty on UserHomeDir failure and dirFor/resolve/Inventory guard against it instead of aliasing user-scope writes onto cwd/.gen/skills - app/selflearn: wireSelfLearn takes pendingSend so countUserTurns excludes the in-flight submit and the cadence seed no longer double-counts; failure path drains the indicator and publishes a partial recap so writes before a timeout still surface - app/update + input: ConfigSavedMsg re-wires the L1 reviewer when the agent is active (toggle was a silent no-op until next session start) and surfaces a notice when the just-saved Enabled flag is overridden by the other settings level (merger ORs across user+project)
- app/selflearn: add live.Load() guards before each phase mutation (BeginReview, the post-LLM Fail/Complete entry, the publish funcs) so a teardown during the 5-min fork window can't flash 'evolving' or a stale recap on a session the user just cleared - scan: tighten role_hijack regex to require an actual role keyword (admin/root/developer/jailbroken/in <X> mode/...) — the old unanchored `you are now ` matched benign English like "you are now in the repo root" and rejected legitimate memory entries on every retry - prompts: append a reviewToolScope clarifier so the fork's reviewer LLM knows only memory_write / skill_manage are available, despite the inherited (verbatim-for-cache-parity) system prompt advertising Bash/Read/Edit/etc. - scan + skill.Patch: introduce scanNewThreats(original, candidate) so a Patch only fails when it INTRODUCES a new threat pattern. A skill whose body legitimately quotes a threat-pattern substring (e.g. an injection-defense example) stays patchable; the previous scanForThreats(patched) bricked any such skill permanently - fixes_test: regression coverage for the regex tightening and the scanNewThreats semantics (clean patch, threat-removing patch, new attack family blocked, invisible-rune smuggling blocked)
The default cadence (10) was duplicated three ways: a magic literal in the /config panel, parallel Default* constants in the reviewer, and two near-identical helpers (defaultIfZero / positiveOr). Consolidate onto the existing setting-package resolver convention (SelfLearnMaxMemoryKB + ResolvedMaxKB): add SelfLearnDefaultEveryTurns / SelfLearnDefaultEveryToolIters constants and ResolvedEveryTurns / ResolvedEveryToolIters methods, apply them in ResolveSettings (mirroring ResolvedMaxKB), and drop the reviewer's Default* constants + positiveOr and the config panel's defaultIfZero. Also rename the MaxKBOr accessor to ResolvedMaxKB for a consistent, non-dangling name.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
L1 self-learning implementation for #52, the per-turn background reviewer
designed in PR #58. Wires the feature end-to-end: trigger core →
restricted fork → memory + skill stores → action-permission gates →
runtime UI surfaces.
What's in this PR
Core (
internal/selflearn/)Reviewer(per-turn trigger) — two arms with independent enable +cadence, single-flight drop-don't-reset, defensive snapshot copy before
the goroutine, hydration on session resume.
MemoryStore(memory_writetool surface) — project-partitionedstore under
~/.gen/projects/<encoded-cwd>/memory/, atomic temp+renamewrites, per-file 25 KB cap matching the injection cap (§4.2 invariant),
injection scanner rejecting prompt-injection / exfiltration payloads.
SkillManager(skill_managetool surface) — six actions (create,edit, patch, write_file, remove_file, delete) gated by Deny-encoded
permission flags + one advanced opt-in for patching user-authored
skills; injection scanner on bodies + descriptions + patch results +
support files.
RunReview(fork mechanics) — freshcore.Agentinheriting theparent's cached system prompt verbatim for prefix-cache parity (§6
invariant push #2), restricted toolset whitelist + action-level allowlist,
5-min wall-clock deadline, best-effort + recover, single-flight.
full §5.1 trigger conditions — UPDATE > DELETE > CREATE preference,
CREATE four required criteria, the "user voiced a correction" UPDATE
path, eviction-first memory framing with the cap-near retire loop.
ResolveSettingsbridgessetting.SelfLearn→ runtime values(Config, ActionPermissions, MemoryMaxChars) with Validate() up front
so an illegal §3.1 combination is rejected at startup.
Config (
internal/setting/)SelfLearnSettings+SelfLearnMemory+SelfLearnSkillstypeswith Deny-encoded permission booleans (zero ⇒ allow, Go idiom).
Validate()enforces the three illegal combinations from §3.1(no-update-with-create, no-delete-when-create+update,
advanced-opt-in-without-base).
UpdateSelfLearnAt(cfg, userLevel)persists the panel's editsthrough the existing atomic merge path.
Wire-up (
internal/app/)wireSelfLearnat session start; readssetting.SelfLearn, appliesthe
GEN_DISABLE_SELF_LEARNenv override (§3.1 escape hatch),builds stores + manager + reviewer, hooks write observers, captures
build params for cache-parity-preserving fork construction.
OnTurnEndforwards everyResultto the reviewer; cancelled /interrupted turns are silently skipped inside
Observe.StopAgentSessionclears the reviewer reference; in-flight goroutinesfinish on the 5-min deadline.
memory-autoreminder provider registered alongsidememory-userand
memory-projectso agent-written entries are injected as theirown scope and never mixed with user-authored
GEN.md/CLAUDE.md.User-visible surface (
internal/app/selflearn_ui.go+ UI hooks)SelfLearnUIStatefour-phase state machine (idle / reviewing / done/ failed) with target debounce and auto-decay holds (2s / 3s).
evolving <braille> <target>indicator using the sharedkit.BrailleSpinnerFramestable; hidden in both idle and disabledstates (status line is pixel-identical to a feature-off session).
calls captured via observers), not the model's narration — the recap
reads as structurally accurate data, not LLM self-report.
/configslash command opens a Self-Learning panel with editablerows for both arms, inline validation, scope toggle between user-wide
and project-local saves.
Tests
-raceclean acrossinternal/selflearn,internal/setting,internal/reminder,internal/app,internal/app/conv,internal/app/input,internal/app/kit.interrupted-turn skip, concurrency cap behaviour, restricted-toolset
enforcement, permission gate (every legal §5.5 combination +
startup-rejection of the three illegal ones),
allowUpdateUserCreatedscope extension, injection scan on every memory + skill write path,
recap-block formatting + verb mapping, state-machine transitions
including done-hold decay survival across
DrainActions, snapshotisolation under concurrent
Observe.What's NOT in this PR (per the design)
real-provider testing
Manual e2e
Not yet run against a real LLM. Reviewers / users who pull this branch
can exercise by setting in
settings.json:{ "selfLearn": { "memory": { "enabled": true, "everyTurns": 3, "maxKB": 25 }, "skills": { "enabled": true, "everyToolIters": 5 } } }Then watch for
evolvingin the bottom bar after 3 user turns / 5cumulative tool iters, and the recap block appearing in the
conversation flow on completion. Disk artifacts will land in
~/.gen/projects/<encoded-cwd>/memory/and~/.gen/skills/<name>/(or./.gen/skills/for project-level entries).Design alignment
Design note (#58, now in main) tracked at every step; cross-references
in code comments use the stable §-anchors. Highlights of where this PR
deviates from the merged design and why:
allowCreate / allowUpdate / allowDelete(
*bool) todenyCreate / denyUpdate / denyDelete(bool) — zerovalue should be the safe default. The design note was updated to
match in the same PR that landed the rename (commit
e2780e2).Inventory()reads SKILL.md directly from disk instead of viaskill.Registry. The registry is loaded once at startup and onlyrefreshed via
ReloadPluginBackedState; a registry-backed Inventorywould not see skills the reviewer created earlier in the same
session, causing it to re-create duplicates. The disk scan is the
correct freshness for this caller. Documented inline.
Closes #52 (implementation).
🤖 Generated with Claude Code