docs: implement self-improvement loop architecture by rachaelrenk · Pull Request #239 · warpdotdev/docs

rachaelrenk · 2026-06-19T00:17:35Z

Summary

Implements the self-improvement loop architecture for Warp docs content operations. This PR adds the signal infrastructure and outer loop skills that allow agent-authored docs to improve automatically from accumulated feedback.

Architecture

Four coordinated loops — three active now, one documented for future deployment:

Loop 1 (Drafting improvement): Signal logs capture style lint violations, agent review patterns, and human reviewer feedback from every agent-authored PR. A monthly cloud agent (improve-drafting-skills) reads these logs and proposes targeted edits to drafting skills and templates.
Loop 2 (Triage — future): Designed but deferred until the docs repo has higher external contribution volume.
Loop 3 (404 → redirect fix): Phase 2 added to weekly-404-monitor — after posting the weekly Slack report, the agent now also proposes redirect entries for high-confidence uncovered 404 gaps.
Loop 4 (AEO meta-improvement): Quarterly cloud agent (improve-aeo-crosslink-skill) reads the aeo_crosslink_audit run log and proposes improvements to the audit skill itself. Deploy on month 3 after aeo_crosslink_audit has 8+ run log entries.

Changes

New log files (`.agents/logs/`)

style_lint_runs.jsonl — JSONL, one record per style lint run on an agent-authored PR
pr_review_runs.md — Markdown log of review-docs-pr runs on agent-authored PRs
human_review_feedback.jsonl — JSONL, human review comments and edits collected from merged agent PRs

New skills

.agents/skills/improve-drafting-skills/SKILL.md — monthly outer loop (Loop 1)
.agents/skills/improve-aeo-crosslink-skill/SKILL.md — quarterly outer loop (Loop 4)

Modified skills (additive changes only)

.agents/skills/draft_docs/SKILL.md — step 8 extended to append a violation record to style_lint_runs.jsonl on cloud agent runs
.agents/skills/review-docs-pr/SKILL.md — new Signal logging section: appends a summary entry to pr_review_runs.md after reviewing an agent-authored PR
.agents/skills/weekly-404-monitor/SKILL.md — new Phase 2 section: redirect drafter with confidence scoring and draft PR for HIGH-confidence matches

Next steps (after merge)

Two new Oz scheduled agents need to be configured in the Oz web app:

improve-drafting-skills — monthly, first Monday of each month at 9am PT
improve-aeo-crosslink-skill — quarterly, first Monday of Jan/Apr/Jul/Oct (start on month 3 when the run log has 8+ entries)

weekly-404-monitor already runs as a scheduled agent — no new agent is needed; Phase 2 runs within the same existing agent.

Architecture plan: https://staging.warp.dev/drive/notebook/LiSAdtZGryD78gSNj5kGPx
Conversation: https://staging.warp.dev/conversation/652a054e-d757-4632-8554-5176f5529ee2

Co-Authored-By: Oz oz-agent@warp.dev

- Add three signal log files: style_lint_runs.jsonl, pr_review_runs.md, human_review_feedback.jsonl - Extend draft_docs/SKILL.md step 8 to append style lint violation records to style_lint_runs.jsonl on agent-authored PRs - Extend review-docs-pr/SKILL.md to append review summaries to pr_review_runs.md - Add improve-drafting-skills skill: monthly outer loop that reads all three logs and proposes targeted edits to skills/templates - Add Phase 2 redirect-drafter to weekly-404-monitor: auto-drafts vercel.json redirects for high-confidence uncovered 404 gaps - Add improve-aeo-crosslink-skill: quarterly outer loop that reads aeo_crosslink_audit_runs.md and proposes edits to the audit skill Co-Authored-By: Oz <oz-agent@warp.dev>

vercel · 2026-06-19T00:17:39Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
docs	Ready	Preview, Comment	Jun 19, 2026 4:55pm

oz-for-oss · 2026-06-19T00:21:25Z

@rachaelrenk

I'm starting a first review of this pull request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

oz-for-oss

Overview

This PR adds self-improvement loop infrastructure for docs operations: new signal logs, new outer-loop skills, and updates to drafting/review/404-monitor skills.

Concerns

The new JSONL log files are seeded with # comment rows, which conflicts with the documented one-record-per-line JSONL format and can break consumers before any real signal is recorded.
Several logging steps require automated agents to commit directly to main; that either bypasses the normal PR review path when credentials allow it, or silently loses the durable signal when branch protection blocks the push.
The human feedback collector misses inline review comments and attributes the full PR diff as human edits, so the primary training signal can be incomplete and polluted with agent-authored content.

Security

The outer-loop skills consume human comments, PR text, and run logs as inputs for editing skills, but they do not define a prompt-injection boundary for treating that content strictly as data.

Spec alignment

No approved or repository spec context was provided for this PR, so there were no spec-alignment findings.

Verdict

Found: 0 critical, 5 important, 0 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

oz-for-oss · 2026-06-19T00:24:24Z

@@ -0,0 +1,4 @@
+# Human review feedback log — one JSON record per line, appended by the feedback collector step.


⚠️ [IMPORTANT] These # header rows make the file invalid JSONL, so any line-by-line JSON parser in improve-drafting-skills will fail before it sees real feedback. Keep JSONL logs empty until the first record or explicitly require consumers to skip comment lines; the same applies to style_lint_runs.jsonl.

oz-for-oss · 2026-06-19T00:24:24Z

+   ```json
+   {"date":"YYYY-MM-DD","pr":"NNN","branch":"BRANCH_NAME","authored_by":"agent","skill_used":"SKILL_NAME","files_scanned":N,"violations":{"check_name":count}}
+   ```
+4. From a clean checkout or worktree based on the latest `main`, stage only `.agents/logs/style_lint_runs.jsonl` and commit directly to `main`:


⚠️ [IMPORTANT] Committing log updates directly to main makes the signal path depend on agents having default-branch push rights and bypasses the normal PR review path when they do. Use a draft log PR or an artifact consumed by a scheduled collector instead; this pattern also appears in the new review/feedback logging steps.

oz-for-oss · 2026-06-19T00:24:24Z

+Before reading the logs, run the feedback collector to capture any merged agent-authored PRs from the past 30 days that have not yet been logged to `human_review_feedback.jsonl`:
+
+1. Use `gh pr list --repo warpdotdev/docs --state merged --label oz-agent` or search for PRs with `oz-agent@warp.dev` as a commit author in the past 30 days.
+2. For each such PR, use `gh pr view NNN --json reviews,comments` to extract human review comments and verdicts.


⚠️ [IMPORTANT] gh pr view --json reviews,comments does not collect inline review comments, which are the main [skill-feedback] signal this loop is designed to learn from. Fetch review threads or pull review comments as well before appending human_review_feedback.jsonl records.

oz-for-oss · 2026-06-19T00:24:24Z

+
+1. Use `gh pr list --repo warpdotdev/docs --state merged --label oz-agent` or search for PRs with `oz-agent@warp.dev` as a commit author in the past 30 days.
+2. For each such PR, use `gh pr view NNN --json reviews,comments` to extract human review comments and verdicts.
+3. Also run `git diff MERGE_BASE..PR_HEAD -- src/content/docs/` to capture human follow-up edits made to the branch after the agent's last commit.


⚠️ [IMPORTANT] MERGE_BASE..PR_HEAD captures the whole PR diff, not just human edits after the agent's last commit, so the collector will log agent-authored content as human feedback. Diff from the last bot-authored commit to PR_HEAD or classify commits by author before writing human_edit records.

oz-for-oss · 2026-06-19T00:24:24Z

+
+### 1. Read the last 30 days of signal data
+
+Parse all three log files and filter to entries from the past 30 days.


⚠️ [IMPORTANT] [SECURITY] The signal logs include untrusted human comments, PR text, and run output, but the workflow has no prompt-injection boundary before using them to edit skills. Add an explicit rule to treat log content as data only, ignore embedded instructions or secret requests, and only act on parsed fields that meet the stated thresholds.

- Empty JSONL log files (comment headers were invalid JSONL) - Replace direct-to-main commits in inner loops with stdout signal markers ([SIGNAL:style-lint] and [SIGNAL:pr-review]) consumed by the outer loop via oz run get -- eliminates branch protection dependency on inner loops - Fix feedback collector to fetch inline review comments via gh api pulls/NNN/comments (not just top-level comments field) - Fix git diff range: LAST_BOT_COMMIT..MERGE_COMMIT instead of MERGE_BASE..PR_HEAD to exclude agent-authored content from human edits - Add explicit prompt-injection security boundary to improve-drafting-skills: treat log content as data only, discard injection indicators, act only on parsed structured fields Co-Authored-By: Oz <oz-agent@warp.dev>

rachaelrenk · 2026-06-19T00:53:28Z

/oz-review

oz-for-oss · 2026-06-19T00:53:41Z

@rachaelrenk

I'm re-reviewing this pull request in response to a review request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

oz-for-oss

Overview

This PR adds documentation and skill instructions for self-improvement loops around docs drafting, PR review, 404 redirects, and AEO cross-link audits. The architecture is directionally coherent, but several changed instructions disagree about where signals are written/read, which would cause the monthly loop to miss automated feedback. I could not inspect the contents of .agents/logs/human_review_feedback.jsonl or .agents/logs/style_lint_runs.jsonl because the attached diff reports those patches as unavailable, so any concerns about those files are summarized rather than commented inline.

Concerns

The new pr_review_runs.md log describes a committed log writer, but the updated review skill only emits a stdout marker.
improve-drafting-skills still instructs the monthly analysis to parse log files for automated signals even though the same skill now says those signals are only collected from Oz run output.
Raw human review comment text is committed to main without an explicit pre-append injection/secret filtering step.
The new review signal is described as running after submission, but the skill workflow produces review.json for the control plane to publish, so the marker may never be emitted.

Verdict

Found: 0 critical, 4 important, 0 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

oz-for-oss · 2026-06-19T00:56:24Z

@@ -0,0 +1,17 @@
+# PR review run log
+
+New entries are prepended by each `review-docs-pr` agent run on an agent-authored PR. Most recent entry first.


⚠️ [IMPORTANT] This log is never written by the updated review-docs-pr skill; it now emits [SIGNAL:pr-review] to stdout, so either wire the skill to prepend this file or remove the committed log stub to avoid a dead signal source.

oz-for-oss · 2026-06-19T00:56:24Z

+   - Set `tag` to the prefix found in the comment (`[skill-feedback]`, `[template-feedback]`, `[style-rule-gap]`) or `""` if none.
+   - Set `feedback_type` to `"review_comment"`, `"human_edit"`, or `"review_verdict"`.
+   - **Skip** comments from `oz-agent@warp.dev`, `vercel`, `github-actions`, or any other bot actor (check the author login or `authorAssociation`).
+5. Append accepted records to `.agents/logs/human_review_feedback.jsonl` and commit directly to `main` as part of this monthly outer loop run:


⚠️ [IMPORTANT] [SECURITY] This commits raw human review text to main, but injection/secret filtering is only described later for analysis; filter/redact records before appending so malicious comments or accidental sensitive data are not persisted.

oz-for-oss · 2026-06-19T00:56:24Z

+
+## Workflow
+
+### 1. Read the last 30 days of signal data


⚠️ [IMPORTANT] This step still says to parse all three log files, but style-lint and PR-review signals are collected from Oz run output in Step A and not written to two of those files; update the analysis to use the in-memory Step A records or the monthly loop will miss automated signals.

oz-for-oss · 2026-06-19T00:56:24Z

+
+## Signal logging
+
+After submitting the PR review, emit a summary record for the `improve-drafting-skills` outer loop. Apply this step only when reviewing an agent-authored PR (branch created by a drafting skill, or commit author is `oz-agent@warp.dev`).


⚠️ [IMPORTANT] This runs after the review is submitted, but this skill's workflow only creates review.json for the control plane to publish; emit the signal immediately after creating/validating review.json so the marker appears in the run output.

- Fix pr_review_runs.md header: now correctly states it is written by the outer loop (improve-drafting-skills), not by review-docs-pr - Add Step A.5: outer loop writes parsed [SIGNAL:pr-review] records to pr_review_runs.md as a human-readable audit trail, closing the dead signal source - Pre-append security filtering in Step B: injection detection and secret redaction happen before records reach human_review_feedback.jsonl - Workflow step 1: now reads in-memory Step A records + on-disk human_review_feedback.jsonl; no longer references non-existent files - Update threshold table: source labels now say 'Step A in-memory records' - review-docs-pr signal logging: emit after validating review.json (not 'after submitting') so the marker appears in Oz run output regardless of how the review is published Co-Authored-By: Oz <oz-agent@warp.dev>

cla-bot Bot added the cla-signed label Jun 19, 2026

rachaelrenk self-assigned this Jun 19, 2026

vercel Bot deployed to Preview June 19, 2026 00:20 View deployment

rachaelrenk marked this pull request as ready for review June 19, 2026 00:21

oz-for-oss Bot reviewed Jun 19, 2026

View reviewed changes

rachaelrenk changed the title ~~feat(skills): implement self-improvement loop architecture~~ docs: implement self-improvement loop architecture Jun 19, 2026

vercel Bot deployed to Preview June 19, 2026 00:44 View deployment

oz-for-oss Bot reviewed Jun 19, 2026

View reviewed changes

vercel Bot deployed to Preview June 19, 2026 16:55 View deployment

		@@ -0,0 +1,4 @@
		# Human review feedback log — one JSON record per line, appended by the feedback collector step.


		### 1. Read the last 30 days of signal data

		Parse all three log files and filter to entries from the past 30 days.

		@@ -0,0 +1,17 @@
		# PR review run log

		New entries are prepended by each `review-docs-pr` agent run on an agent-authored PR. Most recent entry first.


		## Signal logging

		After submitting the PR review, emit a summary record for the `improve-drafting-skills` outer loop. Apply this step only when reviewing an agent-authored PR (branch created by a drafting skill, or commit author is `oz-agent@warp.dev`).

Conversation

rachaelrenk commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Architecture

Changes

New log files (.agents/logs/)

New skills

Modified skills (additive changes only)

Next steps (after merge)

Uh oh!

vercel Bot commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oz-for-oss Bot commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oz-for-oss Bot left a comment

Choose a reason for hiding this comment

Overview

Concerns

Security

Spec alignment

Verdict

Uh oh!

oz-for-oss Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

oz-for-oss Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

oz-for-oss Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

oz-for-oss Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

oz-for-oss Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

rachaelrenk commented Jun 19, 2026

Uh oh!

oz-for-oss Bot commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oz-for-oss Bot left a comment

Choose a reason for hiding this comment

Overview

Concerns

Verdict

Uh oh!

oz-for-oss Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

oz-for-oss Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

oz-for-oss Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

oz-for-oss Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

rachaelrenk commented Jun 19, 2026 •

edited

Loading

New log files (`.agents/logs/`)

vercel Bot commented Jun 19, 2026 •

edited

Loading

oz-for-oss Bot commented Jun 19, 2026 •

edited

Loading

oz-for-oss Bot commented Jun 19, 2026 •

edited

Loading