fix: tooling hardening — runs.json crash, image-copy abort, jargon, dead code by Muizzkolapo · Pull Request #706 · Muizzkolapo/agent-actions

Muizzkolapo · 2026-06-19T00:01:37Z

Summary

Six findings from hardening/reviews/tooling-review.md for agent_actions/tooling/.

#	Severity	File	Fix
1	P0 — HIGH	`docs/run_tracker.py`	`record_action_start`, `record_action_complete`, `finalize_workflow_run` now catch `(OSError, JSONDecodeError)` on `json.load`. A corrupted `runs.json` from a previous interrupted write no longer crashes the live workflow; a warning is logged and the call returns. `start_workflow_run` already had this guard.
2	P1 — MEDIUM	`code_scanner.py`	`scan_tool_functions` now catches `OSError` alongside `SyntaxError`/`UnicodeDecodeError`. A broken symlink or file deleted between `rglob` and `read_text` no longer aborts catalog generation. Matches sibling pattern in `docs/scanner/component_scanners.py`.
3	P1 — MEDIUM	`docs/generator.py`	`_copy_readme_images` wraps the `mkdir`+`copy2` pair in `try/except OSError`. A single bad image (disk full, permission denied) no longer aborts the entire catalog. Log message uses "stage" because the same block covers both mkdir and copy failures.
4	P1 — MEDIUM	`docs/generator.py`	`_copy_readme_images` switched from `str.replace` to span-based substitution with right-to-left splicing. The naive replace corrupted shared filename suffixes — `logo.png` would rewrite inside `dark/logo.png` on the first replace, double-prefixing the second.
5	P2 — LOW	`lsp/diagnostics.py`, `lsp/server.py`	Rewrote IDE diagnostic messages: dropped "context_scope reference" framework jargon in favour of action/field language users actually author. Signature-help documentation similarly clarified.
6	P3 — LOW	`lsp/models.py`	Removed unused `ActionDefinition` dataclass (and its `DEFAULT_ACTION_KIND` import). Superseded by `ActionMetadata`; no callers anywhere — `grep -r "ActionDefinition"` returns only the class definition.

Verification

ruff format --check agent_actions tests — clean on touched files
ruff check agent_actions/tooling tests/unit/tooling — clean
pytest tests/unit/tooling tests/tooling — 279 passed (152 unit + 127 integration)
pytest -q --ignore=tests/integrations — 7409 passed, 2 skipped
pytest tests/integrations -q — 96 passed
Pi consensus (5 reviewers): 2 confirmed YES with full positive reviews, 1 UNCLEAR due to verdict-line parse mismatch on a positive review, 1 Pi process error, 1 truncated. Reviewer 4's cosmetic note on the error message wording ("Failed to copy" when block also covers mkdir) addressed before commit by switching to "Failed to stage".

New tests

tests/unit/tooling/test_run_tracker.py — corrupted and empty runs.json no-crash cases for record_action_start, record_action_complete, finalize_workflow_run. Verifies warning is logged.

tests/unit/tooling/test_copy_readme_images.py (new file) — substring-collision regression (logo.png vs dark/logo.png), selective copy2 failure does not abort or propagate, external URLs (http://, https://, data:) pass through unchanged.

A _allow_log_propagation fixture re-enables propagation on the agent_actions logger for the duration of caplog-based assertions, since LoggerFactory sets propagate=False at the root.

…ead code Six findings from hardening/reviews/tooling-review.md. P0 — RunTracker json.load now wrapped in try/except (OSError, JSONDecodeError) in record_action_start, record_action_complete and finalize_workflow_run. A corrupted runs.json from a previous interrupted write no longer crashes the live workflow thread; the call logs a warning and returns. start_workflow_run already had this guard. P1 — code_scanner.scan_tool_functions now catches OSError around read_text, alongside the existing SyntaxError/UnicodeDecodeError. A broken symlink or file deleted between rglob and read_text no longer aborts catalog generation. P1 — generator._copy_readme_images guards the mkdir+copy2 pair with try/except OSError so a single bad image (disk full, permission denied) no longer aborts the entire catalog. The log message uses "stage" because the same block covers both mkdir and copy failures. P1 — generator._copy_readme_images switched from str.replace to span-based substitution with right-to-left splicing. The naive replace corrupted shared filename suffixes — 'logo.png' would rewrite inside 'dark/logo.png' on the first replace, double-prefixing the second. P2 — LSP diagnostics and signature help rewritten to drop the "context_scope reference" phrasing in favour of action/field language users actually author. server.py signature documentation similarly clarified. P3 — Removed the unused ActionDefinition dataclass (and its DEFAULT_ACTION_KIND import) from lsp/models.py. Superseded by ActionMetadata; no callers anywhere. Tests: 7 new cases covering corrupted/empty runs.json across all three methods, the substring-collision fix, the copy-failure fallback, and URL passthrough. A `_allow_log_propagation` fixture restores log propagation on the `agent_actions` logger for the duration of caplog- based assertions (LoggerFactory sets propagate=False). Verification: ruff format --check + ruff check clean on touched files; pytest -q passes 7409 + 96 integration tests with the new ones included. Pi consensus: 2/5 YES with full positive reviews, 1 UNCLEAR due to verdict-parse mismatch on a YES review, 1 Pi process error, 1 review truncated. Reviewer 4's cosmetic note on the error message wording addressed before commit.

@Retry

Eight findings from a recall-mode simplify review of PR #706. Correctness: - run_tracker.py now `.touch(exist_ok=True)` the runs.json file before acquiring the portalocker lock in record_action_start / record_action_complete / finalize_workflow_run. Without the touch, `portalocker.Lock(file, "r+", ...)` raises FileNotFoundError if the file is missing — and the @Retry decorator only catches LockException, so FNFE propagated to the workflow thread. This was the exact crash the previous commit's docstring claimed to prevent. start_workflow_run already had the touch; the other three did not. - The three new try/except guards no longer just `return` on a corrupted file. They now repair runs.json to the `_empty_runs_data(extended=True)` shape before returning, matching the sibling pattern in _load_runs_data_from_file (line 125) and start_workflow_run (line 247). The old code's silent-fallback shape left the file permanently broken so every subsequent record/finalize call kept logging "cannot record action ..." warnings forever. The helper _load_or_repair was extracted to centralise this behaviour. - code_scanner.scan_tool_functions now also catches OSError raised by `Path.rglob("*.py")` itself during iteration (it raises during the for-statement, not inside the body, so the inner try/except didn't cover it). A 0700-mode subdir under tools/ no longer aborts the whole scan. - lsp/server.py SignatureHelp documentation now accurately describes the data: variables are aggregated across every action in the file, not "this action's context_scope.observe". The diagnostics layer uses a stricter per-action set for guard-condition validation, so the doc now points users at per-action diagnostics for strict scope. Cleanup: - Removed the dead `seen_spans: set[tuple[int, int]]` dedup in _copy_readme_images. The html and markdown URL-group spans live in disjoint surrounding chars (`src="..."` vs `](...)`), so their (start, end) tuples can never coincide; `finditer` already returns non-overlapping matches within a single pattern. - Hoisted `images_dir.mkdir(parents=True, exist_ok=True)` out of the per-image try/except. A systemic mkdir failure (read-only artefact dir) now logs one "cannot create images directory" warning and aborts cleanly instead of producing N misleading per-image warnings. The per-image try/except is back to covering only copy2. - Moved the _allow_log_propagation fixture into a new tests/unit/tooling/conftest.py (mirroring tests/unit/llm/conftest.py and tests/unit/unification/conftest.py) and marked it autouse. The two test files no longer carry duplicated fixture bodies or pass it as a method parameter. Tests: - Added TestCorruptedRunsFileRecovers covering all three methods plus a follow-up test proving the healed file is usable by a normal start_workflow_run + record_action_start sequence afterwards. - Added TestMissingRunsFileNoCrash covering the missing-file case for all three methods. - Added test_mkdir_failure_aborts_cleanly_with_one_warning covering the hoisted-mkdir behaviour. Verification: - ruff format --check / ruff check clean on touched files. - pytest tests/unit/tooling tests/tooling: 284 passed. - pytest -q --ignore=tests/integrations: 7414 passed, 2 skipped (up from 7409 due to the new heal/missing-file tests).

mypy --strict-equivalents flagged 'Returning Any from function declared to return dict[str, Any]'. Annotate the loaded value explicitly so the type checker sees the intended shape without changing runtime behavior.

Muizzkolapo added 3 commits June 19, 2026 01:01

fix(tooling): narrow json.load return type in _load_or_repair

3a470c6

mypy --strict-equivalents flagged 'Returning Any from function declared to return dict[str, Any]'. Annotate the loaded value explicitly so the type checker sees the intended shape without changing runtime behavior.

Muizzkolapo merged commit 55b3124 into main Jun 19, 2026
5 checks passed

github-actions Bot locked and limited conversation to collaborators Jun 19, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: tooling hardening — runs.json crash, image-copy abort, jargon, dead code#706

fix: tooling hardening — runs.json crash, image-copy abort, jargon, dead code#706
Muizzkolapo merged 3 commits into
mainfrom
fix/tooling-hardening

Muizzkolapo commented Jun 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Muizzkolapo commented Jun 19, 2026

Summary

Verification

New tests

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant