fix: tooling hardening — runs.json crash, image-copy abort, jargon, dead code#706
Merged
Conversation
…ead code Six findings from hardening/reviews/tooling-review.md. P0 — RunTracker json.load now wrapped in try/except (OSError, JSONDecodeError) in record_action_start, record_action_complete and finalize_workflow_run. A corrupted runs.json from a previous interrupted write no longer crashes the live workflow thread; the call logs a warning and returns. start_workflow_run already had this guard. P1 — code_scanner.scan_tool_functions now catches OSError around read_text, alongside the existing SyntaxError/UnicodeDecodeError. A broken symlink or file deleted between rglob and read_text no longer aborts catalog generation. P1 — generator._copy_readme_images guards the mkdir+copy2 pair with try/except OSError so a single bad image (disk full, permission denied) no longer aborts the entire catalog. The log message uses "stage" because the same block covers both mkdir and copy failures. P1 — generator._copy_readme_images switched from str.replace to span-based substitution with right-to-left splicing. The naive replace corrupted shared filename suffixes — 'logo.png' would rewrite inside 'dark/logo.png' on the first replace, double-prefixing the second. P2 — LSP diagnostics and signature help rewritten to drop the "context_scope reference" phrasing in favour of action/field language users actually author. server.py signature documentation similarly clarified. P3 — Removed the unused ActionDefinition dataclass (and its DEFAULT_ACTION_KIND import) from lsp/models.py. Superseded by ActionMetadata; no callers anywhere. Tests: 7 new cases covering corrupted/empty runs.json across all three methods, the substring-collision fix, the copy-failure fallback, and URL passthrough. A `_allow_log_propagation` fixture restores log propagation on the `agent_actions` logger for the duration of caplog- based assertions (LoggerFactory sets propagate=False). Verification: ruff format --check + ruff check clean on touched files; pytest -q passes 7409 + 96 integration tests with the new ones included. Pi consensus: 2/5 YES with full positive reviews, 1 UNCLEAR due to verdict-parse mismatch on a YES review, 1 Pi process error, 1 review truncated. Reviewer 4's cosmetic note on the error message wording addressed before commit.
Eight findings from a recall-mode simplify review of PR #706. Correctness: - run_tracker.py now `.touch(exist_ok=True)` the runs.json file before acquiring the portalocker lock in record_action_start / record_action_complete / finalize_workflow_run. Without the touch, `portalocker.Lock(file, "r+", ...)` raises FileNotFoundError if the file is missing — and the @Retry decorator only catches LockException, so FNFE propagated to the workflow thread. This was the exact crash the previous commit's docstring claimed to prevent. start_workflow_run already had the touch; the other three did not. - The three new try/except guards no longer just `return` on a corrupted file. They now repair runs.json to the `_empty_runs_data(extended=True)` shape before returning, matching the sibling pattern in _load_runs_data_from_file (line 125) and start_workflow_run (line 247). The old code's silent-fallback shape left the file permanently broken so every subsequent record/finalize call kept logging "cannot record action ..." warnings forever. The helper _load_or_repair was extracted to centralise this behaviour. - code_scanner.scan_tool_functions now also catches OSError raised by `Path.rglob("*.py")` itself during iteration (it raises during the for-statement, not inside the body, so the inner try/except didn't cover it). A 0700-mode subdir under tools/ no longer aborts the whole scan. - lsp/server.py SignatureHelp documentation now accurately describes the data: variables are aggregated across every action in the file, not "this action's context_scope.observe". The diagnostics layer uses a stricter per-action set for guard-condition validation, so the doc now points users at per-action diagnostics for strict scope. Cleanup: - Removed the dead `seen_spans: set[tuple[int, int]]` dedup in _copy_readme_images. The html and markdown URL-group spans live in disjoint surrounding chars (`src="..."` vs `](...)`), so their (start, end) tuples can never coincide; `finditer` already returns non-overlapping matches within a single pattern. - Hoisted `images_dir.mkdir(parents=True, exist_ok=True)` out of the per-image try/except. A systemic mkdir failure (read-only artefact dir) now logs one "cannot create images directory" warning and aborts cleanly instead of producing N misleading per-image warnings. The per-image try/except is back to covering only copy2. - Moved the _allow_log_propagation fixture into a new tests/unit/tooling/conftest.py (mirroring tests/unit/llm/conftest.py and tests/unit/unification/conftest.py) and marked it autouse. The two test files no longer carry duplicated fixture bodies or pass it as a method parameter. Tests: - Added TestCorruptedRunsFileRecovers covering all three methods plus a follow-up test proving the healed file is usable by a normal start_workflow_run + record_action_start sequence afterwards. - Added TestMissingRunsFileNoCrash covering the missing-file case for all three methods. - Added test_mkdir_failure_aborts_cleanly_with_one_warning covering the hoisted-mkdir behaviour. Verification: - ruff format --check / ruff check clean on touched files. - pytest tests/unit/tooling tests/tooling: 284 passed. - pytest -q --ignore=tests/integrations: 7414 passed, 2 skipped (up from 7409 due to the new heal/missing-file tests).
mypy --strict-equivalents flagged 'Returning Any from function declared to return dict[str, Any]'. Annotate the loaded value explicitly so the type checker sees the intended shape without changing runtime behavior.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Six findings from
hardening/reviews/tooling-review.mdforagent_actions/tooling/.docs/run_tracker.pyrecord_action_start,record_action_complete,finalize_workflow_runnow catch(OSError, JSONDecodeError)onjson.load. A corruptedruns.jsonfrom a previous interrupted write no longer crashes the live workflow; a warning is logged and the call returns.start_workflow_runalready had this guard.code_scanner.pyscan_tool_functionsnow catchesOSErroralongsideSyntaxError/UnicodeDecodeError. A broken symlink or file deleted betweenrglobandread_textno longer aborts catalog generation. Matches sibling pattern indocs/scanner/component_scanners.py.docs/generator.py_copy_readme_imageswraps themkdir+copy2pair intry/except OSError. A single bad image (disk full, permission denied) no longer aborts the entire catalog. Log message uses "stage" because the same block covers both mkdir and copy failures.docs/generator.py_copy_readme_imagesswitched fromstr.replaceto span-based substitution with right-to-left splicing. The naive replace corrupted shared filename suffixes —logo.pngwould rewrite insidedark/logo.pngon the first replace, double-prefixing the second.lsp/diagnostics.py,lsp/server.pylsp/models.pyActionDefinitiondataclass (and itsDEFAULT_ACTION_KINDimport). Superseded byActionMetadata; no callers anywhere —grep -r "ActionDefinition"returns only the class definition.Verification
ruff format --check agent_actions tests— clean on touched filesruff check agent_actions/tooling tests/unit/tooling— cleanpytest tests/unit/tooling tests/tooling— 279 passed (152 unit + 127 integration)pytest -q --ignore=tests/integrations— 7409 passed, 2 skippedpytest tests/integrations -q— 96 passedNew tests
tests/unit/tooling/test_run_tracker.py— corrupted and emptyruns.jsonno-crash cases forrecord_action_start,record_action_complete,finalize_workflow_run. Verifies warning is logged.tests/unit/tooling/test_copy_readme_images.py(new file) — substring-collision regression (logo.pngvsdark/logo.png), selectivecopy2failure does not abort or propagate, external URLs (http://,https://,data:) pass through unchanged.A
_allow_log_propagationfixture re-enables propagation on theagent_actionslogger for the duration of caplog-based assertions, sinceLoggerFactorysetspropagate=Falseat the root.