feat(triage): annotation-backlog triage (pull → score → review/unlabel) + viewer mode + DVC sharding by Chouffe · Pull Request #63 · pyronear/temporal-model

Chouffe · 2026-06-16T16:25:25Z

What

Adds triage/ — a new sibling package that shrinks the pyro-annotator's
annotation backlog. It pulls the unannotated queue (processing_stage=ready_to_annotate)
read-only, scores every sequence with the temporal smoke classifier in-process,
and splits it at a threshold (default 0.35) into:

To Review (score ≥ 0.35) — browsed locally in the viewer
Unlabel (score < 0.35) — a read-only worklist (ids + ready-to-send bulk
payload) to later mark as the unlabeled false-positive type

triage never writes to the annotator (enforced by test).

Run against prod: 21,489 sequences scored → 16,348 To Review / 5,141 Unlabel,
shared on S3 and verified pullable from a clean clone.

Package (`temporal_model.triage`)

annotator_api.py — read-only HTTP client (GET-only + the login POST; no
patch/put/delete method exists).
pull.py — incremental pull with parallel frame downloads (--workers) and
across-sequence concurrency (--seq-workers).
score.py — in-process core.predict(), sequence score = max kept-tube
probability, bucket at threshold; progress heartbeat.
report.py — eval-viewer contract + results.parquet + worklists; tags every
prediction with the release model_version (e.g. 0.2.0).
shards.py — pack/unpack: the loose store (~247k files) + report (~43k
files) would be a DVC object explosion, so per-sequence data is bundled into
~36 tar objects. Frames (immutable, append-only) and predictions (per-run)
are separate shard sets, so re-scores never re-pack the 26 GB of frames.
cli.py — pull / score / pack / unpack.

Viewer (`viewer/`)

Triage mode (detected via triage_bucket): To Review / Unlabel cards, a
clickable per-organization breakdown, a threshold slider + clickable sweep
table (smooth on 21k rows via useDeferredValue + React.memo), a triage
score column, and the correctness column/slider hidden (no ground truth). Shared
monitor/eval paths untouched.

Data sharing (DVC)

Single tracked artifact data/02_shards (dvc add); no dvc.yaml pipeline (the
247k-file store is impractical as a stage dep). Consumer flow: dvc pull →
unpack → view — no annotator creds, model, GPU, or Docker.

Docs

triage/README.md (producer + verified consumer flow), docs/specs/2026-06-16-triage-design.md,
docs/specs/2026-06-16-triage-sharding-design.md; root README + Makefile wired.

Testing

27 offline Python tests (mocked HTTP, fake store, stub model — no network/Docker)

48 viewer tests; lint clean. The full consumer flow (clone → install → dvc pull
→ unpack → viewer serving 21k with frames) verified end-to-end from a clean clone.

….org The deployed annotator pre-creates a SequenceAnnotation record at ready_to_annotate for the whole human queue, so has_annotation=false is near-empty (2) while ready_to_annotate is the real backlog (21,489, matching the UI). Parameterize the pull filter (--stage, default ready_to_annotate) and correct the API host (annotationapi.pyronear.org; annotator.* is the SPA).

Per-sequence frames are fetched+downloaded concurrently (--workers, default 16) via a ThreadPoolExecutor; the client mounts a larger urllib3 pool so concurrent signed-URL fetches don't exhaust connections. pull logs a progress line every 25 sequences. Verified no API/object-store rate limiting (conc=40 all 200s).

Adds a columnar results.parquet twin of results.json (matches eval) for analytical reuse. Tracks the 500-sequence ready_to_annotate store + scored report in DVC (cache:true report, like monitor) — data lives in the S3 remote s3://pyro-vision-rd/dvc/temporal-model/triage/, only pointers in git.

Process seq_workers sequences in parallel (ThreadPoolExecutor), each still fetching its frames with the per-sequence workers pool. Total in-flight ~= seq_workers*workers. Cuts the full backlog pull from ~7h to ~1h. Default stays serial (seq_workers=1) for backward compatibility.

Detect triage trees via triage_bucket and render TriageCards (Review/Unlabel split + bars, clickable per-organization breakdown, triage explainer). Table gains a bucket + triage_score column and drops the correctness column (triage rows are unlabeled); the logistic-threshold slider and outcome/GT filters are hidden; the verdict filter relabels to Review/Unlabel. 44 viewer tests pass.

…tail pane Triage bucket label is now 'To Review' (card, table, filter, explainer). The detail pane's correctness Stat is hidden for triage rows (unlabeled, no ground truth), matching the dropped table column.

…l) in triage

…ency, parquet) Document the consumer flow (dvc pull + viewer, no creds/model/Docker) front and center, plus the correct API host (annotationapi), ready_to_annotate scope, --seq-workers/--workers concurrency, results.parquet, full CLI reference, and data layout. Also add model_config.py to the score stage deps (was missing, so a change there wouldn't trigger dvc repro).

…age count)

Re-enable the threshold slider in triage mode so sliding re-buckets To Review (>= t) vs Unlabel (< t) live via applyThreshold; default to the triage 0.35 (top-level cfg.threshold) not the model logistic_threshold, and relabel as 'triage threshold'. TriageCards header + counts track the live value.

…ep table Defer the big table re-render so dragging the threshold stays smooth on large stores (rail cards stay live; React skips intermediate drag values, no manual debounce). Add a ThresholdSweep table under the slider showing To Review/Unlabel counts at standard thresholds, highlighting the one nearest the slider.

Defer the threshold INPUT (not output) so applyThreshold throttles to settled values; memoize TriageCards/ThresholdSweep computations; wrap SequenceTable in React.memo with stable props (useCallback sort handler) so the 21k-row table is skipped during a drag. Also widen the sweep table: 0.05 steps around the ~0.45 operating point, bigger gaps at the extremes.

…tions Add shards.py (pack: per-sequence frame tars [append-only] + report tars [per run] + loose aggregates; unpack: restore loose store+report) and pack/unpack CLI subcommands. Tag every prediction with the release model_version from the model.zip manifest (e.g. 0.2.0), in results rows + model_config.

Replace loose-store + report-output DVC tracking with a single dvc-add of data/02_shards (the packed artifact, ~36 objects). Drop the dvc.yaml pipeline (the 247k-file store is impractical as a stage dep; the workflow is staged/manual) and the now-unused params.yaml. README + Makefile updated for the pull → score → pack → dvc add → push producer flow and the dvc pull → unpack → view consumer flow.

…, model_version, no pipeline)

…) + verified note

…+ specs)

… slider gating, root README unpack - pull: _pull_one_safe catches any error (not just RequestException) so one bad detection/disk error can't abort a 20k-seq concurrent run; sort detections tolerantly (recorded_at may be null). - annotator client: stop paging on a short page instead of relying on a possibly-absent 'pages' field (avoided silent under-pull). - viewer: gate the triage slider on triageMode, not the eval-shaped decision.aggregation field. - root README: triage consumer flow was missing the 'unpack' step (+ npm install).

Chouffe added 30 commits June 16, 2026 11:28

docs(triage): design spec for annotation-backlog triage project

7cfd1e2

docs(triage): implementation plan

a1041a8

feat(triage): scaffold package

56cbf4c

feat(triage): sequence store (meta.json + frame loading)

120639f

feat(triage): read-only annotator client (GET-only + login)

b810e70

feat(triage): incremental pull of unannotated sequences

f96a0f3

feat(triage): score sequences and bucket by threshold

78df393

feat(triage): viewer contract + unlabel/review worklists

819bdba

feat(triage): pull/score CLI (+ model_config reader)

435ceb0

feat(triage): DVC pipeline, params, README, repo wiring

b685c61

docs(triage): tar-sharding design (append-only sealed shards)

3b8cdf2

feat(viewer): rename Review->To Review; hide correctness in triage de…

26a5ed1

…tail pane Triage bucket label is now 'To Review' (card, table, filter, explainer). The detail pane's correctness Stat is hidden for triage rows (unlabeled, no ground truth), matching the dropped table column.

feat(viewer): relabel detail-pane verdict as bucket (to review/unlabe…

c591b6e

…l) in triage

docs: surface triage in root README (quick-start subsection, fix pack…

20a064d

…age count)

feat(triage): score progress heartbeat (every 250 sequences)

d167e97

feat(viewer): click a threshold-sweep row to set the triage threshold

cc9b806

docs(triage): mark sharding spec implemented (+ deltas: report shards…

82a67a6

…, model_version, no pipeline)

docs(triage): consumer prerequisites (branch, AWS profile, ~55GB disk…

5c2737c

…) + verified note

docs(triage): remove implementation plan (superseded by shipped code …

486da1f

…+ specs)

Chouffe added 2 commits June 16, 2026 18:32

style(viewer): prettier format TriageCards (CI format:check)

1fa070c

Chouffe merged commit 0b9f2f2 into main Jun 16, 2026
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(triage): annotation-backlog triage (pull → score → review/unlabel) + viewer mode + DVC sharding#63

feat(triage): annotation-backlog triage (pull → score → review/unlabel) + viewer mode + DVC sharding#63
Chouffe merged 32 commits into
mainfrom
arthur/chore-pyro-annotator-temporal-model-predictions

Chouffe commented Jun 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Chouffe commented Jun 16, 2026

What

Package (temporal_model.triage)

Viewer (viewer/)

Data sharing (DVC)

Docs

Testing

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Package (`temporal_model.triage`)

Viewer (`viewer/`)