Skip to content

Pipeline Resilience Phase 4: Add retry logic to periodic LLM scripts #56

@madjin

Description

@madjin

Context

After Phases 1-3 harden the daily-critical scripts, 5 periodic/on-demand scripts still lack retry logic. They fail less often (monthly/quarterly runs) but the same failure modes apply.

Scope

Apply the same retry pattern (MAX_ATTEMPTS = 2, RETRY_DELAY_SECONDS = 5, completion token check, debug sidecar) to:

Script API call line Frequency
scripts/etl/helpers.py ~743 Monthly
scripts/etl/generate-monthly-retro.py ~303 Monthly
scripts/etl/generate-quarterly-summary.py ~200 Quarterly
scripts/etl/extract-entities.py ~262 Per-run
scripts/integrations/hackmd/update.py ~685 Daily

Same pattern each time: import time, retry constants, wrap API call in for attempt in range(MAX_ATTEMPTS) loop, token sanity check, debug sidecar on final failure.

Verification

Run a periodic script and verify [attempt 1/2] appears in logs and _metadata.attempts field in output.

Dependencies

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions