Skip to content

Phase 7: precipitation, S2F disaggregator, CBRFC LID helper, polish#13

Merged
tmart234 merged 1 commit into
devfrom
claude/phase-7-precipitation
May 18, 2026
Merged

Phase 7: precipitation, S2F disaggregator, CBRFC LID helper, polish#13
tmart234 merged 1 commit into
devfrom
claude/phase-7-precipitation

Conversation

@tmart234

Copy link
Copy Markdown
Owner

The model didn't see precipitation at all -- not observed (encoder) and not
forecast (decoder). For a 14-day Colorado snowmelt forecast that's a real
omission; SWE and antecedent soil moisture covered "water in the system"
but not recent or upcoming rain. Add a precipitation feature wired
end-to-end:

  • get_noaa.py now fetches GHCND PRCP alongside TMIN/TMAX and converts
    tenths-of-mm to mm so units match Open-Meteo's precipitation_sum.
    Adds a realized-coverage gate (fix for the long-standing TODO at the
    top of the module): walks past stations whose actual returned data is
    too NaN-heavy on TMIN/TMAX, not just the lifetime metadata coverage.
  • get_forecast.py adds precipitation_sum to the Open-Meteo request and
    gracefully falls back to zeros if an older deployment omits it.
  • combine_data.py threads precipitation through merge_dataframes with a
    short (2-day) interior interpolation limit -- precip is spiky and
    smearing a missed storm across surrounding dry days is misleading --
    and 0-fills anything still missing (modal value for most basins).
  • normalize_data.py registers precipitation as optional + log1p-scaled
    (heavy-tailed, same rationale as flow).
  • windowing.py adds precipitation to both ENCODER_FEATURES (observed)
    and DECODER_FEATURES (forecast QPF) -- the one auxiliary that has a
    skillful 14-day forecast available.

Other items in the same sweep:

  • get_s2f.py: ships disaggregate_seasonal_to_daily() so when someone
    wires the USBR archive fetch, the baseline lights up immediately
    instead of needing the math written then too.
  • data/find_lid.py: maintainer helper that queries the NWS NWPS gauge
    index near a USGS/DWR site and prints candidate AHPS LIDs, so the
    CBRFC LID map gets populated with real values instead of guesses.
  • get_poly.py: HUC-aware simplification tolerance (HUC4 polygons have
    ~10x the points of HUC8; same fixed tolerance was over/under-doing it
    depending on the level).
  • export_mobile.py: optional dynamic-range int8 TFLite output, gated on
    a parity check vs the float32 Keras model. Ships only when max abs
    diff < 0.05 (scaled space); manifest records the parity number.
  • VegDRI removed (two never-wired modules + their xfailed test). Two
    years of dead code, no integration path that didn't require a new
    USGS time-series scraper from scratch.
  • docs/INFERENCE.md: documents precipitation handling, int8 artifact,
    and the gh workflow run one-liner for the first release after merge.

https://claude.ai/code/session_01XfhRQmztLSqmz6qeLSi9kw

The model didn't see precipitation at all -- not observed (encoder) and not
forecast (decoder). For a 14-day Colorado snowmelt forecast that's a real
omission; SWE and antecedent soil moisture covered "water in the system"
but not recent or upcoming rain. Add a `precipitation` feature wired
end-to-end:

  - get_noaa.py now fetches GHCND PRCP alongside TMIN/TMAX and converts
    tenths-of-mm to mm so units match Open-Meteo's precipitation_sum.
    Adds a realized-coverage gate (fix for the long-standing TODO at the
    top of the module): walks past stations whose actual returned data is
    too NaN-heavy on TMIN/TMAX, not just the lifetime metadata coverage.
  - get_forecast.py adds precipitation_sum to the Open-Meteo request and
    gracefully falls back to zeros if an older deployment omits it.
  - combine_data.py threads precipitation through merge_dataframes with a
    short (2-day) interior interpolation limit -- precip is spiky and
    smearing a missed storm across surrounding dry days is misleading --
    and 0-fills anything still missing (modal value for most basins).
  - normalize_data.py registers `precipitation` as optional + log1p-scaled
    (heavy-tailed, same rationale as flow).
  - windowing.py adds `precipitation` to both ENCODER_FEATURES (observed)
    and DECODER_FEATURES (forecast QPF) -- the one auxiliary that has a
    skillful 14-day forecast available.

Other items in the same sweep:
  - get_s2f.py: ships `disaggregate_seasonal_to_daily()` so when someone
    wires the USBR archive fetch, the baseline lights up immediately
    instead of needing the math written then too.
  - data/find_lid.py: maintainer helper that queries the NWS NWPS gauge
    index near a USGS/DWR site and prints candidate AHPS LIDs, so the
    CBRFC LID map gets populated with real values instead of guesses.
  - get_poly.py: HUC-aware simplification tolerance (HUC4 polygons have
    ~10x the points of HUC8; same fixed tolerance was over/under-doing it
    depending on the level).
  - export_mobile.py: optional dynamic-range int8 TFLite output, gated on
    a parity check vs the float32 Keras model. Ships only when max abs
    diff < 0.05 (scaled space); manifest records the parity number.
  - VegDRI removed (two never-wired modules + their xfailed test). Two
    years of dead code, no integration path that didn't require a new
    USGS time-series scraper from scratch.
  - docs/INFERENCE.md: documents precipitation handling, int8 artifact,
    and the gh workflow run one-liner for the first release after merge.

https://claude.ai/code/session_01XfhRQmztLSqmz6qeLSi9kw
@tmart234 tmart234 merged commit 8e20d54 into dev May 18, 2026
1 check failed
@tmart234 tmart234 deleted the claude/phase-7-precipitation branch May 18, 2026 17:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants