Skip to content

feat: extensible transforms pipeline for zarr build#80

Merged
turban merged 3 commits intomainfrom
feat/transforms-pipeline
May 9, 2026
Merged

feat: extensible transforms pipeline for zarr build#80
turban merged 3 commits intomainfrom
feat/transforms-pipeline

Conversation

@turban
Copy link
Copy Markdown
Contributor

@turban turban commented May 8, 2026

Warning

This PR was accidentally merged and its changes were immediately reverted from main. The code did not ship.
Please see the replacement PR: #86


Summary

  • Replaces the hardcoded _UNIT_CONVERSIONS dict and pre_process list with a single transforms pipeline in the dataset YAML
  • Each entry is a dotted-path callable (string or {function, params} dict), resolved at runtime the same way ingestion.function works
  • Adds src/climate_api/transforms/ with two built-in transforms: convert_units and deaccumulate_era5
  • Updates era5_land.yaml to use transforms: for both temperature and precipitation datasets

Closes #79

Usage

transforms:
  - climate_api.transforms.deaccumulate_era5
  - climate_api.transforms.convert_units

External transforms from dhis2eo or any other package can be referenced by dotted path without changes to core code.

Test plan

  • uv run pytest tests/test_transforms.py — 12 new tests covering unit conversion, deaccumulation, pipeline execution, and edge cases
  • uv run pytest — full suite passes (153 + 12 tests)

@turban turban marked this pull request as draft May 8, 2026 12:37
@turban turban marked this pull request as ready for review May 9, 2026 12:47
@turban turban merged commit 1cb8b63 into main May 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: replace convert_units and pre_process with a unified transforms pipeline

1 participant