Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
94 commits
Select commit Hold shift + click to select a range
5e95250
docs: update README for current API state
turban May 5, 2026
3160cf6
fix: use python -m uvicorn to avoid conda PATH shadowing
turban May 5, 2026
3c2f40d
docs: add user guide and usage examples
turban May 5, 2026
a8411cb
fix: correct coordinate names and consolidated flag in examples
turban May 5, 2026
0b0bb22
docs: add setup guide with Rwanda example
turban May 5, 2026
f32eb82
fix: add missing docstrings and fix formatting in examples
turban May 5, 2026
fb208ea
docs: fix Python version requirement in setup guide
turban May 5, 2026
f97ec86
docs: use python -m uvicorn in pip and conda setup instructions
turban May 5, 2026
c0eb232
docs: open first catalog child instead of hardcoded SLE collection in…
turban May 5, 2026
57b6196
fix: detect valid_time vs time dimension in examples
turban May 5, 2026
0805636
fix: handle lon/lat coordinate names alongside longitude/latitude and…
turban May 5, 2026
8e1eb1f
docs: document coordinate name variants and use dynamic selection in …
turban May 5, 2026
755214d
docs: fix WorldPop variable name from pop to pop_total
turban May 5, 2026
19f5812
docs: correct ERA5-Land lag to 120 hours per lag_hours config
turban May 5, 2026
48952f5
docs: add make as a prerequisite in setup guide
turban May 5, 2026
6427e90
fix: correct Freetown coordinate label from E to W
turban May 5, 2026
c8952e6
docs: replace hardcoded SLE collection URLs with catalog-discovered h…
turban May 5, 2026
0cb9c42
fix: discover dataset from catalog instead of hardcoding SLE id in za…
turban May 5, 2026
5da532a
chore: update .env.example to remove DHIS2 and CDS API vars, add Dest…
turban May 5, 2026
a7543c8
chore: document all env vars in .env.example, grouped by purpose
turban May 5, 2026
f72f86e
docs: use hourly period strings for ERA5-Land ingestion example
turban May 5, 2026
4bb2318
docs: use dynamic variable name instead of hardcoded precip in user g…
turban May 5, 2026
733fe97
docs: fix zarr_direct_access.py module docstring to match current beh…
turban May 5, 2026
0d34932
fix: replace nonexistent DOWNLOAD_DIR with CACHE_OVERRIDE in .env.exa…
turban May 5, 2026
607f1ca
fix: correct OGCAPI_BASE_URL comment in .env.example
turban May 5, 2026
3eb2657
docs: clarify extent_id placeholder in setup guide ingestion example
turban May 5, 2026
ec61517
docs: use STAC catalog discovery in setup guide Step 7 xarray example
turban May 5, 2026
aa9420c
docs: replace prescriptive coord table with runtime detection guidance
turban May 5, 2026
4331d22
docs: remove misleading coord name comment in zarr_direct_access example
turban May 5, 2026
cbde36f
docs: separate / and /health in README endpoint table
turban May 5, 2026
16037a8
fix: comment out PYGEOAPI_CONFIG/OPENAPI in .env.example to preserve …
turban May 5, 2026
ed72ad5
docs: guard empty catalog in README STAC example and add precondition…
turban May 5, 2026
180c0ba
docs: add jq as optional prerequisite in setup guide
turban May 5, 2026
ce5eb4b
docs: clarify extent_id substitution in sync section of setup guide
turban May 5, 2026
d28b151
fix: raise_for_status on HTTP responses in stac_discover_and_open exa…
turban May 5, 2026
19b6d5c
fix: raise_for_status on HTTP responses in zarr_direct_access example
turban May 5, 2026
0a7435a
fix: replace requests with httpx in examples and docs snippets
turban May 5, 2026
df73bcb
fix: clarify CLIMATE_API_BASE_URL scope — does not affect pygeoapi OG…
turban May 5, 2026
a2cb758
docs: use isel(0) instead of hardcoded date in user guide time select…
turban May 5, 2026
f2a07ff
docs: replace hardcoded Sierra Leone coordinates with domain-centre p…
turban May 5, 2026
80de78e
feat: normalise Zarr coordinate names to longitude/latitude/time at w…
turban May 5, 2026
19f766e
docs: update ERA5-Land auth reference in ogcapi.md from CDS API to De…
turban May 5, 2026
3a09a54
docs: remove link to stale project_description.md from README intro
turban May 5, 2026
2d60f6d
test: add regression test for coordinate normalisation on pyramid path
turban May 5, 2026
ae32f2f
docs: clarify CLIMATE_API_ZARR_BROWSER_ORIGINS is not a CORS access-c…
turban May 5, 2026
f9de317
test: add regression test for coordinate normalisation on x/y source …
turban May 5, 2026
f5edb0e
fix: wrap long test function signature to satisfy ruff E501
turban May 5, 2026
05f8d1f
fix: update ERA5-Land source_url to DestinE Earth Data Hub
turban May 5, 2026
5489a6a
Code cleaning
turban May 5, 2026
1c41525
docs: correct /zarr description — serves any managed dataset, not onl…
turban May 5, 2026
9d93a4b
fix: use catalog href directly in open_dataset to avoid rebasing URL …
turban May 5, 2026
772384f
test: add STAC integration test for normalised zarr coordinate names
turban May 5, 2026
1a07f3f
fix: limit spatial mean time series to first 10 steps in zarr_direct_…
turban May 5, 2026
de3e28a
docs: slice to 10 time steps before spatial mean to avoid full-datase…
turban May 5, 2026
8bc8315
Revert change
turban May 5, 2026
f3b5eed
fix: restore DHIS2 connection vars to .env.example so it remains the …
turban May 5, 2026
d4521ac
feat: add climate_api.client with open_dataset and list_datasets (clo…
turban May 6, 2026
b9c71f0
feat: add Client class and env var fallback to climate_api.client
turban May 6, 2026
3ae9903
refactor: rename Client.open_dataset to Client.open
turban May 6, 2026
ed229a5
refactor: rename Client.list_datasets to Client.catalog
turban May 6, 2026
1982d5f
refactor: remove unused xr import and type annotation from stac example
turban May 6, 2026
3ad39eb
feat: expose dataset id in STAC catalog child links
turban May 6, 2026
502bb4b
refactor: simplify dataset listing in stac example
turban May 6, 2026
5211b79
refactor: pretty-print catalog output with json.dumps
turban May 6, 2026
f4d0364
refactor: replace float() casts with .item() in examples
turban May 6, 2026
c19b4a1
refactor: use .mean() for domain centre coordinate
turban May 6, 2026
4c5787f
fix: use human-readable DestinE landing page as ERA5-Land source_url
turban May 6, 2026
dccbab3
feat: installable package with CLIMATE_API_CONFIG and create_app()
turban May 6, 2026
8754f13
docs: update setup guide for CLIMATE_API_CONFIG and pip install
turban May 6, 2026
cfe9825
docs: add custom dataset guide covering YAML templates and datasets_dir
turban May 6, 2026
bcb8d1f
docs: rename custom_datasets_guide to adding_custom_datasets
turban May 6, 2026
1992e7a
feat: merge custom datasets_dir with bundled templates instead of rep…
turban May 6, 2026
0ab648a
refactor: revert dataset YAMLs to data/datasets/, drop importlib.reso…
turban May 6, 2026
a5d45c1
docs: remove premature pip install step from setup guide
turban May 6, 2026
f7696d4
feat: require CLIMATE_API_CONFIG for extent — remove data/extents.yam…
turban May 6, 2026
27c6df6
feat: add default climate-api.yaml with Sierra Leone extent
turban May 6, 2026
12e51fa
fix: address post-review issues before merging
turban May 6, 2026
6618cac
uv.lock
turban May 6, 2026
948bb48
fix: move non-standard id field from STAC response to client only
turban May 6, 2026
e596174
docs: correct download function interface in custom datasets guide
turban May 6, 2026
b277cfa
refactor: rename cache_info to ingestion in dataset templates
turban May 6, 2026
592daf2
refactor: rename extents to extent throughout — single extent per ins…
turban May 6, 2026
741e8b3
refactor: remove GET /extent/{extent_id} endpoint
turban May 6, 2026
a88568a
docs: fix unresolved Copilot review comments
turban May 6, 2026
3eaae09
fix: address second round of Copilot review comments
turban May 6, 2026
c6a386e
fix: address third round of Copilot review comments
turban May 6, 2026
ebd7d8b
fix: address fourth round of Copilot review comments
turban May 6, 2026
34f5f9d
fix: validate zarr asset href and open_kwargs before calling xr.open_…
turban May 6, 2026
17f20c7
fix: add httpx timeouts, fix CORS credentials flag, use tuple in isin…
turban May 6, 2026
1f08f26
fix: validate href in child links and assets dict in open_dataset; fi…
turban May 6, 2026
1e4c735
fix: load built-in dataset templates via importlib.resources
turban May 6, 2026
ab22df5
fix: consistent ValueError in open_dataset, don't mutate response dic…
turban May 6, 2026
8a5000e
fix: default DOWNLOAD_DIR to XDG user data dir instead of package-rel…
turban May 6, 2026
c3dd230
fix: make all runtime paths XDG-compliant and package data importlib-…
turban May 7, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 46 additions & 28 deletions .env.example
Original file line number Diff line number Diff line change
@@ -1,36 +1,54 @@
PYGEOAPI_CONFIG=data/pygeoapi/pygeoapi-config.yml
PYGEOAPI_OPENAPI=data/pygeoapi/pygeoapi-openapi.yml

# DHIS2 Connection
DHIS2_BASE_URL=https://play.im.dhis2.org/stable-2-42-4/api
DHIS2_USERNAME=admin
DHIS2_PASSWORD=district

# CDS API (required for ERA5-Land downloads)
# Get your API key from: https://cds.climate.copernicus.eu/how-to-api
CDSAPI_URL=https://cds.climate.copernicus.eu/api
CDSAPI_KEY=your-cds-api-key

# Download cache directory for climate data
# DOWNLOAD_DIR=./target/data

# Default EO download extent when a dataset requires bbox and the request does not provide one
# ── Instance configuration ────────────────────────────────────────────────────
# Path to the instance config file (extent, optional datasets_dir).
# Copy the example before editing: cp climate-api.yaml.example climate-api.yaml
# climate-api.yaml is gitignored so your local extent stays out of version control.
# When running via `make run` from the repo root, the relative path below works.
# When running the installed `climate-api` CLI from another directory, use an
# absolute path instead: CLIMATE_API_CONFIG=/path/to/your/climate-api.yaml
CLIMATE_API_CONFIG=./climate-api.yaml

# ── pygeoapi ──────────────────────────────────────────────────────────────────
# Absolute paths are set automatically at startup; override only if you move the files.
# PYGEOAPI_CONFIG=/absolute/path/to/pygeoapi-config.yml
# PYGEOAPI_OPENAPI=/absolute/path/to/pygeoapi-openapi.yml

# ── ERA5-Land (DestinE Earth Data Hub) ────────────────────────────────────────
# Authentication uses ~/.netrc — no API key is required here.
# See docs/setup_guide.md for registration and .netrc setup instructions.

# ── Download and ingestion ────────────────────────────────────────────────────
# Override the download cache directory (default: data/downloads).
# CACHE_OVERRIDE=/path/to/cache

# Fallback bounding box used when a request does not include an explicit bbox.
# Format: xmin,ymin,xmax,ymax
# DOWNLOAD_BBOX=-13.5,6.9,-10.1,10.0

# Default country code for datasets that require one (for example WorldPop)
# Country code for datasets that require one (e.g. WorldPop).
# COUNTRY_CODE=SLE
Comment thread
turban marked this conversation as resolved.

# OGC API base URL (used by Prefect tasks to call back into the API)
# OGCAPI_BASE_URL=http://localhost:8000/ogcapi

# Public Climate API base URL used for absolute STAC links behind a reverse proxy
# ── API deployment ────────────────────────────────────────────────────────────
# Set when running behind a reverse proxy so that STAC hrefs and native dataset
# links use the public address instead of the internal request URL.
# Note: pygeoapi's OGC self/root links are controlled separately via server.url
# in config/pygeoapi/base.yml and are not affected by this variable.
# CLIMATE_API_BASE_URL=http://localhost:8000

# TiTiler tile server base URL
# TITILER_BASE_URL=http://127.0.0.1:8000
# Fallback used to derive the native API base URL when CLIMATE_API_BASE_URL is not set.
# Only needed if CLIMATE_API_BASE_URL is unset and the OGC API path is known.
# Note: pygeoapi's server.url is controlled separately in config/pygeoapi/base.yml.
# OGCAPI_BASE_URL=http://localhost:8000/ogcapi

# Prefect (embedded server)
PREFECT_API_URL=http://localhost:8000/prefect/api
PREFECT_SERVER_ANALYTICS_ENABLED=false
PREFECT_SERVER_UI_SHOW_PROMOTIONAL_CONTENT=false
# Comma-separated list of origins used for extra browser/private-network headers
# on Zarr responses (default: https://inspect.geozarr.org).
# This does not restrict general CORS access; the API allows all origins globally.
# CLIMATE_API_ZARR_BROWSER_ORIGINS=http://localhost:3000

# ── DHIS2 connection ──────────────────────────────────────────────────────────
# Required for DHIS2 org unit lookups (Step 2 — spatial aggregation).
# Not needed for Step 1 (data ingestion, sync, and serving).
Comment thread
turban marked this conversation as resolved.
# DHIS2_BASE_URL=https://play.im.dhis2.org/stable-2-42-4/api
# DHIS2_USERNAME=admin
# DHIS2_PASSWORD=district
# DHIS2_HTTP_TIMEOUT_SECONDS=30
# DHIS2_HTTP_RETRIES=3
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
__pycache__/
.venv/
.env
climate-api.yaml
eo_api.egg-info/
data/downloads
data/artifacts
Expand Down
8 changes: 4 additions & 4 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Key concepts:

- **Dataset templates** — YAML files in `data/datasets/` describing a data source (variable, period type, download function). These are blueprints.
- **Artifacts / managed datasets** — ingested instances of a template for a specific spatial extent and time range. Exposed under `/datasets` and `/zarr/{dataset_id}`.
- **Extents** — named spatial bounding boxes configured at instance setup time (`id`, `bbox`, optional `country_code`).
- **Extent** — a single named spatial bounding box configured at instance setup time (`id`, `bbox`, optional `country_code`). Exposed at `GET /extent`.

## Repository layout

Expand Down Expand Up @@ -41,11 +41,11 @@ The `.env` file is required for `make run` and `make openapi`. Copy `.env.exampl

## Dataset templates

Each YAML in `data/datasets/` defines a dataset template. The `cache_info` block controls download and zarr build behaviour:
Each YAML in `data/datasets/` defines a dataset template. The `ingestion` block controls download and zarr build behaviour:

```yaml
cache_info:
eo_function: dhis2eo.data.worldpop.pop_total.yearly.download
ingestion:
function: dhis2eo.data.worldpop.pop_total.yearly.download
default_params: {} # passed to the download function
multiscales: # optional — triggers pyramid build
levels: 4
Expand Down
54 changes: 29 additions & 25 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Climate and Earth Observation data is distributed across dozens of providers —

Each instance is configured for a specific country or region, and all data extraction, processing, and storage is scoped to that spatial extent. It abstracts data access across heterogeneous sources (CHIRPS, ERA5, WorldPop, and others), stores outputs as GeoZarr, and exposes them through standards-based endpoints.

The platform is designed to operate independently of DHIS2 and can be deployed on local, cloud-hosted, or sovereign country infrastructure. See [docs/project_description.md](docs/project_description.md) for a full description of the vision and technical architecture, [docs/managed_data_api_guide.md](docs/managed_data_api_guide.md) for the current API surface, and [docs/roadmap.md](docs/roadmap.md) for the planned development steps.
The platform is designed to operate independently of DHIS2 and can be deployed on local, cloud-hosted, or sovereign country infrastructure. See [docs/setup_guide.md](docs/setup_guide.md) for a step-by-step setup walkthrough, [docs/user_guide.md](docs/user_guide.md) for data access examples, [docs/managed_data_api_guide.md](docs/managed_data_api_guide.md) for the full API reference, and [docs/roadmap.md](docs/roadmap.md) for the planned development steps.

> **Status: active development.** Current focus is on dataset ingestion, sync workflows, and GeoZarr storage. APIs and data models may change without notice.

Expand All @@ -18,13 +18,7 @@ Install dependencies (requires [uv](https://docs.astral.sh/uv/)):
uv sync
```

Copy `.env.example` to `.env` and adjust values as needed. Environment variables are loaded automatically from `.env` at runtime.

Key environment variables:

- `DHIS2_BASE_URL` — DHIS2 API base URL (defaults to play server in `.env.example`)
- `DHIS2_USERNAME` — DHIS2 username
- `DHIS2_PASSWORD` — DHIS2 password
Copy `.env.example` to `.env` and adjust values as needed. Environment variables are loaded automatically from `.env` at runtime. See `.env.example` for the full list of available options.
Comment thread
turban marked this conversation as resolved.
Comment thread
turban marked this conversation as resolved.
Comment thread
turban marked this conversation as resolved.

Start the app:

Expand All @@ -40,7 +34,7 @@ If you cannot use uv (e.g. mixed conda/forge environments):
python -m venv .venv
source .venv/bin/activate
pip install -e .
uvicorn climate_api.main:app --reload
python -m uvicorn climate_api.main:app --reload
```

### Using conda
Expand All @@ -49,7 +43,7 @@ uvicorn climate_api.main:app --reload
conda create -n dhis2-climate-api python=3.13
conda activate dhis2-climate-api
pip install -e .
uvicorn climate_api.main:app --reload
python -m uvicorn climate_api.main:app --reload
```

## Development
Expand All @@ -73,33 +67,43 @@ Once running, the API is available at:

| Endpoint | Description |
| ----------------------------------------- | ------------------------------------------ |
| `http://localhost:8000/` | Welcome / health check |
| `http://localhost:8000/` | Navigation document |
| `http://localhost:8000/health` | Health check |
| `http://localhost:8000/docs` | Interactive API documentation (Swagger UI) |
| `http://localhost:8000/extent` | Configured spatial extent |
| `http://localhost:8000/datasets` | Managed dataset catalogue |
| `http://localhost:8000/stac/catalog.json` | STAC catalog for published GeoZarr data |
| `http://localhost:8000/zarr/{dataset_id}` | GeoZarr store for a managed dataset |
Comment thread
turban marked this conversation as resolved.
| `http://localhost:8000/ogcapi` | OGC API root |
| `http://localhost:8000/zarr/{dataset_id}` | GeoZarr store for a published dataset |

## STAC

Published Zarr-backed managed datasets are exposed under `/stac` as one STAC Collection per dataset.
Published GeoZarr datasets are discoverable under `/stac` as one STAC Collection per dataset. Each collection includes a `zarr` asset with direct xarray-compatible access metadata derived from the live Zarr store.

- `/stac/catalog.json` is the entrypoint catalog
- `/stac/collections/{dataset_id}` exposes a Collection with a direct `/zarr/{dataset_id}` asset href
- `xstac` derives Datacube metadata from the real Zarr-backed dataset
- `pygeoapi` remains the OGC query layer under `/ogcapi`
Discover available datasets and open one with xarray:

Minimal example:
The catalog is populated once at least one dataset has been ingested and published (see [docs/setup_guide.md](docs/setup_guide.md)).

```python
import requests
import httpx
import xarray as xr

collection = requests.get(
"http://127.0.0.1:8000/stac/collections/chirps3_precipitation_daily_sle"
).json()

asset = collection["assets"]["zarr"]
ds = xr.open_zarr(asset["href"], consolidated=asset["xarray:open_kwargs"]["consolidated"])
catalog = httpx.get("http://127.0.0.1:8000/stac/catalog.json").json()
children = [link for link in catalog["links"] if link["rel"] == "child"]

if not children:
print("No published datasets found. Run an ingestion first.")
else:
for link in children:
print(link["title"], "—", link["href"])

collection = httpx.get(children[0]["href"]).json()
asset = collection["assets"]["zarr"]
ds = xr.open_zarr(
asset["href"],
consolidated=asset["xarray:open_kwargs"]["consolidated"],
)
print(ds)
```

## pygeoapi
Expand Down
11 changes: 11 additions & 0 deletions climate-api.yaml.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Climate API instance configuration.
# Replace the extent below with your country before deploying.
# See docs/setup_guide.md for field descriptions.

extent:
id: sle
name: Sierra Leone
bbox: [-13.5, 6.9, -10.1, 10.0]
country_code: SLE

# datasets_dir: ./datasets/ # optional — custom templates merged with built-ins
6 changes: 0 additions & 6 deletions data/extents.yaml

This file was deleted.

Loading
Loading