Skip to content

General development#86

Merged
mitfik merged 9 commits intomasterfrom
development
Apr 15, 2026
Merged

General development#86
mitfik merged 9 commits intomasterfrom
development

Conversation

@mitfik
Copy link
Copy Markdown
Contributor

@mitfik mitfik commented Apr 15, 2026

  • Improve error handling and communication between witness and watcher
  • expose delegation over sdk
  • expose low level API in sdk to the client before we move fully to sdk

mitfik added 9 commits April 15, 2026 09:39
…nc fixes

Overhauls the watcher's KEL synchronization to be self-reliant rather
than purely reactive to external queries.

Key changes:
- Fix update_local_kel to compare local SN against witness-reported KSN
  instead of relying solely on escrow state, preventing cases where KSN
  was accepted but full KEL events were never fetched
- Add WitnessPoller for background polling of tracked AIDs with adaptive
  intervals (recently active AIDs polled more frequently, stable ones
  less) and AID subscription support for priority polling
- Replace fire-and-forget channel with completion-signaling mechanism so
  Logs queries await KEL updates (10s timeout) before responding,
  eliminating the need for external retry loops
- Add WitnessHealthTracker recording per-witness response statistics
  (success/failure counts, response times, consecutive failures) and
  expose GET /health endpoint for monitoring
- Range-based KEL fetching to avoid re-downloading the entire event log
- Propagate and log witness query errors instead of silently discarding
When a remote endpoint returns a non-success status with an empty body,
serde_json::from_str("") would fail with "EOF while parsing a value",
producing a misleading "network error" message. Now checks for empty
bodies first and returns a clear error with the HTTP status code.

Also changed JSON parse error fallback to use UnknownError(body) instead
of NetworkError(serde_err) since the raw body is more useful for debugging.
reqwest::Client was using defaults with no timeouts, causing requests
to hang indefinitely when endpoints are slow or unreachable. Added
10s connect timeout and 30s overall request timeout via a shared
http_client() builder.
Watcher's TEL transport used reqwest::Client with no timeouts, causing
requests to witnesses to hang indefinitely when unreachable. Added 10s
connect timeout and 30s request timeout, consistent with keri-core
transport layer.
forward_query_from iterated witnesses sequentially, so a slow or
unreachable first witness would delay fetching events from all others.
Now uses join_all() to query all witnesses concurrently (consistent
with query_state) and processes all successful responses, improving
latency for multi-witness AIDs.
before moving fully to keri sdk we need to keep them available
@mitfik mitfik merged commit ddcd2ab into master Apr 15, 2026
0 of 2 checks passed
@mitfik mitfik deleted the development branch April 15, 2026 08:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant