Merged
Conversation
Contributor
mitfik
commented
Apr 15, 2026
- Improve error handling and communication between witness and watcher
- expose delegation over sdk
- expose low level API in sdk to the client before we move fully to sdk
…nc fixes Overhauls the watcher's KEL synchronization to be self-reliant rather than purely reactive to external queries. Key changes: - Fix update_local_kel to compare local SN against witness-reported KSN instead of relying solely on escrow state, preventing cases where KSN was accepted but full KEL events were never fetched - Add WitnessPoller for background polling of tracked AIDs with adaptive intervals (recently active AIDs polled more frequently, stable ones less) and AID subscription support for priority polling - Replace fire-and-forget channel with completion-signaling mechanism so Logs queries await KEL updates (10s timeout) before responding, eliminating the need for external retry loops - Add WitnessHealthTracker recording per-witness response statistics (success/failure counts, response times, consecutive failures) and expose GET /health endpoint for monitoring - Range-based KEL fetching to avoid re-downloading the entire event log - Propagate and log witness query errors instead of silently discarding
When a remote endpoint returns a non-success status with an empty body,
serde_json::from_str("") would fail with "EOF while parsing a value",
producing a misleading "network error" message. Now checks for empty
bodies first and returns a clear error with the HTTP status code.
Also changed JSON parse error fallback to use UnknownError(body) instead
of NetworkError(serde_err) since the raw body is more useful for debugging.
reqwest::Client was using defaults with no timeouts, causing requests to hang indefinitely when endpoints are slow or unreachable. Added 10s connect timeout and 30s overall request timeout via a shared http_client() builder.
Watcher's TEL transport used reqwest::Client with no timeouts, causing requests to witnesses to hang indefinitely when unreachable. Added 10s connect timeout and 30s request timeout, consistent with keri-core transport layer.
forward_query_from iterated witnesses sequentially, so a slow or unreachable first witness would delay fetching events from all others. Now uses join_all() to query all witnesses concurrently (consistent with query_state) and processes all successful responses, improving latency for multi-witness AIDs.
before moving fully to keri sdk we need to keep them available
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.