Implement timeout capability. Apply timeout to crypto response#278
Implement timeout capability. Apply timeout to crypto response#278AlexLanzano wants to merge 8 commits intowolfSSL:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR introduces a generic timeout utility and wires client-side response timeouts into crypto operations, plus unit tests for the timeout helper.
Changes:
- Add a generic timeout module (
wh_timeout.[ch]) based onWH_GETTIME_US()and expose it via configuration (WOLFHSM_CFG_ENABLE_TIMEOUT) and a new error codeWH_ERROR_TIMEOUT. - Extend
whClientContext/whClientConfigand addwh_Client_RecvResponseTimeout, then route all crypto client receive paths through a new_recvCryptoResponsehelper that uses the timeout when enabled. - Add unit tests for the timeout helper and enable timeout support in the test configuration.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| wolfhsm/wh_timeout.h | Declares timeout context/config types and the timeout API used by the client; documentation establishes that timeoutUs == 0 disables the timeout. |
| wolfhsm/wh_settings.h | Documents WOLFHSM_CFG_ENABLE_TIMEOUT as enabling timeout helpers and client response timeouts; also defines WH_GETTIME_US(), which the timeout code relies on. |
| wolfhsm/wh_error.h | Introduces WH_ERROR_TIMEOUT to signal an expired timeout from client operations. |
| wolfhsm/wh_client.h | Adds a per-client respTimeout context, an optional respTimeout config, and declares wh_Client_RecvResponseTimeout behind WOLFHSM_CFG_ENABLE_TIMEOUT. |
| test/wh_test_timeout.h | Declares the whTest_Timeout unit test entry point. |
| test/wh_test_timeout.c | Implements unit tests for wh_Timeout_*, including callback invocation, stop/disable behavior, and bad-argument handling. |
| test/wh_test.c | Wires whTest_Timeout() into the unit test suite when WOLFHSM_CFG_ENABLE_TIMEOUT is defined. |
| test/config/wolfhsm_cfg.h | Enables WOLFHSM_CFG_ENABLE_TIMEOUT in the test configuration and ensures a valid time source via WOLFHSM_CFG_PORT_GETTIME. |
| src/wh_timeout.c | Implements the timeout helper functions; currently treats timeoutUs == 0 as an error in wh_Timeout_Start, which conflicts with the documented “0 disables” semantics and impacts higher-level usage. |
| src/wh_client_crypto.c | Introduces _recvCryptoResponse and switches all crypto receive loops to use it; with timeout support enabled this always goes through wh_Client_RecvResponseTimeout and the per-client respTimeout. |
| src/wh_client.c | Initializes the optional respTimeout context in wh_Client_Init and adds wh_Client_RecvResponseTimeout, which loops on WH_ERROR_NOTREADY until a valid response arrives or the timeout expires. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
5498633 to
f7a30b3
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 11 out of 11 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
bigbrett
left a comment
There was a problem hiding this comment.
Overall looks great. Some smaller tweaks and also proposed an extension of functionality that we might want.
src/wh_timeout.c
Outdated
| nowUs = WH_GETTIME_US(); | ||
| expired = (nowUs - timeout->startUs) >= timeout->timeoutUs; | ||
| if (expired && (timeout->expiredCb != NULL)) { | ||
| timeout->expiredCb(timeout->cbCtx); |
There was a problem hiding this comment.
OK so what if we allowed the callback to override the timeout here, based on the return value? So the callback could return a code that either says 1) "yes, I acknowledge the timeout, proceed with the failure return to the caller, I've done what I need to do to bring the system to a safe state" or 2) I know the timeout expired, but I want wh_Client_RecvRequestTimeout() to KEEP polling the request loop.
This is necessary, since we don't really have a good way to "retry" a blocking operation like crypto, since calling the crypto function again would try and send the initial request, which would fail due to a request already being in flight. Perhaps "wait a little longer" is something we should support?
For this to work, we would need to pass the actual timeout context into the callback, not just the callback context. This would allow the callback to "restart" the timer. I think passing the message parameters for the request (group and action) for additional context would be important as well.
Then in wh_ClientRecvRequestTimeout(), we could check the actual return value of wh_Timeout_Expired() when deciding whether or not to break from the loop.
Thoughts?
There was a problem hiding this comment.
Im not sure I follow the use case you laid out for 2. I figure if the user enabled timeouts they would never want to indefinitely poll. And if they did they could just configure the timeout to be extremely high to essentially achieve this.
There was a problem hiding this comment.
Yeah I guess just adding additional flexibility, similar to how wolfSSL cert verify callbacks let you override a decision made by the library based on the scenario. But I guess it is a little different here, since the timeout can be specified before any operation - I was just trying to think generically. Punt this for now, I can stew on usage a little more.
There was a problem hiding this comment.
@AlexLanzano OK consider this case - timeout expiration is not intended to be fatal but perhaps just perform some instrumentation/logging/tracking/GPIO wiggling/thing we haven't thought of yet? Doing what I proposed would allow for this, AND allow for periodic timeout checks (e.g. timeout fires, you do some action (log, printf, whatever) but then you want to reset the timeout for another interval in the callback and keep waiting. Currently, once the timeout expires, the polling loop exits.
There was a problem hiding this comment.
I'd like to see testing of the client API pathway too, even for just one algorithm. Perhaps a whTest_TimeoutClientCtx() function that could be called in whTest_ClientServerSequential() where we control the server, to essentially do what you have here but for a crypto request like AES CBC?
There was a problem hiding this comment.
This is a tricky test to implement since you need a client and server thread. And if you want to test the timeout expired case you need to somehow resync the client and server by dropping the server's eventual response. We may need to punt this effort, I'll need some time to think about how to implement this
There was a problem hiding this comment.
@AlexLanzano you just need to step the server separately from the client, you don't need multiple threads. That is why I suggested whTest_ClientServerSequential(). Did you peek at this function? There should be no getting out of sync involved. Or am I misunderstanding
That said, I would ammend my original comment to have a sequential test helper, instead of a purely client-context driven helper, in order to function in this harness. But otherwise should do exactly what you want to do?
EDIT: OK I forgot, we only added the timeouts to BLOCKING crypto, so you can't use split-transaction on client side. You right, you right.... Carry on.
There was a problem hiding this comment.
@AlexLanzano OK wait, actually my idea above still is fine and is indeed what we should do. I think its totally possible to do this in the sequential test and prevent any out-of-sync while ensuring the callback fires through the client API. Consider the following operation sequence psuedocode in the sequential clientserver test:
ClientInitTimeout()
ClientAesCbcBlocking()
/* Timeout fires, since we can't step the server, breaking us out of the client's blocking loop */
TestAssertTimeoutFired()
/* Prevent out-of-sync by proceeding with message processing as usual */
ServerHandleRequestMessage()
ClientCommLayerRecvRequest() /* dip down to comm layer to directly recv the request since we don't (yet) have split request/response for crypto (PR for this pending :) )*/
/* synchronization restored, so subsequent transport usage should be normal */
There was a problem hiding this comment.
Implemented this test
Co-authored-by: Brett Nicholas <7547222+bigbrett@users.noreply.github.com>
|
|
||
| /* Pick up compile-time configuration */ | ||
| #include "wolfhsm/wh_settings.h" | ||
|
|
There was a problem hiding this comment.
everything below wolfhsm/wh_settings.h needs to be protected by WOLFHSM_CFG_TIMEOUT
src/wh_timeout.c
Outdated
| nowUs = WH_GETTIME_US(); | ||
| expired = (nowUs - timeout->startUs) >= timeout->timeoutUs; | ||
| if (expired && (timeout->expiredCb != NULL)) { | ||
| timeout->expiredCb(timeout->cbCtx); |
There was a problem hiding this comment.
@AlexLanzano OK consider this case - timeout expiration is not intended to be fatal but perhaps just perform some instrumentation/logging/tracking/GPIO wiggling/thing we haven't thought of yet? Doing what I proposed would allow for this, AND allow for periodic timeout checks (e.g. timeout fires, you do some action (log, printf, whatever) but then you want to reset the timeout for another interval in the callback and keep waiting. Currently, once the timeout expires, the polling loop exits.
| } | ||
|
|
||
| timeout->startUs = 0; | ||
| timeout->timeoutUs = 0; |
There was a problem hiding this comment.
This would require all subsequent request to need to re-set the timeout. If the intent is only to stop the current measurement, wouldn't we only need to clear startUs? Maybe we also need some sort of stateful active flag?
There was a problem hiding this comment.
timeoutUs being > 0 determines whether or not a timeout is active. I can add a reset function that only clears startUs though
wolfhsm/wh_timeout.h
Outdated
| #define WH_MSEC_TO_USEC(usec) (usec * 1000ULL) | ||
| #define WH_SEC_TO_USEC(sec) (sec * 1000000ULL) | ||
| #define WH_MIN_TO_USEC(min) (min * WH_SEC_TO_USEC(60)) |
There was a problem hiding this comment.
macro argument/parameters need additional parens to expand them before the math
wolfhsm/wh_settings.h
Outdated
| * functionality | ||
| * | ||
| * WOLFHSM_CFG_ENABLE_TIMEOUT - If defined, include client-side support for | ||
| blocking request timeouts |
There was a problem hiding this comment.
re-flow comment *
| blocking request timeouts | |
| * blocking request timeouts |
| int whTest_Timeout(void) | ||
| { | ||
| int cb_count = 0; | ||
| whTimeoutConfig cfg; | ||
| whTimeoutCtx timeout[1]; | ||
|
|
||
| cfg.timeoutUs = 1; | ||
| cfg.expiredCb = whTest_TimeoutCb; | ||
| cfg.cbCtx = &cb_count; | ||
|
|
||
| wh_Timeout_Init(timeout, &cfg); | ||
| WH_TEST_ASSERT_RETURN(timeout->startUs == 0); | ||
| WH_TEST_ASSERT_RETURN(timeout->timeoutUs == cfg.timeoutUs); | ||
| WH_TEST_ASSERT_RETURN(timeout->expiredCb == cfg.expiredCb); | ||
| WH_TEST_ASSERT_RETURN(timeout->cbCtx == cfg.cbCtx); | ||
|
|
||
| wh_Timeout_Start(timeout); | ||
| WH_TEST_ASSERT_RETURN(timeout->timeoutUs > 0); | ||
|
|
||
| wh_Timeout_Stop(timeout); | ||
| WH_TEST_ASSERT_RETURN(timeout->startUs == 0); | ||
| WH_TEST_ASSERT_RETURN(timeout->timeoutUs == 0); | ||
|
|
||
| /* No expiration when disabled */ | ||
| WH_TEST_ASSERT_RETURN(wh_Timeout_Expired(timeout) == 0); | ||
|
|
||
| WH_TEST_ASSERT_RETURN(wh_Timeout_Init(0, 0) == WH_ERROR_BADARGS); | ||
| WH_TEST_ASSERT_RETURN(wh_Timeout_Set(0, 0) == WH_ERROR_BADARGS); | ||
| WH_TEST_ASSERT_RETURN(wh_Timeout_Start(0) == WH_ERROR_BADARGS); | ||
| WH_TEST_ASSERT_RETURN(wh_Timeout_Stop(0) == WH_ERROR_BADARGS); | ||
| WH_TEST_ASSERT_RETURN(wh_Timeout_Expired(0) == 0); | ||
|
|
||
| return 0; | ||
| } |
There was a problem hiding this comment.
this test should assert the callback has fired as well (e.g. counter is incremented to the expected value from however many timeouts have happened).
There was a problem hiding this comment.
@AlexLanzano OK wait, actually my idea above still is fine and is indeed what we should do. I think its totally possible to do this in the sequential test and prevent any out-of-sync while ensuring the callback fires through the client API. Consider the following operation sequence psuedocode in the sequential clientserver test:
ClientInitTimeout()
ClientAesCbcBlocking()
/* Timeout fires, since we can't step the server, breaking us out of the client's blocking loop */
TestAssertTimeoutFired()
/* Prevent out-of-sync by proceeding with message processing as usual */
ServerHandleRequestMessage()
ClientCommLayerRecvRequest() /* dip down to comm layer to directly recv the request since we don't (yet) have split request/response for crypto (PR for this pending :) )*/
/* synchronization restored, so subsequent transport usage should be normal */
43255a9 to
066ea59
Compare
fixes #130
Timeout Functionality: Client Perspective
1. Configuration at Init Time
When creating a client, you provide a
whTimeoutConfigspecifying the timeout duration and an optional callback:During
wh_Client_Init(src/wh_client.c:84-89), the config is copied into an embeddedwhTimeoutCtx respTimeout[1]inside the client context viawh_Timeout_Init(). This stores the timeout duration and callback but doesn't start any timer yet.If
respTimeoutConfigis NULL, the timeout context is left zeroed and effectively disabled (atimeoutUsof 0 means "never expires").2. What Happens During a Crypto Call
Before this PR, every crypto function in
wh_client_crypto.chad this pattern after sending a request:If the server never responded, the client would spin forever.
The PR replaces all ~30 of these with a single helper
_recvCryptoResponse()(src/wh_client_crypto.c:165-180):When timeout is enabled, it delegates to
wh_Client_RecvResponseTimeout. When disabled, the old infinite-loop behavior is preserved.3. The Timeout Receive Loop
wh_Client_RecvResponseTimeout(src/wh_client.c:211-231) does this:Starts the timer -- calls
wh_Timeout_Start()which snapshots the current time viaWH_GETTIME_US()intotimeout->startUs.Polls for a response -- calls
wh_Client_RecvResponse()in a loop.On each
WH_ERROR_NOTREADY, checkswh_Timeout_Expired():WH_GETTIME_US()(now - startUs) >= timeoutUsexpiredCb(if set), then returnsWH_ERROR_TIMEOUTOn any other return value (success or error), returns immediately.
4. What the Client Sees
From the application's perspective, the crypto APIs (
wh_Client_AesCbc,wh_Client_RsaFunction,wh_Client_EccSign, etc.) now returnWH_ERROR_TIMEOUT(-2010) instead of hanging indefinitely. The application can then decide how to handle it -- retry, log, fail gracefully, etc.The
expiredCbfires before the error is returned, so you can use it for logging or cleanup without needing to check the return code first.5. Scope Limitations
A few things to note about the current design:
_recvCryptoResponse.respTimeoutcontext with the same duration. You can callwh_Timeout_Set(ctx->respTimeout, newValue)to change it between calls, but there's no per-operation override.