Skip to content

Avoid bad timing on posits#656

Merged
jakmeier merged 3 commits intosig-net:developfrom
jakmeier:fix-posit-timeout
Mar 2, 2026
Merged

Avoid bad timing on posits#656
jakmeier merged 3 commits intosig-net:developfrom
jakmeier:fix-posit-timeout

Conversation

@jakmeier
Copy link
Copy Markdown
Contributor

@jakmeier jakmeier commented Feb 5, 2026

For presignature and triple generation, the proposer and deliberator roles time out at the same time.

The proposer tries to move forward on a timeout, if the threshold has been met. But a deliberator will simply abort. This leads to bad timing and non-determinism if even just a single node is not responding.

Increasing the deliberator timeout resolves the problem.

This also increases the frequency for checking the expiration, to speed up the process.

@jakmeier
Copy link
Copy Markdown
Contributor Author

jakmeier commented Feb 5, 2026

@ChaoticTempest This is what I mentioned in today's meeting. I just saw a test (test_sign_contention_5_nodes) failed, I will need to look into that.

ChaoticTempest
ChaoticTempest previously approved these changes Feb 5, 2026
@jakmeier
Copy link
Copy Markdown
Contributor Author

Unfortunately tests are non-deterministic.

test_sign_contention_5_nodes and test_presignature_timeout sometimes fails in CI

locally, I got test_sign_contention_5_nodes failing once but mostly it runs without issues.

I probably won't have time to dig deeper today. But we shouldn't merge this while the issue remains.

jakmeier added 3 commits March 2, 2026 11:07
For presignature and triple generation, the proposer
and deliberator roles time out at the same time.

The proposer tries to move forward on a timeout, if
the threshold has been met. But a deliberator will
simply abort. This leads to bad timing and non-determinism
if even just a single node is not responding.

Increasing the deliberator timeout resolves the problem.
Waiting for double the time makes it difficult for rounds
other than the first round to get a posit through.
@jakmeier
Copy link
Copy Markdown
Contributor Author

jakmeier commented Mar 2, 2026

I had to change the delay for deliberator expiration to make this work reliably.

In my initial fix, I used 2x the proposer timeout for deliberators. This makes it much more likely that round 1 of posits gets accepted. But I didn't consider that for round two, this creates another (perfectly) bad timing. 2x is pretty much the worst choice for the delay. Thus, resulting in the non-deterministic test failures I had.

Now I use a fixed 2s extra for deliberators to expire. That should be enough to receive the proposer START message in time. Combined with a proposer expiration of 10s, also leaves enough time for the next round to conclude.

@jakmeier jakmeier merged commit 2c9d032 into sig-net:develop Mar 2, 2026
3 of 4 checks passed
@jakmeier jakmeier deleted the fix-posit-timeout branch March 2, 2026 14:49
@volovyks
Copy link
Copy Markdown
Contributor

volovyks commented Mar 3, 2026

@ppca cases::mpc::test_sign_contention_5_nodes is still failing, often, so the issue is relevant
cases::mpc::test_sign_requests_wait_for_presignatures is unstable as well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants