Skip to content

Move sleep slicing from musl's __wait to the lower level emscripten_futex_wait. NFC#26471

Open
sbc100 wants to merge 1 commit intoemscripten-core:mainfrom
sbc100:handle_pthread_cancel_during_futext_wait
Open

Move sleep slicing from musl's __wait to the lower level emscripten_futex_wait. NFC#26471
sbc100 wants to merge 1 commit intoemscripten-core:mainfrom
sbc100:handle_pthread_cancel_during_futext_wait

Conversation

@sbc100
Copy link
Collaborator

@sbc100 sbc100 commented Mar 17, 2026

This reduces the number of places we need to breaking up our wait operation. It also means that other users the emscripten_futex_wait API don't break pthread proxying or async cancellation.

This change removes one more place where we were erroneously calling pthread_self() in the Wasm Workers build of libc, so this change also makes the code less broken in the Wasm Worker case.

Needed as part of #26487

@sbc100 sbc100 force-pushed the handle_pthread_cancel_during_futext_wait branch 4 times, most recently from 65227c2 to dc4a6db Compare March 18, 2026 17:47
double max_ms_slice_to_sleep = emscripten_is_main_runtime_thread() ? 1 : 100;
// Break up sleeping so that we process proxied work at regular intervals.
// TODO(sbc): This should be remove and/or moved down into
// `emscripten_futex_wait`.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tlively do you think we can just remove this loop? i.e. to background threads actually want to be be running proxied work when they call emscripten_thread_sleep?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure. It looks like the ProxyWorker used by the WasmFS OPFS backend does not need this: https://github.com/emscripten-core/emscripten/blob/main/system/lib/wasmfs/thread_utils.h#L67. I don't know about all the users of the system proxying queue, though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, what about dlopen? Will that depend on sleeping threads waking up to perform loading work?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the dlopen handling is done down in emscripten_futex_wait at the entry and exit from the wait.

The contract is that you can sleep as long as you want, as long you syncronize your loaded code as soon as you wake up.

Copy link
Collaborator Author

@sbc100 sbc100 Mar 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i.e. a thread that is "sleeping" is considered "up-to-date" from the POV of the "sync-the-world" algorithm.

@sbc100 sbc100 requested review from dschuff and kripken March 18, 2026 18:03
@sbc100 sbc100 force-pushed the handle_pthread_cancel_during_futext_wait branch from dc4a6db to c83cec6 Compare March 18, 2026 18:05
@sbc100 sbc100 force-pushed the handle_pthread_cancel_during_futext_wait branch from c83cec6 to 56c46cb Compare March 18, 2026 23:25
double max_ms_slice_to_sleep = is_runtime_thread ? 1 : 100;

while (*addr==val) {
if (is_runtime_thread || pthread_self()->cancelasync == PTHREAD_CANCEL_ASYNCHRONOUS) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get the second removal here (cancellation) but what about is_runtime_thread?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved that down into emscripten_futex_wait too.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the comment that starts 1. We are the main runtime thread.

@sbc100
Copy link
Collaborator Author

sbc100 commented Mar 18, 2026

@dschuff this is the next PR in the patch chain leading to #26487. (In fact is the last one). I hope I did an OK job describing the rational in the description. Let me know if anything is not clear here.

@sbc100 sbc100 force-pushed the handle_pthread_cancel_during_futext_wait branch from d025d13 to ee3f1b4 Compare March 19, 2026 17:36
@sbc100
Copy link
Collaborator Author

sbc100 commented Mar 19, 2026

I had gemini CLI go over potentially issues with this PR and it found a few subtle issues with the slicing.. all of which are now fixed.

@sbc100 sbc100 force-pushed the handle_pthread_cancel_during_futext_wait branch from ee3f1b4 to dba24ed Compare March 19, 2026 17:37
// 1. We are the main runtime thread. In this case we need to be able to
// process proxied events from workers. Note that this is not always
// the same as being the main browser thread. For example, when running
// under node or when launching and emscripten-built program in a Web
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// under node or when launching and emscripten-built program in a Web
// under node or when launching an emscripten-built program in a Web

// process proxied events from workers. Note that this is not always
// the same as being the main browser thread. For example, when running
// under node or when launching and emscripten-built program in a Web
// Worker. This this case we limit our wait slices to 1ms intervals.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Worker. This this case we limit our wait slices to 1ms intervals.
// Worker. In this case we limit our wait slices to 1ms intervals.

double max_ms_slice_to_sleep = emscripten_is_main_runtime_thread() ? 1 : 100;
// Break up sleeping so that we process proxied work at regular intervals.
// TODO(sbc): This should be remove and/or moved down into
// `emscripten_futex_wait`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, what about dlopen? Will that depend on sleeping threads waking up to perform loading work?

…n_futex_wait`. NFC

This means we only need to do this breaking up on our `wait` operations
in a single place.  It also means that other users the
`emscripten_futex_wait` API don't break pthread proxying or async
cancellation.

The moved code is only included in pthread-enabled builds so should not
effect Wasm Workers builders.

This change also paves the way for enabling musl's `__wait` to work
with `WASM_WORKERS`.
@sbc100 sbc100 force-pushed the handle_pthread_cancel_during_futext_wait branch from dba24ed to a44b167 Compare March 19, 2026 21:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants