gh-149146: Add dealloc-depth counter fallback to trashcan trigger#151595
gh-149146: Add dealloc-depth counter fallback to trashcan trigger#151595bhuvi27 wants to merge 1 commit into
Conversation
Under memory pressure (for example RLIMIT_AS), Python's trashcan trigger in _Py_Dealloc never fires. The trigger relies on _Py_RecursionLimit_GetMargin, which compares the machine stack pointer against c_stack_soft_limit. When RLIMIT_AS prevents the kernel from growing the C stack, the kernel SIGSEGVs while the stack pointer is still megabytes above the soft limit (see sibling issue pythongh-150722 for an LLDB trace showing SP ~7.2 MB above c_stack_hard_limit at the SIGSEGV), so the trashcan never deposits and the recursive tuple_dealloc chain runs out the stack. Add a per-thread c_dealloc_depth counter as a fallback trigger, mirroring the historical _PyTrash_UNWIND_LEVEL=50 protection that was removed when the trashcan was consolidated into _Py_Dealloc. _Py_RecursionLimit_GetMargin remains the primary signal; the counter only kicks in when the stack-pointer signal cannot fire. _PyTrash_thread_destroy_chain itself bumps c_dealloc_depth for the duration of the drain so any _Py_Dealloc invoked while draining cannot recursively re-enter destroy_chain (which would rebuild the same unbounded recursion the trashcan exists to prevent). This mirrors the historical delete_nesting bookkeeping. The counter is per-tstate (no atomics needed in the free-threaded build) and is only touched for GC types, so non-GC types pay zero overhead. Add a regression test in test_gc that builds a 100,000-deep (b, None) tuple chain and asserts that ``del b`` cleans it up without a C-stack overflow.
|
IMO the existing
Issue gh-149146 is really a corner case: when the OS fails to enlarge the stack because the system memory is exhausted. IMO we should do nothing for this case. |
vstinner
left a comment
There was a problem hiding this comment.
I don't think that this approach is correct (see my previous comment).
|
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase And if you don't make the requested changes, you will be poked with soft cushions! |
|
cc @markshannon |
Fixes #149146.
while True: b = (b, None)under RLIMIT_AS segfaults during cleanup:Docker with
ulimit -v 524288:MemoryError→lost sys.stderr→ segfault, exit 139.The trashcan deposit in
_Py_Dealloconly fires when_Py_RecursionLimit_GetMargin(tstate) < 2, which compares the stack pointer againstc_stack_soft_limit. Under RLIMIT_AS the kernel SIGSEGVs trying to grow the stack while SP is still nowhere near the soft limit — issue #150722 has an LLDB trace from the same reporter showing SP ~7 MB abovec_stack_hard_limitat the fault. So the trashcan never deposits and the recursivetuple_dealloc → Py_XDECREF → _Py_Dealloc → tuple_deallocchain just runs into the wall.This adds a per-thread
c_dealloc_depthcounter and uses it as a secondary trigger inside_Py_Dealloc, depositing at depth 50 (matches the old_PyTrash_UNWIND_LEVEL). The stack-pointer margin stays the primary signal; the counter only kicks in when SP can't.One thing I missed on the first pass: the drain side has the same problem.
margin >= 4is also true in this scenario, so every_Py_Deallocexit would re-enter_PyTrash_thread_destroy_chainfrom inside the deallocator it just called. Fix:destroy_chainbumps the counter for the duration of the drain (same idea as the olddelete_nesting), and the drain check in_Py_Deallocbecomes justc_dealloc_depth == 0. Counter is only touched for GC types so non-GC types pay nothing.Tests: new
test_trashcan_nested_tuple_deepintest_gc.py(100k nested tuples, must clean up without crash);test_gc test_exceptions test_capiall green locally. Docker repro now exits 1 withMemoryErrorinstead of 139 withSegmentation fault.#150722 is the same SP-signal blindness in the comparison path (
Py_EnterRecursiveCall) — separate code path, separate PR.