Skip to content

gh-151436: Fix missing tstate->last_profiled_frame updates#151437

Merged
pablogsal merged 4 commits into
python:mainfrom
maurycy:gh-151436
Jun 17, 2026
Merged

gh-151436: Fix missing tstate->last_profiled_frame updates#151437
pablogsal merged 4 commits into
python:mainfrom
maurycy:gh-151436

Conversation

@maurycy

@maurycy maurycy commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

Please see #151436 for more details.

The issue is fixed, drv_a is back:

def gen():
    while True:
        n = yield
        sum(range(n))

g = gen()
next(g)

def drv_a(): g.send(500)
def drv_b(): g.send(50000)

while True:
    drv_a(); drv_b()
2026-06-13T11:47:41.099330000+0200 maurycy@gimel /Users/maurycy/work/cpython (gh-151436 4eab5a0?) % sudo ./python.exe -m profiling.sampling run --collapsed -o /tmp/stacks.txt -d  10 genrepro.py&& cat /tmp/stacks.txt
Captured 10,000 samples in 10.00 seconds
Sample rate: 999.99 samples/sec
Error rate: 0.01
Collapsed stack output written to /tmp/stacks.txt
tid:13799271;<frozen runpy>:_run_module_as_main:201;<frozen runpy>:_run_code:87;genrepro.py:<module>:13;genrepro.py:drv_b:10;genrepro.py:gen:4 9945
tid:13799271;<frozen runpy>:_run_module_as_main:201;<frozen runpy>:_run_code:87;genrepro.py:<module>:13;genrepro.py:drv_a:9;genrepro.py:gen:4 48
tid:13799271;<frozen runpy>:_run_module_as_main:201;<frozen runpy>:_run_code:87;genrepro.py:<module>:13;genrepro.py:drv_b:10;genrepro.py:gen:3 3
tid:13799271;<frozen runpy>:_run_module_as_main:201;<frozen runpy>:_run_code:87;genrepro.py:<module>:13 1
tid:13799271;<frozen runpy>:_run_module_as_main:201;<frozen runpy>:_run_code:87;genrepro.py:<module>:13;genrepro.py:drv_a:9 1
tid:13799271;<frozen runpy>:_run_module_as_main:201;<frozen runpy>:_run_code:87;genrepro.py:<module>:13;genrepro.py:drv_b:10 1
2026-06-13T11:47:51.429974000+0200 maurycy@gimel /Users/maurycy/work/cpython (gh-151436 4eab5a0?) % 

0,5% (48/9996 = 0,48%) is roughly correct, since it shouldn't be exactly linear.

I can try to come up with reproduction for each miss.

Should it be a helper?

If you ask me, the whole mechanism smells me of ABA...

@maurycy maurycy marked this pull request as ready for review June 13, 2026 09:58
@maurycy maurycy requested a review from markshannon as a code owner June 13, 2026 09:58
@maurycy

maurycy commented Jun 13, 2026

Copy link
Copy Markdown
Contributor Author

cc @pablogsal @savannahostrowski

@pablogsal pablogsal added the needs backport to 3.15 pre-release feature fixes, bugs and security fixes label Jun 17, 2026
@pablogsal

pablogsal commented Jun 17, 2026

Copy link
Copy Markdown
Member

Should it be a helper?

Yes, this should be a macro.

If you ask me, the whole mechanism smells me of ABA...

Yeah I think is still possible to have this sequence of events:

  1. Profiler samples frame address A and caches stack under it.
  2. Target yields/pops/removes that frame.
  3. Target or allocator later has a different logical frame at address A, or the profiler races and writes stale A back after the target already moved on.
  4. Profiler sees A again and says “cache hit”.
  5. It reuses stale callers for a different logical stack.

we can try to think how to tackle this in a separate issue, there are some mitigations we can do

@pablogsal

pablogsal commented Jun 17, 2026

Copy link
Copy Markdown
Member

I have pushed 1348756 making this a macro

@pablogsal pablogsal enabled auto-merge (squash) June 17, 2026 19:02
@pablogsal pablogsal disabled auto-merge June 17, 2026 19:21
@pablogsal pablogsal merged commit a8d74c0 into python:main Jun 17, 2026
77 checks passed
@miss-islington-app

Copy link
Copy Markdown

Thanks @maurycy for the PR, and @pablogsal for merging it 🌮🎉.. I'm working now to backport this PR to: 3.15.
🐍🍒⛏🤖

@bedevere-app

bedevere-app Bot commented Jun 17, 2026

Copy link
Copy Markdown

GH-151612 is a backport of this pull request to the 3.15 branch.

@bedevere-app bedevere-app Bot removed the needs backport to 3.15 pre-release feature fixes, bugs and security fixes label Jun 17, 2026
@bedevere-bot

Copy link
Copy Markdown

⚠️⚠️⚠️ Buildbot failure ⚠️⚠️⚠️

Hi! The buildbot aarch64 Android 3.x (tier-3) has failed when building commit a8d74c0.

What do you need to do:

  1. Don't panic.
  2. Check the buildbot page in the devguide if you don't know what the buildbots are or how they work.
  3. Go to the page of the buildbot that failed (https://buildbot.python.org/#/builders/1594/builds/5113) and take a look at the build logs.
  4. Check if the failure is related to this commit (a8d74c0) or if it is a false positive.
  5. If the failure is related to this commit, please, reflect that on the issue and make a new Pull Request with a fix.

You can take a look at the buildbot page here:

https://buildbot.python.org/#/builders/1594/builds/5113

Summary of the results of the build (if available):

Click to see traceback logs
remote: Enumerating objects: 20, done.        
remote: Counting objects:   8% (1/12)        
remote: Counting objects:  16% (2/12)        
remote: Counting objects:  25% (3/12)        
remote: Counting objects:  33% (4/12)        
remote: Counting objects:  41% (5/12)        
remote: Counting objects:  50% (6/12)        
remote: Counting objects:  58% (7/12)        
remote: Counting objects:  66% (8/12)        
remote: Counting objects:  75% (9/12)        
remote: Counting objects:  83% (10/12)        
remote: Counting objects:  91% (11/12)        
remote: Counting objects: 100% (12/12)        
remote: Counting objects: 100% (12/12), done.        
remote: Compressing objects:  16% (1/6)        
remote: Compressing objects:  33% (2/6)        
remote: Compressing objects:  50% (3/6)        
remote: Compressing objects:  66% (4/6)        
remote: Compressing objects:  83% (5/6)        
remote: Compressing objects: 100% (6/6)        
remote: Compressing objects: 100% (6/6), done.        
remote: Total 20 (delta 8), reused 6 (delta 6), pack-reused 8 (from 2)        
From https://github.com/python/cpython
 * branch                    main       -> FETCH_HEAD
Note: switching to 'a8d74c062fe3c5cb2962dde8bee83704fcfa1bc9'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:

  git switch -c <new-branch-name>

Or undo this operation with:

  git switch -

Turn off this advice by setting config variable advice.detachedHead to false

HEAD is now at a8d74c062fe gh-151436: Fix missing `tstate->last_profiled_frame` updates (#151437)
Switched to and reset branch 'main'

configure: WARNING: no system libmpdec found; falling back to pure-Python version for the decimal module
configure: WARNING: pkg-config is missing. Some dependencies may not be detected correctly.

../../configure: line 4112: pkg-config: command not found
configure: WARNING: no system libmpdec found; falling back to pure-Python version for the decimal module
configure: WARNING: pkg-config is missing. Some dependencies may not be detected correctly.

../../Python/fileutils.c:458:1: warning: unused function 'decode_current_locale' [-Wunused-function]
  458 | decode_current_locale(const char* arg, wchar_t **wstr, size_t *wlen,
      | ^~~~~~~~~~~~~~~~~~~~~
../../Python/fileutils.c:677:1: warning: unused function 'encode_current_locale' [-Wunused-function]
  677 | encode_current_locale(const wchar_t *text, char **str,
      | ^~~~~~~~~~~~~~~~~~~~~
2 warnings generated.
../../Modules/_localemodule.c:195:1: warning: unused function 'is_all_ascii' [-Wunused-function]
  195 | is_all_ascii(const char *str)
      | ^~~~~~~~~~~~
1 warning generated.
../../Modules/pwdmodule.c:69:16: warning: unused variable 'pwd_db_mutex' [-Wunused-variable]
   69 | static PyMutex pwd_db_mutex = {0};
      |                ^~~~~~~~~~~~
1 warning generated.
../../Modules/_hacl/Lib_Memzero0.c:66:6: warning: "Your platform does not support any safe implementation of memzero -- consider a pull request!" [-W#warnings]
   66 |     #warning "Your platform does not support any safe implementation of memzero -- consider a pull request!"
      |      ^
1 warning generated.

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  8784  100  8784    0     0  38971      0 --:--:-- --:--:-- --:--:-- 38867
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  2894  100  2894    0     0  81003      0 --:--:-- --:--:-- --:--:-- 82685
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100 43504  100 43504    0     0   973k      0 --:--:-- --:--:-- --:--:--  988k
  + Exception Group Traceback (most recent call last):
  |   File "<frozen runpy>", line 198, in _run_module_as_main
  |   File "<frozen runpy>", line 88, in _run_code
  |   File "/Users/android/buildarea/3.x.mhsmith-android-aarch64/build/Platforms/Android/__main__.py", line 1059, in <module>
  |     main()
  |   File "/Users/android/buildarea/3.x.mhsmith-android-aarch64/build/Platforms/Android/__main__.py", line 1035, in main
  |     asyncio.run(result)
  |   File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/asyncio/runners.py", line 194, in run
  |     return runner.run(main)
  |            ^^^^^^^^^^^^^^^^
  |   File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/asyncio/runners.py", line 118, in run
  |     return self._loop.run_until_complete(task)
  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |   File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/asyncio/base_events.py", line 687, in run_until_complete
  |     return future.result()
  |            ^^^^^^^^^^^^^^^
  |   File "/Users/android/buildarea/3.x.mhsmith-android-aarch64/build/Platforms/Android/__main__.py", line 726, in run_testbed
  |     async with asyncio.TaskGroup() as tg:
  |   File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/asyncio/taskgroups.py", line 145, in __aexit__
  |     raise me from None
  | ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
  +-+---------------- 1 ----------------
    | Traceback (most recent call last):
    |   File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/asyncio/tasks.py", line 520, in wait_for
    |     return await fut
    |            ^^^^^^^^^
    |   File "/Users/android/buildarea/3.x.mhsmith-android-aarch64/build/Platforms/Android/__main__.py", line 488, in find_device
    |     await asyncio.sleep(1)
    |   File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/asyncio/tasks.py", line 665, in sleep
    |     return await future
    |            ^^^^^^^^^^^^
    | asyncio.exceptions.CancelledError
    | 
    | The above exception was the direct cause of the following exception:
    | 
    | Traceback (most recent call last):
    |   File "/Users/android/buildarea/3.x.mhsmith-android-aarch64/build/Platforms/Android/__main__.py", line 541, in logcat_task
    |     serial = await wait_for(find_device(context, initial_devices), startup_timeout)
    |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |   File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/asyncio/tasks.py", line 519, in wait_for
    |     async with timeouts.timeout(timeout):
    |   File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/asyncio/timeouts.py", line 115, in __aexit__
    |     raise TimeoutError from exc_val
    | TimeoutError
    +------------------------------------

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants