Prepare NonNull for pattern types by oli-obk · Pull Request #152702 · rust-lang/rust

oli-obk · 2026-02-16T10:21:47Z

View all comments

Pull out the changes that affect some tests, but do not require pattern types.

See #136006 (comment) for what triggered this PR

r? @scottmcm

oli-obk · 2026-02-16T10:22:00Z

@bors try @rust-timer queue

Prepare NonNull for pattern types

rust-bors · 2026-02-16T12:32:40Z

☀️ Try build successful (CI)
Build commit: ea5f33e (ea5f33ec13511047f4f28548657d6f308b541315, parent: fef627b1ebdc7369ddf8a4031a5d25733ac1fb34)

rust-timer · 2026-02-16T13:12:45Z

Finished benchmarking commit (ea5f33e): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @rustbot label: +perf-regression-triaged. If not, please fix the regressions and do another perf run. If its results are neutral or positive, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

	mean	range	count
Regressions ❌ (primary)	1.3%	[1.3%, 1.3%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.5%	[-0.5%, -0.5%]	1
Improvements ✅ (secondary)	-0.5%	[-0.5%, -0.5%]	2
All ❌✅ (primary)	0.4%	[-0.5%, 1.3%]	2

Max RSS (memory usage)

Results (primary -0.7%, secondary 8.5%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	4.4%	[4.4%, 4.4%]	1
Regressions ❌ (secondary)	8.5%	[8.5%, 8.5%]	1
Improvements ✅ (primary)	-3.3%	[-3.8%, -2.7%]	2
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.7%	[-3.8%, 4.4%]	3

Cycles

Results (secondary 0.4%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	2.9%	[2.9%, 2.9%]	1
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-2.2%	[-2.2%, -2.2%]	1
All ❌✅ (primary)	-	-	0

Binary size

Results (primary 0.6%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	0.6%	[0.6%, 0.6%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.6%	[0.6%, 0.6%]	1

Bootstrap: 481.396s -> 480.639s (-0.16%)
Artifact size: 397.80 MiB -> 395.76 MiB (-0.51%)

scottmcm

This basically looks good, but did leave me with two thoughts.

I have no idea how important the !align metadata is. Might be ok to just file a follow-up bug to bring it back? Dunno how hard it would be to restore.

View changes since this review

scottmcm · 2026-02-17T01:20:27Z

library/core/src/ptr/non_null.rs

+        let pointer: *const T = crate::ptr::without_provenance(addr.get());
        // SAFETY: we know `addr` is non-zero.
-        unsafe { NonNull { pointer } }
+        unsafe { NonNull { pointer: transmute(pointer) } }


up to you: if this is going to transmute anyway, maybe just transmute directly to NonNull?

Going through the usize then the pointer then the aggregate doesn't seem particularly meaningful; maybe just go straight NonZero→NonNull if there needs to be a transmute here regardless?

I also tried going from *const T to NonNull directly without going through the field constructor. It causes slightly worse MIR, because we don't really handle things like cast well in that case (NonZero transmute to raw pointer, cast, transmute back to NonZero).

scottmcm · 2026-02-17T01:26:31Z

tests/codegen-llvm/loads.rs

 #[no_mangle]
 pub fn load_box<'a>(x: Box<Box<i32>>) -> Box<i32> {
-    // CHECK: load ptr, ptr %{{.*}}, align [[PTR_ALIGNMENT]], !nonnull !{{[0-9]+}}, !align ![[ALIGN_4_META]], !noundef !{{[0-9]+}}
+    // CHECK: load ptr, ptr %{{.*}}, align [[PTR_ALIGNMENT]], !nonnull !{{[0-9]+}}, !noundef !{{[0-9]+}}


Hmm, if this is showing up here with just the nonnull change it looks like whatever emits this !align needs an update to see through pattern types, maybe?

This PR doesn't use pattern types yet 😅 so the issue is likely that we lose the information somewhere in the transmute. Lemme try your other suggestion, which maybe LLVM can preserve things if things are simpler

Oh, doh, of course it doesn't use pattern types 🤦

Note that this is a -C no-prepopulate-passes test, though, so it's only testing things that we in cg_llvm actually emit for the loads. So this implies that it's loading as a pointer instead of as a Box, somehow, I guess?

(I was going to say "well if it's just llvm adding this maybe I'm not worried about it", but if it's us in cg_llvm adding it then presumably it's worth trying to keep it working.)

Hmm, thinking more about what you said about the transmute, I wonder if this is something I regressed a long time ago with moving transmute to being primitive instead of always going through memory. Box<Box<i32>> and Box<i32> are both BackendAbi::Scalar, so maybe something is passed through a transmute now instead of being loaded from an alloca as the Box which would be where the !align on the load would be coming from...

I think that pushes me to just r=me for this stuff, as if it's an existing "oh this combination isn't great" that's not this PR's problem, and we can reoptimize it later as needed.

Yea, seems like there is quite some space to expwriment with. I thought about creating an issue, but it seems too vague to be very useful

oli-obk · 2026-02-17T10:52:13Z

tests/mir-opt/dont_reset_cast_kind_without_updating_operand.test.GVN.32bit.panic-abort.diff

+          _24 = copy _25 as *mut u8 (PtrToPtr);
+          StorageDead(_25);
+          _23 = copy _24 as std::ptr::NonNull<u8> (Transmute);
+-         StorageDead(_24);
+         nop;
+          _13 = copy _23 as *mut u8 (Transmute);


I think we need to teach gvn about skipping transmutes that go back and forth 😆

ah... we explicitly skip them with transmute_may_have_niche_of_interest_to_backend. Maybe we could unconditionally fold them, but also emit an assume?

Yeah, that's harder in GVN because GVN doesn't like updating things other than the value. (Also because of the fat-to-thin pointer cast. If fat pointers were just a struct things could be much easier here since part of the complication to folding is the transmute of the fat pointer.)

Musing: Maybe we always fold in GVN, but change the "remove unnecessary statements" code to emit an assume for an "unnecessary" transmute that creates a validity restriction? Dunno where the best place to do this would be.

(Of course, the other thing I would like is just a native "non-null with const-generic alignment" opaque pointer type. Whenever I see MIR needing to worry about *mut u8 vs *const u8 or about *mut u8 vs *mut () I get sad.)

Maybe we always fold in GVN, but change the "remove unnecessary statements" code to emit an assume for an "unnecessary" transmute that creates a validity restriction? Dunno where the best place to do this would be.

Oh that does seem feasible. I'll play with that

oli-obk · 2026-02-19T10:47:49Z

Oh lol no wonder "all" tests passed for me. It's x86 only and wasn't run at all

scottmcm

This looks good. r=me if you're ready; any mir-opt change to clean things up better would probably be better as a different PR.

View changes since this review

scottmcm · 2026-02-19T20:17:55Z

...ir-opt/pre-codegen/drop_boxed_slice.generic_in_place.PreCodegen.after.32bit.panic-unwind.mir

+        StorageDead(_8);
+        StorageLive(_10);
+        _10 = copy _9 as *mut u8 (Transmute);
+        _11 = alloc::alloc::__rust_dealloc(move _10, move _5, move _7) -> [return: bb3, unwind unreachable];


Not this PR: Come to think of it, one way to avoid the extra transmute here might be to just have __rust_dealloc take NonNull. Miri says it's UB to pass a null to it anyway.

EDIT: Opened this as #153251

oh, yea! Putting it on the list

AKA basically what I just did for Alignment in #152605

This would need a bit more work, because all we end up with then is

_2 = copy (((*_1).0: std::ptr::Unique<[u8; 1024]>).0: std::ptr::NonNull<[u8; 1024]>); _3 = copy _2 as std::ptr::NonNull<u8> (Transmute); _4 = alloc::alloc::__rust_dealloc(move _3, const 1024_usize, const std::ptr::Alignment {{ _inner_repr_trick: std::ptr::alignment::AlignmentEnum::_Align1Shl0 }}) -> [return: bb1, unwind unreachable];

oli-obk · 2026-02-19T20:29:03Z

@bors r=scottmcm

rust-bors · 2026-02-19T20:29:06Z

📌 Commit e8c6d13 has been approved by scottmcm

It is now in the queue for this repository.

@scottmcm

Prepare NonNull for pattern types Pull out the changes that affect some tests, but do not require pattern types. See #136006 (comment) for what triggered this PR r? @scottmcm

rust-bors · 2026-02-21T20:19:29Z

💔 Test for bfc9778 failed: CI. Failed job:

auto - x86_64-gnu-llvm-20-3 (web logs, enhanced plaintext logs)

rustbot · 2026-02-24T08:50:14Z

This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

oli-obk · 2026-02-24T09:20:12Z

@bors r=scottmcm

rust-bors · 2026-02-24T09:20:15Z

📌 Commit 241cd7e has been approved by scottmcm

It is now in the queue for this repository.

rust-bors · 2026-02-27T16:15:59Z

☀️ Test successful - CI
Approved by: scottmcm
Duration: 3h 11m 6s
Pushing 3a70d03 to main...

github-actions · 2026-02-27T16:19:28Z

What is this?

This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.

Comparing 6f54d59 (parent) -> 3a70d03 (this PR)

Test differences

Show 110 test diffs

110 doctest diffs were found. These are ignored, as they are noisy.

Test dashboard

Run

cargo run --manifest-path src/ci/citool/Cargo.toml -- \
    test-dashboard 3a70d0349fa378a10c3748f1a48742e61505020f --output-dir test-dashboard

And then open test-dashboard/index.html in your browser to see an overview of all executed tests.

Job duration changes

dist-aarch64-apple: 2h 23m -> 1h 46m (-25.5%)
dist-x86_64-apple: 1h 56m -> 2h 20m (+20.6%)
dist-apple-various: 1h 56m -> 2h 8m (+10.2%)
x86_64-gnu-llvm-21-1: 1h 13m -> 1h 6m (-9.5%)
aarch64-gnu-debug: 1h 11m -> 1h 5m (-9.2%)
dist-powerpc64-linux-musl: 1h 36m -> 1h 28m (-8.7%)
x86_64-gnu-llvm-20-3: 2h 1m -> 1h 52m (-7.2%)
aarch64-apple: 3h 18m -> 3h 4m (-7.2%)
pr-check-1: 30m 1s -> 27m 57s (-6.9%)
aarch64-gnu-llvm-20-1: 59m 43s -> 55m 39s (-6.8%)

How to interpret the job duration changes?

Job durations can vary a lot, based on the actual runner instance
that executed the job, system noise, invalidated caches, etc. The table above is provided
mostly for t-infra members, for simpler debugging of potential CI slow-downs.

rust-timer · 2026-02-27T17:33:07Z

Finished benchmarking commit (3a70d03): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Our benchmarks found a performance regression caused by this PR.
This might be an actual regression, but it can also be just noise.

Next Steps:

If the regression was expected or you think it can be justified,
please write a comment with sufficient written justification, and add
@rustbot label: +perf-regression-triaged to it, to mark the regression as triaged.
If you think that you know of a way to resolve the regression, try to create
a new PR with a fix for the regression.
If you do not understand the regression or you think that it is just noise,
you can ask the @rust-lang/wg-compiler-performance working group for help (members of this group
were already notified of this PR).

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

	mean	range	count
Regressions ❌ (primary)	0.5%	[0.4%, 0.6%]	3
Regressions ❌ (secondary)	0.5%	[0.5%, 0.5%]	1
Improvements ✅ (primary)	-0.3%	[-0.5%, -0.2%]	3
Improvements ✅ (secondary)	-1.1%	[-1.1%, -1.1%]	1
All ❌✅ (primary)	0.1%	[-0.5%, 0.6%]	6

Max RSS (memory usage)

Results (primary 1.1%, secondary 0.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	3.5%	[2.0%, 5.3%]	4
Regressions ❌ (secondary)	2.9%	[2.9%, 2.9%]	1
Improvements ✅ (primary)	-2.0%	[-2.6%, -1.4%]	3
Improvements ✅ (secondary)	-2.8%	[-2.8%, -2.8%]	1
All ❌✅ (primary)	1.1%	[-2.6%, 5.3%]	7

Cycles

Results (primary 2.9%, secondary -3.5%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	2.9%	[2.6%, 3.1%]	2
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-3.5%	[-4.2%, -2.4%]	4
All ❌✅ (primary)	2.9%	[2.6%, 3.1%]	2

Binary size

Results (primary -0.0%, secondary -0.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	0.3%	[0.0%, 0.5%]	5
Regressions ❌ (secondary)	0.4%	[0.3%, 0.5%]	4
Improvements ✅ (primary)	-0.4%	[-0.9%, -0.0%]	4
Improvements ✅ (secondary)	-0.3%	[-0.7%, -0.0%]	5
All ❌✅ (primary)	-0.0%	[-0.9%, 0.5%]	9

Bootstrap: 481.449s -> 480.016s (-0.30%)
Artifact size: 397.59 MiB -> 397.55 MiB (-0.01%)

rustbot assigned scottmcm Feb 16, 2026

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Feb 16, 2026