deref patterns: lower deref patterns to MIR
This lowers deref patterns to MIR. This is a bit tricky because this is the first kind of pattern that requires storing a value in a temporary. Thanks to https://github.com/rust-lang/rust/pull/123324 false edges are no longer a problem.
The thing I'm not confident about is the handling of fake borrows. This PR ignores any fake borrows inside a deref pattern. We are guaranteed to at least fake borrow the place of the first pointer value, which could be enough, but I'm not certain.
Disable SimplifyToExp in MatchBranchSimplification
Due to the miscompilation mentioned in #124150, We need to disable MatchBranchSimplification temporarily.
To fully resolve this issue, my plan is:
1. Disable SimplifyToExp in MatchBranchSimplification (this PR).
2. Remove all potentially unclear transforms in #124122.
3. Gradually add back the removed transforms (possibly multiple PRs).
r? `@Nilstrieb` or `@oli-obk`
Make `checked` ops emit *unchecked* LLVM operations where feasible
For things with easily pre-checked overflow conditions -- shifts and unsigned subtraction -- write the checked methods in such a way that we stop emitting wrapping versions of them.
For example, today <https://rust.godbolt.org/z/qM9YK8Txb> neither
```rust
a.checked_sub(b).unwrap()
```
nor
```rust
a.checked_sub(b).unwrap_unchecked()
```
actually optimizes to `sub nuw`. After this PR they do.
cc #103299
For things with easily pre-checked overflow conditions -- shifts and unsigned subtraction -- write then checked methods in such a way that we stop emitting wrapping versions of them.
For example, today <https://rust.godbolt.org/z/qM9YK8Txb> neither
```rust
a.checked_sub(b).unwrap()
```
nor
```rust
a.checked_sub(b).unwrap_unchecked()
```
actually optimizes to `sub nuw`. After this PR they do.
Re-enable the early otherwise branch optimization
Closes#95162. Fixes#119014.
This is the first part of #121397.
An invalid enum discriminant can come from anywhere. We have to check to see if all successors contain the discriminant statement. This should have a pass to hoist instructions.
r? cjgillot
Pass list of defineable opaque types into canonical queries
This eliminates `DefiningAnchor::Bubble` for good and brings the old solver closer to the new one wrt cycles and nested obligations. At that point the difference between `DefiningAnchor::Bind([])` and `DefiningAnchor::Error` was academic. We only used the difference for some sanity checks, which actually had to be worked around in places, so I just removed `DefiningAnchor` entirely and just stored the list of opaques that may be defined.
fixes#108498
fixes https://github.com/rust-lang/rust/issues/116877
* [x] run crater
- https://github.com/rust-lang/rust/pull/122077#issuecomment-2013293931
match lowering: make false edges more precise
When lowering match expressions, we add false edges to hide details of the lowering from borrowck. Morally we pretend we're testing the patterns (and guards) one after the other in order. See the tests for examples. Problem is, the way we implement this today is too coarse for deref patterns.
In deref patterns, a pattern like `deref [1, x]` matches on a `Vec` by creating a temporary to store the output of the call to `deref()` and then uses that to continue matching. Here the pattern has a binding, which we set up after the pre-binding block. Problem is, currently the false edges tell borrowck that the pre-binding block can be reached from a previous arm as well, so the `deref()` temporary may not be initialized. This triggers an error when we try to use the binding `x`.
We could call `deref()` a second time, but this opens the door to soundness issues if the deref impl is weird. Instead in this PR I rework false edges a little bit.
What we need from false edges is a (fake) path from each candidate to the next, specifically from candidate C's pre-binding block to next candidate D's pre-binding block. Today, we link the pre-binding blocks directly. In this PR, I link them indirectly by choosing an earlier node on D's success path. Specifically, I choose the earliest block on D's success path that doesn't make a loop (if I chose e.g. the start block of the whole match (which is on the success path of all candidates), that would make a loop). This turns out to be rather straightforward to implement.
r? `@matthewjasper` if you have the bandwidth, otherwise let me know
Rename `UninhabitedEnumBranching` to `UnreachableEnumBranching`
Per [#120268](https://github.com/rust-lang/rust/pull/120268#discussion_r1517492060), I rename `UninhabitedEnumBranching` to `UnreachableEnumBranching` .
I solved some nits to add some comments.
I adjusted the workaround restrictions. This should be useful for `a <= b` and `if let Some/Ok(v)`. For enum with few variants, `early-tailduplication` should not cause compile time overhead.
r? RalfJung
rename ptr::from_exposed_addr -> ptr::with_exposed_provenance
As discussed on [Zulip](https://rust-lang.zulipchat.com/#narrow/stream/136281-t-opsem/topic/To.20expose.20or.20not.20to.20expose/near/427757066).
The old name, `from_exposed_addr`, makes little sense as it's not the address that is exposed, it's the provenance. (`ptr.expose_addr()` stays unchanged as we haven't found a better option yet. The intended interpretation is "expose the provenance and return the address".)
The new name nicely matches `ptr::without_provenance`.
De-LLVM the unchecked shifts [MCP#693]
This is just one part of the MCP (https://github.com/rust-lang/compiler-team/issues/693), but it's the one that IMHO removes the most noise from the standard library code.
Seems net simpler this way, since MIR already supported heterogeneous shifts anyway, and thus it's not more work for backends than before.
r? WaffleLapkin
Add `Ord::cmp` for primitives as a `BinOp` in MIR
Update: most of this OP was written months ago. See https://github.com/rust-lang/rust/pull/118310#issuecomment-2016940014 below for where we got to recently that made it ready for review.
---
There are dozens of reasonable ways to implement `Ord::cmp` for integers using comparison, bit-ops, and branches. Those differences are irrelevant at the rust level, however, so we can make things better by adding `BinOp::Cmp` at the MIR level:
1. Exactly how to implement it is left up to the backends, so LLVM can use whatever pattern its optimizer best recognizes and cranelift can use whichever pattern codegens the fastest.
2. By not inlining those details for every use of `cmp`, we drastically reduce the amount of MIR generated for `derive`d `PartialOrd`, while also making it more amenable to MIR-level optimizations.
Having extremely careful `if` ordering to μoptimize resource usage on broadwell (#63767) is great, but it really feels to me like libcore is the wrong place to put that logic. Similarly, using subtraction [tricks](https://graphics.stanford.edu/~seander/bithacks.html#CopyIntegerSign) (#105840) is arguably even nicer, but depends on the optimizer understanding it (https://github.com/llvm/llvm-project/issues/73417) to be practical. Or maybe [bitor is better than add](https://discourse.llvm.org/t/representing-in-ir/67369/2?u=scottmcm)? But maybe only on a future version that [has `or disjoint` support](https://discourse.llvm.org/t/rfc-add-or-disjoint-flag/75036?u=scottmcm)? And just because one of those forms happens to be good for LLVM, there's no guarantee that it'd be the same form that GCC or Cranelift would rather see -- especially given their very different optimizers. Not to mention that if LLVM gets a spaceship intrinsic -- [which it should](https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/Suboptimal.20inlining.20in.20std.20function.20.60binary_search.60/near/404250586) -- we'll need at least a rustc intrinsic to be able to call it.
As for simplifying it in Rust, we now regularly inline `{integer}::partial_cmp`, but it's quite a large amount of IR. The best way to see that is with 8811efa88b (diff-d134c32d028fbe2bf835fef2df9aca9d13332dd82284ff21ee7ebf717bfa4765R113) -- I added a new pre-codegen MIR test for a simple 3-tuple struct, and this PR change it from 36 locals and 26 basic blocks down to 24 locals and 8 basic blocks. Even better, as soon as the construct-`Some`-then-match-it-in-same-BB noise is cleaned up, this'll expose the `Cmp == 0` branches clearly in MIR, so that an InstCombine (#105808) can simplify that to just a `BinOp::Eq` and thus fix some of our generated code perf issues. (Tracking that through today's `if a < b { Less } else if a == b { Equal } else { Greater }` would be *much* harder.)
---
r? `@ghost`
But first I should check that perf is ok with this
~~...and my true nemesis, tidy.~~
This is just one part of the MCP, but it's the one that IMHO removes the most noise from the standard library code.
Seems net simpler this way, since MIR already supported heterogeneous shifts anyway, and thus it's not more work for backends than before.
In `ConstructCoroutineInClosureShim`, pass receiver by mut ref, not mut pointer
The receivers were compatible at codegen time, but did not necessarily have the same layouts due to niches, which was caught by miri.
Fixesrust-lang/miri#3400
r? oli-obk