Commit graph

4915 commits

Author SHA1 Message Date
Cuong Manh Le 0b65b02ba5 cmd/compile: fix clear on slice with zero size elem
Fixed #61127

Change-Id: If07b04ebcc98438c66f273c0c94bea1f230dc2e8
Reviewed-on: https://go-review.googlesource.com/c/go/+/507535
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2023-07-10 16:31:54 +00:00
Keith Randall c4db811e44 cmd/compile: don't ICE on unaligned offsets for pointer writes
User code is unlikely to be correct, but don't crash the compiler
when the offset of a pointer in an object is not a multiple of the
pointer size.

Fixes #61187

Change-Id: Ie56bfcb38556c5dd6f702ae4ec1d4534c6acd420
Reviewed-on: https://go-review.googlesource.com/c/go/+/508555
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
2023-07-10 16:29:42 +00:00
Meng Zhuo 3fce111535 cmd/compile: fix FMA negative commutativity of riscv64
According to RISCV manual 11.6:

FMADD x,y,z computes x*y+z and
FNMADD x,y,z => -x*y-z
FMSUB x,y,z => x*y-z
FNMSUB x,y,z => -x*y+z respectively

However our implement of SSA convert FMADD -x,y,z to FNMADD x,y,z which
is wrong and should be convert to FNMSUB according to manual.

Change-Id: Ib297bc83824e121fd7dda171ed56ea9694a4e575
Reviewed-on: https://go-review.googlesource.com/c/go/+/506575
Run-TryBot: M Zhuo <mzh@golangcn.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Joedian Reid <joedian@golang.org>
Reviewed-by: Michael Munday <mike.munday@lowrisc.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-07-05 22:05:44 +00:00
Meng Zhuo 5c1a15df41 test/codegen: enable Mul2 DivPow2 test for riscv64
Change-Id: Ice0bb7a665599b334e927a1b00d1a5b400c15e3d
Reviewed-on: https://go-review.googlesource.com/c/go/+/506035
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
2023-07-04 13:33:45 +00:00
Matthew Dempsky 79d4defa75 cmd/compile/internal/ssagen: fix min/max codegen, again
The large-function phi placement algorithm evidently doesn't like the
same pseudo-variable being used to represent expressions of varying
types.

Instead, use the same tactic as used for "valVar" (ssa.go:6585--6587),
which is to just generate a fresh marker node each time.

Maybe we could just use the OMIN/OMAX nodes themselves as the key
(like we do for OANDAND/OOROR), but that just seems needlessly risky
for negligible memory savings. Using fresh marker values each time
seems obviously safe by comparison.

Fixes #61041.

Change-Id: Ie2600c9c37b599c2e26ae01f5f8a433025d7fd08
Reviewed-on: https://go-review.googlesource.com/c/go/+/506679
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2023-06-28 16:07:47 +00:00
Cuong Manh Le 4ad4128d3c cmd/compile: fix bad order of evaluation for min/max builtin
For float or string, min/max builtin performs a runtime call, so we need
to save its result to temporary variable. Otherwise, the runtime call
will clobber closure's arguments currently on the stack when passing
min/max as argument to closures.

Fixes #60990

Change-Id: I1397800f815ec7853182868678d0f760b22afff2
Reviewed-on: https://go-review.googlesource.com/c/go/+/506115
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2023-06-27 18:18:23 +00:00
Cuong Manh Le b3ca8d2b3c types2, go/types: record final type for min/max arguments
Fixes #60991

Change-Id: I6130ccecbdc209996dbb376491be9df3b8988327
Reviewed-on: https://go-review.googlesource.com/c/go/+/506055
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Findley <rfindley@google.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-06-26 17:08:05 +00:00
Keith Randall a031f4ef83 cmd/compile: fix min/max builtin code generation
Our large-function phi placement algorithm is incompatible with phi
opcodes already existing in the SSA representation. Instead, use simple
variable assignments and have the phi placement algorithm place the phis
we need for min/max.

Turns out the small-function phi placement algorithm doesn't have this
sensitivity, so this bug only occurs in large functions (>500 basic blocks).

Maybe we should document/check that no phis are present when we start
phi placement (regardless of size).  Leaving for a potential separate CL.

We should probably also fix the placement algorithm to handle existing
phis correctly.  But this CL is probably a lot smaller/safer than
messing with phi placement.

Fixes #60982

Change-Id: I59ba7f506c72b22bc1485099a335d96315ebef67
Reviewed-on: https://go-review.googlesource.com/c/go/+/505756
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2023-06-24 05:24:25 +00:00
Cuong Manh Le 6dce882b3a cmd/compile: scanning closures body when visiting wrapper function
CL 410344 fixed missing method value wrapper, by visiting body of
wrapper function after applying inlining pass.

CL 492017 allow more inlining of functions that construct closures,
which ends up making the wrapper function now inlineable, but can
contain closure nodes that couldn't be inlined. These closures body may
contain OMETHVALUE nodes that we never seen, thus we need to scan
closures body for finding them.

Fixes #60945

Change-Id: Ia1e31420bb172ff87d7321d2da2989ef23e6ebb6
Reviewed-on: https://go-review.googlesource.com/c/go/+/505255
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2023-06-23 14:29:16 +00:00
Ian Lance Taylor 633b742ae0 test: add test that caused a gofrontend crash
Change-Id: Idd872c5b90dbca564ed8a37bb3683e642142ae63
Reviewed-on: https://go-review.googlesource.com/c/go/+/505015
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
2023-06-21 21:30:46 +00:00
Michael Anthony Knyszek 2b0ff4b629 reflect: fix ArenaNew to match documentation
Currently ArenaNew expects the type passed in to be a *T and it returns
a *T. This does not match the function's documentation.

Since this is an experiment, change ArenaNew to match the documentation.
This more closely aligns ArenaNew with arena.New. (Takes a type T,
returns a *T value.)

Note that this is a breaking change. However, as far as pkg.go.dev can
tell, there's exactly one package using it in the open source world.

Also, add smoke test for the exported API, which is just a wrapper
around the internal API. Clearly there's enough room for error here that
it should be tested, but we don't need thorough tests at this layer
because that already exists in the runtime. We just need to make sure it
basically works.

Fixes #60528.

Change-Id: I673cc4609378380ef80648b0c2eb2928e73f49c9
Reviewed-on: https://go-review.googlesource.com/c/go/+/501860
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-06-16 17:08:43 +00:00
Cuong Manh Le 60e6afb689 cmd/compile: do not report division by error during typecheck
types2 have already errored about any spec-required overflows, and
division by zero. CL 469595 unintentionally fixed typecheck not to error
about overflows, but zero division is still be checked during tcArith.
This causes unsafe operations with variable size failed to compile,
instead of raising runtime error.

Fixes #60601

Change-Id: I7bea2821099556835c920713397f7c5d8a4025ac
Reviewed-on: https://go-review.googlesource.com/c/go/+/501735
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2023-06-15 18:41:09 +00:00
Meng Zhuo b7e7467865 test/codegen: add fsqrt test for riscv64
Add FSQRTD FSQRTS codegen tests for riscv64

Change-Id: I16ca3753ad1ba37afbd9d0f887b078e33f98fda0
Reviewed-on: https://go-review.googlesource.com/c/go/+/503275
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
Run-TryBot: M Zhuo <mzh@golangcn.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-06-15 15:16:20 +00:00
Keith Randall c643b29381 cmd/compile: use callsite as line number for argument marshaling
Don't use the line number of the argument itself, as that may be from
arbitrarily earlier in the function.

Fixes #60673

Change-Id: Ifc0a2aaae221a256be3a4b0b2e04849bae4b79d7
Reviewed-on: https://go-review.googlesource.com/c/go/+/502656
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@google.com>
2023-06-12 20:34:37 +00:00
Cuong Manh Le 96d16803c2 cmd/compile: allow ir.OMIN/ir.OMAX in mayCall
CL 496257 adds min/max builtins, which may appear as argument to a
function call, so it will be tested by mayCall. But those ops are not
handled by mayCall, causes the compiler crashes.

Fixes #60582

Change-Id: I729f10bf62b4aad39ffcb1433f576e74d09fdd9a
Reviewed-on: https://go-review.googlesource.com/c/go/+/500575
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2023-06-05 03:11:36 +00:00
Cherry Mui fdb3dc471d cmd/internal/obj/arm: handle HAUTO etc. in addpool
HAUTO should be handled the same way as other stack offsets for
adding to constant pool. Add the missing cases.

Fixes #57955.

Change-Id: If7fc82cafb2bbf0a6121e73e353b8825cb36b5bc
Reviewed-on: https://go-review.googlesource.com/c/go/+/463138
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
2023-06-01 19:29:08 +00:00
Austin Clements c2e0bf0abf cmd/internal/testdir: pass if GOEXPERIMENT=cgocheck2 is set
Some testdir tests fail if GOEXPERIMENT=cgocheck2 is set. Fix this by
skipping these tests.

Change-Id: I58d4ef0cceb86bcf93220b4a44de9b9dc4879b16
Reviewed-on: https://go-review.googlesource.com/c/go/+/499675
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2023-06-01 18:30:44 +00:00
Robert Griesemer 1dd24d8216 go/types, types2: don't infer type argument for unused parameter in interfaces
Two interface types that are assignable don't have to be identical;
specifically, if they are defined types, they can be different
defined types. If those defined types specify type parameters which
are never used, do not infer a type argument based on the instantiation
of a matching defined type.

Adjusted three existing tests where we inferred type arguments incorrectly.

Fixes #60377.

Change-Id: I91fb207235424b3cbc42b5fd93eee619e7541cb7
Reviewed-on: https://go-review.googlesource.com/c/go/+/498315
Auto-Submit: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Findley <rfindley@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
Run-TryBot: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-05-25 21:37:01 +00:00
Derek Parker c5ba9d2232 cmd/compile: prioritize non-CALL struct member comparisons
This patch optimizes reflectdata.geneq to pick apart structs in array
equality and prioritize non-CALL comparisons over those which involve
a runtime function call. This is similar to how arrays of strings
operate currently. Instead of looping over the entire array of structs
once, if there are any comparisons which involve a runtime function
call we instead loop twice. The first loop is all simple, quick
comparisons. If no inequality is found in the first loop the second loop
calls runtime functions for larger memory comparison, which is more
expensive.

For the benchmarks added in this change:

Old:

```
goos: linux
goarch: amd64
pkg: cmd/compile/internal/reflectdata
cpu: AMD Ryzen 9 3950X 16-Core Processor
BenchmarkEqArrayOfStructsEq
BenchmarkEqArrayOfStructsEq-32            797196              1497 ns/op
BenchmarkEqArrayOfStructsEq-32            758332              1581 ns/op
BenchmarkEqArrayOfStructsEq-32            764871              1599 ns/op
BenchmarkEqArrayOfStructsEq-32            760706              1558 ns/op
BenchmarkEqArrayOfStructsEq-32            763112              1476 ns/op
BenchmarkEqArrayOfStructsEq-32            747696              1547 ns/op
BenchmarkEqArrayOfStructsEq-32            756526              1562 ns/op
BenchmarkEqArrayOfStructsEq-32            768829              1486 ns/op
BenchmarkEqArrayOfStructsEq-32            764248              1477 ns/op
BenchmarkEqArrayOfStructsEq-32            752767              1545 ns/op
BenchmarkEqArrayOfStructsNotEq
BenchmarkEqArrayOfStructsNotEq-32         757194              1542 ns/op
BenchmarkEqArrayOfStructsNotEq-32         748942              1552 ns/op
BenchmarkEqArrayOfStructsNotEq-32         766687              1554 ns/op
BenchmarkEqArrayOfStructsNotEq-32         732069              1541 ns/op
BenchmarkEqArrayOfStructsNotEq-32         759163              1576 ns/op
BenchmarkEqArrayOfStructsNotEq-32         796402              1629 ns/op
BenchmarkEqArrayOfStructsNotEq-32         726610              1570 ns/op
BenchmarkEqArrayOfStructsNotEq-32         735770              1584 ns/op
BenchmarkEqArrayOfStructsNotEq-32         745255              1610 ns/op
BenchmarkEqArrayOfStructsNotEq-32         743872              1591 ns/op
PASS
ok      cmd/compile/internal/reflectdata        35.446s
```

New:

```
goos: linux
goarch: amd64
pkg: cmd/compile/internal/reflectdata
cpu: AMD Ryzen 9 3950X 16-Core Processor
BenchmarkEqArrayOfStructsEq
BenchmarkEqArrayOfStructsEq-32            618379              1827 ns/op
BenchmarkEqArrayOfStructsEq-32            619368              1922 ns/op
BenchmarkEqArrayOfStructsEq-32            616023              1910 ns/op
BenchmarkEqArrayOfStructsEq-32            617575              1905 ns/op
BenchmarkEqArrayOfStructsEq-32            610399              1889 ns/op
BenchmarkEqArrayOfStructsEq-32            615378              1823 ns/op
BenchmarkEqArrayOfStructsEq-32            613732              1883 ns/op
BenchmarkEqArrayOfStructsEq-32            613924              1894 ns/op
BenchmarkEqArrayOfStructsEq-32            657799              1876 ns/op
BenchmarkEqArrayOfStructsEq-32            665580              1873 ns/op
BenchmarkEqArrayOfStructsNotEq
BenchmarkEqArrayOfStructsNotEq-32        1834915               627.4 ns/op
BenchmarkEqArrayOfStructsNotEq-32        1806370               660.5 ns/op
BenchmarkEqArrayOfStructsNotEq-32        1828075               625.5 ns/op
BenchmarkEqArrayOfStructsNotEq-32        1819741               641.6 ns/op
BenchmarkEqArrayOfStructsNotEq-32        1813128               632.3 ns/op
BenchmarkEqArrayOfStructsNotEq-32        1865250               643.7 ns/op
BenchmarkEqArrayOfStructsNotEq-32        1828617               632.8 ns/op
BenchmarkEqArrayOfStructsNotEq-32        1862748               633.6 ns/op
BenchmarkEqArrayOfStructsNotEq-32        1825432               638.7 ns/op
BenchmarkEqArrayOfStructsNotEq-32        1804382               628.8 ns/op
PASS
ok      cmd/compile/internal/reflectdata        36.571s
```

Benchstat comparison:

```
name                      old time/op  new time/op  delta
EqArrayOfStructsEq-32     1.53µs ± 4%  1.88µs ± 3%  +22.66%  (p=0.000 n=10+10)
EqArrayOfStructsNotEq-32  1.57µs ± 3%  0.64µs ± 4%  -59.59%  (p=0.000 n=10+10)
```

So, the equal case is a bit slower (unrolling the loop helps with that),
but the non-equal case is now much faster.

Change-Id: I05d776456c79c48a3d6d74b18c45246e58ffbea6
GitHub-Last-Rev: f57ee07d05
GitHub-Pull-Request: golang/go#59409
Reviewed-on: https://go-review.googlesource.com/c/go/+/481895
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Dmitri Shuralyov <dmitshur@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
2023-05-24 21:55:14 +00:00
Bryan Mills 02d234e34d Revert "cmd/compile: sparse conditional constant propagation"
This reverts CL 483875.

Reason for revert: appears to cause internal compiler errors on the ssacheck builder.

Change-Id: I662418384291470c1962c417797a5890dd9aa7a4
Reviewed-on: https://go-review.googlesource.com/c/go/+/497855
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Bryan Mills <bcmills@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Bryan Mills <bcmills@google.com>
2023-05-24 14:39:34 +00:00
Junxian Zhu f0d575c266 cmd/compile: optimize math.Float64(32)bits and math.Float64(32)frombits on mips64x
This CL use MFC1/MTC1 instructions to move data between GPR and FPR instead of stores and loads to move float/int values.

goos: linux
goarch: mips64le
pkg: math
                      │   oldmath    │              newmath               │
                      │    sec/op    │   sec/op     vs base               │
Acos-4                   258.2n ± 0%   258.2n ± 0%        ~ (p=0.859 n=8)
Acosh-4                  378.7n ± 0%   323.9n ± 0%  -14.47% (p=0.000 n=8)
Asin-4                   255.1n ± 2%   255.5n ± 0%   +0.16% (p=0.002 n=8)
Asinh-4                  407.1n ± 0%   348.7n ± 0%  -14.35% (p=0.000 n=8)
Atan-4                   189.5n ± 0%   189.9n ± 3%        ~ (p=0.205 n=8)
Atanh-4                  355.6n ± 0%   323.4n ± 2%   -9.03% (p=0.000 n=8)
Atan2-4                  284.1n ± 7%   280.1n ± 4%        ~ (p=0.313 n=8)
Cbrt-4                   314.3n ± 0%   236.4n ± 0%  -24.79% (p=0.000 n=8)
Ceil-4                   144.3n ± 3%   139.6n ± 0%        ~ (p=0.069 n=8)
Compare-4               21.100n ± 0%   7.035n ± 0%  -66.66% (p=0.000 n=8)
Compare32-4             20.100n ± 0%   6.030n ± 0%  -70.00% (p=0.000 n=8)
Copysign-4              34.970n ± 0%   6.221n ± 0%  -82.21% (p=0.000 n=8)
Cos-4                    183.4n ± 3%   184.1n ± 5%        ~ (p=0.159 n=8)
Cosh-4                   487.9n ± 2%   419.6n ± 0%  -14.00% (p=0.000 n=8)
Erf-4                    160.6n ± 0%   157.9n ± 0%   -1.68% (p=0.009 n=8)
Erfc-4                   183.7n ± 4%   169.8n ± 0%   -7.54% (p=0.000 n=8)
Erfinv-4                 191.5n ± 4%   183.6n ± 0%   -4.13% (p=0.023 n=8)
Erfcinv-4                192.0n ± 7%   184.3n ± 0%        ~ (p=0.425 n=8)
Exp-4                    398.2n ± 0%   340.1n ± 4%  -14.58% (p=0.000 n=8)
ExpGo-4                  383.3n ± 0%   327.3n ± 0%  -14.62% (p=0.000 n=8)
Expm1-4                  248.7n ± 5%   216.0n ± 0%  -13.11% (p=0.000 n=8)
Exp2-4                   372.8n ± 0%   316.9n ± 3%  -14.98% (p=0.000 n=8)
Exp2Go-4                 374.1n ± 0%   320.5n ± 0%  -14.33% (p=0.000 n=8)
Abs-4                    3.013n ± 0%   3.016n ± 0%   +0.10% (p=0.020 n=8)
Dim-4                    5.021n ± 0%   5.022n ± 0%        ~ (p=0.270 n=8)
Floor-4                  127.5n ± 4%   126.2n ± 3%        ~ (p=0.186 n=8)
Max-4                    72.32n ± 0%   61.33n ± 0%  -15.20% (p=0.000 n=8)
Min-4                    83.33n ± 1%   61.36n ± 0%  -26.37% (p=0.000 n=8)
Mod-4                    690.7n ± 0%   454.5n ± 0%  -34.20% (p=0.000 n=8)
Frexp-4                 116.30n ± 1%   71.80n ± 1%  -38.26% (p=0.000 n=8)
Gamma-4                  389.0n ± 0%   355.9n ± 1%   -8.48% (p=0.000 n=8)
Hypot-4                 102.40n ± 0%   83.90n ± 0%  -18.07% (p=0.000 n=8)
HypotGo-4               105.45n ± 4%   84.82n ± 2%  -19.56% (p=0.000 n=8)
Ilogb-4                  99.13n ± 4%   63.71n ± 2%  -35.73% (p=0.000 n=8)
J0-4                     859.7n ± 0%   854.8n ± 0%   -0.57% (p=0.000 n=8)
J1-4                     873.9n ± 0%   875.7n ± 0%   +0.21% (p=0.007 n=8)
Jn-4                     1.855µ ± 0%   1.867µ ± 0%   +0.65% (p=0.000 n=8)
Ldexp-4                 130.50n ± 2%   64.35n ± 0%  -50.69% (p=0.000 n=8)
Lgamma-4                 208.8n ± 0%   200.9n ± 0%   -3.78% (p=0.000 n=8)
Log-4                    294.1n ± 0%   255.2n ± 3%  -13.22% (p=0.000 n=8)
Logb-4                  105.45n ± 1%   66.81n ± 1%  -36.64% (p=0.000 n=8)
Log1p-4                  268.2n ± 0%   211.3n ± 0%  -21.21% (p=0.000 n=8)
Log10-4                  295.4n ± 0%   255.2n ± 2%  -13.59% (p=0.000 n=8)
Log2-4                   152.9n ± 1%   127.5n ± 0%  -16.61% (p=0.000 n=8)
Modf-4                  103.40n ± 0%   75.36n ± 0%  -27.12% (p=0.000 n=8)
Nextafter32-4           121.20n ± 1%   78.40n ± 0%  -35.31% (p=0.000 n=8)
Nextafter64-4           110.40n ± 1%   64.91n ± 0%  -41.20% (p=0.000 n=8)
PowInt-4                 509.8n ± 1%   369.3n ± 1%  -27.56% (p=0.000 n=8)
PowFrac-4               1189.0n ± 0%   947.8n ± 0%  -20.29% (p=0.000 n=8)
Pow10Pos-4               15.07n ± 0%   15.07n ± 0%        ~ (p=0.733 n=8)
Pow10Neg-4               20.10n ± 0%   20.10n ± 0%        ~ (p=0.576 n=8)
Round-4                  44.22n ± 0%   26.12n ± 0%  -40.92% (p=0.000 n=8)
RoundToEven-4            46.22n ± 0%   27.12n ± 0%  -41.31% (p=0.000 n=8)
Remainder-4              539.0n ± 1%   417.1n ± 1%  -22.62% (p=0.000 n=8)
Signbit-4               17.985n ± 0%   5.694n ± 0%  -68.34% (p=0.000 n=8)
Sin-4                    185.7n ± 5%   172.9n ± 0%   -6.89% (p=0.001 n=8)
Sincos-4                 176.6n ± 0%   200.9n ± 0%  +13.76% (p=0.000 n=8)
Sinh-4                   495.8n ± 0%   435.9n ± 0%  -12.09% (p=0.000 n=8)
SqrtIndirect-4           5.022n ± 0%   5.024n ± 0%        ~ (p=0.083 n=8)
SqrtLatency-4            8.038n ± 0%   8.044n ± 0%        ~ (p=0.524 n=8)
SqrtIndirectLatency-4    8.035n ± 0%   8.039n ± 0%   +0.06% (p=0.017 n=8)
SqrtGoLatency-4          340.1n ± 0%   278.3n ± 0%  -18.19% (p=0.000 n=8)
SqrtPrime-4              5.381µ ± 0%   5.386µ ± 0%        ~ (p=0.662 n=8)
Tan-4                    198.6n ± 1%   183.1n ± 0%   -7.85% (p=0.000 n=8)
Tanh-4                   491.3n ± 1%   440.8n ± 1%  -10.29% (p=0.000 n=8)
Trunc-4                  121.7n ± 0%   121.7n ± 0%        ~ (p=0.769 n=8)
Y0-4                     855.1n ± 0%   859.8n ± 0%   +0.54% (p=0.007 n=8)
Y1-4                     862.3n ± 0%   865.1n ± 0%   +0.32% (p=0.007 n=8)
Yn-4                     1.830µ ± 0%   1.837µ ± 0%   +0.36% (p=0.011 n=8)
Float64bits-4           13.060n ± 0%   3.016n ± 0%  -76.91% (p=0.000 n=8)
Float64frombits-4       13.060n ± 0%   3.018n ± 0%  -76.90% (p=0.000 n=8)
Float32bits-4           13.060n ± 0%   3.016n ± 0%  -76.91% (p=0.000 n=8)
Float32frombits-4       13.070n ± 0%   3.013n ± 0%  -76.94% (p=0.000 n=8)
FMA-4                    446.0n ± 0%   413.1n ± 1%   -7.38% (p=0.000 n=8)
geomean                  143.4n        108.3n       -24.49%

Change-Id: I2067f7a5ae1126ada7ab3fb2083710e8212535e9
Reviewed-on: https://go-review.googlesource.com/c/go/+/493815
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Dmitri Shuralyov <dmitshur@golang.org>
2023-05-24 03:36:31 +00:00
Yi Yang fa50248ce6 cmd/compile: sparse conditional constant propagation
sparse conditional constant propagation can discover optimization opportunities that cannot be found by just combining constant folding and constant propagation and dead code elimination separately.

Updates #59399

Change-Id: Ia954e906480654a6f0cc065d75b5912f96f36b2e
GitHub-Last-Rev: 90fc02db99
GitHub-Pull-Request: golang/go#59575
Reviewed-on: https://go-review.googlesource.com/c/go/+/483875
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
2023-05-24 02:54:03 +00:00
Cuong Manh Le 35a71dc56d cmd/compile: avoid slicebytetostring call in len(string([]byte))
Change-Id: Ie04503e61400a793a6a29a4b58795254deabe472
Reviewed-on: https://go-review.googlesource.com/c/go/+/497276
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2023-05-23 19:27:38 +00:00
Cuong Manh Le 4e679e26a3 test: remove *_unified.go variants
CL 415241 and CL 411935 break tests into unified/nounified variants, for
compatibility with old frontend while developing unified IR. Now the old
frontend was gone, so moving those tests back to the original files.

Change-Id: Iecdcd4e6ee33c723f6ac02189b0be26248e15f0f
Reviewed-on: https://go-review.googlesource.com/c/go/+/497275
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2023-05-23 17:16:35 +00:00
Matthew Dempsky 7f1467ff4d cmd/compile: incorporate inlined function names into closure naming
In Go 1.17, cmd/compile gained the ability to inline calls to
functions that contain function literals (aka "closures"). This was
implemented by duplicating the function literal body and emitting a
second LSym, because in general it might be optimized better than the
original function literal.

However, the second LSym was named simply as any other function
literal appearing literally in the enclosing function would be named.
E.g., if f has a closure "f.funcX", and f is inlined into g, we would
create "g.funcY" (N.B., X and Y need not be the same.). Users then
have no idea this function originally came from f.

With this CL, the inlined call stack is incorporated into the clone
LSym's name: instead of "g.funcY", it's named "g.f.funcY".

In the future, it seems desirable to arrange for the clone's name to
appear exactly as the original name, so stack traces remain the same
as when -l or -d=inlfuncswithclosures are used. But it's unclear
whether the linker supports that today, or whether any downstream
tooling would be confused by this.

Updates #60324.

Change-Id: Ifad0ccef7e959e72005beeecdfffd872f63982f8
Reviewed-on: https://go-review.googlesource.com/c/go/+/497137
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
2023-05-22 22:47:15 +00:00
Keith Randall bd3f44e4ff cmd/compile: constant-fold loads from constant dictionaries and types
Retrying the original CL with a small modification. The original CL
did not handle the case of reading an itab out of a dictionary
correctly.  When we read an itab out of a dictionary, we must treat
the type inside that itab as maybe being put in an interface.

Original CL: 486895
Revert CL: 490156

Change-Id: Id2dc1699d184cd8c63dac83986a70b60b4e6cbd7
Reviewed-on: https://go-review.googlesource.com/c/go/+/491495
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-05-19 18:10:11 +00:00
Robert Griesemer 956d31ecd5 cmd/compile: enable more lenient type inference for untyped arguments
This enables the implementation for proposal #58671, which is
a likely accept. By enabling it early we get a bit extra soak
time for this feature. The change can be reverted trivially, if
need be.

For #58671.

Change-Id: Id6c27515e45ff79f4f1d2fc1706f3f672ccdd1ab
Reviewed-on: https://go-review.googlesource.com/c/go/+/495955
Run-TryBot: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Findley <rfindley@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-05-18 00:35:53 +00:00
Matthew Dempsky 7240d7e9e4 cmd/compile/internal/noder: suppress unionType consistency check
In the types1 universe, we only need to represent value types. For
interfaces, this means we only need to worry about pure interfaces. A
pure interface can embed a union type, but the overall union must be
equivalent to "any".

In go.dev/cl/458619, we changed the types1 reader to return "any", but
to incorporate a consistency check to make sure this is valid.
Unfortunately, a pure interface can actually still reference impure
interfaces, and in general this is hard to check precisely without
reimplementing a lot of types2 data structures and logic into types1.

We haven't had any other reports of this check failing since 1.20, so
it seems simplest to just suppress for now.

Fixes #60117.

Change-Id: I5053faafe2d1068c6d438b2193347546bf5330cd
Reviewed-on: https://go-review.googlesource.com/c/go/+/495455
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
2023-05-16 21:34:45 +00:00
Keith Randall 6042a062dc cmd/compile: make memcombine pass a bit more robust to reassociation of exprs
Be more liberal about expanding the OR tree. Handle any tree shape
instead of a fully left or right associative tree.

Also remove tail feature, it isn't ever needed.

Change-Id: If16bebef94b952a604d6069e9be3d9129994cb6f
Reviewed-on: https://go-review.googlesource.com/c/go/+/494056
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Ryan Berger <ryanbberger@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
2023-05-16 19:13:26 +00:00
eric fang e0ceba8139 cmd/compile: enhance tighten pass for memory values
[This is a roll-forward of CL 458755, which was reverted due to make.bash
being broken on GOAMD64=v3. But it turned out that the problem was caused
by wrong bswap/load rewrite rules, and it was fixed in CL 492616.]

This CL enhances the tighten pass. Previously if a value has memory arg,
then the tighten pass won't move it, actually if the memory state is
consistent among definition and use block, we can move the value. This
CL optimizes this case. This is useful for the following situation:
b1:
  x = load(...mem)
  if(...) goto b2 else b3
b2:
  use(x)
b3:
  some_op_not_use_x

For the micro-benchmark mentioned in #56620, the performance improvement
is about 15%.
There's no noticeable performance change in the go1 benchmark.

Fixes #56620

Change-Id: I36ea68bed384986cd3ae81cb9e6efe84bb213adc
Reviewed-on: https://go-review.googlesource.com/c/go/+/492895
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Eric Fang <eric.fang@arm.com>
2023-05-16 01:01:38 +00:00
Lynn Boger 4481042c43 cmd/compile: update rules to generate more prefixed instructions
This modifies some existing rules to allow more prefixed instructions
to be generated when using GOPPC64=power10. Some rules also check
if PCRel is available, which is currently supported for linux/ppc64le
and linux/ppc64 (internal linking only).

Prior to p10, DS-offset loads and stores had a 16 bit size limit for
the offset field. If the offset of the data for load or store was
beyond this range then an indexed load or store would be selected by
the rules.

In p10 the assembler can generate prefixed instructions in this case,
but does not if an indexed instruction was selected during the lowering
pass.

This allows many more cases to use prefixed loads or stores, reducing
function sizes and improving performance in some cases where the code
change happens in key loops.

For example in strconv BenchmarkAppendQuoteRune before:

  12c5e4:       15 00 10 06     pla     r10,1425660
  12c5e8:       fc c0 40 39
  12c5ec:       00 00 6a e8     ld      r3,0(r10)
  12c5f0:       10 00 aa e8     ld      r5,16(r10)

After this change:

  12a828:       15 00 10 04     pld     r3,1433272
  12a82c:       b8 de 60 e4
  12a830:       15 00 10 04     pld     r5,1433280
  12a834:       c0 de a0 e4

Performs better in the second case.

A testcase was added to verify that the rules correctly select a load or
store based on the offset and whether power10 or earlier.

Change-Id: I4335fed0bd9b8aba8a4f84d69b89f819cc464846
Reviewed-on: https://go-review.googlesource.com/c/go/+/477398
Reviewed-by: Heschi Kreinick <heschi@google.com>
Reviewed-by: Archana Ravindar <aravind5@in.ibm.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Paul Murphy <murp@ibm.com>
2023-05-15 18:20:54 +00:00
Cherry Mui 994eca4883 test: add escape test for reflect.Value operations
With CL 408826 reflect.Value does not always escape. We need to
make sure Value operations does (or does not) escape the Value
correctly. This CL adds a test.

There are still a few unfortunate cases, where some Value
operations escape more than necessary (comparing to a non-reflect
version of the code), but hard to fix. These are mostly that a
Value would escape conditionally (mostly on the type of the Value),
but currently we don't have a good way to express that.

Change-Id: I9fdfc7584670aa09c5a01f6b2803f2043aaddb65
Reviewed-on: https://go-review.googlesource.com/c/go/+/441938
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2023-05-12 23:13:19 +00:00
Cherry Mui be4fe08b57 reflect: do not escape Value.Type
Types are either static (for compiler-created types) or heap
allocated and always reachable (for reflection-created types, held
in the central map). So there is no need to escape types.

With CL 408826 reflect.Value does not always escape. Some functions
that escapes Value.typ would make the Value escape without this CL.

Had to add a special case for the inliner to keep (*Value).Type
still inlineable.

Change-Id: I7c14d35fd26328347b509a06eb5bd1534d40775f
Reviewed-on: https://go-review.googlesource.com/c/go/+/413474
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-05-12 21:11:51 +00:00
Dmitri Shuralyov 7cc4516ac8 internal/testdir: move to cmd/internal/testdir
The effect and motivation is for the test to be selected when doing
'go test cmd' and not when doing 'go test std' since it's primarily
about testing the Go compiler and linker. Other than that, it's run
by all.bash and 'go test std cmd' as before.

For #56844.
Fixes #60059.

Change-Id: I2d499af013f9d9b8761fdf4573f8d27d80c1fccf
Reviewed-on: https://go-review.googlesource.com/c/go/+/493876
Run-TryBot: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2023-05-12 17:18:08 +00:00
Austin Clements b679e31cdb test/bench: delete
Russ added test/bench/go1 in CL 5484071 to have a stable suite of
programs to use as benchmarks. For the compiler and runtime we had
back then, those were reasonable benchmarks, but the compiler and
runtime are now far more sophisticated and these benchmarks no longer
have good coverage. We also now have better benchmark suites
maintained outside the repo (e.g., golang.org/x/benchmarks). Keeping
test/bench/go1 at this point is actively misleading.

Indirectly related to #37486, as this also removes the last package
dist test runs outside of src/.

Change-Id: I2867ef303fe48a02acce58ace4ee682add8acdbf
Reviewed-on: https://go-review.googlesource.com/c/go/+/494193
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-05-12 12:35:07 +00:00
Austin Clements b6c75c5fb1 test,internal/testdir: don't set GOOS/GOARCH
The test directory driver currently sets the GOOS/GOARCH environment
variables if they aren't set. This appears to be in service of a
single test, test/env.go, which was introduced in September 2008 along
with os.Getenv. It's not entirely clear what that test is even trying
to check, since runtime.GOOS isn't necessarily the same as $GOOS. We
keep the test around because golang.org/x/tools/go/ssa/interp uses it
as a test case, but we simplify the test and eliminate the need for
the driver to set GOOS/GOARCH.

Change-Id: I5acc0093b557c95d1f0a526d031210256a68222d
Reviewed-on: https://go-review.googlesource.com/c/go/+/493601
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
2023-05-12 12:34:59 +00:00
Stefan 95c4f320d5 cmd/compile: add De Morgan's rewrite rule
Adds rules that rewrites statements such as ~P&~Q as ~(P|Q) and ~P|~Q as ~(P&Q), removing an extraneous instruction.

Change-Id: Icedb97df741680ddf9799df79df78657173aa500
GitHub-Last-Rev: f22e2350c9
GitHub-Pull-Request: golang/go#60018
Reviewed-on: https://go-review.googlesource.com/c/go/+/493175
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Stefan M <st3f4nm4d4@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2023-05-10 16:32:25 +00:00
Lynn Boger bc3bdfa977 test: add memcombine testcases for ppc64
Thanks to the recent addition of the memcombine pass, the
ppc64 ports now have the memcombine optimizations. Previously
in PPC64.rules, the memcombine rules were only added for
ppc64le targets due to the significant increase in size of
the rewritePPC64.go file when those rules were added. The
ppc64 and ppc64le rules had to be different because of the
byte order due to endianness differences.

This enables the memcombine tests to be run on ppc64 as well
as ppc64le.

Change-Id: I4081e2d94617a1b66541d536c0c2662e266c9c1e
Reviewed-on: https://go-review.googlesource.com/c/go/+/492615
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Lynn Boger <laboger@linux.vnet.ibm.com>
2023-05-08 16:50:23 +00:00
Junxian Zhu 5cad8d41ca math: optimize math.Abs on mipsx
This commit optimized math.Abs function implementation on mipsx.
Tested on loongson 3A2000.

goos: linux
goarch: mipsle
pkg: math
                      │   oldmath    │              newmath               │
                      │    sec/op    │   sec/op     vs base               │
Acos-4                   282.6n ± 0%   282.3n ± 0%        ~ (p=0.140 n=7)
Acosh-4                  506.1n ± 0%   451.8n ± 0%  -10.73% (p=0.001 n=7)
Asin-4                   272.3n ± 0%   272.2n ± 0%        ~ (p=0.808 n=7)
Asinh-4                  529.7n ± 0%   475.3n ± 0%  -10.27% (p=0.001 n=7)
Atan-4                   208.2n ± 0%   207.9n ± 0%        ~ (p=0.134 n=7)
Atanh-4                  503.4n ± 1%   449.7n ± 0%  -10.67% (p=0.001 n=7)
Atan2-4                  310.5n ± 0%   310.5n ± 0%        ~ (p=0.928 n=7)
Cbrt-4                   359.3n ± 0%   358.8n ± 0%        ~ (p=0.121 n=7)
Ceil-4                   203.9n ± 0%   204.0n ± 0%        ~ (p=0.600 n=7)
Compare-4                23.11n ± 0%   23.11n ± 0%        ~ (p=0.702 n=7)
Compare32-4              19.09n ± 0%   19.12n ± 0%        ~ (p=0.070 n=7)
Copysign-4               33.20n ± 0%   34.02n ± 0%   +2.47% (p=0.001 n=7)
Cos-4                    422.5n ± 0%   385.4n ± 1%   -8.78% (p=0.001 n=7)
Cosh-4                   628.0n ± 0%   545.5n ± 0%  -13.14% (p=0.001 n=7)
Erf-4                    193.7n ± 2%   192.7n ± 1%        ~ (p=0.430 n=7)
Erfc-4                   192.8n ± 1%   193.0n ± 0%        ~ (p=0.245 n=7)
Erfinv-4                 220.7n ± 1%   221.5n ± 2%        ~ (p=0.272 n=7)
Erfcinv-4                221.3n ± 1%   220.4n ± 2%        ~ (p=0.738 n=7)
Exp-4                    471.4n ± 0%   435.1n ± 0%   -7.70% (p=0.001 n=7)
ExpGo-4                  470.6n ± 0%   434.0n ± 0%   -7.78% (p=0.001 n=7)
Expm1-4                  243.1n ± 0%   243.4n ± 0%        ~ (p=0.417 n=7)
Exp2-4                   463.1n ± 0%   427.0n ± 0%   -7.80% (p=0.001 n=7)
Exp2Go-4                 462.4n ± 0%   426.2n ± 5%   -7.83% (p=0.001 n=7)
Abs-4                   37.000n ± 0%   8.039n ± 9%  -78.27% (p=0.001 n=7)
Dim-4                    18.09n ± 0%   18.11n ± 0%        ~ (p=0.094 n=7)
Floor-4                  151.9n ± 0%   151.8n ± 0%        ~ (p=0.190 n=7)
Max-4                    116.7n ± 1%   116.7n ± 1%        ~ (p=0.842 n=7)
Min-4                    116.6n ± 1%   116.6n ± 0%        ~ (p=0.464 n=7)
Mod-4                   1244.0n ± 0%   980.9n ± 0%  -21.15% (p=0.001 n=7)
Frexp-4                  199.0n ± 0%   146.7n ± 0%  -26.28% (p=0.001 n=7)
Gamma-4                  516.4n ± 0%   479.3n ± 1%   -7.18% (p=0.001 n=7)
Hypot-4                  169.8n ± 0%   117.8n ± 2%  -30.62% (p=0.001 n=7)
HypotGo-4                170.8n ± 0%   117.5n ± 0%  -31.21% (p=0.001 n=7)
Ilogb-4                  160.8n ± 0%   109.5n ± 0%  -31.90% (p=0.001 n=7)
J0-4                     1.359µ ± 0%   1.305µ ± 0%   -3.97% (p=0.001 n=7)
J1-4                     1.386µ ± 0%   1.334µ ± 0%   -3.75% (p=0.001 n=7)
Jn-4                     2.864µ ± 0%   2.758µ ± 0%   -3.70% (p=0.001 n=7)
Ldexp-4                  202.9n ± 0%   151.7n ± 0%  -25.23% (p=0.001 n=7)
Lgamma-4                 234.0n ± 0%   234.3n ± 0%        ~ (p=0.199 n=7)
Log-4                    444.1n ± 0%   407.9n ± 0%   -8.15% (p=0.001 n=7)
Logb-4                   157.8n ± 0%   121.6n ± 0%  -22.94% (p=0.001 n=7)
Log1p-4                  354.8n ± 0%   315.4n ± 0%  -11.10% (p=0.001 n=7)
Log10-4                  453.9n ± 0%   417.9n ± 0%   -7.93% (p=0.001 n=7)
Log2-4                   245.3n ± 0%   209.1n ± 0%  -14.76% (p=0.001 n=7)
Modf-4                   126.6n ± 0%   126.6n ± 0%        ~ (p=0.126 n=7)
Nextafter32-4            112.5n ± 0%   112.5n ± 0%        ~ (p=0.853 n=7)
Nextafter64-4            141.7n ± 0%   141.6n ± 0%        ~ (p=0.331 n=7)
PowInt-4                 878.8n ± 1%   758.3n ± 1%  -13.71% (p=0.001 n=7)
PowFrac-4                1.809µ ± 0%   1.615µ ± 0%  -10.72% (p=0.001 n=7)
Pow10Pos-4               18.10n ± 0%   18.12n ± 0%        ~ (p=0.464 n=7)
Pow10Neg-4               17.09n ± 0%   17.09n ± 0%        ~ (p=0.263 n=7)
Round-4                  68.36n ± 0%   68.33n ± 0%        ~ (p=0.325 n=7)
RoundToEven-4            78.40n ± 0%   78.40n ± 0%        ~ (p=0.934 n=7)
Remainder-4              894.0n ± 1%   753.4n ± 1%  -15.73% (p=0.001 n=7)
Signbit-4                18.09n ± 0%   18.09n ± 0%        ~ (p=0.761 n=7)
Sin-4                    389.8n ± 1%   389.8n ± 0%        ~ (p=0.995 n=7)
Sincos-4                 416.0n ± 0%   415.9n ± 0%        ~ (p=0.361 n=7)
Sinh-4                   634.6n ± 4%   585.6n ± 1%   -7.72% (p=0.001 n=7)
SqrtIndirect-4           8.035n ± 0%   8.036n ± 0%        ~ (p=0.523 n=7)
SqrtLatency-4            8.039n ± 0%   8.037n ± 0%        ~ (p=0.218 n=7)
SqrtIndirectLatency-4    8.040n ± 0%   8.040n ± 0%        ~ (p=0.652 n=7)
SqrtGoLatency-4          895.7n ± 0%   896.6n ± 0%   +0.10% (p=0.004 n=7)
SqrtPrime-4              5.406µ ± 0%   5.407µ ± 0%        ~ (p=0.592 n=7)
Tan-4                    406.1n ± 0%   405.8n ± 1%        ~ (p=0.435 n=7)
Tanh-4                   627.6n ± 0%   545.5n ± 0%  -13.08% (p=0.001 n=7)
Trunc-4                  146.7n ± 1%   146.7n ± 0%        ~ (p=0.755 n=7)
Y0-4                     1.359µ ± 0%   1.310µ ± 0%   -3.61% (p=0.001 n=7)
Y1-4                     1.351µ ± 0%   1.301µ ± 0%   -3.70% (p=0.001 n=7)
Yn-4                     2.829µ ± 0%   2.729µ ± 0%   -3.53% (p=0.001 n=7)
Float64bits-4            14.08n ± 0%   14.07n ± 0%        ~ (p=0.069 n=7)
Float64frombits-4        19.09n ± 0%   19.10n ± 0%        ~ (p=0.755 n=7)
Float32bits-4            13.06n ± 0%   13.07n ± 1%        ~ (p=0.586 n=7)
Float32frombits-4        13.06n ± 0%   13.06n ± 0%        ~ (p=0.853 n=7)
FMA-4                    606.9n ± 0%   606.8n ± 0%        ~ (p=0.393 n=7)
geomean                  201.1n        185.4n        -7.81%

Change-Id: I6d41a97ad3789ed5731588588859ac0b8b13b664
Reviewed-on: https://go-review.googlesource.com/c/go/+/484675
Reviewed-by: Rong Zhang <rongrong@oss.cipunited.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
2023-05-08 15:53:28 +00:00
Than McIntosh 445e520d49 cmd/compile: allow more inlining of functions that construct closures
[This is a roll-forward of CL 479095, which was reverted due to a bad
interaction between inlining and escape analysis, then later fixed
first with an attempt in CL 482355, then again in CL 484859, and then
one more time with CL 492135.]

Currently, when the inliner is determining if a function is
inlineable, it descends into the bodies of closures constructed by
that function. This has several unfortunate consequences:

- If the closure contains a disallowed operation (e.g., a defer), then
  the outer function can't be inlined. It makes sense that the
  *closure* can't be inlined in this case, but it doesn't make sense
  to punish the function that constructs the closure.

- The hairiness of the closure counts against the inlining budget of
  the outer function. Since we currently copy the closure body when
  inlining the outer function, this makes sense from the perspective
  of export data size and binary size, but ultimately doesn't make
  much sense from the perspective of what should be inlineable.

- Since the inliner walks into every closure created by an outer
  function in addition to starting a walk at every closure, this adds
  an n^2 factor to inlinability analysis.

This CL simply drops this behavior.

In std, this makes 57 more functions inlinable, and disallows inlining
for 10 (due to the basic instability of our bottom-up inlining
approach), for an net increase of 47 inlinable functions (+0.6%).

This will help significantly with the performance of the functions to
be added for #56102, which have a somewhat complicated nesting of
closures with a performance-critical fast path.

The downside of this seems to be a potential increase in export data
and text size, but the practical impact of this seems to be
negligible:

	       │    before    │           after            │
	       │    bytes     │    bytes      vs base      │
Go/binary        15.12Mi ± 0%   15.14Mi ± 0%  +0.16% (n=1)
Go/text          5.220Mi ± 0%   5.237Mi ± 0%  +0.32% (n=1)
Compile/binary   22.92Mi ± 0%   22.94Mi ± 0%  +0.07% (n=1)
Compile/text     8.428Mi ± 0%   8.435Mi ± 0%  +0.08% (n=1)

Change-Id: I5f75fcceb177f05853996b75184a486528eafe96
Reviewed-on: https://go-review.googlesource.com/c/go/+/492017
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Than McIntosh <thanm@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2023-05-05 21:04:48 +00:00
Than McIntosh 89138ce740 cmd/compile: un-hide closure func if parent expr moved to staticinit
If the function referenced by a closure expression is incorporated
into a static init, be sure to mark it as non-hidden, since otherwise
it will be live but no longer reachable from the init func, hence it
will be skipped during escape analysis, which can lead to
miscompilations.

Fixes #59680.

Change-Id: Ib858aee296efcc0b7655d25c23ab8a6a8dbdc5f9
Reviewed-on: https://go-review.googlesource.com/c/go/+/492135
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2023-05-05 21:04:38 +00:00
Than McIntosh ea69de9b92 cmd/compile: rework marking of dead hidden closure functions
[This is a roll-forward of CL 484859, this time including a fix for
issue #59709. The call to do dead function marking was taking place in
the wrong spot, causing it to run more than once if generics were
instantiated.]

This patch generalizes the code in the inliner that marks unreferenced
hidden closure functions as dead. Rather than doing the marking on the
fly (previous approach), this new approach does a single pass at the
end of inlining, which catches more dead functions.

Change-Id: I0e079ad755c21295477201acbd7e1a732a98fffd
Reviewed-on: https://go-review.googlesource.com/c/go/+/492016
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Run-TryBot: Than McIntosh <thanm@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2023-05-05 21:04:28 +00:00
Junxian Zhu 574431cfcd math: optimize math.Abs on mips64x
This commit optimized math.Abs function implementation on mips64x.
Tested on loongson 3A2000.

goos: linux
goarch: mips64le
pkg: math
                      │    oldmath    │               newmath               │
                      │    sec/op     │    sec/op     vs base               │
Acos-4                   258.0n ± ∞ ¹   257.1n ± ∞ ¹   -0.35% (p=0.008 n=5)
Acosh-4                  417.0n ± ∞ ¹   377.9n ± ∞ ¹   -9.38% (p=0.008 n=5)
Asin-4                   248.0n ± ∞ ¹   259.9n ± ∞ ¹   +4.80% (p=0.008 n=5)
Asinh-4                  439.6n ± ∞ ¹   408.3n ± ∞ ¹   -7.12% (p=0.008 n=5)
Atan-4                   189.6n ± ∞ ¹   188.8n ± ∞ ¹        ~ (p=0.056 n=5)
Atanh-4                  390.0n ± ∞ ¹   356.4n ± ∞ ¹   -8.62% (p=0.008 n=5)
Atan2-4                  279.0n ± ∞ ¹   263.9n ± ∞ ¹   -5.41% (p=0.008 n=5)
Cbrt-4                   314.2n ± ∞ ¹   322.3n ± ∞ ¹   +2.58% (p=0.008 n=5)
Ceil-4                   139.7n ± ∞ ¹   136.6n ± ∞ ¹   -2.22% (p=0.008 n=5)
Compare-4                21.11n ± ∞ ¹   21.09n ± ∞ ¹        ~ (p=0.405 n=5)
Compare32-4              20.10n ± ∞ ¹   20.12n ± ∞ ¹        ~ (p=0.206 n=5)
Copysign-4               32.17n ± ∞ ¹   35.71n ± ∞ ¹  +11.00% (p=0.008 n=5)
Cos-4                    222.8n ± ∞ ¹   169.8n ± ∞ ¹  -23.79% (p=0.008 n=5)
Cosh-4                   550.2n ± ∞ ¹   477.4n ± ∞ ¹  -13.23% (p=0.008 n=5)
Erf-4                    171.6n ± ∞ ¹   174.5n ± ∞ ¹        ~ (p=0.635 n=5)
Erfc-4                   182.6n ± ∞ ¹   170.2n ± ∞ ¹   -6.79% (p=0.008 n=5)
Erfinv-4                 177.6n ± ∞ ¹   196.6n ± ∞ ¹  +10.70% (p=0.008 n=5)
Erfcinv-4                177.8n ± ∞ ¹   197.8n ± ∞ ¹  +11.25% (p=0.008 n=5)
Exp-4                    422.8n ± ∞ ¹   382.1n ± ∞ ¹   -9.63% (p=0.008 n=5)
ExpGo-4                  416.1n ± ∞ ¹   383.2n ± ∞ ¹   -7.91% (p=0.008 n=5)
Expm1-4                  232.9n ± ∞ ¹   252.2n ± ∞ ¹   +8.29% (p=0.008 n=5)
Exp2-4                   404.8n ± ∞ ¹   389.1n ± ∞ ¹   -3.88% (p=0.008 n=5)
Exp2Go-4                 407.0n ± ∞ ¹   372.3n ± ∞ ¹   -8.53% (p=0.008 n=5)
Abs-4                   30.120n ± ∞ ¹   3.014n ± ∞ ¹  -89.99% (p=0.008 n=5)
Dim-4                    5.021n ± ∞ ¹   5.023n ± ∞ ¹        ~ (p=0.071 n=5)
Floor-4                  127.8n ± ∞ ¹   127.1n ± ∞ ¹   -0.55% (p=0.008 n=5)
Max-4                    77.69n ± ∞ ¹   76.33n ± ∞ ¹   -1.75% (p=0.008 n=5)
Min-4                    83.27n ± ∞ ¹   77.87n ± ∞ ¹   -6.48% (p=0.008 n=5)
Mod-4                    906.2n ± ∞ ¹   692.9n ± ∞ ¹  -23.54% (p=0.008 n=5)
Frexp-4                  150.6n ± ∞ ¹   108.6n ± ∞ ¹  -27.89% (p=0.008 n=5)
Gamma-4                  418.4n ± ∞ ¹   386.1n ± ∞ ¹   -7.72% (p=0.008 n=5)
Hypot-4                 148.20n ± ∞ ¹   93.78n ± ∞ ¹  -36.72% (p=0.008 n=5)
HypotGo-4               148.20n ± ∞ ¹   94.47n ± ∞ ¹  -36.26% (p=0.008 n=5)
Ilogb-4                 135.50n ± ∞ ¹   92.38n ± ∞ ¹  -31.82% (p=0.008 n=5)
J0-4                     937.7n ± ∞ ¹   861.7n ± ∞ ¹   -8.10% (p=0.008 n=5)
J1-4                     915.4n ± ∞ ¹   875.9n ± ∞ ¹   -4.32% (p=0.008 n=5)
Jn-4                     1.974µ ± ∞ ¹   1.863µ ± ∞ ¹   -5.62% (p=0.008 n=5)
Ldexp-4                  158.5n ± ∞ ¹   129.3n ± ∞ ¹  -18.42% (p=0.008 n=5)
Lgamma-4                 209.0n ± ∞ ¹   211.8n ± ∞ ¹        ~ (p=0.095 n=5)
Log-4                    326.4n ± ∞ ¹   295.2n ± ∞ ¹   -9.56% (p=0.008 n=5)
Logb-4                   147.7n ± ∞ ¹   105.0n ± ∞ ¹  -28.91% (p=0.008 n=5)
Log1p-4                  303.4n ± ∞ ¹   266.3n ± ∞ ¹  -12.23% (p=0.008 n=5)
Log10-4                  329.2n ± ∞ ¹   298.3n ± ∞ ¹   -9.39% (p=0.008 n=5)
Log2-4                   187.4n ± ∞ ¹   153.0n ± ∞ ¹  -18.36% (p=0.008 n=5)
Modf-4                   110.5n ± ∞ ¹   103.5n ± ∞ ¹   -6.33% (p=0.008 n=5)
Nextafter32-4            128.4n ± ∞ ¹   121.5n ± ∞ ¹   -5.37% (p=0.016 n=5)
Nextafter64-4            109.5n ± ∞ ¹   110.5n ± ∞ ¹   +0.91% (p=0.008 n=5)
PowInt-4                 603.3n ± ∞ ¹   516.4n ± ∞ ¹  -14.40% (p=0.008 n=5)
PowFrac-4                1.365µ ± ∞ ¹   1.183µ ± ∞ ¹  -13.33% (p=0.008 n=5)
Pow10Pos-4               15.07n ± ∞ ¹   15.07n ± ∞ ¹        ~ (p=0.738 n=5)
Pow10Neg-4               21.11n ± ∞ ¹   21.10n ± ∞ ¹        ~ (p=0.190 n=5)
Round-4                  44.23n ± ∞ ¹   44.22n ± ∞ ¹        ~ (p=0.635 n=5)
RoundToEven-4            50.25n ± ∞ ¹   46.27n ± ∞ ¹   -7.92% (p=0.008 n=5)
Remainder-4              675.6n ± ∞ ¹   530.4n ± ∞ ¹  -21.49% (p=0.008 n=5)
Signbit-4                17.07n ± ∞ ¹   17.95n ± ∞ ¹   +5.16% (p=0.008 n=5)
Sin-4                    171.6n ± ∞ ¹   189.1n ± ∞ ¹  +10.20% (p=0.008 n=5)
Sincos-4                 201.5n ± ∞ ¹   200.5n ± ∞ ¹        ~ (p=0.421 n=5)
Sinh-4                   529.6n ± ∞ ¹   484.6n ± ∞ ¹   -8.50% (p=0.008 n=5)
SqrtIndirect-4           5.021n ± ∞ ¹   5.023n ± ∞ ¹   +0.04% (p=0.048 n=5)
SqrtLatency-4            8.032n ± ∞ ¹   8.039n ± ∞ ¹   +0.09% (p=0.024 n=5)
SqrtIndirectLatency-4    8.036n ± ∞ ¹   8.038n ± ∞ ¹        ~ (p=0.056 n=5)
SqrtGoLatency-4          338.8n ± ∞ ¹   338.7n ± ∞ ¹        ~ (p=0.841 n=5)
SqrtPrime-4              5.379µ ± ∞ ¹   5.382µ ± ∞ ¹   +0.06% (p=0.048 n=5)
Tan-4                    182.7n ± ∞ ¹   191.8n ± ∞ ¹   +4.98% (p=0.008 n=5)
Tanh-4                   558.7n ± ∞ ¹   497.6n ± ∞ ¹  -10.94% (p=0.008 n=5)
Trunc-4                  122.5n ± ∞ ¹   122.6n ± ∞ ¹        ~ (p=0.405 n=5)
Y0-4                     892.8n ± ∞ ¹   851.7n ± ∞ ¹   -4.60% (p=0.008 n=5)
Y1-4                     887.2n ± ∞ ¹   863.2n ± ∞ ¹   -2.71% (p=0.008 n=5)
Yn-4                     1.889µ ± ∞ ¹   1.832µ ± ∞ ¹   -3.02% (p=0.008 n=5)
Float64bits-4            13.05n ± ∞ ¹   13.06n ± ∞ ¹   +0.08% (p=0.040 n=5)
Float64frombits-4        13.05n ± ∞ ¹   13.06n ± ∞ ¹        ~ (p=0.143 n=5)
Float32bits-4            13.05n ± ∞ ¹   13.06n ± ∞ ¹   +0.08% (p=0.008 n=5)
Float32frombits-4        13.05n ± ∞ ¹   13.08n ± ∞ ¹   +0.23% (p=0.016 n=5)
FMA-4                    445.7n ± ∞ ¹   448.1n ± ∞ ¹   +0.54% (p=0.008 n=5)
geomean                  157.2n         142.8n         -9.17%

Change-Id: I9bf104848b588c9ecf79401a81d483d7fcdb0a79
Reviewed-on: https://go-review.googlesource.com/c/go/+/481575
Reviewed-by: M Zhuo <mzh@golangcn.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Than McIntosh <thanm@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
Reviewed-by: Rong Zhang <rongrong@oss.cipunited.com>
2023-05-05 14:54:39 +00:00
Matthew Dempsky 767fbe01ae cmd/compile: fix compilation of inferred type arguments
Previously, type arguments could only be inferred for generic
functions in call expressions, whereas with the reverse type inference
proposal they can now be inferred in assignment contexts too. As a
consequence, we now need to check Info.Instances to find the inferred
type for more cases now.

Updates #59338.
Fixes #59955.

Change-Id: I9b6465395869459c2387d0424febe7337b28b90e
Reviewed-on: https://go-review.googlesource.com/c/go/+/492455
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
2023-05-03 22:12:27 +00:00
Daniel Martí aa6e168480 Revert "cmd/compile: enhance tighten pass for memory values"
This reverts CL 458755.

Reason for revert: broke make.bash on GOAMD64=v3:

/workdir/go/src/crypto/sha1/sha1.go:54:35: internal compiler error: '(*digest).MarshalBinary': func (*digest).MarshalBinary, startMem[b13] has different values, old v206, new v338

goroutine 34 [running]:
runtime/debug.Stack()
	/workdir/go/src/runtime/debug/stack.go:24 +0x9f
bootstrap/cmd/compile/internal/base.FatalfAt({0x13, 0xaa0f1}, {0xc000db4440, 0x40}, {0xc0013b0000, 0x5, 0x5})
	/workdir/go/src/cmd/compile/internal/base/print.go:234 +0x2d1
bootstrap/cmd/compile/internal/base.Fatalf(...)
	/workdir/go/src/cmd/compile/internal/base/print.go:203
bootstrap/cmd/compile/internal/ssagen.(*ssafn).Fatalf(0xc000d90000, {0x13, 0xaa0f1}, {0xcb7b91, 0x3a}, {0xc000d99bc0, 0x4, 0x4})
	/workdir/go/src/cmd/compile/internal/ssagen/ssa.go:7896 +0x1f8
bootstrap/cmd/compile/internal/ssa.(*Func).Fatalf(0xc000d82340, {0xcb7b91, 0x3a}, {0xc000d99bc0, 0x4, 0x4})
	/workdir/go/src/cmd/compile/internal/ssa/func.go:716 +0x342
bootstrap/cmd/compile/internal/ssa.memState(0xc000d82340, {0xc000ec6200, 0x22, 0x40}, {0xc001046000, 0x22, 0x40})
	/workdir/go/src/cmd/compile/internal/ssa/tighten.go:240 +0x6c5
bootstrap/cmd/compile/internal/ssa.tighten(0xc000d82340)
[...]

Change-Id: Ic445fb48fe0f2c60ac67abe259b66594f1419152
Reviewed-on: https://go-review.googlesource.com/c/go/+/492335
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
2023-05-03 21:28:37 +00:00
erifan01 ea8f037996 cmd/compile: enhance tighten pass for memory values
This CL enhances the tighten pass. Previously if a value has memory arg,
then the tighten pass won't move it, actually if the memory state is
consistent among definition and use block, we can move the value. This
CL optimizes this case. This is useful for the following situation:
b1:
  x = load(...mem)
  if(...) goto b2 else b3
b2:
  use(x)
b3:
  some_op_not_use_x

For the micro-benchmark mentioned in #56620, the performance improvement
is about 15%.
There's no noticeable performance change in the go1 benchmark.

Fixes #56620

Change-Id: I9b152754f27231f583a6995fc7cd8472aa7d390c
Reviewed-on: https://go-review.googlesource.com/c/go/+/458755
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
2023-05-03 19:56:09 +00:00
Robert Griesemer 1f570787a8 cmd/compile: enable reverse type inference
For #59338.

Change-Id: I8141d421cdc60e47ee5794fc1ca81246bd8a8a25
Reviewed-on: https://go-review.googlesource.com/c/go/+/491475
Reviewed-by: Robert Findley <rfindley@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
Run-TryBot: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-05-03 19:36:20 +00:00
Cherry Mui 19fd96512c cmd/link: put zero-sized data symbols at same address as runtime.zerobase
Put zero-sized data symbols at same address as runtime.zerobase,
so zero-sized global variables have the same address as zero-sized
allocations.

Change-Id: Ib3145dc1b663a9794dfabc0e6abd2384960f2c49
Reviewed-on: https://go-review.googlesource.com/c/go/+/490435
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
2023-04-28 18:35:43 +00:00
Austin Clements 0f099a4bc5 runtime, cmd: rationalize StackLimit and StackGuard
The current definitions of StackLimit and StackGuard only indirectly
specify the NOSPLIT stack limit and duplicate a literal constant
(928). Currently, they define the stack guard delta, and from there
compute the NOSPLIT limit.

Rationalize these by defining a new constant, abi.StackNosplitBase,
which consolidates and directly specifies the NOSPLIT stack limit (in
the default case). From this we then compute the stack guard delta,
inverting the relationship between these two constants. While we're
here, we rename StackLimit to StackNosplit to make it clearer what's
being limited.

This change does not affect the values of these constants in the
default configuration. It does slightly change how
StackGuardMultiplier values other than 1 affect the constants, but
this multiplier is a pretty rough heuristic anyway.

                    before after
stackNosplit           800   800
_StackGuard            928   928
stackNosplit -race    1728  1600
_StackGuard -race     1856  1728

For #59670.

Change-Id: Ia94094c5e47897e7c088d24b4a5e33f5c2768db5
Reviewed-on: https://go-review.googlesource.com/c/go/+/486976
Auto-Submit: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-04-21 19:28:56 +00:00
Austin Clements d11ff3f081 Revert "runtime, cmd: rationalize StackLimit and StackGuard"
This reverts commit CL 486380.

Submitted out of order and breaks bootstrap.

Change-Id: I67bd225094b5c9713b97f70feba04d2c99b7da76
Reviewed-on: https://go-review.googlesource.com/c/go/+/486916
Reviewed-by: David Chase <drchase@google.com>
TryBot-Bypass: Austin Clements <austin@google.com>
2023-04-20 16:19:35 +00:00