Commit graph

4991 commits

Author SHA1 Message Date
Matthew Dempsky 7f1467ff4d cmd/compile: incorporate inlined function names into closure naming
In Go 1.17, cmd/compile gained the ability to inline calls to
functions that contain function literals (aka "closures"). This was
implemented by duplicating the function literal body and emitting a
second LSym, because in general it might be optimized better than the
original function literal.

However, the second LSym was named simply as any other function
literal appearing literally in the enclosing function would be named.
E.g., if f has a closure "f.funcX", and f is inlined into g, we would
create "g.funcY" (N.B., X and Y need not be the same.). Users then
have no idea this function originally came from f.

With this CL, the inlined call stack is incorporated into the clone
LSym's name: instead of "g.funcY", it's named "g.f.funcY".

In the future, it seems desirable to arrange for the clone's name to
appear exactly as the original name, so stack traces remain the same
as when -l or -d=inlfuncswithclosures are used. But it's unclear
whether the linker supports that today, or whether any downstream
tooling would be confused by this.

Updates #60324.

Change-Id: Ifad0ccef7e959e72005beeecdfffd872f63982f8
Reviewed-on: https://go-review.googlesource.com/c/go/+/497137
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
2023-05-22 22:47:15 +00:00
Keith Randall bd3f44e4ff cmd/compile: constant-fold loads from constant dictionaries and types
Retrying the original CL with a small modification. The original CL
did not handle the case of reading an itab out of a dictionary
correctly.  When we read an itab out of a dictionary, we must treat
the type inside that itab as maybe being put in an interface.

Original CL: 486895
Revert CL: 490156

Change-Id: Id2dc1699d184cd8c63dac83986a70b60b4e6cbd7
Reviewed-on: https://go-review.googlesource.com/c/go/+/491495
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-05-19 18:10:11 +00:00
Robert Griesemer 956d31ecd5 cmd/compile: enable more lenient type inference for untyped arguments
This enables the implementation for proposal #58671, which is
a likely accept. By enabling it early we get a bit extra soak
time for this feature. The change can be reverted trivially, if
need be.

For #58671.

Change-Id: Id6c27515e45ff79f4f1d2fc1706f3f672ccdd1ab
Reviewed-on: https://go-review.googlesource.com/c/go/+/495955
Run-TryBot: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Findley <rfindley@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-05-18 00:35:53 +00:00
Matthew Dempsky 7240d7e9e4 cmd/compile/internal/noder: suppress unionType consistency check
In the types1 universe, we only need to represent value types. For
interfaces, this means we only need to worry about pure interfaces. A
pure interface can embed a union type, but the overall union must be
equivalent to "any".

In go.dev/cl/458619, we changed the types1 reader to return "any", but
to incorporate a consistency check to make sure this is valid.
Unfortunately, a pure interface can actually still reference impure
interfaces, and in general this is hard to check precisely without
reimplementing a lot of types2 data structures and logic into types1.

We haven't had any other reports of this check failing since 1.20, so
it seems simplest to just suppress for now.

Fixes #60117.

Change-Id: I5053faafe2d1068c6d438b2193347546bf5330cd
Reviewed-on: https://go-review.googlesource.com/c/go/+/495455
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
2023-05-16 21:34:45 +00:00
Keith Randall 6042a062dc cmd/compile: make memcombine pass a bit more robust to reassociation of exprs
Be more liberal about expanding the OR tree. Handle any tree shape
instead of a fully left or right associative tree.

Also remove tail feature, it isn't ever needed.

Change-Id: If16bebef94b952a604d6069e9be3d9129994cb6f
Reviewed-on: https://go-review.googlesource.com/c/go/+/494056
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Ryan Berger <ryanbberger@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
2023-05-16 19:13:26 +00:00
eric fang e0ceba8139 cmd/compile: enhance tighten pass for memory values
[This is a roll-forward of CL 458755, which was reverted due to make.bash
being broken on GOAMD64=v3. But it turned out that the problem was caused
by wrong bswap/load rewrite rules, and it was fixed in CL 492616.]

This CL enhances the tighten pass. Previously if a value has memory arg,
then the tighten pass won't move it, actually if the memory state is
consistent among definition and use block, we can move the value. This
CL optimizes this case. This is useful for the following situation:
b1:
  x = load(...mem)
  if(...) goto b2 else b3
b2:
  use(x)
b3:
  some_op_not_use_x

For the micro-benchmark mentioned in #56620, the performance improvement
is about 15%.
There's no noticeable performance change in the go1 benchmark.

Fixes #56620

Change-Id: I36ea68bed384986cd3ae81cb9e6efe84bb213adc
Reviewed-on: https://go-review.googlesource.com/c/go/+/492895
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Eric Fang <eric.fang@arm.com>
2023-05-16 01:01:38 +00:00
Lynn Boger 4481042c43 cmd/compile: update rules to generate more prefixed instructions
This modifies some existing rules to allow more prefixed instructions
to be generated when using GOPPC64=power10. Some rules also check
if PCRel is available, which is currently supported for linux/ppc64le
and linux/ppc64 (internal linking only).

Prior to p10, DS-offset loads and stores had a 16 bit size limit for
the offset field. If the offset of the data for load or store was
beyond this range then an indexed load or store would be selected by
the rules.

In p10 the assembler can generate prefixed instructions in this case,
but does not if an indexed instruction was selected during the lowering
pass.

This allows many more cases to use prefixed loads or stores, reducing
function sizes and improving performance in some cases where the code
change happens in key loops.

For example in strconv BenchmarkAppendQuoteRune before:

  12c5e4:       15 00 10 06     pla     r10,1425660
  12c5e8:       fc c0 40 39
  12c5ec:       00 00 6a e8     ld      r3,0(r10)
  12c5f0:       10 00 aa e8     ld      r5,16(r10)

After this change:

  12a828:       15 00 10 04     pld     r3,1433272
  12a82c:       b8 de 60 e4
  12a830:       15 00 10 04     pld     r5,1433280
  12a834:       c0 de a0 e4

Performs better in the second case.

A testcase was added to verify that the rules correctly select a load or
store based on the offset and whether power10 or earlier.

Change-Id: I4335fed0bd9b8aba8a4f84d69b89f819cc464846
Reviewed-on: https://go-review.googlesource.com/c/go/+/477398
Reviewed-by: Heschi Kreinick <heschi@google.com>
Reviewed-by: Archana Ravindar <aravind5@in.ibm.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Paul Murphy <murp@ibm.com>
2023-05-15 18:20:54 +00:00
Cherry Mui 994eca4883 test: add escape test for reflect.Value operations
With CL 408826 reflect.Value does not always escape. We need to
make sure Value operations does (or does not) escape the Value
correctly. This CL adds a test.

There are still a few unfortunate cases, where some Value
operations escape more than necessary (comparing to a non-reflect
version of the code), but hard to fix. These are mostly that a
Value would escape conditionally (mostly on the type of the Value),
but currently we don't have a good way to express that.

Change-Id: I9fdfc7584670aa09c5a01f6b2803f2043aaddb65
Reviewed-on: https://go-review.googlesource.com/c/go/+/441938
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2023-05-12 23:13:19 +00:00
Cherry Mui be4fe08b57 reflect: do not escape Value.Type
Types are either static (for compiler-created types) or heap
allocated and always reachable (for reflection-created types, held
in the central map). So there is no need to escape types.

With CL 408826 reflect.Value does not always escape. Some functions
that escapes Value.typ would make the Value escape without this CL.

Had to add a special case for the inliner to keep (*Value).Type
still inlineable.

Change-Id: I7c14d35fd26328347b509a06eb5bd1534d40775f
Reviewed-on: https://go-review.googlesource.com/c/go/+/413474
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-05-12 21:11:51 +00:00
Dmitri Shuralyov 7cc4516ac8 internal/testdir: move to cmd/internal/testdir
The effect and motivation is for the test to be selected when doing
'go test cmd' and not when doing 'go test std' since it's primarily
about testing the Go compiler and linker. Other than that, it's run
by all.bash and 'go test std cmd' as before.

For #56844.
Fixes #60059.

Change-Id: I2d499af013f9d9b8761fdf4573f8d27d80c1fccf
Reviewed-on: https://go-review.googlesource.com/c/go/+/493876
Run-TryBot: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2023-05-12 17:18:08 +00:00
Austin Clements b679e31cdb test/bench: delete
Russ added test/bench/go1 in CL 5484071 to have a stable suite of
programs to use as benchmarks. For the compiler and runtime we had
back then, those were reasonable benchmarks, but the compiler and
runtime are now far more sophisticated and these benchmarks no longer
have good coverage. We also now have better benchmark suites
maintained outside the repo (e.g., golang.org/x/benchmarks). Keeping
test/bench/go1 at this point is actively misleading.

Indirectly related to #37486, as this also removes the last package
dist test runs outside of src/.

Change-Id: I2867ef303fe48a02acce58ace4ee682add8acdbf
Reviewed-on: https://go-review.googlesource.com/c/go/+/494193
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-05-12 12:35:07 +00:00
Austin Clements b6c75c5fb1 test,internal/testdir: don't set GOOS/GOARCH
The test directory driver currently sets the GOOS/GOARCH environment
variables if they aren't set. This appears to be in service of a
single test, test/env.go, which was introduced in September 2008 along
with os.Getenv. It's not entirely clear what that test is even trying
to check, since runtime.GOOS isn't necessarily the same as $GOOS. We
keep the test around because golang.org/x/tools/go/ssa/interp uses it
as a test case, but we simplify the test and eliminate the need for
the driver to set GOOS/GOARCH.

Change-Id: I5acc0093b557c95d1f0a526d031210256a68222d
Reviewed-on: https://go-review.googlesource.com/c/go/+/493601
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
2023-05-12 12:34:59 +00:00
Stefan 95c4f320d5 cmd/compile: add De Morgan's rewrite rule
Adds rules that rewrites statements such as ~P&~Q as ~(P|Q) and ~P|~Q as ~(P&Q), removing an extraneous instruction.

Change-Id: Icedb97df741680ddf9799df79df78657173aa500
GitHub-Last-Rev: f22e2350c9
GitHub-Pull-Request: golang/go#60018
Reviewed-on: https://go-review.googlesource.com/c/go/+/493175
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Stefan M <st3f4nm4d4@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2023-05-10 16:32:25 +00:00
Lynn Boger bc3bdfa977 test: add memcombine testcases for ppc64
Thanks to the recent addition of the memcombine pass, the
ppc64 ports now have the memcombine optimizations. Previously
in PPC64.rules, the memcombine rules were only added for
ppc64le targets due to the significant increase in size of
the rewritePPC64.go file when those rules were added. The
ppc64 and ppc64le rules had to be different because of the
byte order due to endianness differences.

This enables the memcombine tests to be run on ppc64 as well
as ppc64le.

Change-Id: I4081e2d94617a1b66541d536c0c2662e266c9c1e
Reviewed-on: https://go-review.googlesource.com/c/go/+/492615
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Lynn Boger <laboger@linux.vnet.ibm.com>
2023-05-08 16:50:23 +00:00
Junxian Zhu 5cad8d41ca math: optimize math.Abs on mipsx
This commit optimized math.Abs function implementation on mipsx.
Tested on loongson 3A2000.

goos: linux
goarch: mipsle
pkg: math
                      │   oldmath    │              newmath               │
                      │    sec/op    │   sec/op     vs base               │
Acos-4                   282.6n ± 0%   282.3n ± 0%        ~ (p=0.140 n=7)
Acosh-4                  506.1n ± 0%   451.8n ± 0%  -10.73% (p=0.001 n=7)
Asin-4                   272.3n ± 0%   272.2n ± 0%        ~ (p=0.808 n=7)
Asinh-4                  529.7n ± 0%   475.3n ± 0%  -10.27% (p=0.001 n=7)
Atan-4                   208.2n ± 0%   207.9n ± 0%        ~ (p=0.134 n=7)
Atanh-4                  503.4n ± 1%   449.7n ± 0%  -10.67% (p=0.001 n=7)
Atan2-4                  310.5n ± 0%   310.5n ± 0%        ~ (p=0.928 n=7)
Cbrt-4                   359.3n ± 0%   358.8n ± 0%        ~ (p=0.121 n=7)
Ceil-4                   203.9n ± 0%   204.0n ± 0%        ~ (p=0.600 n=7)
Compare-4                23.11n ± 0%   23.11n ± 0%        ~ (p=0.702 n=7)
Compare32-4              19.09n ± 0%   19.12n ± 0%        ~ (p=0.070 n=7)
Copysign-4               33.20n ± 0%   34.02n ± 0%   +2.47% (p=0.001 n=7)
Cos-4                    422.5n ± 0%   385.4n ± 1%   -8.78% (p=0.001 n=7)
Cosh-4                   628.0n ± 0%   545.5n ± 0%  -13.14% (p=0.001 n=7)
Erf-4                    193.7n ± 2%   192.7n ± 1%        ~ (p=0.430 n=7)
Erfc-4                   192.8n ± 1%   193.0n ± 0%        ~ (p=0.245 n=7)
Erfinv-4                 220.7n ± 1%   221.5n ± 2%        ~ (p=0.272 n=7)
Erfcinv-4                221.3n ± 1%   220.4n ± 2%        ~ (p=0.738 n=7)
Exp-4                    471.4n ± 0%   435.1n ± 0%   -7.70% (p=0.001 n=7)
ExpGo-4                  470.6n ± 0%   434.0n ± 0%   -7.78% (p=0.001 n=7)
Expm1-4                  243.1n ± 0%   243.4n ± 0%        ~ (p=0.417 n=7)
Exp2-4                   463.1n ± 0%   427.0n ± 0%   -7.80% (p=0.001 n=7)
Exp2Go-4                 462.4n ± 0%   426.2n ± 5%   -7.83% (p=0.001 n=7)
Abs-4                   37.000n ± 0%   8.039n ± 9%  -78.27% (p=0.001 n=7)
Dim-4                    18.09n ± 0%   18.11n ± 0%        ~ (p=0.094 n=7)
Floor-4                  151.9n ± 0%   151.8n ± 0%        ~ (p=0.190 n=7)
Max-4                    116.7n ± 1%   116.7n ± 1%        ~ (p=0.842 n=7)
Min-4                    116.6n ± 1%   116.6n ± 0%        ~ (p=0.464 n=7)
Mod-4                   1244.0n ± 0%   980.9n ± 0%  -21.15% (p=0.001 n=7)
Frexp-4                  199.0n ± 0%   146.7n ± 0%  -26.28% (p=0.001 n=7)
Gamma-4                  516.4n ± 0%   479.3n ± 1%   -7.18% (p=0.001 n=7)
Hypot-4                  169.8n ± 0%   117.8n ± 2%  -30.62% (p=0.001 n=7)
HypotGo-4                170.8n ± 0%   117.5n ± 0%  -31.21% (p=0.001 n=7)
Ilogb-4                  160.8n ± 0%   109.5n ± 0%  -31.90% (p=0.001 n=7)
J0-4                     1.359µ ± 0%   1.305µ ± 0%   -3.97% (p=0.001 n=7)
J1-4                     1.386µ ± 0%   1.334µ ± 0%   -3.75% (p=0.001 n=7)
Jn-4                     2.864µ ± 0%   2.758µ ± 0%   -3.70% (p=0.001 n=7)
Ldexp-4                  202.9n ± 0%   151.7n ± 0%  -25.23% (p=0.001 n=7)
Lgamma-4                 234.0n ± 0%   234.3n ± 0%        ~ (p=0.199 n=7)
Log-4                    444.1n ± 0%   407.9n ± 0%   -8.15% (p=0.001 n=7)
Logb-4                   157.8n ± 0%   121.6n ± 0%  -22.94% (p=0.001 n=7)
Log1p-4                  354.8n ± 0%   315.4n ± 0%  -11.10% (p=0.001 n=7)
Log10-4                  453.9n ± 0%   417.9n ± 0%   -7.93% (p=0.001 n=7)
Log2-4                   245.3n ± 0%   209.1n ± 0%  -14.76% (p=0.001 n=7)
Modf-4                   126.6n ± 0%   126.6n ± 0%        ~ (p=0.126 n=7)
Nextafter32-4            112.5n ± 0%   112.5n ± 0%        ~ (p=0.853 n=7)
Nextafter64-4            141.7n ± 0%   141.6n ± 0%        ~ (p=0.331 n=7)
PowInt-4                 878.8n ± 1%   758.3n ± 1%  -13.71% (p=0.001 n=7)
PowFrac-4                1.809µ ± 0%   1.615µ ± 0%  -10.72% (p=0.001 n=7)
Pow10Pos-4               18.10n ± 0%   18.12n ± 0%        ~ (p=0.464 n=7)
Pow10Neg-4               17.09n ± 0%   17.09n ± 0%        ~ (p=0.263 n=7)
Round-4                  68.36n ± 0%   68.33n ± 0%        ~ (p=0.325 n=7)
RoundToEven-4            78.40n ± 0%   78.40n ± 0%        ~ (p=0.934 n=7)
Remainder-4              894.0n ± 1%   753.4n ± 1%  -15.73% (p=0.001 n=7)
Signbit-4                18.09n ± 0%   18.09n ± 0%        ~ (p=0.761 n=7)
Sin-4                    389.8n ± 1%   389.8n ± 0%        ~ (p=0.995 n=7)
Sincos-4                 416.0n ± 0%   415.9n ± 0%        ~ (p=0.361 n=7)
Sinh-4                   634.6n ± 4%   585.6n ± 1%   -7.72% (p=0.001 n=7)
SqrtIndirect-4           8.035n ± 0%   8.036n ± 0%        ~ (p=0.523 n=7)
SqrtLatency-4            8.039n ± 0%   8.037n ± 0%        ~ (p=0.218 n=7)
SqrtIndirectLatency-4    8.040n ± 0%   8.040n ± 0%        ~ (p=0.652 n=7)
SqrtGoLatency-4          895.7n ± 0%   896.6n ± 0%   +0.10% (p=0.004 n=7)
SqrtPrime-4              5.406µ ± 0%   5.407µ ± 0%        ~ (p=0.592 n=7)
Tan-4                    406.1n ± 0%   405.8n ± 1%        ~ (p=0.435 n=7)
Tanh-4                   627.6n ± 0%   545.5n ± 0%  -13.08% (p=0.001 n=7)
Trunc-4                  146.7n ± 1%   146.7n ± 0%        ~ (p=0.755 n=7)
Y0-4                     1.359µ ± 0%   1.310µ ± 0%   -3.61% (p=0.001 n=7)
Y1-4                     1.351µ ± 0%   1.301µ ± 0%   -3.70% (p=0.001 n=7)
Yn-4                     2.829µ ± 0%   2.729µ ± 0%   -3.53% (p=0.001 n=7)
Float64bits-4            14.08n ± 0%   14.07n ± 0%        ~ (p=0.069 n=7)
Float64frombits-4        19.09n ± 0%   19.10n ± 0%        ~ (p=0.755 n=7)
Float32bits-4            13.06n ± 0%   13.07n ± 1%        ~ (p=0.586 n=7)
Float32frombits-4        13.06n ± 0%   13.06n ± 0%        ~ (p=0.853 n=7)
FMA-4                    606.9n ± 0%   606.8n ± 0%        ~ (p=0.393 n=7)
geomean                  201.1n        185.4n        -7.81%

Change-Id: I6d41a97ad3789ed5731588588859ac0b8b13b664
Reviewed-on: https://go-review.googlesource.com/c/go/+/484675
Reviewed-by: Rong Zhang <rongrong@oss.cipunited.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
2023-05-08 15:53:28 +00:00
Than McIntosh 445e520d49 cmd/compile: allow more inlining of functions that construct closures
[This is a roll-forward of CL 479095, which was reverted due to a bad
interaction between inlining and escape analysis, then later fixed
first with an attempt in CL 482355, then again in CL 484859, and then
one more time with CL 492135.]

Currently, when the inliner is determining if a function is
inlineable, it descends into the bodies of closures constructed by
that function. This has several unfortunate consequences:

- If the closure contains a disallowed operation (e.g., a defer), then
  the outer function can't be inlined. It makes sense that the
  *closure* can't be inlined in this case, but it doesn't make sense
  to punish the function that constructs the closure.

- The hairiness of the closure counts against the inlining budget of
  the outer function. Since we currently copy the closure body when
  inlining the outer function, this makes sense from the perspective
  of export data size and binary size, but ultimately doesn't make
  much sense from the perspective of what should be inlineable.

- Since the inliner walks into every closure created by an outer
  function in addition to starting a walk at every closure, this adds
  an n^2 factor to inlinability analysis.

This CL simply drops this behavior.

In std, this makes 57 more functions inlinable, and disallows inlining
for 10 (due to the basic instability of our bottom-up inlining
approach), for an net increase of 47 inlinable functions (+0.6%).

This will help significantly with the performance of the functions to
be added for #56102, which have a somewhat complicated nesting of
closures with a performance-critical fast path.

The downside of this seems to be a potential increase in export data
and text size, but the practical impact of this seems to be
negligible:

	       │    before    │           after            │
	       │    bytes     │    bytes      vs base      │
Go/binary        15.12Mi ± 0%   15.14Mi ± 0%  +0.16% (n=1)
Go/text          5.220Mi ± 0%   5.237Mi ± 0%  +0.32% (n=1)
Compile/binary   22.92Mi ± 0%   22.94Mi ± 0%  +0.07% (n=1)
Compile/text     8.428Mi ± 0%   8.435Mi ± 0%  +0.08% (n=1)

Change-Id: I5f75fcceb177f05853996b75184a486528eafe96
Reviewed-on: https://go-review.googlesource.com/c/go/+/492017
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Than McIntosh <thanm@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2023-05-05 21:04:48 +00:00
Than McIntosh 89138ce740 cmd/compile: un-hide closure func if parent expr moved to staticinit
If the function referenced by a closure expression is incorporated
into a static init, be sure to mark it as non-hidden, since otherwise
it will be live but no longer reachable from the init func, hence it
will be skipped during escape analysis, which can lead to
miscompilations.

Fixes #59680.

Change-Id: Ib858aee296efcc0b7655d25c23ab8a6a8dbdc5f9
Reviewed-on: https://go-review.googlesource.com/c/go/+/492135
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2023-05-05 21:04:38 +00:00
Than McIntosh ea69de9b92 cmd/compile: rework marking of dead hidden closure functions
[This is a roll-forward of CL 484859, this time including a fix for
issue #59709. The call to do dead function marking was taking place in
the wrong spot, causing it to run more than once if generics were
instantiated.]

This patch generalizes the code in the inliner that marks unreferenced
hidden closure functions as dead. Rather than doing the marking on the
fly (previous approach), this new approach does a single pass at the
end of inlining, which catches more dead functions.

Change-Id: I0e079ad755c21295477201acbd7e1a732a98fffd
Reviewed-on: https://go-review.googlesource.com/c/go/+/492016
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Run-TryBot: Than McIntosh <thanm@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2023-05-05 21:04:28 +00:00
Junxian Zhu 574431cfcd math: optimize math.Abs on mips64x
This commit optimized math.Abs function implementation on mips64x.
Tested on loongson 3A2000.

goos: linux
goarch: mips64le
pkg: math
                      │    oldmath    │               newmath               │
                      │    sec/op     │    sec/op     vs base               │
Acos-4                   258.0n ± ∞ ¹   257.1n ± ∞ ¹   -0.35% (p=0.008 n=5)
Acosh-4                  417.0n ± ∞ ¹   377.9n ± ∞ ¹   -9.38% (p=0.008 n=5)
Asin-4                   248.0n ± ∞ ¹   259.9n ± ∞ ¹   +4.80% (p=0.008 n=5)
Asinh-4                  439.6n ± ∞ ¹   408.3n ± ∞ ¹   -7.12% (p=0.008 n=5)
Atan-4                   189.6n ± ∞ ¹   188.8n ± ∞ ¹        ~ (p=0.056 n=5)
Atanh-4                  390.0n ± ∞ ¹   356.4n ± ∞ ¹   -8.62% (p=0.008 n=5)
Atan2-4                  279.0n ± ∞ ¹   263.9n ± ∞ ¹   -5.41% (p=0.008 n=5)
Cbrt-4                   314.2n ± ∞ ¹   322.3n ± ∞ ¹   +2.58% (p=0.008 n=5)
Ceil-4                   139.7n ± ∞ ¹   136.6n ± ∞ ¹   -2.22% (p=0.008 n=5)
Compare-4                21.11n ± ∞ ¹   21.09n ± ∞ ¹        ~ (p=0.405 n=5)
Compare32-4              20.10n ± ∞ ¹   20.12n ± ∞ ¹        ~ (p=0.206 n=5)
Copysign-4               32.17n ± ∞ ¹   35.71n ± ∞ ¹  +11.00% (p=0.008 n=5)
Cos-4                    222.8n ± ∞ ¹   169.8n ± ∞ ¹  -23.79% (p=0.008 n=5)
Cosh-4                   550.2n ± ∞ ¹   477.4n ± ∞ ¹  -13.23% (p=0.008 n=5)
Erf-4                    171.6n ± ∞ ¹   174.5n ± ∞ ¹        ~ (p=0.635 n=5)
Erfc-4                   182.6n ± ∞ ¹   170.2n ± ∞ ¹   -6.79% (p=0.008 n=5)
Erfinv-4                 177.6n ± ∞ ¹   196.6n ± ∞ ¹  +10.70% (p=0.008 n=5)
Erfcinv-4                177.8n ± ∞ ¹   197.8n ± ∞ ¹  +11.25% (p=0.008 n=5)
Exp-4                    422.8n ± ∞ ¹   382.1n ± ∞ ¹   -9.63% (p=0.008 n=5)
ExpGo-4                  416.1n ± ∞ ¹   383.2n ± ∞ ¹   -7.91% (p=0.008 n=5)
Expm1-4                  232.9n ± ∞ ¹   252.2n ± ∞ ¹   +8.29% (p=0.008 n=5)
Exp2-4                   404.8n ± ∞ ¹   389.1n ± ∞ ¹   -3.88% (p=0.008 n=5)
Exp2Go-4                 407.0n ± ∞ ¹   372.3n ± ∞ ¹   -8.53% (p=0.008 n=5)
Abs-4                   30.120n ± ∞ ¹   3.014n ± ∞ ¹  -89.99% (p=0.008 n=5)
Dim-4                    5.021n ± ∞ ¹   5.023n ± ∞ ¹        ~ (p=0.071 n=5)
Floor-4                  127.8n ± ∞ ¹   127.1n ± ∞ ¹   -0.55% (p=0.008 n=5)
Max-4                    77.69n ± ∞ ¹   76.33n ± ∞ ¹   -1.75% (p=0.008 n=5)
Min-4                    83.27n ± ∞ ¹   77.87n ± ∞ ¹   -6.48% (p=0.008 n=5)
Mod-4                    906.2n ± ∞ ¹   692.9n ± ∞ ¹  -23.54% (p=0.008 n=5)
Frexp-4                  150.6n ± ∞ ¹   108.6n ± ∞ ¹  -27.89% (p=0.008 n=5)
Gamma-4                  418.4n ± ∞ ¹   386.1n ± ∞ ¹   -7.72% (p=0.008 n=5)
Hypot-4                 148.20n ± ∞ ¹   93.78n ± ∞ ¹  -36.72% (p=0.008 n=5)
HypotGo-4               148.20n ± ∞ ¹   94.47n ± ∞ ¹  -36.26% (p=0.008 n=5)
Ilogb-4                 135.50n ± ∞ ¹   92.38n ± ∞ ¹  -31.82% (p=0.008 n=5)
J0-4                     937.7n ± ∞ ¹   861.7n ± ∞ ¹   -8.10% (p=0.008 n=5)
J1-4                     915.4n ± ∞ ¹   875.9n ± ∞ ¹   -4.32% (p=0.008 n=5)
Jn-4                     1.974µ ± ∞ ¹   1.863µ ± ∞ ¹   -5.62% (p=0.008 n=5)
Ldexp-4                  158.5n ± ∞ ¹   129.3n ± ∞ ¹  -18.42% (p=0.008 n=5)
Lgamma-4                 209.0n ± ∞ ¹   211.8n ± ∞ ¹        ~ (p=0.095 n=5)
Log-4                    326.4n ± ∞ ¹   295.2n ± ∞ ¹   -9.56% (p=0.008 n=5)
Logb-4                   147.7n ± ∞ ¹   105.0n ± ∞ ¹  -28.91% (p=0.008 n=5)
Log1p-4                  303.4n ± ∞ ¹   266.3n ± ∞ ¹  -12.23% (p=0.008 n=5)
Log10-4                  329.2n ± ∞ ¹   298.3n ± ∞ ¹   -9.39% (p=0.008 n=5)
Log2-4                   187.4n ± ∞ ¹   153.0n ± ∞ ¹  -18.36% (p=0.008 n=5)
Modf-4                   110.5n ± ∞ ¹   103.5n ± ∞ ¹   -6.33% (p=0.008 n=5)
Nextafter32-4            128.4n ± ∞ ¹   121.5n ± ∞ ¹   -5.37% (p=0.016 n=5)
Nextafter64-4            109.5n ± ∞ ¹   110.5n ± ∞ ¹   +0.91% (p=0.008 n=5)
PowInt-4                 603.3n ± ∞ ¹   516.4n ± ∞ ¹  -14.40% (p=0.008 n=5)
PowFrac-4                1.365µ ± ∞ ¹   1.183µ ± ∞ ¹  -13.33% (p=0.008 n=5)
Pow10Pos-4               15.07n ± ∞ ¹   15.07n ± ∞ ¹        ~ (p=0.738 n=5)
Pow10Neg-4               21.11n ± ∞ ¹   21.10n ± ∞ ¹        ~ (p=0.190 n=5)
Round-4                  44.23n ± ∞ ¹   44.22n ± ∞ ¹        ~ (p=0.635 n=5)
RoundToEven-4            50.25n ± ∞ ¹   46.27n ± ∞ ¹   -7.92% (p=0.008 n=5)
Remainder-4              675.6n ± ∞ ¹   530.4n ± ∞ ¹  -21.49% (p=0.008 n=5)
Signbit-4                17.07n ± ∞ ¹   17.95n ± ∞ ¹   +5.16% (p=0.008 n=5)
Sin-4                    171.6n ± ∞ ¹   189.1n ± ∞ ¹  +10.20% (p=0.008 n=5)
Sincos-4                 201.5n ± ∞ ¹   200.5n ± ∞ ¹        ~ (p=0.421 n=5)
Sinh-4                   529.6n ± ∞ ¹   484.6n ± ∞ ¹   -8.50% (p=0.008 n=5)
SqrtIndirect-4           5.021n ± ∞ ¹   5.023n ± ∞ ¹   +0.04% (p=0.048 n=5)
SqrtLatency-4            8.032n ± ∞ ¹   8.039n ± ∞ ¹   +0.09% (p=0.024 n=5)
SqrtIndirectLatency-4    8.036n ± ∞ ¹   8.038n ± ∞ ¹        ~ (p=0.056 n=5)
SqrtGoLatency-4          338.8n ± ∞ ¹   338.7n ± ∞ ¹        ~ (p=0.841 n=5)
SqrtPrime-4              5.379µ ± ∞ ¹   5.382µ ± ∞ ¹   +0.06% (p=0.048 n=5)
Tan-4                    182.7n ± ∞ ¹   191.8n ± ∞ ¹   +4.98% (p=0.008 n=5)
Tanh-4                   558.7n ± ∞ ¹   497.6n ± ∞ ¹  -10.94% (p=0.008 n=5)
Trunc-4                  122.5n ± ∞ ¹   122.6n ± ∞ ¹        ~ (p=0.405 n=5)
Y0-4                     892.8n ± ∞ ¹   851.7n ± ∞ ¹   -4.60% (p=0.008 n=5)
Y1-4                     887.2n ± ∞ ¹   863.2n ± ∞ ¹   -2.71% (p=0.008 n=5)
Yn-4                     1.889µ ± ∞ ¹   1.832µ ± ∞ ¹   -3.02% (p=0.008 n=5)
Float64bits-4            13.05n ± ∞ ¹   13.06n ± ∞ ¹   +0.08% (p=0.040 n=5)
Float64frombits-4        13.05n ± ∞ ¹   13.06n ± ∞ ¹        ~ (p=0.143 n=5)
Float32bits-4            13.05n ± ∞ ¹   13.06n ± ∞ ¹   +0.08% (p=0.008 n=5)
Float32frombits-4        13.05n ± ∞ ¹   13.08n ± ∞ ¹   +0.23% (p=0.016 n=5)
FMA-4                    445.7n ± ∞ ¹   448.1n ± ∞ ¹   +0.54% (p=0.008 n=5)
geomean                  157.2n         142.8n         -9.17%

Change-Id: I9bf104848b588c9ecf79401a81d483d7fcdb0a79
Reviewed-on: https://go-review.googlesource.com/c/go/+/481575
Reviewed-by: M Zhuo <mzh@golangcn.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Than McIntosh <thanm@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
Reviewed-by: Rong Zhang <rongrong@oss.cipunited.com>
2023-05-05 14:54:39 +00:00
Matthew Dempsky 767fbe01ae cmd/compile: fix compilation of inferred type arguments
Previously, type arguments could only be inferred for generic
functions in call expressions, whereas with the reverse type inference
proposal they can now be inferred in assignment contexts too. As a
consequence, we now need to check Info.Instances to find the inferred
type for more cases now.

Updates #59338.
Fixes #59955.

Change-Id: I9b6465395869459c2387d0424febe7337b28b90e
Reviewed-on: https://go-review.googlesource.com/c/go/+/492455
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
2023-05-03 22:12:27 +00:00
Daniel Martí aa6e168480 Revert "cmd/compile: enhance tighten pass for memory values"
This reverts CL 458755.

Reason for revert: broke make.bash on GOAMD64=v3:

/workdir/go/src/crypto/sha1/sha1.go:54:35: internal compiler error: '(*digest).MarshalBinary': func (*digest).MarshalBinary, startMem[b13] has different values, old v206, new v338

goroutine 34 [running]:
runtime/debug.Stack()
	/workdir/go/src/runtime/debug/stack.go:24 +0x9f
bootstrap/cmd/compile/internal/base.FatalfAt({0x13, 0xaa0f1}, {0xc000db4440, 0x40}, {0xc0013b0000, 0x5, 0x5})
	/workdir/go/src/cmd/compile/internal/base/print.go:234 +0x2d1
bootstrap/cmd/compile/internal/base.Fatalf(...)
	/workdir/go/src/cmd/compile/internal/base/print.go:203
bootstrap/cmd/compile/internal/ssagen.(*ssafn).Fatalf(0xc000d90000, {0x13, 0xaa0f1}, {0xcb7b91, 0x3a}, {0xc000d99bc0, 0x4, 0x4})
	/workdir/go/src/cmd/compile/internal/ssagen/ssa.go:7896 +0x1f8
bootstrap/cmd/compile/internal/ssa.(*Func).Fatalf(0xc000d82340, {0xcb7b91, 0x3a}, {0xc000d99bc0, 0x4, 0x4})
	/workdir/go/src/cmd/compile/internal/ssa/func.go:716 +0x342
bootstrap/cmd/compile/internal/ssa.memState(0xc000d82340, {0xc000ec6200, 0x22, 0x40}, {0xc001046000, 0x22, 0x40})
	/workdir/go/src/cmd/compile/internal/ssa/tighten.go:240 +0x6c5
bootstrap/cmd/compile/internal/ssa.tighten(0xc000d82340)
[...]

Change-Id: Ic445fb48fe0f2c60ac67abe259b66594f1419152
Reviewed-on: https://go-review.googlesource.com/c/go/+/492335
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
2023-05-03 21:28:37 +00:00
erifan01 ea8f037996 cmd/compile: enhance tighten pass for memory values
This CL enhances the tighten pass. Previously if a value has memory arg,
then the tighten pass won't move it, actually if the memory state is
consistent among definition and use block, we can move the value. This
CL optimizes this case. This is useful for the following situation:
b1:
  x = load(...mem)
  if(...) goto b2 else b3
b2:
  use(x)
b3:
  some_op_not_use_x

For the micro-benchmark mentioned in #56620, the performance improvement
is about 15%.
There's no noticeable performance change in the go1 benchmark.

Fixes #56620

Change-Id: I9b152754f27231f583a6995fc7cd8472aa7d390c
Reviewed-on: https://go-review.googlesource.com/c/go/+/458755
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
2023-05-03 19:56:09 +00:00
Robert Griesemer 1f570787a8 cmd/compile: enable reverse type inference
For #59338.

Change-Id: I8141d421cdc60e47ee5794fc1ca81246bd8a8a25
Reviewed-on: https://go-review.googlesource.com/c/go/+/491475
Reviewed-by: Robert Findley <rfindley@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
Run-TryBot: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-05-03 19:36:20 +00:00
Cherry Mui 19fd96512c cmd/link: put zero-sized data symbols at same address as runtime.zerobase
Put zero-sized data symbols at same address as runtime.zerobase,
so zero-sized global variables have the same address as zero-sized
allocations.

Change-Id: Ib3145dc1b663a9794dfabc0e6abd2384960f2c49
Reviewed-on: https://go-review.googlesource.com/c/go/+/490435
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
2023-04-28 18:35:43 +00:00
Austin Clements 0f099a4bc5 runtime, cmd: rationalize StackLimit and StackGuard
The current definitions of StackLimit and StackGuard only indirectly
specify the NOSPLIT stack limit and duplicate a literal constant
(928). Currently, they define the stack guard delta, and from there
compute the NOSPLIT limit.

Rationalize these by defining a new constant, abi.StackNosplitBase,
which consolidates and directly specifies the NOSPLIT stack limit (in
the default case). From this we then compute the stack guard delta,
inverting the relationship between these two constants. While we're
here, we rename StackLimit to StackNosplit to make it clearer what's
being limited.

This change does not affect the values of these constants in the
default configuration. It does slightly change how
StackGuardMultiplier values other than 1 affect the constants, but
this multiplier is a pretty rough heuristic anyway.

                    before after
stackNosplit           800   800
_StackGuard            928   928
stackNosplit -race    1728  1600
_StackGuard -race     1856  1728

For #59670.

Change-Id: Ia94094c5e47897e7c088d24b4a5e33f5c2768db5
Reviewed-on: https://go-review.googlesource.com/c/go/+/486976
Auto-Submit: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-04-21 19:28:56 +00:00
Austin Clements d11ff3f081 Revert "runtime, cmd: rationalize StackLimit and StackGuard"
This reverts commit CL 486380.

Submitted out of order and breaks bootstrap.

Change-Id: I67bd225094b5c9713b97f70feba04d2c99b7da76
Reviewed-on: https://go-review.googlesource.com/c/go/+/486916
Reviewed-by: David Chase <drchase@google.com>
TryBot-Bypass: Austin Clements <austin@google.com>
2023-04-20 16:19:35 +00:00
Austin Clements 921699fe5f runtime, cmd: rationalize StackLimit and StackGuard
The current definitions of StackLimit and StackGuard only indirectly
specify the NOSPLIT stack limit and duplicate a literal constant
(928). Currently, they define the stack guard delta, and from there
compute the NOSPLIT limit.

Rationalize these by defining a new constant, abi.StackNosplitBase,
which consolidates and directly specifies the NOSPLIT stack limit (in
the default case). From this we then compute the stack guard delta,
inverting the relationship between these two constants. While we're
here, we rename StackLimit to StackNosplit to make it clearer what's
being limited.

This change does not affect the values of these constants in the
default configuration. It does slightly change how
StackGuardMultiplier values other than 1 affect the constants, but
this multiplier is a pretty rough heuristic anyway.

                    before after
stackNosplit           800   800
_StackGuard            928   928
stackNosplit -race    1728  1600
_StackGuard -race     1856  1728

For #59670.

Change-Id: Ibe20825ebe0076bbd7b0b7501177b16c9dbcb79e
Reviewed-on: https://go-review.googlesource.com/c/go/+/486380
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-04-20 16:05:21 +00:00
Robert Griesemer d93f02010c cmd/compile/internal/types2: only mark variables as used if they are
Marking variables in erroneous variable declarations as used is
convenient for tests but doesn't necessarily hide follow-on errors
in real code: either the variable is not supposed to be declared in
the first place and then we should get an error if it is not used,
or it is there because it is intended to be used, and the we expect
an error it if is not used.

This brings types2 closer to go/types.

Change-Id: If7ee1298fc770f7ad0cefe7e968533fd50ec2343
Reviewed-on: https://go-review.googlesource.com/c/go/+/486175
Run-TryBot: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Findley <rfindley@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-04-19 14:07:00 +00:00
Keith Randall 6b165577fe cmd/compile: remove memequal call from string compares in more cases
Add more rules to ensure that order doesn't matter.

Add memequal 0 rule.

Try to use a constant argument to memequal when one is available.

Fixes #59684

Change-Id: I36e85ffbd949396ed700ed6e8ec2bc3ae013f5d2
Reviewed-on: https://go-review.googlesource.com/c/go/+/485535
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-04-18 21:31:33 +00:00
Than McIntosh 7c1ed1fa8f Revert "cmd/compile: rework marking of dead hidden closure functions"
This reverts commit http://go.dev/cl//484859

Reason for revert: causes linker errors in a number of google-internal tests.

Change-Id: I322252f784a46d2b1d447ebcdca86ce14bc0cc91
Reviewed-on: https://go-review.googlesource.com/c/go/+/485755
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
2023-04-18 16:03:22 +00:00
Michael Knyszek ce10e9d845 Revert "cmd/compile: allow more inlining of functions that construct closures"
This reverts commit f8162a0e72.

Reason for revert: https://github.com/golang/go/issues/59680

Change-Id: I91821c691a2d019ff0ad5b69509e32f3d56b8f67
Reviewed-on: https://go-review.googlesource.com/c/go/+/485498
Reviewed-by: Russ Cox <rsc@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
2023-04-17 21:45:00 +00:00
Than McIntosh f8162a0e72 cmd/compile: allow more inlining of functions that construct closures
[This is a roll-forward of CL 479095, which was reverted due to a bad
interaction between inlining and escape analysis, then later fixed
fist with an attempt in CL 482355, then again in 484859 .]

Currently, when the inliner is determining if a function is
inlineable, it descends into the bodies of closures constructed by
that function. This has several unfortunate consequences:

- If the closure contains a disallowed operation (e.g., a defer), then
  the outer function can't be inlined. It makes sense that the
  *closure* can't be inlined in this case, but it doesn't make sense
  to punish the function that constructs the closure.

- The hairiness of the closure counts against the inlining budget of
  the outer function. Since we currently copy the closure body when
  inlining the outer function, this makes sense from the perspective
  of export data size and binary size, but ultimately doesn't make
  much sense from the perspective of what should be inlineable.

- Since the inliner walks into every closure created by an outer
  function in addition to starting a walk at every closure, this adds
  an n^2 factor to inlinability analysis.

This CL simply drops this behavior.

In std, this makes 57 more functions inlinable, and disallows inlining
for 10 (due to the basic instability of our bottom-up inlining
approach), for an net increase of 47 inlinable functions (+0.6%).

This will help significantly with the performance of the functions to
be added for #56102, which have a somewhat complicated nesting of
closures with a performance-critical fast path.

The downside of this seems to be a potential increase in export data
and text size, but the practical impact of this seems to be
negligible:

	       │    before    │           after            │
	       │    bytes     │    bytes      vs base      │
Go/binary        15.12Mi ± 0%   15.14Mi ± 0%  +0.16% (n=1)
Go/text          5.220Mi ± 0%   5.237Mi ± 0%  +0.32% (n=1)
Compile/binary   22.92Mi ± 0%   22.94Mi ± 0%  +0.07% (n=1)
Compile/text     8.428Mi ± 0%   8.435Mi ± 0%  +0.08% (n=1)

Updates #56102.

Change-Id: I6e938d596992ffb473cf51e7e598f372ce08deb0
Reviewed-on: https://go-review.googlesource.com/c/go/+/484860
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2023-04-17 14:52:41 +00:00
Than McIntosh d240226fe5 cmd/compile: rework marking of dead hidden closure functions
This patch generalizes the code in the inliner that marks unreferenced
hidden closure functions as dead. Rather than doing the marking on the
fly (previous approach), this new approach does a single pass at the
end of inlining, which catches more dead functions.

Fixes #59638.
Updates #59404.
Updates #59547.

Change-Id: I54fd63e9e37c9123b08a3e7def7d1989919bba91
Reviewed-on: https://go-review.googlesource.com/c/go/+/484859
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-04-17 14:52:32 +00:00
Cuong Manh Le 74b52d9519 cmd/compile: better code generation for constant-fold switch
CL 399694 added constant-fold switch early in compilation. So function:

func f() string {
    switch intSize {
    case 32:
        return "32"
    case 64:
        return "64"
    default:
        panic("unreachable")
    }
}

will be constant-fold to:

func f() string {
    switch intSize {
    case 64:
        return "64"
    }
}

When this function get inlined, there is a check whether we can delay
declaring the result parameter until the "return" statement. For the
original function, we can't delay the result, because there's more than
one return statement. However, the constant-fold one can, because
there's on one return statement in the body now. The result parameter
~R0 ends up declaring inside the switch statement scope.

Now, when walking the switch statement, it's re-written into if-else
statement. Without typecheck.EvalConst, the if condition "if 64 == 64"
is passed as-is to the ssa generation pass. Because "64 == 64" is not a
constant, the ssagen creates normal blocks for branching the results.
This confuses the liveness analysis, because ~R0 is only live inside the
if block. With typecheck.EvalConst, "64 == 64" is evaluated to "true",
so ssagen can branch the result without emitting conditional blocks.

Instead, the constant-fold can be re-written as:

switch {
case true:
    // Body
}

So it does not depend on the delay results check during inlining. Adding
a test, which will fail when typecheck.EvalConst is removed, so we can
do the cleanup without breaking things.

Change-Id: I638730bb147140de84260653741431b807ff2f15
Reviewed-on: https://go-review.googlesource.com/c/go/+/484316
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-04-14 17:58:01 +00:00
Cuong Manh Le 20c349e534 cmd/compile: reenable inline static init
Updates #58293
Updates #58339
Fixes #58439

Change-Id: I06d2d92f86fa4a672d69515c4066d69d3e0fc75b
Reviewed-on: https://go-review.googlesource.com/c/go/+/467016
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2023-04-14 17:57:36 +00:00
Cuong Manh Le a141c58c85 cmd/compile: handle string concatenation in static init inliner
Static init inliner is using typecheck.EvalConst to handle string
concatenation expressions. But static init inliner may reveal constant
expressions after substitution, and the compiler needs to evaluate those
expressions in non-constant semantic. Using typecheck.EvalConst, which
always evaluates expressions in constant semantic, is not the right
choice.

For safety, this CL fold the logic to handle string concatenation to
static init inliner, so there won't be regression in handling constant
expressions in non-constant semantic. And also, future CL can simplify
typecheck.EvalConst logic.

Updates #58293
Updates #58339
Fixes #58439

Change-Id: I74068d99c245938e576afe9460cbd2b39677bbff
Reviewed-on: https://go-review.googlesource.com/c/go/+/466277
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2023-04-14 17:57:14 +00:00
Keith Randall 2b92c39fe0 cmd/link: establish dependable package initialization order
(This is a retry of CL 462035 which was reverted at 474976.
The only change from that CL is the aix fix SRODATA->SNOPTRDATA
at inittask.go:141)

As described here:

https://github.com/golang/go/issues/31636#issuecomment-493271830

"Find the lexically earliest package that is not initialized yet,
but has had all its dependencies initialized, initialize that package,
 and repeat."

Simplify the runtime a bit, by just computing the ordering required
in the linker and giving a list to the runtime.

Update #31636
Fixes #57411

RELNOTE=yes

Change-Id: I28c09451d6aa677d7394c179d23c2c02c503fc56
Reviewed-on: https://go-review.googlesource.com/c/go/+/478916
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-04-14 16:55:22 +00:00
Than McIntosh 8854be4180 Revert "cmd/compile: allow more inlining of functions that construct closures"
This reverts commit http://go.dev/cl/c/482356.

Reason for revert: Reverting this change again, since it is causing additional failures in google-internal testing.

Change-Id: I9234946f62e5bb18c2f873a65e8b298d04af0809
Reviewed-on: https://go-review.googlesource.com/c/go/+/484735
Reviewed-by: Florian Zenker <floriank@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
Auto-Submit: Than McIntosh <thanm@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2023-04-14 14:45:59 +00:00
Junwei Zuo 89567a35c1 cmd/compile: fix ir.StaticValue for ORANGE
Range statement will mutate the key and value, so we should treat them as reassigned.

Fixes #59572

Change-Id: I9c6b67d938760a0c6a1d9739f2737c67af4a3a10
Reviewed-on: https://go-review.googlesource.com/c/go/+/483855
Run-TryBot: Wayne Zuo <wdvxdr@golangcn.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2023-04-12 19:28:47 +00:00
Johan Brandhorst-Satzkorn 319b75ed33 all: add wasip1 support
Fixes #58141

Co-authored-by: Richard Musiol <neelance@gmail.com>
Co-authored-by: Achille Roussel <achille.roussel@gmail.com>
Co-authored-by: Julien Fabre <ju.pryz@gmail.com>
Co-authored-by: Evan Phoenix <evan@phx.io>
Change-Id: I49b66946acc90fdf09ed9223096bfec9a1e5b923
Reviewed-on: https://go-review.googlesource.com/c/go/+/479627
Run-TryBot: Johan Brandhorst-Satzkorn <johan.brandhorst@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Johan Brandhorst-Satzkorn <johan.brandhorst@gmail.com>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Bypass: Ian Lance Taylor <iant@golang.org>
2023-04-11 20:56:32 +00:00
Cuong Manh Le 63a08e61bd cmd/compile: teach prove about bitwise OR operation
For now, only apply the rule if either of arguments are constants. That
would catch a lot of real user code, without slowing down the compiler
with code generated for string comparison (experience in CL 410336).

Updates #57959
Fixes #45928

Change-Id: Ie2e830d6d0d71cda3947818b22c2775bd94f7971
Reviewed-on: https://go-review.googlesource.com/c/go/+/483359
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2023-04-10 17:13:41 +00:00
Cuong Manh Le 231f290e51 runtime: mark map bucket slots as empty during map clear
So iterators that are in progress can know entries have been deleted and
terminate the iterator properly.

Update #55002
Update #56351
Fixes #59411

Change-Id: I924f16a00fe4ed6564f730a677348a6011d3fb67
Reviewed-on: https://go-review.googlesource.com/c/go/+/481935
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-04-08 05:25:04 +00:00
Keith Randall b3bc8620f8 cmd/compile: use correct type for byteswaps on multi-byte stores
Use the type of the store for the byteswap, not the type of the
store's value argument.

Normally when we're storing a 16-bit value, the value being stored is
also typed as 16 bits. But sometimes it is typed as something smaller,
usually because it is the result of an upcast from a smaller value,
and that upcast needs no instructions.

If the type of the store's arg is thinner than the type being stored,
and the byteswap'd value uses that thinner type, and the byteswap'd
value needs to be spilled & restored, that spill/restore happens using
the thinner type, which causes us to lose some of the top bits of the
value.

Fixes #59367

Change-Id: If6ce1e8a76f18bf8e9d79871b6caa438bc3cce4d
Reviewed-on: https://go-review.googlesource.com/c/go/+/481395
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-04-07 21:11:29 +00:00
Than McIntosh 39986d28e4 cmd/compile: allow more inlining of functions that construct closures
[This is a roll-forward of CL 479095, which was reverted due to a bad
interaction between inlining and escape analysis since fixed in CL 482355.]

Currently, when the inliner is determining if a function is
inlineable, it descends into the bodies of closures constructed by
that function. This has several unfortunate consequences:

- If the closure contains a disallowed operation (e.g., a defer), then
  the outer function can't be inlined. It makes sense that the
  *closure* can't be inlined in this case, but it doesn't make sense
  to punish the function that constructs the closure.

- The hairiness of the closure counts against the inlining budget of
  the outer function. Since we currently copy the closure body when
  inlining the outer function, this makes sense from the perspective
  of export data size and binary size, but ultimately doesn't make
  much sense from the perspective of what should be inlineable.

- Since the inliner walks into every closure created by an outer
  function in addition to starting a walk at every closure, this adds
  an n^2 factor to inlinability analysis.

This CL simply drops this behavior.

In std, this makes 57 more functions inlinable, and disallows inlining
for 10 (due to the basic instability of our bottom-up inlining
approach), for an net increase of 47 inlinable functions (+0.6%).

This will help significantly with the performance of the functions to
be added for #56102, which have a somewhat complicated nesting of
closures with a performance-critical fast path.

The downside of this seems to be a potential increase in export data
and text size, but the practical impact of this seems to be
negligible:

	       │    before    │           after            │
	       │    bytes     │    bytes      vs base      │
Go/binary        15.12Mi ± 0%   15.14Mi ± 0%  +0.16% (n=1)
Go/text          5.220Mi ± 0%   5.237Mi ± 0%  +0.32% (n=1)
Compile/binary   22.92Mi ± 0%   22.94Mi ± 0%  +0.07% (n=1)
Compile/text     8.428Mi ± 0%   8.435Mi ± 0%  +0.08% (n=1)

Updates #56102.

Change-Id: I1f4fc96c71609c8feb59fecdb92b69ba7e3b5b41
Reviewed-on: https://go-review.googlesource.com/c/go/+/482356
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Run-TryBot: Than McIntosh <thanm@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-04-07 15:12:08 +00:00
Than McIntosh f1caf1aa1c cmd/compile: deadcode unreferenced hidden closures during inlining
When a closure is inlined, it may contain other hidden closures, which
the inliner will duplicate, rendering the original nested closures as
unreachable. Because they are unreachable, they don't get processed in
escape analysis, meaning that go/defer statements don't get rewritten,
which can then in turn trigger errors in walk. This patch looks for
nested hidden closures and marks them as dead, so that they can be
skipped later on in the compilation flow.  NB: if during escape
analysis we rediscover a hidden closure (due to an explicit reference)
that was previously marked dead, revive it at that point.

Fixes #59404.

Change-Id: I76db1e9cf1ee38bd1147aeae823f916dbbbf081b
Reviewed-on: https://go-review.googlesource.com/c/go/+/482355
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2023-04-07 15:07:18 +00:00
ruinan 9be533a8ee cmd/compile: get more bounds info from logic operators in prove pass
Currently, the prove pass can get knowledge from some specific logic
operators only before the CFG is explored, which means that the bounds
information of the branch will be ignored.

This CL updates the facts table by the logic operators in every
branch. Combined with the branch information, this will be helpful for
BCE in some circumstances.

Fixes #57243

Change-Id: I0bd164f1b47804ccfc37879abe9788740b016fd5
Reviewed-on: https://go-review.googlesource.com/c/go/+/419555
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Eric Fang <eric.fang@arm.com>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
2023-04-07 10:09:11 +00:00
Cuong Manh Le 1e5955aabd cmd/compile: don't set range expr key/value type if already set
Unified IR already records the correct type for them.

Fixes #59378

Change-Id: I275c45b48f67bde55c8e2079d60b5868d0acde7f
Reviewed-on: https://go-review.googlesource.com/c/go/+/481555
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2023-04-05 17:48:15 +00:00
Than McIntosh f5371581c7 Revert "cmd/compile: allow more inlining of functions that construct closures"
This reverts commit http://go.dev/cl//479095

Reason for revert: causes failures in google-internal testing

Change-Id: If1018b35be0b8627e2959f116179ada24d44d67c
Reviewed-on: https://go-review.googlesource.com/c/go/+/481637
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
2023-04-03 14:51:33 +00:00
Keith Randall 8edcdddb23 crypto/subtle: don't cast to *uintptr when word size is 0
Casting to a *uintptr is not ok if there isn't at least 8 bytes of
data backing that pointer (on 64-bit archs).
So although we end up making a slice of 0 length with that pointer,
the cast itself doesn't know that.
Instead, bail early if the result is going to be 0 length.

Fixes #59334

Change-Id: Id3c0e09d341d838835c0382cccfb0f71dc3dc7e6
Reviewed-on: https://go-review.googlesource.com/c/go/+/480575
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Emmanuel Odeke <emmanuel@orijtech.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
2023-03-31 23:25:07 +00:00
Austin Clements 2ff684a541 cmd/compile: allow more inlining of functions that construct closures
Currently, when the inliner is determining if a function is
inlineable, it descends into the bodies of closures constructed by
that function. This has several unfortunate consequences:

- If the closure contains a disallowed operation (e.g., a defer), then
  the outer function can't be inlined. It makes sense that the
  *closure* can't be inlined in this case, but it doesn't make sense
  to punish the function that constructs the closure.

- The hairiness of the closure counts against the inlining budget of
  the outer function. Since we currently copy the closure body when
  inlining the outer function, this makes sense from the perspective
  of export data size and binary size, but ultimately doesn't make
  much sense from the perspective of what should be inlineable.

- Since the inliner walks into every closure created by an outer
  function in addition to starting a walk at every closure, this adds
  an n^2 factor to inlinability analysis.

This CL simply drops this behavior.

In std, this makes 57 more functions inlinable, and disallows inlining
for 10 (due to the basic instability of our bottom-up inlining
approach), for an net increase of 47 inlinable functions (+0.6%).

This will help significantly with the performance of the functions to
be added for #56102, which have a somewhat complicated nesting of
closures with a performance-critical fast path.

The downside of this seems to be a potential increase in export data
and text size, but the practical impact of this seems to be
negligible:

               │    before    │           after            │
               │    bytes     │    bytes      vs base      │
Go/binary        15.12Mi ± 0%   15.14Mi ± 0%  +0.16% (n=1)
Go/text          5.220Mi ± 0%   5.237Mi ± 0%  +0.32% (n=1)
Compile/binary   22.92Mi ± 0%   22.94Mi ± 0%  +0.07% (n=1)
Compile/text     8.428Mi ± 0%   8.435Mi ± 0%  +0.08% (n=1)

Change-Id: Ie9e38104fed5689a94c368288653fd7cb4b7a35e
Reviewed-on: https://go-review.googlesource.com/c/go/+/479095
Reviewed-by: Than McIntosh <thanm@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Auto-Submit: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
2023-03-31 20:00:40 +00:00
Keith Randall 0d9eb8bea2 cmd/compile: casts from slices to array pointers are known to be non-nil
The cast is proceeded by a bounds check. If the bounds check passes
then we know the pointer in the slice is non-nil.

... except casts to pointers of 0-sized arrays. They are strange, as
the bounds check can pass for a nil input.

Change-Id: Ic01cf4a82d59fbe3071d4b271c94efca9cafaec1
Reviewed-on: https://go-review.googlesource.com/c/go/+/479335
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Auto-Submit: Keith Randall <khr@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2023-03-29 21:55:11 +00:00
Robert Griesemer 91a40f43b6 go/types, types2: don't report assignment mismatch errors if there are other errors
Change the Checker.use/useLHS functions to report if all "used"
expressions evaluated without error. Use that information to
control whether to report an assignment mismatch error or not.
This will reduce the number of errors reported per assignment,
where the assignment mismatch is only one of the errors.

Change-Id: Ia0fc3203253b002e4e1d5759d8d5644999af6884
Reviewed-on: https://go-review.googlesource.com/c/go/+/478756
Reviewed-by: Robert Findley <rfindley@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Robert Griesemer <gri@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
2023-03-28 22:22:08 +00:00
Ian Lance Taylor a6f564c8e9 test: add test that caused a gofrontend crash
For #55242

Change-Id: I092b1881623ea997b178d038c0afd10cd5bca937
Reviewed-on: https://go-review.googlesource.com/c/go/+/479898
Reviewed-by: Than McIntosh <thanm@google.com>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-03-28 20:27:13 +00:00
Keith Randall 61bc17f04e cmd/compile: don't assume pointer of a slice is non-nil
unsafe.SliceData can return pointers which are nil. That function gets
lowered to the SSA OpSlicePtr, which the compiler assumes is non-nil.
This used to be the case as OpSlicePtr was only used in situations
where the bounds check already passed. But with unsafe.SliceData that
is no longer the case.

There are situations where we know it is nil. Use Bounded() to
indicate that.

I looked through all the uses of OSPTR and added SetBounded where it
made sense. Most OSPTR results are passed directly to runtime calls
(e.g. memmove), so even if we know they are non-nil that info isn't
helpful.

Fixes #59293

Change-Id: I437a15330db48e0082acfb1f89caf8c56723fc51
Reviewed-on: https://go-review.googlesource.com/c/go/+/479896
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
2023-03-28 19:55:43 +00:00
Robert Griesemer 8c5e8a38df go/types, types2: refactor initVars
As with changes in prior CLs, we don't suppress legitimate
"declared but not used" errors anymore simply because the
respective variables are used in incorrect assignments,
unrelated to the variables in question.
Adjust several (ancient) tests accordingly.

Change-Id: I5826393264d9d8085c64777a330d4efeb735dd2d
Reviewed-on: https://go-review.googlesource.com/c/go/+/478716
Reviewed-by: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Findley <rfindley@google.com>
Run-TryBot: Robert Griesemer <gri@google.com>
2023-03-28 18:13:13 +00:00
Robert Griesemer abf9b112fd go/types, types2: more systematic use of Checker.use und useLHS
This CL re-introduces useLHS because we don't want to suppress
correct "declared but not used" errors for variables that only
appear on the LHS of an assignment (using Checker.use would mark
them as used).

This CL also adjusts a couple of places where types2 differed
from go/types (and suppressed valid "declared and not used"
errors). Now those errors are surfaced. Adjusted a handful of
tests accordingly.

Change-Id: Ia555139a05049887aeeec9e5221b1f41432c1a57
Reviewed-on: https://go-review.googlesource.com/c/go/+/478635
Reviewed-by: Robert Griesemer <gri@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Findley <rfindley@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Robert Griesemer <gri@google.com>
2023-03-28 14:28:33 +00:00
Robert Griesemer bf9d9b7dba go/types, types2: better error message for some invalid integer array lengths
Don't say "array length must be integer" if it is in fact an integer.

Fixes #59209

Change-Id: If60b93a0418f5837ac334412d3838eec25eeb855
Reviewed-on: https://go-review.googlesource.com/c/go/+/479115
Reviewed-by: Robert Griesemer <gri@google.com>
Run-TryBot: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Findley <rfindley@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-03-27 18:59:51 +00:00
Robert Griesemer 171850f169 cmd/compile: don't panic if unsafe.Sizeof/Offsetof is used with oversize types
In the Sizes API, recognize an overflow (to a negative value) as a
consequence of an oversize value, and specify as such in the API.

Adjust the various size computations to take overflow into account.

Recognize a negative size or offset as an error and report it rather
than panicking.

Use the same protocol for results provided by the default (StdSizes)
and external Sizes implementations.

Add a new error code TypeTooLarge for the new errors.

Fixes #59190.
Fixes #59207.

Change-Id: I8c33a9e69932760275100112dde627289ac7695b
Reviewed-on: https://go-review.googlesource.com/c/go/+/478919
Run-TryBot: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Findley <rfindley@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
2023-03-27 16:52:49 +00:00
erifan01 42f99b203d cmd/compile: optimize cmp to cmn under conditions < and >= on arm64
Under the right conditions we can optimize cmp comparisons to cmn
comparisons, such as:
func foo(a, b int) int {
  var c int
  if a + b < 0 {
  	c = 1
  }
  return c
}

Previously it's compiled as:
  ADD     R1, R0, R1
  CMP     $0, R1
  CSET    LT, R0
With this CL it's compiled as:
  CMN     R1, R0
  CSET    MI, R0
Here we need to pay attention to the overflow situation of a+b, the MI
flag means N==1, which doesn't honor the overflow flag V, its value
depends only on the sign of the result. So it has the same semantic of
the Go code, so it's correct.

Similarly, this CL also optimizes the case of >= comparison
using the PL conditional flag.

Change-Id: I47179faba5b30cca84ea69bafa2ad5241bf6dfba
Reviewed-on: https://go-review.googlesource.com/c/go/+/476116
Run-TryBot: Eric Fang <eric.fang@arm.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-03-24 01:19:09 +00:00
Ian Lance Taylor 09e9a9eac9 test: add test that caused gofrontend crash
For #59169

Change-Id: Id72ad9fe8b6e1d7cf64f972520ae8858f70c025a
Reviewed-on: https://go-review.googlesource.com/c/go/+/478217
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
2023-03-22 18:56:30 +00:00
Cuong Manh Le 07559ceb72 cmd/compile: mark negative size memclr non-inlineable
Fixes #59174

Change-Id: I72b2b068830b90d42a0186addd004fb3175b9126
Reviewed-on: https://go-review.googlesource.com/c/go/+/478375
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Jakub Ciolek <jakub@ciolek.dev>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-03-22 16:43:10 +00:00
erifan01 91a2e921dd cmd/compile: fix incorrect truncating when converting CMP to TST on arm64
CL 420434 optimized CMP into TST in some situations, but it has a bug,
these four rules are not correct:
(LessThan (CMPWconst [0] x:(ANDconst [c] y))) && x.Uses == 1 => (LessThan (TSTconst [c] y))
(LessEqual (CMPWconst [0] x:(ANDconst [c] y))) && x.Uses == 1 => (LessEqual (TSTconst [c] y))
(GreaterThan (CMPWconst [0] x:(ANDconst [c] y))) && x.Uses == 1 => (GreaterThan (TSTconst [c] y))
(GreaterEqual (CMPWconst [0] x:(ANDconst [c] y))) && x.Uses == 1 => (GreaterEqual (TSTconst [c] y))

But due to the existence of this rule
(LessThan (CMPWconst [0] x:(ANDconst [c] y))) && x.Uses == 1 =>
(LessThan (TSTWconst [int32(c)] y)), the above rules have never been
fired. This CL corrects them as:
(LessThan (CMPconst [0] x:(ANDconst [c] y))) && x.Uses == 1 => (LessThan (TSTconst [c] y))
(LessEqual (CMPconst [0] x:(ANDconst [c] y))) && x.Uses == 1 => (LessEqual (TSTconst [c] y))
(GreaterThan (CMPconst [0] x:(ANDconst [c] y))) && x.Uses == 1 => (GreaterThan (TSTconst [c] y))
(GreaterEqual (CMPconst [0] x:(ANDconst [c] y))) && x.Uses == 1 => (GreaterEqual (TSTconst [c] y))

Change-Id: I7d60bcc9a266ee58388baeaab9f493b57cf1ad55
Reviewed-on: https://go-review.googlesource.com/c/go/+/473617
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Run-TryBot: Eric Fang <eric.fang@arm.com>
2023-03-22 08:32:53 +00:00
Yi Yang da4687923b cmd/compile: add rewrite rules for arithmetic operations
Add the following common local transformations

(t + x) - (t + y) == x - y
(t + x) - (y + t) == x - y
(x + t) - (y + t) == x - y
(x + t) - (t + y) == x - y
(x - t) + (t + y) == x + y
(x - t) + (y + t) == x + y

The compiler itself matches such patterns many times. This also aligns with other popular compilers.

Fixes #59111

Change-Id: Ibdfdb414782f8fcaa20b84ac5d43d0d9ae2c7b60
GitHub-Last-Rev: 1aad82e62e
GitHub-Pull-Request: golang/go#59119
Reviewed-on: https://go-review.googlesource.com/c/go/+/477555
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
2023-03-20 15:42:09 +00:00
Wayne Zuo cedfcba3e8 cmd/compile: instrinsify TrailingZeros{8,32,64} for 386
This CL add support for instrinsifying the TrialingZeros{8,32,64}
functions for 386 architecture. We need handle the case when the input
is 0, which could lead to undefined output from the BSFL instruction.

Next CL will remove the assembly code in runtime/internal/sys package.

Change-Id: Ic168edf68e81bf69a536102100fdd3f56f0f4a1b
Reviewed-on: https://go-review.googlesource.com/c/go/+/475735
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Wayne Zuo <wdvxdr@golangcn.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-03-14 08:10:32 +00:00
Than McIntosh f5c7416511 cmd/compile: reorder operations in SCCs to enable more inlining
This patch changes the relative order of "CanInline" and "InlineCalls"
operations within the inliner for clumps of functions corresponding to
strongly connected components in the call graph. This helps increase
the amount of inlining within SCCs, particularly in Go's runtime
package, which has a couple of very large SCCs.

For a given SCC of the form { fn1, fn2, ... fnk }, the inliner would
(prior to this point) walk through the list of functions and for each
function first compute inlinability ("CanInline") and then perform
inlining ("InlineCalls"). This meant that if there was an inlinable
call from fn3 to fn4 (for example), this call would never be inlined,
since at the point fn3 was visited, we would not have computed
inlinability for fn4.

We now do inlinability analysis for all functions in an SCC first,
then do actual inlining for everything. This results in 47 additional
inlines in the Go runtime package (a fairly modest increase
percentage-wise of 0.6%).

Updates #58905.

Change-Id: I48dbb1ca16f0b12f256d9eeba8cf7f3e6dd853cd
Reviewed-on: https://go-review.googlesource.com/c/go/+/474955
Run-TryBot: Than McIntosh <thanm@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2023-03-09 22:13:26 +00:00
Keith Randall 642542cb3c Revert "cmd/link: establish dependable package initialization order"
This reverts commit ce2a609909.
aka CL 462035

Reason for revert: this CL is causing some problems in some internal Google programs.

Change-Id: I4476b8d8d2c3d7b5703d1d85c93baebb4b4e5d26
Reviewed-on: https://go-review.googlesource.com/c/go/+/474976
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
2023-03-09 19:19:41 +00:00
Keith Randall 54d05e4e25 test: test for issue 53087
This issue has been fixed with unified IR, so just add a test.

Update #53087

Change-Id: I965d9f27529fa6b7c89e2921c65e5a100daeb9fe
Reviewed-on: https://go-review.googlesource.com/c/go/+/410197
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Emmanuel Odeke <emmanuel@orijtech.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Auto-Submit: Keith Randall <khr@google.com>
2023-03-08 16:23:09 +00:00
Cherry Mui b675a75c3d cmd/compile: enable address folding for globals on ARM64, just not -dynlink mode
On ARM64, in -dynlink mode (building a shared library or a plugin),
accessing global variable is made using the GOT. Currently, the
GOT accessing instruction sequence our assembler generates doesn't
handle large offset well, so we don't fold the offset into loads
and stores in the compiler. Currently, the rewrite rules are
guarded with the -shared flag. However, the GOT access
instructions are only generated in the -dynlink mode (which
implies -shared, but not the other direction).

CL 445535 attempted to remove the guard althgether. But that
causes build failure for -dynlink mode for the reason above. This
CL changes it to guard specifically on -dynlink mode, allowing
the optimization in more cases (-shared but not -dynlink build
modes).

Updates #58826.

Change-Id: I1391db6a33e8d0455a304e7cae7fcfdeb49bfdab
Reviewed-on: https://go-review.googlesource.com/c/go/+/473999
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-03-07 21:29:30 +00:00
David Chase c20d959163 cmd/compile: experimental loop iterator capture semantics change
Adds:
GOEXPERIMENT=loopvar (expected way of invoking)
-d=loopvar={-1,0,1,2,11,12} (for per-package control and/or logging)
-d=loopvarhash=... (for hash debugging)

loopvar=11,12 are for testing, benchmarking, and debugging.

If enabled,for loops of the form `for x,y := range thing`, if x and/or
y are addressed or captured by a closure, are transformed by renaming
x/y to a temporary and prepending an assignment to the body of the
loop x := tmp_x.  This changes the loop semantics by making each
iteration's instance of x be distinct from the others (currently they
are all aliased, and when this matters, it is almost always a bug).

3-range with captured iteration variables are also transformed,
though it is a more complex transformation.

"Optimized" to do a simpler transformation for
3-clause for where the increment is empty.

(Prior optimization of address-taking under Return disabled, because
it was incorrect; returns can have loops for children.  Restored in
a later CL.)

Includes support for -d=loopvarhash=<binary string> intended for use
with hash search and GOCOMPILEDEBUG=loopvarhash=<binary string>
(use `gossahash -e loopvarhash command-that-fails`).

Minor feature upgrades to hash-triggered features; clients can specify
that file-position hashes use only the most-inline position, and/or that
they use only the basenames of source files (not the full directory path).
Most-inlined is the right choice for debugging loop-iteration change
once the semantics are linked to the package across inlining; basename-only
makes it tractable to write tests (which, otherwise, depend on the full
pathname of the source file and thus vary).

Updates #57969.

Change-Id: I180a51a3f8d4173f6210c861f10de23de8a1b1db
Reviewed-on: https://go-review.googlesource.com/c/go/+/411904
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-03-06 18:34:24 +00:00
Keith Randall ce2a609909 cmd/link: establish dependable package initialization order
As described here:

https://github.com/golang/go/issues/31636#issuecomment-493271830

"Find the lexically earliest package that is not initialized yet,
but has had all its dependencies initialized, initialize that package,
 and repeat."

Simplify the runtime a bit, by just computing the ordering required
in the linker and giving a list to the runtime.

Update #31636
Fixes #57411

RELNOTE=yes

Change-Id: I1e4d3878ebe6e8953527aedb730824971d722cac
Reviewed-on: https://go-review.googlesource.com/c/go/+/462035
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-03-03 23:11:37 +00:00
Keith Randall cbb9cd03f8 cmd/compile: ensure FuncForPC works on closures that start with NOPs
A 0-sized no-op shouldn't prevent us from detecting that the first
instruction is from an inlined callee.

Update #58300

Change-Id: Ic5f6ed108c54a32c05e9b2264b516f2cc17e4619
Reviewed-on: https://go-review.googlesource.com/c/go/+/467977
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-03-03 16:35:00 +00:00
Matthew Dempsky 37a2004b43 cmd/compile: relax overly strict assertion
The assertion here was to make sure the newly constructed and
typechecked expression selected the same receiver-qualified method,
but in the case of anonymous receiver types we can actually end up
with separate types.Field instances corresponding to each types.Type
instance. In that case, the assertion spuriously failed.

The fix here is to relax and assertion and just compare the method's
name and type (including receiver type).

Fixes #58563.

Change-Id: I67d51ddb020e6ed52671473c93fc08f283a40886
Reviewed-on: https://go-review.googlesource.com/c/go/+/471676
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-03-01 20:26:10 +00:00
ruinan 4d180f71dc cmd/compile: omit redundant sign/unsign extension on arm64
On Arm64, all 32-bit instructions will ignore the upper 32 bits and
clear them to zero for the result. No need to do an unsign extend before
a 32 bit op.

This CL removes the redundant unsign extension only for the existing
32-bit opcodes, and also omits the sign extension when the upper bit of
the result can be predicted.

Fixes #42162

Change-Id: I61e6670bfb8982572430e67a4fa61134a3ea240a
CustomizedGitHooks: yes
Reviewed-on: https://go-review.googlesource.com/c/go/+/427454
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Eric Fang <eric.fang@arm.com>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Eric Fang <eric.fang@arm.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-02-28 03:16:44 +00:00
Dmitri Shuralyov 7a0799b2c0 cmd/dist, test: convert test/run.go runner to a cmd/go test
As motivated on the issue, we want to move the functionality of the
run.go program to happen via a normal go test. Each .go test case in
the GOROOT/test directory gets a subtest, and cmd/go's support for
parallel test execution replaces run.go's own implementation thereof.

The goal of this change is to have fairly minimal and readable diff
while making an atomic changeover. The working directory is modified
during the test execution to be GOROOT/test as it was with run.go,
and most of the test struct and its run method are kept unchanged.
The next CL in the stack applies further simplifications and cleanups
that become viable.

There's no noticeable difference in test execution time: it takes around
60-80 seconds both before and after on my machine. Test caching, which
the previous runner lacked, can shorten the time significantly.

For #37486.
Fixes #56844.

Change-Id: I209619dc9d90e7529624e49c01efeadfbeb5c9ae
Reviewed-on: https://go-review.googlesource.com/c/go/+/463276
Run-TryBot: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-02-28 01:11:37 +00:00
Matthew Dempsky fa9efd9171 cmd/compile/internal/noder: correct positions for synthetic closures
When inlining functions that contain function literals, we need to be
careful about position information. The OCLOSURE node should use the
inline-adjusted position, but the ODCLFUNC and its body should use the
original positions.

However, the same problem can arise with certain generic constructs,
which require the compiler to synthesize function literals to insert
dictionary arguments.

go.dev/cl/425395 fixed the issue with user-written function literals
in a somewhat kludgy way; this CL extends the same solution to
synthetic function literals.

This is all quite subtle and the solutions aren't terribly robust, so
longer term it's probably desirable to revisit how we track inlining
context for positions. But for now, this seems to be the least bad
solution, esp. for backporting to 1.20.

Updates #54625.
Fixes #58513.

Change-Id: Icc43a70dbb11a0e665cbc9e6a64ef274ad8253d1
Reviewed-on: https://go-review.googlesource.com/c/go/+/468415
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Than McIntosh <thanm@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
2023-02-27 23:07:49 +00:00
Michael Munday 85d54a7667 cmd/compile: use zero constants in comparisons where possible
Some integer comparisons with 1 and -1 can be rewritten as comparisons
with 0. For example, x < 1 is equivalent to x <= 0. This is an
advantageous transformation on riscv64 because comparisons with zero
do not require a constant to be loaded into a register. Other
architectures will likely benefit too and the transformation is
relatively benign on architectures that do not benefit.

Change-Id: I2ce9821dd7605a660eb71d76e83a61f9bae1bf25
Reviewed-on: https://go-review.googlesource.com/c/go/+/350831
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Munday <mike.munday@lowrisc.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-02-27 21:38:30 +00:00
Lynn Boger ebe49f98c8 cmd/compile: inline constant sized memclrNoHeapPointers calls on PPC64
Update the function isInlinableMemclr for ppc64 and ppc64le
to enable inlining for the constant sized cases < 512.

Larger cases can use dcbz which performs better but requires
alignment checking so it is best to continue using memclrNoHeapPointers
for those cases.

Results on p10:

MemclrKnownSize1         2.07ns ± 0%     0.57ns ± 0%   -72.59%
MemclrKnownSize2         2.56ns ± 5%     0.57ns ± 0%   -77.82%
MemclrKnownSize4         5.15ns ± 0%     0.57ns ± 0%   -89.00%
MemclrKnownSize8         2.23ns ± 0%     0.57ns ± 0%   -74.57%
MemclrKnownSize16        2.23ns ± 0%     0.50ns ± 0%   -77.74%
MemclrKnownSize32        2.28ns ± 0%     0.56ns ± 0%   -75.28%
MemclrKnownSize64        2.49ns ± 0%     0.72ns ± 0%   -70.95%
MemclrKnownSize112       2.97ns ± 2%     1.14ns ± 0%   -61.72%
MemclrKnownSize128       4.64ns ± 6%     2.45ns ± 1%   -47.17%
MemclrKnownSize192       5.45ns ± 5%     2.79ns ± 0%   -48.87%
MemclrKnownSize248       4.51ns ± 0%     2.83ns ± 0%   -37.12%
MemclrKnownSize256       6.34ns ± 1%     3.58ns ± 0%   -43.53%
MemclrKnownSize512       3.64ns ± 0%     3.64ns ± 0%    -0.03%
MemclrKnownSize1024      4.73ns ± 0%     4.73ns ± 0%    +0.01%
MemclrKnownSize4096      17.1ns ± 0%     17.1ns ± 0%    +0.07%
MemclrKnownSize512KiB    2.12µs ± 0%     2.12µs ± 0%      ~     (all equal)

Change-Id: If1abf5749f4802c64523a41fe0058bd144d0ea46
Reviewed-on: https://go-review.googlesource.com/c/go/+/464340
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Jakub Ciolek <jakub@ciolek.dev>
Reviewed-by: Archana Ravindar <aravind5@in.ibm.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Carlos Eduardo Seo <carlos.seo@linaro.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
2023-02-23 18:57:27 +00:00
Matthew Dempsky 9f834a559c test: add regress test for #58572
Fixes #58572.

Change-Id: I75fa432afefc3e036ed9a6a9002a29d7b23105ee
Reviewed-on: https://go-review.googlesource.com/c/go/+/468880
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
2023-02-17 03:59:20 +00:00
Cuong Manh Le 93f10b8829 cmd/compile: fix wrong escape analysis for go/defer generic calls
For go/defer calls like "defer f(x, y)", the compiler rewrites it to:

	x1, y1 := x, y
	defer func() { f(x1, y1) }()

However, if "f" needs runtime type information, the "RType" field will
refer to the outer ".dict" param, causing wrong liveness analysis.

To fix this, if "f" refers to outer ".dict", the dict param will be
copied to an autotmp, and "f" will refer to this autotmp instead.

Fixes #58341

Change-Id: I238b6e75441442b5540d39bc818205398e80c94d
Reviewed-on: https://go-review.googlesource.com/c/go/+/466035
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-02-13 21:28:54 +00:00
Cuong Manh Le b7736cbceb cmd/compile: disable inline static init optimization
There are a plenty of regression in 1.20 with this optimization. This CL
disable inline static init, so it's safer to backport to 1.20 branch.

The optimization will be enabled again during 1.21 cycle.

Updates #58293
Updates #58339
For #58293

Change-Id: If5916008597b46146b4dc7108c6b389d53f35e95
Reviewed-on: https://go-review.googlesource.com/c/go/+/467015
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2023-02-09 19:49:12 +00:00
Sung Yoon Whang 3161081c12 cmd/compile/internal/staticinit: fix panic in interface conversion
This patch fixes a panic from incorrect interface conversion from
*ir.BasicLit to *ir.ConstExpr. This only occurs when nounified
GOEXPERIMENT is set, so ideally it should be backported to Go
1.20 and removed from master.

Fixes #58339

Change-Id: I357069d7ee1707d5cc6811bd2fbdd7b0456323ae
GitHub-Last-Rev: 641dedb5f9
GitHub-Pull-Request: golang/go#58389
Reviewed-on: https://go-review.googlesource.com/c/go/+/466175
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@google.com>
2023-02-09 15:21:37 +00:00
Cuong Manh Le fd208c8850 cmd/compile: remove constant arithmetic overflows during typecheck
Since go1.19, these errors are already reported by types2 for any user's
Go code. Compiler generated code, which looks like constant expression
should be evaluated as non-constant semantic, which allows overflows.

Fixes #58293

Change-Id: I6f0049a69bdb0a8d0d7a0db49c7badaa92598ea2
Reviewed-on: https://go-review.googlesource.com/c/go/+/465096
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com>
2023-02-09 09:33:47 +00:00
Jorropo 0d8d181bd5 cmd/compile: use MakeResult in empty MakeSlice elimination
This gets eliminated by thoses rules above:
  // for rewriting results of some late-expanded rewrites (below)
  (SelectN [0] (MakeResult x ___)) => x
  (SelectN [1] (MakeResult x y ___)) => y
  (SelectN [2] (MakeResult x y z ___)) => z

Fixes #58161

Change-Id: I4fbfd52c72c06b6b3db906bd9910b6dbb7fe8975
Reviewed-on: https://go-review.googlesource.com/c/go/+/463846
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
2023-02-08 22:31:12 +00:00
Wayne Zuo d7ac5d1480 cmd/compile: intrinsify math/bits/ReverseBytes{32|64} for 386
The BSWAPL instruction is supported in i486 and newer.
https://github.com/golang/go/wiki/MinimumRequirements#386 says we
support "All Pentium MMX or later". The Pentium is also referred to as
i586, so that we are safe with these instructions.

Change-Id: I6dea1f9d864a45bb07c8f8f35a81cfe16cca216c
Reviewed-on: https://go-review.googlesource.com/c/go/+/465515
Run-TryBot: Wayne Zuo <wdvxdr@golangcn.org>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Keith Randall <khr@google.com>
2023-02-08 03:43:23 +00:00
Cuong Manh Le abd55d8483 cmd/compile: fix inline static init arguments substitued tree
Blank node must be ignored when building arguments substitued tree.
Otherwise, it could be used to replace other blank node in left hand
side of an assignment, causing an invalid IR node.

Consider the following code:

	type S1 struct {
		s2 S2
	}

	type S2 struct{}

	func (S2) Make() S2 {
		return S2{}
	}

	func (S1) Make() S1 {
		return S1{s2: S2{}.Make()}
	}

	var _ = S1{}.Make()

After staticAssignInlinedCall, the assignment becomes:

	var _ = S1{s2: S2{}.Make()}

and the arg substitued tree is "map[*ir.Name]ir.Node{_: S1{}}". Now,
when doing static assignment, if there is any assignment to blank node,
for example:

	_ := S2{}

That blank node will be replaced with "S1{}":

	S1{} := S2{}

So constructing an invalid IR which causes the ICE.

Fixes #58325

Change-Id: I21b48357f669a7e02a7eb4325246aadc31f78fb9
Reviewed-on: https://go-review.googlesource.com/c/go/+/465098
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
2023-02-08 02:44:20 +00:00
Cuong Manh Le 1bd0405b8f test: add test for issue 58345
CL 458619 fixed the problem un-intentionally, so adding test to prevent
regression happening.

Updates #58345

Change-Id: I80cf60716ef85e142d769e8621fce19c826be03d
Reviewed-on: https://go-review.googlesource.com/c/go/+/465455
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2023-02-07 20:59:40 +00:00
Keith Randall 103f37497f cmd/compile: ensure first instruction in a function is not inlined
People are using this to get the name of the function from a function type:

runtime.FuncForPC(reflect.ValueOf(fn).Pointer()).Name()

Unfortunately, this technique falls down when the first instruction
of the function is from an inlined callee. Then the expression above
gets you the name of the inlined function instead of the function itself.

To fix this, ensure that the first instruction is never from an inlinee.
Normally functions have prologs so those are already fine. In just the
cases where a function is a leaf with no local variables, and an instruction
from an inlinee appears first in the prog list, add a nop at the start
of the function to hold a non-inlined position.

Consider the nop a "mini-prolog" for leaf functions.

Fixes #58300

Change-Id: Ie37092f4ac3167fe8e5ef4a2207b14abc1786897
Reviewed-on: https://go-review.googlesource.com/c/go/+/465076
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Reviewed-by: David Chase <drchase@google.com>
2023-02-06 20:39:54 +00:00
Archana R a432d89137 cmd/compile: add rules to emit SETBC/R instructions on power10
This CL adds rules that replaces instances of ISEL that produce
a boolean result based on a condition register by SETBC/SETBCR
operations. On Power10 these are convereted to SETBC/SETBCR
instructions that use one register instead of 3 registers
conventionally used by ISEL and hence reduces register pressure.
On loops written specifically to exercise such instances of ISEL
extensively, a performance improvement of 2.5% is seen on Power10.
Also added verification tests to verify correct generation of
SETBC/SETBCR instructions on Power10.

Change-Id: Ib719897f09d893de40324440a43052dca026e8fa
Reviewed-on: https://go-review.googlesource.com/c/go/+/449795
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Run-TryBot: Archana Ravindar <aravind5@in.ibm.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-02-06 12:49:53 +00:00
Archana R cd1fc87156 cmd/compile: intrinsify math/bits/ReverseBytes{16|32|64} for ppc64/power10
This change intrinsifies ReverseBytes{16|32|64} by generating the
corresponding new instructions in Power10: brh, brd and brw and
adds a verification test for the same.
On Power 9 and 8, the .go code performs optimally as it is.

Performance improvement seen on Power10:
ReverseBytes32  1.38ns ± 0%  1.18ns ± 0%  -14.2
ReverseBytes64  1.52ns ± 0%  1.11ns ± 0%  -26.87
ReverseBytes16  1.41ns ± 1%  1.18ns ± 0%  -16.47

Change-Id: I88f127f3ab9ba24a772becc21ad90acfba324b37
Reviewed-on: https://go-review.googlesource.com/c/go/+/446675
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2023-02-03 19:01:06 +00:00
Keith Randall 6224db9b4d cmd/compile: schedule values with no in-block uses later
When scheduling a block, deprioritize values whose results aren't used
until subsequent blocks.

For #58166, this has the effect of pushing the induction variable increment
to the end of the block, past all the other uses of the pre-incremented value.

Do this only with optimizations on. Debuggers have a preference for values
in source code order, which this CL can degrade.

Fixes #58166
Fixes #57976

Change-Id: I40d5885c661b142443c6d4702294c8abe8026c4f
Reviewed-on: https://go-review.googlesource.com/c/go/+/463751
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-02-01 18:41:07 +00:00
Cuong Manh Le ac7efcb0ca test: enable inlining tests for functions with local type
Updates #57410

Change-Id: Ibe1f5523a4635d2b844b9a5db94514e07eb0bc0f
Reviewed-on: https://go-review.googlesource.com/c/go/+/463998
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2023-01-31 20:36:55 +00:00
Cuong Manh Le b89a840d65 cmd/compile: add clear(x) builtin
To clear map, and zero content of slice.

Updates #56351

Change-Id: I5f81dfbc465500f5acadaf2c6beb9b5f0d2c4045
Reviewed-on: https://go-review.googlesource.com/c/go/+/453395
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2023-01-31 19:43:07 +00:00
Jakub Ciolek 1fc585dc2f cmd/compile: inline known-size memclrNoHeapPointers calls
This patch rewrites known-size calls to memclrNoHeapPointers with an OpZero.
This significantly improves performance and lets some clears get DSE'd.

One of the cases where this applies is zeroing a known-size array, example:

	var x [256]int8

        ...

	for a := range x {
	    x[a] = 0
	}

Other cases can be found in the runtime itself where memclrNoHeapPointers is sometimes directly invoked with a constant.

It seems that for some sized-clears on some architectures (AMD64, maybe others) the memcrlNoHeapPointers is more performant than OpZero.
See the issue #56997 for more details.

Benches ARM (M1 Pro):

name                      old time/op     new time/op     delta
MemclrKnownSize1-10          2.03ns ± 0%     0.31ns ± 0%    -84.69%  (p=0.000 n=18+19)
MemclrKnownSize2-10          1.97ns ± 0%     0.31ns ± 0%    -84.19%  (p=0.000 n=12+19)
MemclrKnownSize4-10          2.02ns ± 0%     0.31ns ± 0%    -84.56%  (p=0.000 n=12+20)
MemclrKnownSize8-10          2.02ns ± 0%     0.31ns ± 0%    -84.59%  (p=0.000 n=14+19)
MemclrKnownSize16-10         2.15ns ± 0%     0.31ns ± 0%    -85.50%  (p=0.000 n=18+19)
MemclrKnownSize32-10         2.48ns ± 0%     0.31ns ± 0%    -87.48%  (p=0.000 n=20+19)
MemclrKnownSize64-10         1.93ns ± 0%     0.62ns ± 0%    -67.88%  (p=0.000 n=20+19)
MemclrKnownSize112-10        2.48ns ± 0%     1.80ns ± 0%    -27.74%  (p=0.000 n=19+20)
MemclrKnownSize128-10       10.0ns ±112%      2.0ns ± 0%    -79.76%  (p=0.000 n=18+17)
MemclrKnownSize192-10       27.4ns ±103%      2.6ns ± 0%    -90.38%  (p=0.000 n=16+19)
MemclrKnownSize248-10        9.67ns ±43%     3.26ns ± 0%    -66.29%  (p=0.000 n=19+19)
MemclrKnownSize256-10       85.4ns ±148%      3.3ns ± 0%    -96.18%  (p=0.000 n=20+20)
MemclrKnownSize512-10         223ns ±54%        6ns ± 0%    -97.42%  (p=0.000 n=18+20)
MemclrKnownSize1024-10        216ns ±26%       11ns ± 0%    -95.00%  (p=0.000 n=18+15)
MemclrKnownSize4096-10        265ns ± 2%       88ns ± 0%    -66.84%  (p=0.000 n=19+17)
MemclrKnownSize512KiB-10     9.91µs ± 1%    10.23µs ± 2%     +3.14%  (p=0.000 n=19+19)
[Geo mean]                   15.6ns           2.5ns         -83.62%

name                      old speed       new speed       delta
MemclrKnownSize1-10         493MB/s ± 0%   3216MB/s ± 0%   +553.04%  (p=0.000 n=18+19)
MemclrKnownSize2-10        1.02GB/s ± 0%   6.43GB/s ± 0%   +532.33%  (p=0.000 n=16+19)
MemclrKnownSize4-10        1.99GB/s ± 0%  12.86GB/s ± 0%   +547.67%  (p=0.000 n=18+20)
MemclrKnownSize8-10        3.96GB/s ± 0%  25.72GB/s ± 0%   +548.81%  (p=0.000 n=19+19)
MemclrKnownSize16-10       7.46GB/s ± 0%  51.43GB/s ± 0%   +589.72%  (p=0.000 n=20+19)
MemclrKnownSize32-10       12.9GB/s ± 0%  102.9GB/s ± 0%   +698.60%  (p=0.000 n=20+18)
MemclrKnownSize64-10       33.1GB/s ± 0%  103.0GB/s ± 0%   +211.34%  (p=0.000 n=19+19)
MemclrKnownSize112-10      45.1GB/s ± 0%   62.4GB/s ± 0%    +38.38%  (p=0.000 n=19+20)
MemclrKnownSize128-10     13.3GB/s ±107%   63.5GB/s ± 0%   +378.03%  (p=0.000 n=19+18)
MemclrKnownSize192-10     6.97GB/s ±139%  72.72GB/s ± 0%   +943.44%  (p=0.000 n=19+19)
MemclrKnownSize248-10      25.9GB/s ±46%   76.1GB/s ± 0%   +194.16%  (p=0.000 n=20+17)
MemclrKnownSize256-10     8.64GB/s ±196%  78.51GB/s ± 0%   +808.19%  (p=0.000 n=20+20)
MemclrKnownSize512-10      2.33GB/s ±86%  89.13GB/s ± 0%  +3719.50%  (p=0.000 n=17+20)
MemclrKnownSize1024-10     4.85GB/s ±32%  94.93GB/s ± 0%  +1856.74%  (p=0.000 n=18+19)
MemclrKnownSize4096-10     15.4GB/s ± 2%   46.6GB/s ± 0%   +201.55%  (p=0.000 n=19+18)
MemclrKnownSize512KiB-10   52.9GB/s ± 1%   51.3GB/s ± 2%     -3.04%  (p=0.000 n=19+19)
[Geo mean]                 7.54GB/s       42.86GB/s        +468.76%

Intel Alder Lake 12600k:

name                      old time/op    new time/op     delta
MemclrKnownSize1-16         0.59ns ± 3%     0.38ns ± 6%   -36.00%  (p=0.000 n=19+18)
MemclrKnownSize2-16         0.57ns ± 1%     0.19ns ± 5%   -66.27%  (p=0.000 n=19+19)
MemclrKnownSize4-16         0.66ns ± 2%     0.36ns ±21%   -45.12%  (p=0.000 n=19+20)
MemclrKnownSize8-16         0.74ns ± 1%     0.30ns ±26%   -59.81%  (p=0.000 n=18+20)
MemclrKnownSize16-16        1.00ns ± 7%     0.21ns ± 8%   -79.51%  (p=0.000 n=20+19)
MemclrKnownSize32-16        0.95ns ± 1%     0.40ns ± 1%   -57.61%  (p=0.000 n=20+18)
MemclrKnownSize64-16        1.20ns ± 2%     0.41ns ± 0%   -65.82%  (p=0.000 n=20+18)
MemclrKnownSize112-16       1.27ns ± 2%     1.03ns ± 0%   -19.35%  (p=0.000 n=20+18)
MemclrKnownSize128-16       1.34ns ± 2%     1.03ns ± 0%   -23.02%  (p=0.000 n=20+20)
MemclrKnownSize192-16       1.92ns ± 2%     1.44ns ± 0%   -24.89%  (p=0.000 n=20+16)
MemclrKnownSize248-16       2.77ns ± 1%     3.29ns ± 0%   +18.81%  (p=0.000 n=20+16)
MemclrKnownSize256-16       1.92ns ± 1%     1.86ns ± 0%    -3.49%  (p=0.000 n=19+15)
MemclrKnownSize512-16       2.81ns ± 2%     3.49ns ± 0%   +24.15%  (p=0.000 n=20+17)
MemclrKnownSize1024-16      4.02ns ± 1%     6.78ns ± 0%   +68.44%  (p=0.000 n=20+18)
MemclrKnownSize4096-16      17.2ns ± 2%     14.4ns ± 0%   -16.73%  (p=0.000 n=20+17)
MemclrKnownSize512KiB-16    6.71µs ± 1%     6.52µs ± 0%    -2.85%  (p=0.000 n=20+18)
[Geo mean]                  2.60ns          1.71ns        -34.06%

name                      old speed      new speed       delta
MemclrKnownSize1-16       1.71GB/s ± 3%   2.67GB/s ± 6%   +56.39%  (p=0.000 n=19+18)
MemclrKnownSize2-16       3.52GB/s ± 2%  10.43GB/s ± 6%  +196.04%  (p=0.000 n=20+20)
MemclrKnownSize4-16       6.06GB/s ± 1%  10.83GB/s ±11%   +78.63%  (p=0.000 n=19+18)
MemclrKnownSize8-16       10.7GB/s ± 1%   27.0GB/s ±21%  +151.49%  (p=0.000 n=18+20)
MemclrKnownSize16-16      16.0GB/s ± 8%   78.1GB/s ± 7%  +387.24%  (p=0.000 n=20+19)
MemclrKnownSize32-16      33.6GB/s ± 1%   79.4GB/s ± 1%  +135.89%  (p=0.000 n=20+18)
MemclrKnownSize64-16      53.3GB/s ± 2%  155.9GB/s ± 0%  +192.58%  (p=0.000 n=20+18)
MemclrKnownSize112-16     88.0GB/s ± 2%  109.1GB/s ± 0%   +23.97%  (p=0.000 n=20+18)
MemclrKnownSize128-16     95.3GB/s ± 2%  123.8GB/s ± 0%   +29.88%  (p=0.000 n=20+20)
MemclrKnownSize192-16      100GB/s ± 2%    133GB/s ± 0%   +33.12%  (p=0.000 n=20+17)
MemclrKnownSize248-16     89.7GB/s ± 1%   75.5GB/s ± 0%   -15.84%  (p=0.000 n=20+19)
MemclrKnownSize256-16      133GB/s ± 1%    138GB/s ± 0%    +3.61%  (p=0.000 n=19+14)
MemclrKnownSize512-16      182GB/s ± 2%    147GB/s ± 0%   -19.46%  (p=0.000 n=20+17)
MemclrKnownSize1024-16     254GB/s ± 1%    151GB/s ± 0%   -40.64%  (p=0.000 n=20+18)
MemclrKnownSize4096-16     237GB/s ± 2%    285GB/s ± 0%   +20.09%  (p=0.000 n=20+17)
MemclrKnownSize512KiB-16  78.2GB/s ± 1%   80.4GB/s ± 0%    +2.93%  (p=0.000 n=20+18)
[Geo mean]                42.1GB/s        63.8GB/s        +51.53%

compilecmp linux/amd64:

runtime
runtime.(*pallocData).allocAll 85 -> 45  (-47.06%)
runtime.(*pageAlloc).allocRange 942 -> 923  (-2.02%)
runtime.(*pageAlloc).free 798 -> 774  (-3.01%)
runtime.(*pageBits).clearAll 66 -> 20  (-69.70%)
runtime.startCheckmarks 255 -> 246  (-3.53%)
runtime.(*pallocData).freeAll 86 -> 46  (-46.51%)
runtime.(*pallocBits).freeAll 66 -> 20  (-69.70%)
runtime.(*consistentHeapStats).unsafeClear 66 -> 19  (-71.21%)
runtime.newproc1 965 -> 933  (-3.32%)

crypto/rc4
crypto/rc4.(*Cipher).Reset 78 -> 69  (-11.54%)

compress/bzip2
compress/bzip2.(*reader).readBlock 2973 -> 2941  (-1.08%)

image/jpeg
image/jpeg.(*decoder).processDHT 1179 -> 1166  (-1.10%)

index/suffixarray
index/suffixarray.bucketMax_8_32 394 -> 241  (-38.83%)
index/suffixarray.freq_8_32 317 -> 185  (-41.64%)
index/suffixarray.freq_8_64 317 -> 178  (-43.85%)
index/suffixarray.bucketMin_8_32 394 -> 243  (-38.32%)
index/suffixarray.bucketMin_8_64 398 -> 234  (-41.21%)
index/suffixarray.bucketMax_8_64 398 -> 234  (-41.21%)

compress/flate
compress/flate.(*huffmanBitWriter).generateCodegen 965 -> 838  (-13.16%)
compress/flate.(*compressor).reset 429 -> 409  (-4.66%)

cmd/vendor/golang.org/x/sys/unix
cmd/vendor/golang.org/x/sys/unix.(*FdSet).Zero 66 -> 60  (-9.09%)
cmd/vendor/golang.org/x/sys/unix.(*Ifreq).SetInet4Addr 211 -> 129  (-38.86%)
cmd/vendor/golang.org/x/sys/unix.(*Ifreq).SetUint32 98 -> 14  (-85.71%)
cmd/vendor/golang.org/x/sys/unix.(*Ifreq).clear 66 -> 11  (-83.33%)
cmd/vendor/golang.org/x/sys/unix.(*Ifreq).SetUint16 101 -> 15  (-85.15%)
cmd/vendor/golang.org/x/sys/unix.(*CPUSet).Zero 66 -> 60  (-9.09%)

internal/coverage/decodemeta
internal/coverage/decodemeta.(*CoverageMetaFileReader).rdUint64 325 -> 293  (-9.85%)

crypto/tls
crypto/tls.(*halfConn).setTrafficSecret 253 -> 247  (-2.37%)
crypto/tls.(*Conn).readRecordOrCCS 10315 -> 10283  (-0.31%)
crypto/tls.(*halfConn).changeCipherSpec 271 -> 261  (-3.69%)
crypto/tls.(*Conn).writeRecordLocked 1765 -> 1748  (-0.96%)

file                               before   after    Δ       %
runtime.s                          512467   512164   -303    -0.059%
crypto/rc4.s                       955      946      -9      -0.942%
compress/bzip2.s                   9586     9554     -32     -0.334%
image/jpeg.s                       32122    32109    -13     -0.040%
index/suffixarray.s                38547    37644    -903    -2.343%
compress/flate.s                   46668    46521    -147    -0.315%
cmd/vendor/golang.org/x/sys/unix.s 118620   118301   -319    -0.269%
internal/coverage/decodemeta.s     7224     7192     -32     -0.443%
crypto/tls.s                       288762   288697   -65     -0.023%
cmd/compile/internal/ssa.s         3639799  3640727  +928    +0.025%
total                              20790248 20789353 -895    -0.004%

src/runtime benchmarks (Linux Alder Lake 12600k):

name                                             old time/op    new time/op    delta
MakeChan/Byte-16                                   26.2ns ± 2%    25.6ns ± 3%   -2.05%  (p=0.003 n=9+10)
MakeChan/Int-16                                    33.9ns ± 2%    33.3ns ± 4%   -1.99%  (p=0.015 n=10+10)
MakeChan/Ptr-16                                    54.2ns ± 2%    53.7ns ± 1%   -0.90%  (p=0.016 n=10+9)
MakeChan/Struct/0-16                               23.8ns ± 3%    23.4ns ± 1%   -1.72%  (p=0.009 n=10+8)
MakeChan/Struct/32-16                              55.9ns ± 2%    53.9ns ± 1%   -3.48%  (p=0.000 n=10+10)
MakeChan/Struct/40-16                              63.5ns ± 1%    61.1ns ± 2%   -3.79%  (p=0.000 n=10+9)
ChanNonblocking-16                                 0.22ns ± 0%    0.22ns ± 0%   +0.40%  (p=0.011 n=9+8)
SelectUncontended-16                               4.63ns ± 1%    4.62ns ± 0%   -0.35%  (p=0.001 n=10+8)
SelectSyncContended-16                             1.58µs ± 2%    1.59µs ± 1%     ~     (p=0.540 n=10+10)
SelectAsyncContended-16                             290ns ± 0%     291ns ± 0%   +0.14%  (p=0.012 n=8+9)
SelectNonblock-16                                  0.95ns ± 1%    0.95ns ± 1%     ~     (p=0.546 n=9+9)
ChanUncontended-16                                  239ns ± 3%     242ns ± 6%     ~     (p=0.886 n=9+10)
ChanContended-16                                   17.7µs ± 1%    18.2µs ± 1%   +2.87%  (p=0.000 n=10+9)
ChanSync-16                                         109ns ± 2%     109ns ± 1%     ~     (p=0.342 n=10+10)
ChanSyncWork-16                                    6.55µs ± 1%    6.53µs ± 1%     ~     (p=0.101 n=10+10)
ChanProdCons0-16                                    502ns ± 1%     499ns ± 0%   -0.55%  (p=0.001 n=10+9)
ChanProdCons10-16                                   373ns ± 2%     377ns ± 1%     ~     (p=0.095 n=10+9)
ChanProdCons100-16                                  224ns ± 2%     223ns ± 3%     ~     (p=0.150 n=9+10)
ChanProdConsWork0-16                                491ns ± 1%     484ns ± 0%   -1.26%  (p=0.000 n=10+9)
ChanProdConsWork10-16                               451ns ± 2%     448ns ± 2%     ~     (p=0.210 n=8+10)
ChanProdConsWork100-16                              406ns ± 0%     407ns ± 1%     ~     (p=0.138 n=8+8)
SelectProdCons-16                                   509ns ± 0%     509ns ± 0%     ~     (p=0.917 n=9+9)
ReceiveDataFromClosedChan-16                       12.1ns ± 0%    12.1ns ± 0%     ~     (p=0.780 n=10+10)
ChanCreation-16                                    22.6ns ± 1%    22.4ns ± 0%   -0.72%  (p=0.001 n=10+8)
ChanSem-16                                          165ns ± 1%     166ns ± 1%   +0.72%  (p=0.002 n=10+10)
ChanPopular-16                                      500µs ± 2%     498µs ± 1%     ~     (p=0.218 n=10+10)
ChanClosed-16                                      0.29ns ± 0%    0.29ns ± 0%   +0.09%  (p=0.019 n=9+8)
CallClosure-16                                     1.28ns ± 0%    1.27ns ± 0%   -0.51%  (p=0.000 n=9+9)
CallClosure1-16                                    1.50ns ± 0%    1.50ns ± 0%     ~     (p=0.123 n=9+9)
CallClosure2-16                                    8.86ns ± 1%    8.86ns ± 3%     ~     (p=0.590 n=9+10)
CallClosure3-16                                    8.75ns ± 2%    8.69ns ± 2%     ~     (p=0.247 n=10+10)
CallClosure4-16                                    8.65ns ± 2%    8.56ns ± 2%     ~     (p=0.105 n=10+10)
Complex128DivNormal-16                             2.47ns ± 0%    2.47ns ± 0%     ~     (p=0.790 n=10+9)
Complex128DivNisNaN-16                             4.44ns ± 0%    4.43ns ± 0%     ~     (p=0.564 n=10+10)
Complex128DivDisNaN-16                             4.48ns ± 0%    4.48ns ± 0%     ~     (p=0.101 n=10+10)
Complex128DivNisInf-16                             2.58ns ± 0%    2.58ns ± 0%     ~     (p=0.808 n=10+10)
Complex128DivDisInf-16                             6.30ns ± 0%    6.31ns ± 0%     ~     (p=0.305 n=10+10)
SetTypePtr-16                                      0.73ns ± 1%    0.73ns ± 3%     ~     (p=0.644 n=10+10)
SetTypePtr8-16                                     4.12ns ± 0%    4.12ns ± 0%     ~     (p=0.127 n=10+10)
SetTypePtr16-16                                    4.13ns ± 1%    4.12ns ± 0%     ~     (p=0.109 n=10+10)
SetTypePtr32-16                                    4.12ns ± 0%    4.12ns ± 0%     ~     (p=0.203 n=9+10)
SetTypePtr64-16                                    4.12ns ± 0%    4.12ns ± 0%     ~     (p=0.696 n=10+10)
SetTypePtr126-16                                   6.91ns ± 0%    6.91ns ± 0%     ~     (p=0.469 n=10+10)
SetTypePtr128-16                                   6.66ns ± 0%    6.67ns ± 0%     ~     (p=0.246 n=9+10)
SetTypePtrSlice-16                                 54.1ns ± 1%    54.1ns ± 1%     ~     (p=0.509 n=9+10)
SetTypeNode1-16                                    4.13ns ± 1%    4.12ns ± 0%     ~     (p=0.342 n=10+10)
SetTypeNode1Slice-16                               10.1ns ± 1%    10.0ns ± 1%   -1.18%  (p=0.000 n=10+10)
SetTypeNode8-16                                    4.12ns ± 0%    4.12ns ± 0%     ~     (p=0.137 n=8+8)
SetTypeNode8Slice-16                               22.6ns ± 0%    22.6ns ± 0%     ~     (p=0.423 n=10+10)
SetTypeNode64-16                                   6.90ns ± 0%    6.91ns ± 0%     ~     (p=0.275 n=10+10)
SetTypeNode64Slice-16                               173ns ± 0%     173ns ± 0%     ~     (p=0.610 n=9+10)
SetTypeNode64Dead-16                               5.53ns ± 0%    5.52ns ± 0%     ~     (p=0.123 n=10+6)
SetTypeNode64DeadSlice-16                           150ns ± 0%     150ns ± 0%     ~     (p=0.398 n=10+10)
SetTypeNode124-16                                  6.90ns ± 0%    6.90ns ± 0%     ~     (p=0.779 n=10+10)
SetTypeNode124Slice-16                              222ns ± 5%     217ns ± 0%     ~     (p=0.302 n=10+10)
SetTypeNode126-16                                  6.66ns ± 0%    6.66ns ± 0%     ~     (p=0.324 n=10+9)
SetTypeNode126Slice-16                              218ns ± 0%     218ns ± 0%     ~     (p=0.119 n=9+10)
SetTypeNode128-16                                  9.76ns ± 0%    9.73ns ± 0%   -0.31%  (p=0.003 n=9+10)
SetTypeNode128Slice-16                              279ns ± 0%     278ns ± 0%     ~     (p=0.112 n=10+9)
SetTypeNode130-16                                  9.77ns ± 0%    9.73ns ± 0%   -0.33%  (p=0.002 n=10+10)
SetTypeNode130Slice-16                              284ns ± 0%     284ns ± 0%     ~     (p=0.668 n=10+10)
SetTypeNode1024-16                                 51.2ns ± 0%    51.6ns ± 1%     ~     (p=0.080 n=9+9)
SetTypeNode1024Slice-16                            1.83µs ± 0%    1.82µs ± 0%     ~     (p=0.115 n=10+10)
Allocation-16                                      4.64µs ± 1%    4.37µs ± 1%   -5.69%  (p=0.000 n=9+9)
ReadMemStats-16                                    5.62µs ± 2%    5.55µs ± 5%   -1.36%  (p=0.050 n=10+10)
WriteBarrier-16                                    4.95ns ± 3%    4.99ns ± 3%     ~     (p=0.255 n=10+10)
BulkWriteBarrier-16                                1.69ns ± 2%    1.63ns ± 4%   -3.77%  (p=0.001 n=10+10)
ScanStackNoLocals-16                               12.8ms ± 2%    12.9ms ± 1%   +0.72%  (p=0.019 n=10+10)
MSpanCountAlloc/bits=64-16                         1.65ns ± 0%    1.65ns ± 0%     ~     (p=0.124 n=10+10)
MSpanCountAlloc/bits=128-16                        2.08ns ± 1%    2.06ns ± 1%   -0.87%  (p=0.000 n=10+10)
MSpanCountAlloc/bits=256-16                        2.71ns ± 1%    2.69ns ± 1%   -0.74%  (p=0.001 n=10+9)
MSpanCountAlloc/bits=512-16                        4.15ns ± 0%    4.23ns ± 2%   +2.15%  (p=0.000 n=10+10)
MSpanCountAlloc/bits=1024-16                       7.89ns ± 1%    7.89ns ± 1%     ~     (p=0.867 n=10+10)
Hash5-16                                           1.93ns ± 1%    2.01ns ± 0%   +3.99%  (p=0.000 n=10+8)
Hash16-16                                          2.04ns ± 1%    2.21ns ± 1%   +8.61%  (p=0.000 n=10+10)
Hash64-16                                          2.67ns ± 0%    2.67ns ± 0%     ~     (p=0.154 n=9+9)
Hash1024-16                                        16.4ns ± 0%    16.4ns ± 0%   +0.17%  (p=0.020 n=9+10)
Hash65536-16                                        886ns ± 0%     885ns ± 0%     ~     (p=0.725 n=10+10)
AlignedLoad-16                                     0.96ns ± 2%    0.95ns ± 3%     ~     (p=0.123 n=10+10)
UnalignedLoad-16                                   0.95ns ± 2%    1.01ns ± 2%   +6.31%  (p=0.000 n=10+10)
EqEfaceConcrete-16                                 0.31ns ± 3%    0.33ns ± 5%   +8.10%  (p=0.000 n=10+10)
EqIfaceConcrete-16                                 0.31ns ±13%    0.28ns ± 2%   -9.23%  (p=0.001 n=10+10)
NeEfaceConcrete-16                                 0.29ns ± 1%    0.31ns ± 7%   +5.59%  (p=0.010 n=8+8)
NeIfaceConcrete-16                                 0.28ns ± 2%    0.29ns ± 1%   +4.49%  (p=0.000 n=9+8)
ConvT2EByteSized/bool-16                           0.53ns ± 1%    0.52ns ± 1%   -2.18%  (p=0.000 n=10+10)
ConvT2EByteSized/uint8-16                          0.53ns ± 1%    0.53ns ± 0%   +1.22%  (p=0.000 n=10+10)
ConvT2ESmall-16                                    1.13ns ± 0%    1.13ns ± 0%     ~     (p=0.774 n=9+9)
ConvT2EUintptr-16                                  1.03ns ± 0%    1.04ns ± 0%   +0.50%  (p=0.000 n=10+8)
ConvT2ELarge-16                                    14.4ns ± 2%    14.4ns ± 1%     ~     (p=0.726 n=10+10)
ConvT2ISmall-16                                    1.13ns ± 0%    1.13ns ± 0%     ~     (p=0.693 n=9+10)
ConvT2IUintptr-16                                  1.03ns ± 0%    1.03ns ± 0%   +0.44%  (p=0.000 n=10+10)
ConvT2ILarge-16                                    14.2ns ± 1%    14.4ns ± 1%   +0.85%  (p=0.007 n=9+10)
ConvI2E-16                                         0.54ns ± 1%    0.54ns ± 0%   -0.39%  (p=0.037 n=10+8)
ConvI2I-16                                         2.68ns ± 0%    2.70ns ± 1%   +0.73%  (p=0.000 n=9+10)
AssertE2T-16                                       0.28ns ± 1%    0.39ns ± 5%  +37.38%  (p=0.000 n=10+10)
AssertE2TLarge-16                                  0.42ns ± 2%    0.48ns ± 1%  +14.92%  (p=0.000 n=9+10)
AssertE2I-16                                       2.67ns ± 0%    2.67ns ± 0%     ~     (p=0.352 n=9+9)
AssertI2T-16                                       0.37ns ± 3%    0.34ns ± 1%   -6.16%  (p=0.000 n=10+10)
AssertI2I-16                                       2.67ns ± 0%    2.67ns ± 0%     ~     (p=0.286 n=10+10)
AssertI2E-16                                       0.54ns ± 1%    0.54ns ± 0%   -0.94%  (p=0.000 n=10+10)
AssertE2E-16                                       0.41ns ± 0%    0.41ns ± 1%     ~     (p=0.880 n=9+9)
AssertE2T2-16                                      0.41ns ± 1%    0.41ns ± 1%     ~     (p=0.725 n=10+10)
AssertE2T2Blank-16                                 0.24ns ± 5%    0.21ns ± 1%  -14.79%  (p=0.000 n=10+9)
AssertI2E2-16                                      0.69ns ± 0%    0.69ns ± 1%     ~     (p=0.541 n=10+10)
AssertI2E2Blank-16                                 0.26ns ± 9%    0.21ns ± 1%  -18.86%  (p=0.000 n=10+9)
AssertE2E2-16                                      0.53ns ± 1%    0.53ns ± 1%   +0.72%  (p=0.004 n=10+10)
AssertE2E2Blank-16                                 0.23ns ± 4%    0.21ns ± 1%   -8.42%  (p=0.000 n=10+10)
ConvT2Ezero/zero/16-16                             1.13ns ± 0%    1.14ns ± 1%     ~     (p=0.583 n=9+10)
ConvT2Ezero/zero/32-16                             1.13ns ± 0%    1.13ns ± 0%     ~     (p=0.417 n=10+10)
ConvT2Ezero/zero/64-16                             1.03ns ± 1%    1.03ns ± 0%     ~     (p=0.051 n=10+10)
ConvT2Ezero/zero/str-16                            1.03ns ± 0%    1.03ns ± 0%     ~     (p=0.132 n=10+10)
ConvT2Ezero/zero/slice-16                          1.14ns ± 0%    1.15ns ± 0%   +0.49%  (p=0.001 n=10+10)
ConvT2Ezero/zero/big-16                             123ns ± 1%     123ns ± 1%     ~     (p=0.171 n=10+10)
ConvT2Ezero/nonzero/str-16                         19.4ns ± 1%    19.3ns ± 3%     ~     (p=0.548 n=9+10)
ConvT2Ezero/nonzero/slice-16                       22.2ns ± 2%    22.0ns ± 2%     ~     (p=0.109 n=10+10)
ConvT2Ezero/nonzero/big-16                          123ns ± 1%     123ns ± 1%     ~     (p=0.446 n=10+8)
ConvT2Ezero/smallint/16-16                         1.13ns ± 0%    1.14ns ± 1%     ~     (p=0.362 n=10+10)
ConvT2Ezero/smallint/32-16                         1.13ns ± 0%    1.13ns ± 0%     ~     (p=0.907 n=10+9)
ConvT2Ezero/smallint/64-16                         1.04ns ± 0%    1.03ns ± 0%   -0.38%  (p=0.002 n=10+10)
ConvT2Ezero/largeint/16-16                         6.65ns ± 1%    6.65ns ± 2%     ~     (p=0.618 n=10+9)
ConvT2Ezero/largeint/32-16                         6.75ns ± 3%    6.63ns ± 2%   -1.77%  (p=0.015 n=10+10)
ConvT2Ezero/largeint/64-16                         9.19ns ± 1%    9.26ns ± 2%     ~     (p=0.123 n=10+10)
Malloc8-16                                         8.66ns ± 1%    8.89ns ± 2%   +2.74%  (p=0.000 n=10+10)
Malloc16-16                                        13.7ns ± 1%    13.8ns ± 1%   +0.71%  (p=0.022 n=10+8)
MallocTypeInfo8-16                                 11.7ns ± 3%    11.6ns ± 2%     ~     (p=0.469 n=10+10)
MallocTypeInfo16-16                                18.3ns ± 1%    18.2ns ± 2%     ~     (p=0.251 n=9+10)
MallocLargeStruct-16                                195ns ± 1%     198ns ± 1%   +1.65%  (p=0.000 n=9+10)
GoroutineSelect-16                                 1.10ms ± 1%    1.12ms ± 1%   +1.36%  (p=0.000 n=10+8)
GoroutineBlocking-16                                986µs ± 1%     998µs ± 1%   +1.23%  (p=0.002 n=10+10)
GoroutineForRange-16                                985µs ± 1%    1001µs ± 1%   +1.68%  (p=0.000 n=10+10)
GoroutineIdle-16                                    679µs ± 1%     691µs ± 0%   +1.74%  (p=0.000 n=10+9)
HashStringSpeed-16                                 5.33ns ± 5%    5.19ns ± 4%     ~     (p=0.113 n=9+9)
HashBytesSpeed-16                                  8.20ns ± 3%    8.24ns ± 1%     ~     (p=0.497 n=10+9)
HashInt32Speed-16                                  4.01ns ± 2%    3.90ns ± 4%   -2.63%  (p=0.011 n=9+10)
HashInt64Speed-16                                  3.94ns ± 4%    3.79ns ± 1%   -3.74%  (p=0.003 n=10+9)
HashStringArraySpeed-16                            12.5ns ± 4%    12.3ns ± 1%     ~     (p=0.055 n=10+10)
MegMap-16                                          3.72ns ± 1%    3.73ns ± 1%     ~     (p=0.484 n=9+10)
MegOneMap-16                                       2.28ns ± 1%    2.27ns ± 1%     ~     (p=0.287 n=10+10)
MegEqMap-16                                        22.0µs ± 3%    22.3µs ± 2%   +1.48%  (p=0.028 n=10+9)
MegEmptyMap-16                                     0.93ns ± 1%    0.92ns ± 1%   -0.52%  (p=0.030 n=10+10)
SmallStrMap-16                                     3.77ns ± 0%    3.77ns ± 0%     ~     (p=0.324 n=10+10)
MapStringKeysEight_16-16                           3.91ns ± 0%    3.91ns ± 0%     ~     (p=0.088 n=9+9)
MapStringKeysEight_32-16                           3.58ns ± 1%    3.50ns ± 0%   -2.11%  (p=0.000 n=10+10)
MapStringKeysEight_64-16                           3.58ns ± 1%    3.50ns ± 0%   -2.23%  (p=0.000 n=10+10)
MapStringKeysEight_1M-16                           3.57ns ± 1%    3.50ns ± 0%   -1.92%  (p=0.000 n=10+10)
IntMap-16                                          2.89ns ± 1%    2.89ns ± 0%     ~     (p=0.381 n=10+10)
MapFirst/1-16                                      1.60ns ± 1%    1.59ns ± 2%   -0.49%  (p=0.020 n=10+9)
MapFirst/2-16                                      1.61ns ± 0%    1.59ns ± 1%   -1.17%  (p=0.001 n=10+10)
MapFirst/3-16                                      1.61ns ± 1%    1.59ns ± 1%   -1.45%  (p=0.000 n=10+10)
MapFirst/4-16                                      1.60ns ± 1%    1.59ns ± 1%   -1.16%  (p=0.000 n=10+10)
MapFirst/5-16                                      1.60ns ± 1%    1.58ns ± 1%   -0.98%  (p=0.000 n=10+10)
MapFirst/6-16                                      1.60ns ± 1%    1.59ns ± 1%   -0.87%  (p=0.001 n=10+10)
MapFirst/7-16                                      1.60ns ± 1%    1.59ns ± 1%   -0.79%  (p=0.002 n=10+10)
MapFirst/8-16                                      1.60ns ± 1%    1.59ns ± 1%   -0.67%  (p=0.017 n=9+10)
MapFirst/9-16                                      2.83ns ± 0%    2.83ns ± 0%     ~     (p=0.492 n=10+10)
MapFirst/10-16                                     2.83ns ± 0%    2.84ns ± 0%   +0.24%  (p=0.017 n=10+10)
MapFirst/11-16                                     2.83ns ± 0%    2.83ns ± 0%     ~     (p=0.445 n=10+10)
MapFirst/12-16                                     2.83ns ± 0%    2.83ns ± 0%     ~     (p=0.564 n=10+10)
MapFirst/13-16                                     2.83ns ± 0%    2.84ns ± 0%     ~     (p=0.175 n=9+10)
MapFirst/14-16                                     2.83ns ± 0%    2.83ns ± 0%     ~     (p=0.322 n=10+9)
MapFirst/15-16                                     2.83ns ± 0%    2.84ns ± 1%     ~     (p=0.209 n=10+10)
MapFirst/16-16                                     2.83ns ± 1%    2.84ns ± 0%     ~     (p=0.238 n=10+10)
MapMid/1-16                                        1.64ns ± 0%    1.64ns ± 0%     ~     (p=0.453 n=10+9)
MapMid/2-16                                        1.86ns ± 1%    1.86ns ± 0%     ~     (p=0.764 n=10+9)
MapMid/3-16                                        1.86ns ± 0%    1.86ns ± 1%     ~     (p=1.000 n=10+10)
MapMid/4-16                                        2.06ns ± 0%    2.06ns ± 0%   -0.27%  (p=0.014 n=10+9)
MapMid/5-16                                        2.06ns ± 0%    2.06ns ± 0%     ~     (p=0.075 n=9+10)
MapMid/6-16                                        2.27ns ± 0%    2.27ns ± 1%     ~     (p=0.898 n=10+10)
MapMid/7-16                                        2.27ns ± 1%    2.26ns ± 0%   -0.23%  (p=0.049 n=10+10)
MapMid/8-16                                        2.47ns ± 0%    2.47ns ± 1%     ~     (p=0.840 n=10+10)
MapMid/9-16                                        4.21ns ± 7%    4.13ns ±19%     ~     (p=0.315 n=10+10)
MapMid/10-16                                       4.17ns ± 7%    4.31ns ± 5%   +3.37%  (p=0.021 n=10+9)
MapMid/11-16                                       4.18ns ± 7%    4.32ns ± 6%   +3.50%  (p=0.015 n=10+10)
MapMid/12-16                                       4.34ns ± 7%    4.30ns ± 5%     ~     (p=0.858 n=9+10)
MapMid/13-16                                       4.25ns ± 6%    4.28ns ± 6%     ~     (p=0.489 n=9+9)
MapMid/14-16                                       3.75ns ±23%    3.90ns ±16%     ~     (p=0.353 n=10+10)
MapMid/15-16                                       3.87ns ±25%    3.95ns ±26%     ~     (p=0.315 n=10+10)
MapMid/16-16                                       4.06ns ±19%    3.94ns ±16%     ~     (p=0.796 n=10+10)
MapLast/1-16                                       1.65ns ± 0%    1.65ns ± 0%     ~     (p=0.607 n=10+10)
MapLast/2-16                                       1.86ns ± 0%    1.86ns ± 0%   +0.26%  (p=0.029 n=10+10)
MapLast/3-16                                       2.06ns ± 1%    2.06ns ± 0%     ~     (p=0.689 n=8+9)
MapLast/4-16                                       2.27ns ± 1%    2.26ns ± 0%     ~     (p=0.148 n=10+9)
MapLast/5-16                                       2.47ns ± 0%    2.47ns ± 0%     ~     (p=0.385 n=9+10)
MapLast/6-16                                       2.67ns ± 0%    2.68ns ± 0%     ~     (p=0.202 n=9+10)
MapLast/7-16                                       2.88ns ± 0%    2.88ns ± 0%     ~     (p=0.751 n=10+10)
MapLast/8-16                                       3.08ns ± 0%    3.08ns ± 0%     ~     (p=0.826 n=10+9)
MapLast/9-16                                       4.31ns ± 6%    4.54ns ± 5%     ~     (p=0.070 n=9+8)
MapLast/10-16                                      4.25ns ± 5%    4.42ns ± 6%     ~     (p=0.321 n=9+8)
MapLast/11-16                                      4.59ns ±16%    5.42ns ±44%  +17.99%  (p=0.019 n=10+10)
MapLast/12-16                                      5.04ns ±19%    6.11ns ±28%  +21.35%  (p=0.005 n=9+10)
MapLast/13-16                                      6.00ns ±35%    5.76ns ± 3%     ~     (p=0.173 n=10+8)
MapLast/14-16                                      4.27ns ± 5%    4.53ns ± 6%   +6.14%  (p=0.007 n=10+10)
MapLast/15-16                                      4.41ns ± 1%    4.44ns ± 7%     ~     (p=0.515 n=8+10)
MapLast/16-16                                      4.18ns ± 6%    4.99ns ±18%  +19.48%  (p=0.000 n=10+10)
MapCycle-16                                        7.48ns ± 2%    7.46ns ± 1%     ~     (p=0.699 n=10+10)
RepeatedLookupStrMapKey32-16                       6.98ns ± 3%    6.73ns ± 2%   -3.63%  (p=0.000 n=10+10)
RepeatedLookupStrMapKey1M-16                       14.7µs ± 5%    14.7µs ± 4%     ~     (p=0.604 n=9+10)
MakeMap/[Byte]Byte-16                              58.5ns ± 1%    58.5ns ± 1%     ~     (p=0.780 n=10+9)
MakeMap/[Int]Int-16                                 113ns ± 0%     113ns ± 1%     ~     (p=0.100 n=8+10)
NewEmptyMap-16                                     2.47ns ± 0%    2.47ns ± 0%     ~     (p=0.638 n=10+10)
NewSmallMap-16                                     11.5ns ± 1%    11.6ns ± 0%   +1.18%  (p=0.000 n=10+10)
MapIter-16                                         42.2ns ± 0%    42.8ns ± 1%   +1.50%  (p=0.000 n=10+10)
MapIterEmpty-16                                    1.85ns ± 0%    1.85ns ± 0%     ~     (p=0.651 n=10+10)
SameLengthMap-16                                   1.85ns ± 1%    1.85ns ± 0%     ~     (p=0.247 n=10+10)
BigKeyMap-16                                       7.18ns ± 1%    7.42ns ± 4%   +3.33%  (p=0.004 n=10+10)
BigValMap-16                                       7.03ns ± 2%    7.19ns ± 1%   +2.33%  (p=0.000 n=10+9)
SmallKeyMap-16                                     5.32ns ± 1%    5.24ns ± 1%   -1.41%  (p=0.000 n=10+10)
MapPopulate/1-16                                   6.30ns ± 0%    6.41ns ± 1%   +1.81%  (p=0.000 n=8+10)
MapPopulate/10-16                                   239ns ± 2%     234ns ± 2%   -2.05%  (p=0.001 n=9+10)
MapPopulate/100-16                                 4.19µs ± 2%    4.22µs ± 2%     ~     (p=0.171 n=10+10)
MapPopulate/1000-16                                52.3µs ± 1%    52.5µs ± 1%     ~     (p=0.133 n=9+10)
MapPopulate/10000-16                                459µs ± 1%     466µs ± 2%   +1.45%  (p=0.005 n=10+10)
MapPopulate/100000-16                              4.22ms ± 2%    4.25ms ± 2%     ~     (p=0.393 n=10+10)
ComplexAlgMap-16                                   12.5ns ± 1%    12.4ns ± 1%   -0.95%  (p=0.022 n=10+10)
GoMapClear/Reflexive/1-16                          9.61ns ± 1%    9.58ns ± 0%   -0.27%  (p=0.027 n=10+10)
GoMapClear/Reflexive/10-16                         10.0ns ± 1%    10.0ns ± 1%     ~     (p=0.648 n=9+9)
GoMapClear/Reflexive/100-16                        31.4ns ± 0%    31.4ns ± 1%     ~     (p=0.305 n=9+10)
GoMapClear/Reflexive/1000-16                        147ns ± 0%     149ns ± 2%   +1.21%  (p=0.000 n=10+10)
GoMapClear/Reflexive/10000-16                      3.99µs ± 0%    4.00µs ± 0%   +0.21%  (p=0.018 n=9+10)
GoMapClear/NonReflexive/1-16                       41.4ns ± 2%    41.7ns ± 1%   +0.55%  (p=0.043 n=9+10)
GoMapClear/NonReflexive/10-16                      50.3ns ± 1%    50.9ns ± 1%   +1.16%  (p=0.000 n=10+10)
GoMapClear/NonReflexive/100-16                      125ns ± 0%     126ns ± 0%   +0.96%  (p=0.000 n=8+10)
GoMapClear/NonReflexive/1000-16                    1.08µs ± 0%    1.08µs ± 1%     ~     (p=0.097 n=10+10)
GoMapClear/NonReflexive/10000-16                   8.18µs ± 2%    8.10µs ± 0%   -0.91%  (p=0.019 n=10+8)
MapStringConversion/32/simple-16                   4.66ns ± 1%    4.69ns ± 3%     ~     (p=0.905 n=9+10)
MapStringConversion/32/struct-16                   4.65ns ± 3%    4.94ns ± 2%   +6.23%  (p=0.000 n=10+10)
MapStringConversion/32/array-16                    4.69ns ± 3%    4.72ns ± 3%     ~     (p=0.631 n=10+10)
MapStringConversion/64/simple-16                   4.14ns ± 0%    4.14ns ± 1%     ~     (p=0.342 n=10+10)
MapStringConversion/64/struct-16                   4.13ns ± 0%    4.13ns ± 0%     ~     (p=0.809 n=10+10)
MapStringConversion/64/array-16                    4.13ns ± 1%    4.13ns ± 1%     ~     (p=0.752 n=10+10)
MapInterfaceString-16                              7.90ns ±23%    8.51ns ±33%     ~     (p=0.604 n=9+10)
MapInterfacePtr-16                                 7.68ns ±29%    7.10ns ±36%     ~     (p=0.353 n=10+10)
NewEmptyMapHintLessThan8-16                        3.70ns ± 0%    3.70ns ± 0%     ~     (p=0.209 n=10+10)
NewEmptyMapHintGreaterThan8-16                      270ns ± 1%     272ns ± 1%   +0.71%  (p=0.005 n=10+9)
MapPop100-16                                       6.45µs ± 0%    6.50µs ± 1%   +0.77%  (p=0.000 n=10+10)
MapPop1000-16                                       114µs ± 1%     114µs ± 1%     ~     (p=0.190 n=10+10)
MapPop10000-16                                     2.28ms ± 2%    2.28ms ± 2%     ~     (p=0.912 n=10+10)
MapAssign/Int32/256-16                             4.75ns ± 2%    4.82ns ± 4%     ~     (p=0.101 n=10+10)
MapAssign/Int32/65536-16                           16.4ns ± 1%    16.7ns ± 0%   +1.44%  (p=0.000 n=10+9)
MapAssign/Int64/256-16                             4.79ns ± 5%    4.79ns ± 1%     ~     (p=0.616 n=10+8)
MapAssign/Int64/65536-16                           17.1ns ± 1%    16.8ns ± 0%   -1.28%  (p=0.000 n=10+8)
MapAssign/Str/256-16                               6.07ns ± 6%    6.24ns ± 2%   +2.84%  (p=0.035 n=10+9)
MapAssign/Str/65536-16                             21.4ns ± 0%    21.4ns ± 3%     ~     (p=0.300 n=7+10)
MapOperatorAssign/Int32/256-16                     4.82ns ± 3%    4.81ns ± 3%     ~     (p=0.684 n=10+10)
MapOperatorAssign/Int32/65536-16                   16.8ns ± 1%    16.5ns ± 1%   -1.68%  (p=0.000 n=9+10)
MapOperatorAssign/Int64/256-16                     4.74ns ± 1%    4.77ns ± 3%     ~     (p=0.563 n=10+9)
MapOperatorAssign/Int64/65536-16                   16.9ns ± 1%    17.2ns ± 1%   +1.88%  (p=0.000 n=10+10)
MapOperatorAssign/Str/256-16                       1.09µs ± 1%    1.10µs ± 2%     ~     (p=0.210 n=10+10)
MapOperatorAssign/Str/65536-16                      184ns ± 9%     184ns ± 8%     ~     (p=0.922 n=10+9)
MapAppendAssign/Int32/256-16                       13.8ns ±10%    14.4ns ±11%     ~     (p=0.190 n=10+10)
MapAppendAssign/Int32/65536-16                     28.9ns ± 5%    30.7ns ± 6%   +6.13%  (p=0.001 n=9+10)
MapAppendAssign/Int64/256-16                       14.5ns ±12%    13.8ns ± 8%   -5.02%  (p=0.037 n=10+10)
MapAppendAssign/Int64/65536-16                     30.9ns ± 1%    30.4ns ± 2%   -1.56%  (p=0.001 n=10+10)
MapAppendAssign/Str/256-16                         30.2ns ± 6%    30.0ns ±10%     ~     (p=0.645 n=10+10)
MapAppendAssign/Str/65536-16                       44.5ns ± 4%    46.8ns ± 3%   +5.17%  (p=0.001 n=8+9)
MapDelete/Int32/100-16                             18.7ns ± 0%    18.7ns ± 0%   -0.27%  (p=0.017 n=10+10)
MapDelete/Int32/1000-16                            17.6ns ± 1%    17.5ns ± 1%   -0.85%  (p=0.000 n=9+10)
MapDelete/Int32/10000-16                           18.7ns ± 0%    18.3ns ± 1%   -1.92%  (p=0.000 n=10+10)
MapDelete/Int64/100-16                             19.1ns ± 0%    19.2ns ± 0%   +0.68%  (p=0.000 n=10+9)
MapDelete/Int64/1000-16                            17.7ns ± 2%    18.3ns ± 1%   +3.00%  (p=0.000 n=10+10)
MapDelete/Int64/10000-16                           18.8ns ± 1%    19.2ns ± 0%   +2.01%  (p=0.000 n=10+9)
MapDelete/Str/100-16                               26.5ns ± 0%    26.4ns ± 1%   -0.73%  (p=0.000 n=10+10)
MapDelete/Str/1000-16                              23.5ns ± 2%    23.4ns ± 1%     ~     (p=0.425 n=10+10)
MapDelete/Str/10000-16                             25.1ns ± 0%    25.1ns ± 1%   +0.28%  (p=0.037 n=10+10)
MapDelete/Pointer/100-16                           20.6ns ± 1%    20.6ns ± 0%     ~     (p=0.117 n=10+10)
MapDelete/Pointer/1000-16                          19.2ns ± 1%    19.4ns ± 1%   +0.97%  (p=0.004 n=10+10)
MapDelete/Pointer/10000-16                         20.0ns ± 0%    20.1ns ± 1%   +0.52%  (p=0.022 n=10+10)
Memmove/0-16                                       0.21ns ± 2%    0.21ns ± 1%     ~     (p=0.671 n=10+10)
Memmove/1-16                                       0.93ns ± 0%    0.93ns ± 0%   +0.21%  (p=0.034 n=10+10)
Memmove/2-16                                       0.93ns ± 0%    0.93ns ± 0%     ~     (p=0.101 n=10+10)
Memmove/3-16                                       0.93ns ± 1%    0.93ns ± 1%   +0.49%  (p=0.004 n=10+10)
Memmove/4-16                                       1.03ns ± 0%    1.03ns ± 0%     ~     (p=0.260 n=10+10)
Memmove/5-16                                       1.13ns ± 0%    1.13ns ± 0%   +0.20%  (p=0.034 n=10+10)
Memmove/6-16                                       1.13ns ± 0%    1.13ns ± 1%     ~     (p=0.126 n=10+10)
Memmove/7-16                                       1.13ns ± 0%    1.13ns ± 1%   +0.22%  (p=0.028 n=10+10)
Memmove/8-16                                       1.13ns ± 0%    1.13ns ± 0%     ~     (p=0.545 n=9+10)
Memmove/9-16                                       1.25ns ± 0%    1.35ns ± 0%   +7.98%  (p=0.000 n=10+10)
Memmove/10-16                                      1.25ns ± 0%    1.35ns ± 0%   +7.96%  (p=0.000 n=9+9)
Memmove/11-16                                      1.25ns ± 0%    1.35ns ± 0%   +8.53%  (p=0.000 n=10+9)
Memmove/12-16                                      1.25ns ± 0%    1.35ns ± 1%   +8.24%  (p=0.000 n=10+10)
Memmove/13-16                                      1.25ns ± 0%    1.34ns ± 0%   +7.75%  (p=0.000 n=10+10)
Memmove/14-16                                      1.25ns ± 0%    1.35ns ± 1%   +8.28%  (p=0.000 n=10+9)
Memmove/15-16                                      1.25ns ± 0%    1.35ns ± 0%   +8.07%  (p=0.000 n=10+9)
Memmove/16-16                                      1.25ns ± 0%    1.35ns ± 1%   +8.35%  (p=0.000 n=9+10)
Memmove/32-16                                      1.34ns ± 0%    1.36ns ± 1%   +1.22%  (p=0.000 n=10+10)
Memmove/64-16                                      1.45ns ± 0%    1.64ns ± 0%  +13.07%  (p=0.000 n=10+9)
Memmove/128-16                                     1.86ns ± 0%    2.02ns ± 0%   +8.64%  (p=0.000 n=10+10)
Memmove/256-16                                     2.47ns ± 0%    2.49ns ± 1%   +1.14%  (p=0.000 n=10+10)
Memmove/512-16                                     3.96ns ± 1%    3.96ns ± 0%     ~     (p=0.182 n=10+10)
Memmove/1024-16                                    5.90ns ± 1%    5.87ns ± 1%     ~     (p=0.258 n=9+9)
Memmove/2048-16                                    9.62ns ± 1%    9.62ns ± 2%     ~     (p=0.963 n=8+9)
Memmove/4096-16                                    16.4ns ± 0%    17.1ns ± 4%   +4.19%  (p=0.003 n=8+9)
MemmoveOverlap/32-16                               1.62ns ± 1%    1.68ns ± 1%   +3.53%  (p=0.000 n=10+10)
MemmoveOverlap/64-16                               1.64ns ± 0%    1.65ns ± 0%   +0.29%  (p=0.002 n=9+9)
MemmoveOverlap/128-16                              2.06ns ± 0%    2.06ns ± 0%     ~     (p=0.070 n=10+10)
MemmoveOverlap/256-16                              2.67ns ± 0%    2.67ns ± 0%   +0.26%  (p=0.012 n=10+10)
MemmoveOverlap/512-16                              6.20ns ±18%    5.74ns ± 0%     ~     (p=0.645 n=10+8)
MemmoveOverlap/1024-16                             7.28ns ± 0%    7.30ns ± 0%   +0.28%  (p=0.006 n=8+10)
MemmoveOverlap/2048-16                             11.9ns ± 0%    12.0ns ± 1%   +0.37%  (p=0.014 n=9+9)
MemmoveOverlap/4096-16                             23.3ns ± 1%    23.1ns ± 1%   -0.84%  (p=0.000 n=8+10)
MemmoveUnalignedDst/0-16                           1.03ns ± 0%    1.03ns ± 0%   +0.19%  (p=0.007 n=10+10)
MemmoveUnalignedDst/1-16                           1.24ns ± 0%    1.25ns ± 1%   +0.52%  (p=0.022 n=10+10)
MemmoveUnalignedDst/2-16                           1.23ns ± 0%    1.23ns ± 0%     ~     (p=0.051 n=10+10)
MemmoveUnalignedDst/3-16                           1.23ns ± 0%    1.23ns ± 0%   +0.14%  (p=0.006 n=9+9)
MemmoveUnalignedDst/4-16                           1.23ns ± 0%    1.24ns ± 1%   +0.37%  (p=0.004 n=10+10)
MemmoveUnalignedDst/5-16                           1.35ns ± 0%    1.35ns ± 0%     ~     (p=0.075 n=10+10)
MemmoveUnalignedDst/6-16                           1.34ns ± 0%    1.34ns ± 0%     ~     (p=0.779 n=10+10)
MemmoveUnalignedDst/7-16                           1.34ns ± 0%    1.34ns ± 0%     ~     (p=1.000 n=10+10)
MemmoveUnalignedDst/8-16                           1.34ns ± 0%    1.35ns ± 1%   +0.39%  (p=0.024 n=10+10)
MemmoveUnalignedDst/9-16                           1.44ns ± 0%    1.44ns ± 0%     ~     (p=0.849 n=10+10)
MemmoveUnalignedDst/10-16                          1.44ns ± 0%    1.44ns ± 0%     ~     (p=0.255 n=10+10)
MemmoveUnalignedDst/11-16                          1.44ns ± 0%    1.44ns ± 0%     ~     (p=0.304 n=10+10)
MemmoveUnalignedDst/12-16                          1.44ns ± 0%    1.44ns ± 0%     ~     (p=0.672 n=10+10)
MemmoveUnalignedDst/13-16                          1.44ns ± 0%    1.44ns ± 0%     ~     (p=0.435 n=10+10)
MemmoveUnalignedDst/14-16                          1.44ns ± 0%    1.44ns ± 0%     ~     (p=0.340 n=10+10)
MemmoveUnalignedDst/15-16                          1.44ns ± 0%    1.44ns ± 0%     ~     (p=0.911 n=10+9)
MemmoveUnalignedDst/16-16                          1.44ns ± 0%    1.44ns ± 0%     ~     (p=0.074 n=10+10)
MemmoveUnalignedDst/32-16                          1.62ns ± 0%    1.63ns ± 0%     ~     (p=0.059 n=10+10)
MemmoveUnalignedDst/64-16                          1.65ns ± 0%    1.65ns ± 0%     ~     (p=0.234 n=10+10)
MemmoveUnalignedDst/128-16                         2.06ns ± 0%    2.06ns ± 0%     ~     (p=0.709 n=10+9)
MemmoveUnalignedDst/256-16                         3.69ns ± 0%    3.70ns ± 0%     ~     (p=0.144 n=10+10)
MemmoveUnalignedDst/512-16                         4.15ns ± 1%    4.14ns ± 0%     ~     (p=0.778 n=10+8)
MemmoveUnalignedDst/1024-16                        7.52ns ± 0%    7.53ns ± 1%     ~     (p=0.650 n=9+9)
MemmoveUnalignedDst/2048-16                        12.9ns ± 0%    12.9ns ± 1%     ~     (p=0.548 n=8+8)
MemmoveUnalignedDst/4096-16                        25.4ns ± 0%    25.4ns ± 0%     ~     (p=0.947 n=9+9)
MemmoveUnalignedDstOverlap/32-16                   4.08ns ± 0%    4.09ns ± 0%     ~     (p=0.360 n=10+10)
MemmoveUnalignedDstOverlap/64-16                   4.56ns ± 0%    4.56ns ± 0%     ~     (p=0.705 n=10+9)
MemmoveUnalignedDstOverlap/128-16                  4.67ns ± 0%    4.67ns ± 0%     ~     (p=0.397 n=10+10)
MemmoveUnalignedDstOverlap/256-16                  5.08ns ± 0%    5.08ns ± 0%     ~     (p=0.159 n=10+9)
MemmoveUnalignedDstOverlap/512-16                  8.45ns ± 5%    8.19ns ± 0%   -3.10%  (p=0.021 n=10+9)
MemmoveUnalignedDstOverlap/1024-16                 9.55ns ± 0%    9.56ns ± 0%     ~     (p=0.221 n=8+8)
MemmoveUnalignedDstOverlap/2048-16                 14.0ns ± 0%    14.0ns ± 1%     ~     (p=0.200 n=10+9)
MemmoveUnalignedDstOverlap/4096-16                 26.5ns ± 0%    26.5ns ± 0%     ~     (p=0.458 n=10+9)
MemmoveUnalignedSrc/0-16                           1.02ns ± 1%    0.99ns ± 1%   -2.67%  (p=0.000 n=10+9)
MemmoveUnalignedSrc/1-16                           1.13ns ± 0%    1.13ns ± 1%   -0.25%  (p=0.027 n=10+9)
MemmoveUnalignedSrc/2-16                           1.13ns ± 1%    1.13ns ± 0%   -0.28%  (p=0.012 n=10+9)
MemmoveUnalignedSrc/3-16                           1.24ns ± 1%    1.23ns ± 0%   -0.25%  (p=0.022 n=9+10)
MemmoveUnalignedSrc/4-16                           1.24ns ± 0%    1.23ns ± 1%     ~     (p=0.118 n=9+10)
MemmoveUnalignedSrc/5-16                           1.34ns ± 0%    1.34ns ± 1%     ~     (p=0.564 n=8+10)
MemmoveUnalignedSrc/6-16                           1.34ns ± 0%    1.34ns ± 0%   -0.39%  (p=0.000 n=10+10)
MemmoveUnalignedSrc/7-16                           1.34ns ± 0%    1.34ns ± 0%     ~     (p=0.235 n=10+10)
MemmoveUnalignedSrc/8-16                           1.34ns ± 0%    1.34ns ± 0%   -0.37%  (p=0.002 n=10+9)
MemmoveUnalignedSrc/9-16                           1.44ns ± 0%    1.44ns ± 0%     ~     (p=0.579 n=10+9)
MemmoveUnalignedSrc/10-16                          1.44ns ± 0%    1.44ns ± 0%     ~     (p=0.534 n=10+9)
MemmoveUnalignedSrc/11-16                          1.44ns ± 0%    1.44ns ± 1%     ~     (p=0.415 n=10+10)
MemmoveUnalignedSrc/12-16                          1.44ns ± 0%    1.44ns ± 0%     ~     (p=0.218 n=10+10)
MemmoveUnalignedSrc/13-16                          1.44ns ± 0%    1.44ns ± 1%     ~     (p=0.693 n=10+10)
MemmoveUnalignedSrc/14-16                          1.44ns ± 0%    1.44ns ± 0%     ~     (p=0.901 n=10+10)
MemmoveUnalignedSrc/15-16                          1.44ns ± 0%    1.44ns ± 0%     ~     (p=0.379 n=10+10)
MemmoveUnalignedSrc/16-16                          1.44ns ± 1%    1.44ns ± 0%     ~     (p=0.538 n=10+10)
MemmoveUnalignedSrc/32-16                          1.60ns ± 1%    1.60ns ± 0%     ~     (p=0.491 n=10+10)
MemmoveUnalignedSrc/64-16                          1.65ns ± 0%    1.65ns ± 0%     ~     (p=0.564 n=10+10)
MemmoveUnalignedSrc/128-16                         2.09ns ± 0%    2.09ns ± 0%     ~     (p=0.497 n=10+9)
MemmoveUnalignedSrc/256-16                         2.70ns ± 0%    2.78ns ± 1%   +2.81%  (p=0.000 n=10+10)
MemmoveUnalignedSrc/512-16                         4.31ns ± 0%    4.30ns ± 0%   -0.26%  (p=0.031 n=8+9)
MemmoveUnalignedSrc/1024-16                        7.28ns ± 0%    7.21ns ± 1%   -1.05%  (p=0.000 n=8+10)
MemmoveUnalignedSrc/2048-16                        13.0ns ± 0%    13.0ns ± 0%     ~     (p=0.180 n=9+8)
MemmoveUnalignedSrc/4096-16                        25.4ns ± 0%    25.3ns ± 1%     ~     (p=0.054 n=10+10)
MemmoveUnalignedSrcOverlap/32-16                   4.04ns ± 0%    4.06ns ± 0%   +0.62%  (p=0.000 n=9+10)
MemmoveUnalignedSrcOverlap/64-16                   4.12ns ± 0%    4.12ns ± 0%     ~     (p=0.421 n=10+10)
MemmoveUnalignedSrcOverlap/128-16                  4.53ns ± 0%    4.52ns ± 0%     ~     (p=0.251 n=10+10)
MemmoveUnalignedSrcOverlap/256-16                  6.17ns ± 0%    6.15ns ± 0%   -0.35%  (p=0.000 n=10+9)
MemmoveUnalignedSrcOverlap/512-16                  7.43ns ± 0%    7.44ns ± 0%     ~     (p=0.524 n=9+8)
MemmoveUnalignedSrcOverlap/1024-16                 8.94ns ± 0%    8.94ns ± 0%     ~     (p=0.419 n=8+8)
MemmoveUnalignedSrcOverlap/2048-16                 13.2ns ± 0%    14.5ns ±21%     ~     (p=0.107 n=8+10)
MemmoveUnalignedSrcOverlap/4096-16                 25.6ns ± 0%    25.6ns ± 1%     ~     (p=0.650 n=9+9)
Memclr/5-16                                        0.86ns ± 1%    0.86ns ± 2%     ~     (p=0.531 n=9+9)
Memclr/16-16                                       1.04ns ± 0%    1.04ns ± 0%   +0.32%  (p=0.013 n=9+10)
Memclr/64-16                                       1.23ns ± 0%    1.26ns ± 0%   +2.28%  (p=0.000 n=10+10)
Memclr/256-16                                      2.27ns ± 0%    2.27ns ± 0%     ~     (p=0.127 n=10+10)
Memclr/4096-16                                     17.1ns ± 1%    17.3ns ± 0%   +0.88%  (p=0.000 n=10+10)
Memclr/65536-16                                     821ns ± 0%     822ns ± 0%     ~     (p=0.516 n=10+10)
Memclr/1M-16                                       14.1µs ± 1%    14.0µs ± 1%     ~     (p=0.516 n=10+10)
Memclr/4M-16                                       86.1µs ± 1%    85.9µs ± 0%     ~     (p=0.123 n=10+10)
Memclr/8M-16                                        174µs ± 2%     173µs ± 0%     ~     (p=0.408 n=10+8)
Memclr/16M-16                                       385µs ± 4%     387µs ± 0%     ~     (p=0.173 n=10+8)
Memclr/64M-16                                      2.18ms ± 0%    2.19ms ± 0%     ~     (p=0.113 n=10+9)
GoMemclr/5-16                                      0.82ns ± 0%    0.82ns ± 0%     ~     (p=0.346 n=9+10)
GoMemclr/16-16                                     1.02ns ± 0%    1.02ns ± 0%   +0.22%  (p=0.003 n=10+8)
GoMemclr/64-16                                     1.14ns ± 0%    1.14ns ± 0%     ~     (p=0.948 n=10+9)
GoMemclr/256-16                                    2.06ns ± 0%    2.06ns ± 0%     ~     (p=0.868 n=10+10)
MemclrRange/1K_2K-16                                457ns ± 0%     428ns ± 1%   -6.38%  (p=0.000 n=10+10)
MemclrRange/2K_8K-16                               1.46µs ± 0%    1.46µs ± 0%     ~     (p=0.700 n=10+10)
MemclrRange/4K_16K-16                              1.16µs ± 0%    1.16µs ± 0%     ~     (p=0.567 n=9+10)
MemclrRange/160K_228K-16                           20.7µs ± 0%    20.7µs ± 0%     ~     (p=0.160 n=10+10)
ClearFat7-16                                       0.38ns ± 5%    0.21ns ± 1%  -45.79%  (p=0.000 n=9+10)
ClearFat8-16                                       0.21ns ± 3%    0.12ns ± 2%  -44.16%  (p=0.000 n=8+9)
ClearFat11-16                                      0.35ns ± 3%    0.21ns ± 1%  -40.46%  (p=0.000 n=9+9)
ClearFat12-16                                      0.23ns ± 9%    0.21ns ± 1%  -10.23%  (p=0.000 n=10+9)
ClearFat13-16                                      0.22ns ± 6%    0.21ns ± 2%   -6.53%  (p=0.000 n=10+10)
ClearFat14-16                                      0.22ns ± 4%    0.21ns ± 1%   -5.97%  (p=0.000 n=10+10)
ClearFat15-16                                      0.22ns ± 4%    0.21ns ± 1%   -6.96%  (p=0.000 n=10+9)
ClearFat16-16                                      0.19ns ± 9%    0.12ns ± 6%  -34.89%  (p=0.000 n=9+10)
ClearFat24-16                                      0.23ns ± 6%    0.21ns ± 1%  -10.26%  (p=0.000 n=10+9)
ClearFat32-16                                      0.22ns ± 5%    0.21ns ± 2%   -5.31%  (p=0.000 n=10+10)
ClearFat40-16                                      0.34ns ± 4%    0.62ns ± 1%  +83.00%  (p=0.000 n=10+10)
ClearFat48-16                                      0.33ns ± 2%    0.41ns ± 0%  +26.71%  (p=0.000 n=10+10)
ClearFat56-16                                      0.41ns ± 1%    0.41ns ± 0%     ~     (p=0.838 n=10+10)
ClearFat64-16                                      0.41ns ± 0%    0.41ns ± 0%     ~     (p=0.178 n=10+8)
ClearFat72-16                                      0.82ns ± 0%    0.82ns ± 0%     ~     (p=0.669 n=10+10)
ClearFat128-16                                     1.04ns ± 0%    1.04ns ± 0%     ~     (p=0.679 n=10+10)
ClearFat256-16                                     1.86ns ± 0%    1.86ns ± 0%     ~     (p=0.066 n=9+10)
ClearFat512-16                                     3.50ns ± 0%    3.50ns ± 0%     ~     (p=0.626 n=10+10)
ClearFat1024-16                                    6.79ns ± 0%    6.79ns ± 0%     ~     (p=0.986 n=10+10)
ClearFat1032-16                                    13.6ns ± 0%    13.6ns ± 0%   +0.13%  (p=0.044 n=10+10)
ClearFat1040-16                                    10.3ns ± 0%    10.3ns ± 0%     ~     (p=0.175 n=10+9)
CopyFat7-16                                        0.37ns ±13%    0.25ns ± 1%  -31.74%  (p=0.000 n=10+9)
CopyFat8-16                                        0.17ns ± 1%    0.17ns ± 2%   +1.35%  (p=0.004 n=9+9)
CopyFat11-16                                       0.26ns ± 1%    0.30ns ± 3%  +12.58%  (p=0.000 n=9+10)
CopyFat12-16                                       0.28ns ± 2%    0.26ns ± 1%   -5.66%  (p=0.000 n=9+9)
CopyFat13-16                                       0.26ns ± 0%    0.28ns ± 4%   +7.35%  (p=0.000 n=8+10)
CopyFat14-16                                       0.29ns ± 6%    0.26ns ± 2%  -10.46%  (p=0.000 n=10+9)
CopyFat15-16                                       0.26ns ± 1%    0.30ns ± 6%  +14.12%  (p=0.000 n=8+10)
CopyFat16-16                                       0.21ns ± 1%    0.21ns ± 0%     ~     (p=0.426 n=8+8)
CopyFat24-16                                       0.29ns ± 3%    0.25ns ± 1%  -12.27%  (p=0.000 n=9+10)
CopyFat32-16                                       0.26ns ± 4%    0.29ns ± 4%  +11.71%  (p=0.000 n=10+10)
CopyFat64-16                                       0.46ns ± 8%    0.42ns ± 1%   -8.37%  (p=0.002 n=10+10)
CopyFat72-16                                       0.82ns ± 0%    0.82ns ± 0%     ~     (p=0.563 n=10+10)
CopyFat128-16                                      1.53ns ± 0%    1.54ns ± 0%   +0.62%  (p=0.000 n=10+10)
CopyFat256-16                                      2.68ns ± 0%    2.65ns ± 1%   -1.23%  (p=0.000 n=10+10)
CopyFat512-16                                      4.93ns ± 1%    5.19ns ± 3%   +5.16%  (p=0.000 n=9+9)
CopyFat520-16                                      6.99ns ± 0%    6.99ns ± 0%     ~     (p=0.539 n=10+10)
CopyFat1024-16                                     11.5ns ± 1%     9.8ns ± 1%  -14.98%  (p=0.000 n=9+10)
CopyFat1032-16                                     13.6ns ± 0%    13.6ns ± 0%     ~     (p=0.728 n=10+10)
CopyFat1040-16                                     11.0ns ± 0%    11.1ns ± 0%   +0.53%  (p=0.000 n=10+10)
Issue18740/2byte-16                                10.1µs ± 0%    10.1µs ± 0%     ~     (p=0.342 n=10+10)
Issue18740/4byte-16                                2.34µs ± 0%    2.35µs ± 0%   +0.30%  (p=0.002 n=10+8)
Issue18740/8byte-16                                1.28µs ± 0%    1.28µs ± 0%   +0.32%  (p=0.000 n=9+10)
Finalizer-16                                        345µs ± 1%     336µs ± 0%   -2.55%  (p=0.000 n=10+9)
FinalizerRun-16                                     450ns ± 3%     420ns ± 1%   -6.65%  (p=0.000 n=10+10)
PallocBitsSummarize/Unpacked00-16                  2.88ns ± 0%    2.88ns ± 0%     ~     (p=0.358 n=10+10)
PallocBitsSummarize/UnpackedFFFFFFFFFFFFFFFF-16    15.2ns ± 0%    15.2ns ± 0%     ~     (p=0.925 n=10+10)
PallocBitsSummarize/UnpackedAA-16                  16.4ns ± 0%    16.3ns ± 0%     ~     (p=0.113 n=9+9)
PallocBitsSummarize/UnpackedAAAAAAAAAAAAAAAA-16    16.5ns ± 0%    16.6ns ± 0%     ~     (p=0.238 n=10+10)
PallocBitsSummarize/Unpacked80000000AAAAAAAA-16    37.8ns ± 1%    36.4ns ± 0%   -3.70%  (p=0.000 n=10+9)
PallocBitsSummarize/UnpackedAAAAAAAA00000001-16    41.8ns ± 1%    39.9ns ± 0%   -4.68%  (p=0.000 n=9+10)
PallocBitsSummarize/UnpackedBBBBBBBBBBBBBBBB-16    18.3ns ± 0%    18.3ns ± 0%     ~     (p=0.781 n=10+10)
PallocBitsSummarize/Unpacked80000000BBBBBBBB-16    38.8ns ± 1%    38.1ns ± 0%   -1.78%  (p=0.000 n=9+10)
PallocBitsSummarize/UnpackedBBBBBBBB00000001-16    37.5ns ± 0%    36.1ns ± 1%   -3.88%  (p=0.000 n=8+10)
PallocBitsSummarize/UnpackedCCCCCCCCCCCCCCCC-16    21.8ns ± 0%    21.9ns ± 0%   +0.20%  (p=0.018 n=10+9)
PallocBitsSummarize/Unpacked4444444444444444-16    21.8ns ± 0%    21.9ns ± 0%   +0.20%  (p=0.029 n=10+9)
PallocBitsSummarize/Unpacked4040404040404040-16    26.5ns ± 0%    26.5ns ± 0%   -0.24%  (p=0.001 n=9+10)
PallocBitsSummarize/Unpacked4000400040004000-16    33.4ns ± 1%    31.3ns ± 0%   -6.20%  (p=0.000 n=9+10)
PallocBitsSummarize/Unpacked1000404044CCAAFF-16    36.4ns ± 1%    35.9ns ± 0%   -1.50%  (p=0.000 n=10+10)
FindBitRange64/Pattern00Size2-16                   0.34ns ± 1%    0.35ns ± 1%   +3.80%  (p=0.000 n=10+9)
FindBitRange64/Pattern00Size8-16                   0.70ns ± 1%    0.70ns ± 0%   -0.68%  (p=0.000 n=10+10)
FindBitRange64/Pattern00Size32-16                  0.70ns ± 1%    0.69ns ± 0%   -0.86%  (p=0.001 n=10+8)
FindBitRange64/PatternFFFFFFFFFFFFFFFFSize2-16     0.34ns ± 1%    0.35ns ± 1%   +4.45%  (p=0.000 n=9+8)
FindBitRange64/PatternFFFFFFFFFFFFFFFFSize8-16     1.54ns ± 0%    1.54ns ± 1%     ~     (p=0.914 n=9+9)
FindBitRange64/PatternFFFFFFFFFFFFFFFFSize32-16    2.78ns ± 0%    2.78ns ± 0%     ~     (p=0.295 n=9+10)
FindBitRange64/PatternAASize2-16                   0.34ns ± 2%    0.35ns ± 2%   +4.61%  (p=0.000 n=10+10)
FindBitRange64/PatternAASize8-16                   0.70ns ± 1%    0.70ns ± 1%   -0.82%  (p=0.005 n=10+10)
FindBitRange64/PatternAASize32-16                  0.70ns ± 1%    0.70ns ± 0%   -0.73%  (p=0.003 n=10+9)
FindBitRange64/PatternAAAAAAAAAAAAAAAASize2-16     0.34ns ± 2%    0.35ns ± 2%   +3.94%  (p=0.000 n=10+10)
FindBitRange64/PatternAAAAAAAAAAAAAAAASize8-16     0.70ns ± 1%    0.70ns ± 1%   -0.67%  (p=0.025 n=10+10)
FindBitRange64/PatternAAAAAAAAAAAAAAAASize32-16    0.70ns ± 1%    0.70ns ± 1%     ~     (p=0.118 n=9+10)
FindBitRange64/Pattern80000000AAAAAAAASize2-16     0.34ns ± 1%    0.35ns ± 2%   +3.72%  (p=0.000 n=10+9)
FindBitRange64/Pattern80000000AAAAAAAASize8-16     0.70ns ± 1%    0.70ns ± 0%     ~     (p=0.102 n=10+10)
FindBitRange64/Pattern80000000AAAAAAAASize32-16    0.70ns ± 1%    0.70ns ± 1%   -0.55%  (p=0.011 n=10+10)
FindBitRange64/PatternAAAAAAAA00000001Size2-16     0.34ns ± 2%    0.35ns ± 1%   +3.83%  (p=0.000 n=10+9)
FindBitRange64/PatternAAAAAAAA00000001Size8-16     0.70ns ± 1%    0.70ns ± 1%     ~     (p=0.065 n=10+10)
FindBitRange64/PatternAAAAAAAA00000001Size32-16    0.70ns ± 1%    0.70ns ± 1%   -0.95%  (p=0.002 n=10+10)
FindBitRange64/PatternBBBBBBBBBBBBBBBBSize2-16     0.34ns ± 0%    0.35ns ± 1%   +4.12%  (p=0.000 n=8+10)
FindBitRange64/PatternBBBBBBBBBBBBBBBBSize8-16     1.24ns ± 0%    1.23ns ± 0%   -0.30%  (p=0.002 n=10+9)
FindBitRange64/PatternBBBBBBBBBBBBBBBBSize32-16    1.24ns ± 0%    1.24ns ± 0%   -0.17%  (p=0.023 n=9+10)
FindBitRange64/Pattern80000000BBBBBBBBSize2-16     0.34ns ± 1%    0.35ns ± 2%   +4.82%  (p=0.000 n=9+10)
FindBitRange64/Pattern80000000BBBBBBBBSize8-16     1.24ns ± 1%    1.24ns ± 0%     ~     (p=0.063 n=10+10)
FindBitRange64/Pattern80000000BBBBBBBBSize32-16    1.24ns ± 0%    1.24ns ± 0%     ~     (p=0.164 n=9+10)
FindBitRange64/PatternBBBBBBBB00000001Size2-16     0.34ns ± 1%    0.35ns ± 1%   +4.38%  (p=0.000 n=8+10)
FindBitRange64/PatternBBBBBBBB00000001Size8-16     1.24ns ± 1%    1.24ns ± 0%     ~     (p=0.052 n=10+10)
FindBitRange64/PatternBBBBBBBB00000001Size32-16    1.24ns ± 0%    1.23ns ± 0%   -0.40%  (p=0.000 n=10+10)
FindBitRange64/PatternCCCCCCCCCCCCCCCCSize2-16     0.34ns ± 0%    0.35ns ± 2%   +3.96%  (p=0.000 n=9+10)
FindBitRange64/PatternCCCCCCCCCCCCCCCCSize8-16     1.24ns ± 0%    1.23ns ± 0%   -0.30%  (p=0.000 n=10+9)
FindBitRange64/PatternCCCCCCCCCCCCCCCCSize32-16    1.24ns ± 0%    1.24ns ± 1%     ~     (p=0.284 n=10+10)
FindBitRange64/Pattern4444444444444444Size2-16     0.34ns ± 1%    0.35ns ± 1%   +3.91%  (p=0.000 n=9+9)
FindBitRange64/Pattern4444444444444444Size8-16     0.70ns ± 1%    0.70ns ± 1%     ~     (p=0.617 n=10+10)
FindBitRange64/Pattern4444444444444444Size32-16    0.70ns ± 1%    0.70ns ± 1%   -0.60%  (p=0.006 n=10+10)
FindBitRange64/Pattern4040404040404040Size2-16     0.34ns ± 2%    0.35ns ± 2%   +3.67%  (p=0.000 n=10+10)
FindBitRange64/Pattern4040404040404040Size8-16     0.70ns ± 2%    0.70ns ± 1%   -0.87%  (p=0.014 n=10+10)
FindBitRange64/Pattern4040404040404040Size32-16    0.70ns ± 1%    0.70ns ± 1%     ~     (p=0.256 n=10+10)
FindBitRange64/Pattern4000400040004000Size2-16     0.34ns ± 2%    0.35ns ± 3%   +4.71%  (p=0.000 n=10+10)
FindBitRange64/Pattern4000400040004000Size8-16     0.70ns ± 1%    0.70ns ± 1%     ~     (p=0.393 n=10+10)
FindBitRange64/Pattern4000400040004000Size32-16    0.70ns ± 1%    0.70ns ± 1%   -0.86%  (p=0.014 n=10+10)
NetpollBreak-16                                    1.49µs ± 1%    1.50µs ± 3%     ~     (p=0.181 n=8+10)
Syscall-16                                         3.68ns ± 1%    3.66ns ± 2%     ~     (p=0.148 n=10+10)
SyscallWork-16                                     5.15ns ± 1%    5.13ns ± 0%     ~     (p=0.188 n=10+9)
SyscallExcess-16                                   3.89ns ± 2%    3.83ns ± 1%   -1.52%  (p=0.001 n=10+10)
SyscallExcessWork-16                               5.34ns ± 1%    5.31ns ± 0%   -0.64%  (p=0.000 n=10+9)
PingPongHog-16                                      397ns ± 7%     394ns ±11%     ~     (p=0.912 n=10+10)
StackGrowth-16                                     67.9ns ± 0%    68.8ns ± 0%   +1.28%  (p=0.000 n=10+8)
StackGrowthDeep-16                                 7.70µs ± 1%    8.48µs ± 2%  +10.06%  (p=0.000 n=9+10)
CreateGoroutines-16                                 124ns ± 1%     124ns ± 1%     ~     (p=0.254 n=10+10)
CreateGoroutinesParallel-16                        25.7ns ± 1%    27.6ns ± 2%   +7.51%  (p=0.000 n=10+10)
CreateGoroutinesCapture-16                          823ns ± 1%     821ns ± 2%     ~     (p=0.699 n=10+10)
CreateGoroutinesSingle-16                           175ns ± 3%     172ns ± 3%   -1.90%  (p=0.011 n=10+10)
ClosureCall-16                                     0.11ns ± 7%    0.12ns ± 3%     ~     (p=0.842 n=9+10)
WakeupParallelSpinning/0s-16                       11.4µs ± 0%    11.4µs ± 0%     ~     (p=0.325 n=9+10)
WakeupParallelSpinning/1µs-16                      15.4µs ± 0%    15.4µs ± 1%     ~     (p=0.955 n=10+10)
WakeupParallelSpinning/2µs-16                      18.7µs ± 2%    18.9µs ± 2%     ~     (p=0.052 n=10+10)
WakeupParallelSpinning/5µs-16                      30.7µs ± 0%    30.7µs ± 0%   -0.03%  (p=0.003 n=10+10)
WakeupParallelSpinning/10µs-16                     48.8µs ± 0%    48.8µs ± 0%     ~     (p=0.670 n=10+10)
WakeupParallelSpinning/20µs-16                     90.8µs ± 0%    90.8µs ± 0%   -0.02%  (p=0.004 n=10+10)
WakeupParallelSpinning/50µs-16                      211µs ± 0%     211µs ± 0%     ~     (p=0.194 n=10+10)
WakeupParallelSpinning/100µs-16                     323µs ± 0%     323µs ± 0%     ~     (p=1.000 n=10+9)
WakeupParallelSyscall/0s-16                         118µs ± 0%     118µs ± 0%     ~     (p=0.447 n=10+9)
WakeupParallelSyscall/1µs-16                        119µs ± 2%     119µs ± 1%     ~     (p=0.604 n=10+9)
WakeupParallelSyscall/2µs-16                        120µs ± 1%     121µs ± 3%     ~     (p=0.263 n=8+10)
WakeupParallelSyscall/5µs-16                        126µs ± 2%     126µs ± 2%     ~     (p=0.510 n=10+9)
WakeupParallelSyscall/10µs-16                       136µs ± 1%     137µs ± 1%     ~     (p=0.095 n=9+10)
WakeupParallelSyscall/20µs-16                       156µs ± 2%     157µs ± 3%     ~     (p=0.604 n=10+9)
WakeupParallelSyscall/50µs-16                       221µs ± 1%     220µs ± 1%     ~     (p=0.063 n=10+10)
WakeupParallelSyscall/100µs-16                      326µs ± 0%     325µs ± 0%   -0.26%  (p=0.003 n=9+10)
Matmult-16                                         0.67ns ± 2%    0.66ns ± 2%     ~     (p=0.256 n=10+10)
Fastrand-16                                        0.08ns ±11%    0.08ns ±13%     ~     (p=0.661 n=9+10)
Fastrand64-16                                      0.08ns ±11%    0.08ns ± 6%     ~     (p=0.631 n=10+10)
FastrandHashiter-16                                1.76ns ± 1%    1.76ns ± 1%     ~     (p=0.854 n=8+8)
Fastrandn/2-16                                     0.86ns ± 1%    0.86ns ± 1%   +1.09%  (p=0.000 n=10+9)
Fastrandn/3-16                                     0.85ns ± 1%    0.86ns ± 1%   +1.23%  (p=0.001 n=10+10)
Fastrandn/4-16                                     0.85ns ± 1%    0.87ns ± 2%   +1.60%  (p=0.000 n=10+10)
Fastrandn/5-16                                     0.85ns ± 1%    0.86ns ± 1%   +1.05%  (p=0.000 n=10+10)
IfaceCmp100-16                                     46.6ns ± 0%    46.1ns ± 0%   -1.18%  (p=0.000 n=10+10)
IfaceCmpNil100-16                                  26.8ns ± 0%    26.8ns ± 0%     ~     (p=0.777 n=10+8)
EfaceCmpDiff-16                                     132ns ± 0%     130ns ± 0%   -0.95%  (p=0.000 n=10+9)
EfaceCmpDiffIndirect-16                             209ns ± 0%     211ns ± 0%   +1.14%  (p=0.000 n=10+9)
Defer-16                                           3.40ns ± 1%    3.04ns ± 0%  -10.67%  (p=0.000 n=10+10)
Defer10-16                                         29.4ns ± 2%    27.2ns ± 3%   -7.26%  (p=0.000 n=10+10)
DeferMany-16                                        110ns ± 6%     113ns ± 2%   +3.45%  (p=0.017 n=9+9)
PanicRecover-16                                    67.6ns ± 0%    67.7ns ± 2%     ~     (p=0.436 n=9+9)
GoroutineProfile/small-nil/idle-16                 3.90µs ± 4%    3.86µs ± 2%     ~     (p=0.305 n=10+9)
GoroutineProfile/small-nil/loaded-16               4.82µs ± 6%    4.82µs ± 4%     ~     (p=0.905 n=10+9)
GoroutineProfile/small/idle-16                      103µs ± 3%     102µs ± 3%     ~     (p=0.113 n=9+9)
GoroutineProfile/small/loaded-16                    432µs ± 5%     440µs ±13%     ~     (p=0.604 n=9+10)
GoroutineProfile/large-nil/idle-16                 3.86µs ± 3%    3.82µs ± 3%     ~     (p=0.210 n=10+10)
GoroutineProfile/large-nil/loaded-16               4.90µs ± 2%    4.90µs ± 5%     ~     (p=0.780 n=10+9)
GoroutineProfile/large/idle-16                     2.58ms ± 1%    2.52ms ± 1%   -2.38%  (p=0.000 n=10+10)
GoroutineProfile/large/loaded-16                   8.62ms ± 9%    8.90ms ±11%     ~     (p=0.400 n=9+10)
GoroutineProfile/sparse-nil/idle-16                3.85µs ± 4%    3.81µs ± 3%     ~     (p=0.470 n=10+10)
GoroutineProfile/sparse-nil/loaded-16              4.82µs ± 4%    4.69µs ± 5%     ~     (p=0.052 n=10+10)
GoroutineProfile/sparse/idle-16                     102µs ± 4%     102µs ± 2%     ~     (p=0.497 n=10+9)
GoroutineProfile/sparse/loaded-16                   438µs ± 7%     437µs ± 6%     ~     (p=0.796 n=10+10)
RWMutexUncontended-16                              6.79ns ± 0%    6.78ns ± 0%     ~     (p=0.228 n=10+8)
RWMutexWrite100-16                                 85.4ns ± 0%    87.1ns ± 0%   +2.00%  (p=0.000 n=10+8)
RWMutexWrite10-16                                   168ns ±25%     152ns ±11%     ~     (p=0.063 n=10+10)
RWMutexWorkWrite100-16                              106ns ± 0%     106ns ± 3%     ~     (p=0.136 n=10+10)
RWMutexWorkWrite10-16                               567ns ± 3%     571ns ± 1%     ~     (p=0.326 n=10+9)
SemTable/OneAddrCollision/n=1000-16                15.9µs ± 1%    16.0µs ± 1%   +0.50%  (p=0.031 n=9+9)
SemTable/ManyAddrCollision/n=1000-16               56.2µs ± 1%    56.8µs ± 1%   +1.06%  (p=0.000 n=10+10)
SemTable/OneAddrCollision/n=2000-16                32.6µs ± 2%    32.9µs ± 4%     ~     (p=0.156 n=10+9)
SemTable/ManyAddrCollision/n=2000-16                118µs ± 0%     119µs ± 0%   +0.75%  (p=0.000 n=9+10)
SemTable/OneAddrCollision/n=4000-16                65.3µs ± 1%    65.6µs ± 3%     ~     (p=0.497 n=9+10)
SemTable/ManyAddrCollision/n=4000-16                245µs ± 0%     248µs ± 2%   +1.36%  (p=0.000 n=9+10)
SemTable/OneAddrCollision/n=8000-16                 131µs ± 1%     130µs ± 1%   -1.01%  (p=0.002 n=9+10)
SemTable/ManyAddrCollision/n=8000-16                503µs ± 1%     508µs ± 0%   +0.97%  (p=0.000 n=10+10)
MakeSliceCopy/mallocmove/Byte-16                   67.6ns ± 1%    64.1ns ± 2%   -5.20%  (p=0.000 n=10+10)
MakeSliceCopy/mallocmove/Int-16                    65.0ns ± 7%    61.7ns ± 4%   -5.08%  (p=0.009 n=10+10)
MakeSliceCopy/mallocmove/Ptr-16                    88.1ns ± 1%    79.9ns ± 1%   -9.29%  (p=0.000 n=10+10)
MakeSliceCopy/makecopy/Byte-16                     65.2ns ± 6%    63.4ns ± 0%     ~     (p=0.500 n=10+8)
MakeSliceCopy/makecopy/Int-16                      63.2ns ± 1%    64.1ns ± 1%   +1.34%  (p=0.001 n=9+9)
MakeSliceCopy/makecopy/Ptr-16                      88.1ns ± 1%    80.1ns ± 1%   -9.09%  (p=0.000 n=10+10)
MakeSliceCopy/nilappend/Byte-16                    69.8ns ± 1%    65.7ns ± 3%   -5.80%  (p=0.000 n=10+10)
MakeSliceCopy/nilappend/Int-16                     69.6ns ± 2%    67.2ns ± 1%   -3.50%  (p=0.000 n=10+9)
MakeSliceCopy/nilappend/Ptr-16                     91.5ns ± 1%    83.8ns ± 1%   -8.42%  (p=0.000 n=9+10)
MakeSlice/Byte-16                                  6.64ns ± 3%    6.58ns ± 2%     ~     (p=0.393 n=10+10)
MakeSlice/Int16-16                                 8.60ns ± 1%    8.38ns ± 3%   -2.48%  (p=0.001 n=9+10)
MakeSlice/Int-16                                   17.7ns ± 3%    16.9ns ± 1%   -4.67%  (p=0.000 n=10+9)
MakeSlice/Ptr-16                                   24.0ns ± 3%    23.3ns ± 2%   -3.25%  (p=0.000 n=10+9)
MakeSlice/Struct/24-16                             34.1ns ± 1%    32.0ns ± 1%   -6.11%  (p=0.000 n=10+10)
MakeSlice/Struct/32-16                             39.1ns ± 4%    38.2ns ± 1%     ~     (p=0.829 n=10+8)
MakeSlice/Struct/40-16                             47.0ns ± 5%    43.0ns ± 2%   -8.55%  (p=0.000 n=10+9)
GrowSlice/Byte-16                                  15.3ns ± 3%    15.0ns ± 2%   -1.75%  (p=0.005 n=9+9)
GrowSlice/Int16-16                                 18.9ns ± 2%    18.4ns ± 2%   -2.71%  (p=0.000 n=10+9)
GrowSlice/Int-16                                   33.9ns ± 1%    32.2ns ± 1%   -4.89%  (p=0.000 n=10+9)
GrowSlice/Ptr-16                                   45.3ns ± 2%    43.5ns ± 1%   -4.12%  (p=0.000 n=10+10)
GrowSlice/Struct/24-16                             61.9ns ± 2%    60.0ns ± 4%   -3.10%  (p=0.002 n=10+10)
GrowSlice/Struct/32-16                             79.9ns ± 2%    72.3ns ± 3%   -9.58%  (p=0.000 n=8+10)
GrowSlice/Struct/40-16                             97.1ns ± 7%    88.8ns ± 5%   -8.49%  (p=0.000 n=10+10)
ExtendSlice/IntSlice-16                            21.1ns ± 2%    20.3ns ± 2%   -3.71%  (p=0.000 n=10+10)
ExtendSlice/PointerSlice-16                        26.8ns ± 2%    26.3ns ± 2%   -1.86%  (p=0.004 n=10+10)
ExtendSlice/NoGrow-16                              1.23ns ± 0%    1.30ns ± 1%   +5.03%  (p=0.000 n=10+10)
Append-16                                          4.58ns ± 1%    4.53ns ± 0%   -1.11%  (p=0.000 n=10+10)
AppendGrowByte-16                                  1.46ms ± 8%    1.42ms ± 7%   -3.24%  (p=0.035 n=10+10)
AppendGrowString-16                                27.8ms ± 4%    27.2ms ± 5%     ~     (p=0.052 n=10+10)
AppendSlice/1Bytes-16                              1.03ns ± 1%    1.04ns ± 1%     ~     (p=0.303 n=10+10)
AppendSlice/4Bytes-16                              1.04ns ± 0%    1.05ns ± 0%   +0.79%  (p=0.000 n=9+10)
AppendSlice/7Bytes-16                              1.23ns ± 1%    1.24ns ± 0%   +0.45%  (p=0.001 n=10+10)
AppendSlice/8Bytes-16                              1.24ns ± 0%    1.24ns ± 0%     ~     (p=0.183 n=10+10)
AppendSlice/15Bytes-16                             1.37ns ± 1%    1.43ns ± 1%   +3.88%  (p=0.000 n=10+10)
AppendSlice/16Bytes-16                             1.37ns ± 1%    1.42ns ± 1%   +3.63%  (p=0.000 n=9+10)
AppendSlice/32Bytes-16                             1.44ns ± 0%    1.47ns ± 1%   +1.83%  (p=0.000 n=10+10)
AppendSliceLarge/1024Bytes-16                       257ns ± 2%     234ns ± 1%   -8.96%  (p=0.000 n=8+9)
AppendSliceLarge/4096Bytes-16                       871ns ± 6%     812ns ± 1%   -6.80%  (p=0.000 n=10+10)
AppendSliceLarge/16384Bytes-16                     3.15µs ± 6%    3.04µs ± 5%     ~     (p=0.052 n=10+10)
AppendSliceLarge/65536Bytes-16                     10.7µs ± 7%    10.8µs ± 2%     ~     (p=0.278 n=10+9)
AppendSliceLarge/262144Bytes-16                    42.9µs ± 1%    39.6µs ± 5%   -7.75%  (p=0.000 n=9+10)
AppendSliceLarge/1048576Bytes-16                    147µs ± 4%     144µs ± 4%   -2.21%  (p=0.035 n=10+10)
AppendStr/1Bytes-16                                1.20ns ± 0%    1.20ns ± 0%     ~     (p=0.755 n=10+10)
AppendStr/4Bytes-16                                1.13ns ± 0%    1.14ns ± 1%   +1.20%  (p=0.000 n=10+10)
AppendStr/8Bytes-16                                1.24ns ± 0%    1.25ns ± 0%   +0.93%  (p=0.000 n=10+10)
AppendStr/16Bytes-16                               1.40ns ± 0%    1.42ns ± 0%   +2.10%  (p=0.000 n=9+10)
AppendStr/32Bytes-16                               1.44ns ± 0%    1.45ns ± 0%   +0.99%  (p=0.000 n=10+10)
AppendSpecialCase-16                               8.64ns ± 1%    8.89ns ± 2%   +2.90%  (p=0.000 n=10+10)
Copy/1Byte-16                                      1.24ns ± 1%    1.24ns ± 0%   -0.28%  (p=0.000 n=10+6)
Copy/1String-16                                    1.24ns ± 0%    1.23ns ± 0%     ~     (p=0.160 n=10+10)
Copy/2Byte-16                                      1.24ns ± 0%    1.24ns ± 0%     ~     (p=0.115 n=10+10)
Copy/2String-16                                    1.24ns ± 0%    1.24ns ± 1%     ~     (p=0.954 n=10+10)
Copy/4Byte-16                                      1.24ns ± 0%    1.24ns ± 0%   -0.44%  (p=0.001 n=10+10)
Copy/4String-16                                    1.23ns ± 0%    1.23ns ± 0%     ~     (p=0.081 n=10+10)
Copy/8Byte-16                                      1.37ns ± 0%    1.34ns ± 0%   -1.79%  (p=0.000 n=9+9)
Copy/8String-16                                    1.34ns ± 0%    1.34ns ± 0%   -0.58%  (p=0.000 n=9+10)
Copy/12Byte-16                                     1.44ns ± 0%    1.44ns ± 0%     ~     (p=0.149 n=9+9)
Copy/12String-16                                   1.44ns ± 0%    1.45ns ± 0%     ~     (p=0.124 n=9+9)
Copy/16Byte-16                                     1.44ns ± 0%    1.44ns ± 0%   -0.19%  (p=0.004 n=10+9)
Copy/16String-16                                   1.44ns ± 0%    1.45ns ± 0%   +0.30%  (p=0.008 n=10+10)
Copy/32Byte-16                                     1.63ns ± 1%    1.62ns ± 1%   -0.72%  (p=0.002 n=10+10)
Copy/32String-16                                   1.60ns ± 1%    1.64ns ± 0%   +2.23%  (p=0.000 n=10+10)
Copy/128Byte-16                                    2.06ns ± 0%    2.06ns ± 0%     ~     (p=0.757 n=9+10)
Copy/128String-16                                  2.07ns ± 0%    2.07ns ± 0%   +0.36%  (p=0.004 n=10+10)
Copy/1024Byte-16                                   6.07ns ± 2%    6.00ns ± 1%   -1.20%  (p=0.000 n=9+10)
Copy/1024String-16                                 6.05ns ± 0%    5.95ns ± 1%   -1.54%  (p=0.000 n=10+9)
AppendInPlace/NoGrow/Byte-16                        288ns ± 1%     284ns ± 1%   -1.58%  (p=0.000 n=10+10)
AppendInPlace/NoGrow/1Ptr-16                        844ns ± 1%     809ns ± 3%   -4.13%  (p=0.000 n=9+10)
AppendInPlace/NoGrow/2Ptr-16                       1.47µs ± 1%    1.46µs ± 1%     ~     (p=0.388 n=9+10)
AppendInPlace/NoGrow/3Ptr-16                       1.87µs ± 7%    1.91µs ± 1%     ~     (p=0.166 n=10+8)
AppendInPlace/NoGrow/4Ptr-16                       2.66µs ± 1%    2.67µs ± 3%     ~     (p=0.968 n=9+10)
AppendInPlace/Grow/Byte-16                          126ns ± 2%     121ns ± 2%   -4.06%  (p=0.000 n=10+10)
AppendInPlace/Grow/1Ptr-16                          132ns ± 2%     127ns ± 2%   -4.28%  (p=0.000 n=10+9)
AppendInPlace/Grow/2Ptr-16                          196ns ± 2%     188ns ± 1%   -4.20%  (p=0.000 n=10+8)
AppendInPlace/Grow/3Ptr-16                          264ns ± 1%     260ns ± 1%   -1.51%  (p=0.000 n=9+10)
AppendInPlace/Grow/4Ptr-16                          297ns ± 2%     294ns ± 2%     ~     (p=0.085 n=10+10)
StackCopyPtr-16                                    36.4ms ± 2%    36.7ms ± 2%     ~     (p=0.481 n=10+10)
StackCopy-16                                       33.9ms ± 3%    32.6ms ± 1%   -3.87%  (p=0.000 n=10+8)
StackCopyNoCache-16                                1.00ms ± 5%    1.01ms ± 5%     ~     (p=0.143 n=10+10)
StackCopyWithStkobj-16                             11.0ms ± 3%    10.9ms ± 4%     ~     (p=0.579 n=10+10)
Issue18138-16                                      49.2µs ± 5%    49.0µs ± 4%     ~     (p=1.000 n=10+9)
CompareStringEqual-16                              1.39ns ± 1%    1.45ns ± 2%   +3.80%  (p=0.000 n=8+10)
CompareStringIdentical-16                          0.55ns ± 1%    0.55ns ± 0%   +0.42%  (p=0.007 n=10+10)
CompareStringSameLength-16                         1.03ns ± 0%    1.03ns ± 0%     ~     (p=0.430 n=9+10)
CompareStringDifferentLength-16                    0.11ns ± 2%    0.11ns ± 3%     ~     (p=0.139 n=9+10)
CompareStringBigUnaligned-16                       23.9µs ± 1%    24.0µs ± 1%     ~     (p=0.370 n=9+8)
CompareStringBig-16                                22.0µs ± 3%    22.2µs ± 3%     ~     (p=0.243 n=9+10)
ConcatStringAndBytes-16                            10.7ns ± 1%    10.0ns ± 2%   -6.33%  (p=0.000 n=10+10)
SliceByteToString/1-16                             1.34ns ± 0%    1.34ns ± 0%     ~     (p=0.057 n=10+10)
SliceByteToString/2-16                             6.67ns ± 2%    6.60ns ± 3%     ~     (p=0.101 n=10+10)
SliceByteToString/4-16                             7.76ns ± 2%    7.56ns ± 3%   -2.59%  (p=0.001 n=10+10)
SliceByteToString/8-16                             9.81ns ± 4%    9.57ns ± 2%   -2.48%  (p=0.005 n=10+10)
SliceByteToString/16-16                            14.0ns ± 3%    13.7ns ± 2%   -2.31%  (p=0.009 n=10+10)
SliceByteToString/32-16                            17.3ns ± 1%    16.7ns ± 2%   -3.41%  (p=0.000 n=10+10)
SliceByteToString/64-16                            25.1ns ± 1%    24.1ns ± 2%   -3.93%  (p=0.000 n=9+10)
SliceByteToString/128-16                           38.6ns ± 1%    36.5ns ± 1%   -5.60%  (p=0.000 n=10+10)
RuneCount/lenruneslice/ASCII-16                    4.12ns ± 0%    4.11ns ± 0%     ~     (p=0.382 n=10+10)
RuneCount/lenruneslice/Japanese-16                 25.4ns ± 2%    25.6ns ± 2%     ~     (p=0.138 n=9+10)
RuneCount/lenruneslice/MixedLength-16              17.1ns ± 0%    17.2ns ± 0%   +0.59%  (p=0.000 n=9+9)
RuneCount/rangeloop/ASCII-16                       3.30ns ± 1%    3.29ns ± 0%     ~     (p=0.267 n=10+10)
RuneCount/rangeloop/Japanese-16                    20.1ns ± 1%    24.9ns ± 1%  +24.31%  (p=0.000 n=9+9)
RuneCount/rangeloop/MixedLength-16                 16.5ns ± 1%    16.7ns ± 1%   +1.34%  (p=0.000 n=10+10)
RuneCount/utf8.RuneCountInString/ASCII-16          5.71ns ± 1%    5.73ns ± 2%     ~     (p=0.579 n=10+10)
RuneCount/utf8.RuneCountInString/Japanese-16       22.0ns ± 6%    18.4ns ± 3%  -16.41%  (p=0.000 n=9+10)
RuneCount/utf8.RuneCountInString/MixedLength-16    15.0ns ± 1%    14.9ns ± 1%   -1.01%  (p=0.004 n=9+10)
RuneIterate/range/ASCII-16                         2.69ns ± 1%    2.72ns ± 0%   +0.94%  (p=0.026 n=10+9)
RuneIterate/range/Japanese-16                      24.5ns ± 2%    25.3ns ± 2%   +3.23%  (p=0.000 n=9+10)
RuneIterate/range/MixedLength-16                   17.0ns ± 1%    17.1ns ± 1%   +0.85%  (p=0.000 n=10+10)
RuneIterate/range1/ASCII-16                        2.70ns ± 1%    2.72ns ± 0%     ~     (p=0.058 n=9+9)
RuneIterate/range1/Japanese-16                     24.1ns ± 2%    25.2ns ± 3%   +4.30%  (p=0.000 n=10+10)
RuneIterate/range1/MixedLength-16                  16.9ns ± 1%    17.7ns ± 0%   +5.04%  (p=0.000 n=10+8)
RuneIterate/range2/ASCII-16                        2.84ns ± 8%    2.72ns ± 1%   -4.28%  (p=0.003 n=10+9)
RuneIterate/range2/Japanese-16                     22.7ns ± 4%    25.2ns ± 3%  +10.97%  (p=0.000 n=10+10)
RuneIterate/range2/MixedLength-16                  17.0ns ± 1%    17.2ns ± 0%   +0.95%  (p=0.000 n=10+10)
ArrayEqual-16                                      0.40ns ± 5%    0.35ns ± 2%  -11.83%  (p=0.000 n=10+10)
Func/Name-16                                       8.05ns ± 1%    8.09ns ± 1%   +0.40%  (p=0.025 n=8+10)
Func/Entry-16                                      1.73ns ± 1%    1.66ns ± 1%   -3.93%  (p=0.000 n=10+10)
Func/FileLine-16                                   27.5ns ± 2%    26.0ns ± 0%   -5.50%  (p=0.000 n=10+10)
[Geo mean]                                         16.7ns         15.7ns        -6.08%

name                                             old speed      new speed      delta
SetTypePtr-16                                    11.0GB/s ± 1%  11.0GB/s ± 3%     ~     (p=0.684 n=10+10)
SetTypePtr8-16                                   15.5GB/s ± 0%  15.5GB/s ± 0%     ~     (p=0.123 n=10+10)
SetTypePtr16-16                                  31.0GB/s ± 1%  31.1GB/s ± 0%     ~     (p=0.123 n=10+10)
SetTypePtr32-16                                  62.1GB/s ± 0%  62.2GB/s ± 0%     ~     (p=0.123 n=10+10)
SetTypePtr64-16                                   124GB/s ± 0%   124GB/s ± 0%     ~     (p=0.684 n=10+10)
SetTypePtr126-16                                  146GB/s ± 0%   146GB/s ± 0%     ~     (p=0.481 n=10+10)
SetTypePtr128-16                                  154GB/s ± 0%   154GB/s ± 0%     ~     (p=0.243 n=9+10)
SetTypePtrSlice-16                                151GB/s ± 1%   151GB/s ± 1%     ~     (p=0.497 n=9+10)
SetTypeNode1-16                                  5.82GB/s ± 1%  5.82GB/s ± 0%     ~     (p=0.353 n=10+10)
SetTypeNode1Slice-16                             76.1GB/s ± 1%  77.0GB/s ± 1%   +1.19%  (p=0.000 n=10+10)
SetTypeNode8-16                                  19.4GB/s ± 0%  19.4GB/s ± 0%     ~     (p=0.130 n=8+8)
SetTypeNode8Slice-16                              113GB/s ± 0%   113GB/s ± 0%     ~     (p=0.604 n=10+9)
SetTypeNode64-16                                 76.5GB/s ± 0%  76.5GB/s ± 0%     ~     (p=0.190 n=10+10)
SetTypeNode64Slice-16                            97.8GB/s ± 0%  97.7GB/s ± 0%     ~     (p=0.549 n=9+10)
SetTypeNode64Dead-16                             95.5GB/s ± 0%  95.7GB/s ± 0%     ~     (p=0.118 n=10+6)
SetTypeNode64DeadSlice-16                         112GB/s ± 0%   112GB/s ± 0%     ~     (p=0.353 n=10+10)
SetTypeNode124-16                                 146GB/s ± 0%   146GB/s ± 0%     ~     (p=0.853 n=10+10)
SetTypeNode124Slice-16                            146GB/s ± 5%   149GB/s ± 0%     ~     (p=0.315 n=10+10)
SetTypeNode126-16                                 154GB/s ± 0%   154GB/s ± 0%     ~     (p=0.356 n=10+9)
SetTypeNode126Slice-16                            150GB/s ± 0%   150GB/s ± 0%     ~     (p=0.095 n=9+10)
SetTypeNode128-16                                 107GB/s ± 0%   107GB/s ± 0%   +0.31%  (p=0.003 n=9+10)
SetTypeNode128Slice-16                            119GB/s ± 0%   120GB/s ± 0%     ~     (p=0.156 n=10+9)
SetTypeNode130-16                                 108GB/s ± 0%   108GB/s ± 0%   +0.33%  (p=0.002 n=10+10)
SetTypeNode130Slice-16                            119GB/s ± 0%   119GB/s ± 0%     ~     (p=0.739 n=10+10)
SetTypeNode1024-16                                160GB/s ± 0%   159GB/s ± 1%     ~     (p=0.113 n=9+9)
SetTypeNode1024Slice-16                           144GB/s ± 0%   144GB/s ± 0%     ~     (p=0.063 n=10+10)
Hash5-16                                         2.59GB/s ± 1%  2.49GB/s ± 0%   -3.90%  (p=0.000 n=10+9)
Hash16-16                                        7.85GB/s ± 1%  7.23GB/s ± 1%   -7.92%  (p=0.000 n=10+10)
Hash64-16                                        24.0GB/s ± 0%  23.9GB/s ± 0%     ~     (p=0.190 n=9+9)
Hash1024-16                                      62.4GB/s ± 0%  62.3GB/s ± 0%   -0.16%  (p=0.017 n=9+10)
Hash65536-16                                     74.0GB/s ± 0%  74.0GB/s ± 0%     ~     (p=0.796 n=10+10)
Memmove/1-16                                     1.08GB/s ± 0%  1.08GB/s ± 0%   -0.21%  (p=0.035 n=10+10)
Memmove/2-16                                     2.16GB/s ± 0%  2.15GB/s ± 0%     ~     (p=0.105 n=10+10)
Memmove/3-16                                     3.24GB/s ± 1%  3.22GB/s ± 1%   -0.49%  (p=0.004 n=10+10)
Memmove/4-16                                     3.89GB/s ± 0%  3.89GB/s ± 0%     ~     (p=0.218 n=10+10)
Memmove/5-16                                     4.42GB/s ± 0%  4.42GB/s ± 0%     ~     (p=0.075 n=10+10)
Memmove/6-16                                     5.31GB/s ± 0%  5.29GB/s ± 1%     ~     (p=0.218 n=10+10)
Memmove/7-16                                     6.19GB/s ± 0%  6.18GB/s ± 0%   -0.15%  (p=0.035 n=10+9)
Memmove/8-16                                     7.07GB/s ± 0%  7.07GB/s ± 0%     ~     (p=0.684 n=10+10)
Memmove/9-16                                     7.22GB/s ± 0%  6.68GB/s ± 0%   -7.37%  (p=0.000 n=10+10)
Memmove/10-16                                    8.02GB/s ± 0%  7.43GB/s ± 0%   -7.38%  (p=0.000 n=9+9)
Memmove/11-16                                    8.83GB/s ± 0%  8.13GB/s ± 0%   -7.87%  (p=0.000 n=10+9)
Memmove/12-16                                    9.62GB/s ± 0%  8.89GB/s ± 1%   -7.61%  (p=0.000 n=10+10)
Memmove/13-16                                    10.4GB/s ± 0%   9.7GB/s ± 0%   -7.20%  (p=0.000 n=10+10)
Memmove/14-16                                    11.2GB/s ± 0%  10.4GB/s ± 1%   -7.64%  (p=0.000 n=10+9)
Memmove/15-16                                    12.0GB/s ± 0%  11.1GB/s ± 0%   -7.46%  (p=0.000 n=10+9)
Memmove/16-16                                    12.8GB/s ± 0%  11.8GB/s ± 1%   -7.67%  (p=0.000 n=10+10)
Memmove/32-16                                    23.8GB/s ± 0%  23.5GB/s ± 1%   -1.20%  (p=0.000 n=10+10)
Memmove/64-16                                    44.2GB/s ± 0%  39.1GB/s ± 0%  -11.56%  (p=0.000 n=10+9)
Memmove/128-16                                   68.7GB/s ± 0%  63.2GB/s ± 0%   -7.95%  (p=0.000 n=10+10)
Memmove/256-16                                    104GB/s ± 0%   103GB/s ± 0%   -1.13%  (p=0.000 n=10+10)
Memmove/512-16                                    129GB/s ± 1%   129GB/s ± 0%     ~     (p=0.165 n=10+10)
Memmove/1024-16                                   174GB/s ± 1%   174GB/s ± 1%     ~     (p=0.258 n=9+9)
Memmove/2048-16                                   213GB/s ± 1%   213GB/s ± 2%     ~     (p=0.963 n=8+9)
Memmove/4096-16                                   250GB/s ± 1%   240GB/s ± 4%   -3.83%  (p=0.006 n=9+9)
MemmoveOverlap/32-16                             19.8GB/s ± 1%  19.1GB/s ± 1%   -3.40%  (p=0.000 n=10+10)
MemmoveOverlap/64-16                             39.0GB/s ± 0%  38.8GB/s ± 0%   -0.28%  (p=0.001 n=9+9)
MemmoveOverlap/128-16                            62.2GB/s ± 0%  62.1GB/s ± 0%     ~     (p=0.063 n=10+10)
MemmoveOverlap/256-16                            96.0GB/s ± 0%  95.8GB/s ± 0%   -0.26%  (p=0.009 n=10+10)
MemmoveOverlap/512-16                            83.6GB/s ±16%  89.2GB/s ± 0%     ~     (p=0.696 n=10+8)
MemmoveOverlap/1024-16                            141GB/s ± 0%   140GB/s ± 0%   -0.28%  (p=0.006 n=8+10)
MemmoveOverlap/2048-16                            172GB/s ± 0%   171GB/s ± 1%   -0.38%  (p=0.008 n=9+9)
MemmoveOverlap/4096-16                            176GB/s ± 1%   177GB/s ± 1%   +0.84%  (p=0.001 n=8+10)
MemmoveUnalignedDst/1-16                          806MB/s ± 0%   802MB/s ± 1%   -0.52%  (p=0.023 n=10+10)
MemmoveUnalignedDst/2-16                         1.62GB/s ± 0%  1.62GB/s ± 0%   -0.11%  (p=0.041 n=10+10)
MemmoveUnalignedDst/3-16                         2.43GB/s ± 0%  2.43GB/s ± 0%   -0.14%  (p=0.006 n=9+9)
MemmoveUnalignedDst/4-16                         3.24GB/s ± 0%  3.23GB/s ± 1%   -0.36%  (p=0.007 n=10+10)
MemmoveUnalignedDst/5-16                         3.71GB/s ± 0%  3.71GB/s ± 0%     ~     (p=0.063 n=10+10)
MemmoveUnalignedDst/6-16                         4.48GB/s ± 0%  4.47GB/s ± 0%     ~     (p=0.912 n=10+10)
MemmoveUnalignedDst/7-16                         5.22GB/s ± 0%  5.22GB/s ± 0%     ~     (p=1.000 n=10+10)
MemmoveUnalignedDst/8-16                         5.95GB/s ± 0%  5.93GB/s ± 1%   -0.40%  (p=0.023 n=10+10)
MemmoveUnalignedDst/9-16                         6.24GB/s ± 0%  6.24GB/s ± 0%     ~     (p=0.912 n=10+10)
MemmoveUnalignedDst/10-16                        6.94GB/s ± 0%  6.94GB/s ± 0%     ~     (p=0.353 n=10+10)
MemmoveUnalignedDst/11-16                        7.64GB/s ± 0%  7.63GB/s ± 0%     ~     (p=0.393 n=10+10)
MemmoveUnalignedDst/12-16                        8.33GB/s ± 0%  8.33GB/s ± 0%     ~     (p=0.971 n=10+10)
MemmoveUnalignedDst/13-16                        9.02GB/s ± 0%  9.01GB/s ± 0%     ~     (p=0.436 n=10+10)
MemmoveUnalignedDst/14-16                        9.71GB/s ± 0%  9.71GB/s ± 0%     ~     (p=0.280 n=10+10)
MemmoveUnalignedDst/15-16                        10.4GB/s ± 0%  10.4GB/s ± 1%     ~     (p=0.853 n=10+10)
MemmoveUnalignedDst/16-16                        11.1GB/s ± 0%  11.1GB/s ± 0%     ~     (p=0.089 n=10+10)
MemmoveUnalignedDst/32-16                        19.7GB/s ± 1%  19.6GB/s ± 0%     ~     (p=0.075 n=10+10)
MemmoveUnalignedDst/64-16                        38.9GB/s ± 0%  38.8GB/s ± 0%     ~     (p=0.218 n=10+10)
MemmoveUnalignedDst/128-16                       62.1GB/s ± 0%  62.1GB/s ± 0%     ~     (p=0.549 n=10+9)
MemmoveUnalignedDst/256-16                       69.4GB/s ± 0%  69.3GB/s ± 0%     ~     (p=0.105 n=10+10)
MemmoveUnalignedDst/512-16                        124GB/s ± 1%   124GB/s ± 0%     ~     (p=0.762 n=10+8)
MemmoveUnalignedDst/1024-16                       136GB/s ± 0%   136GB/s ± 1%     ~     (p=0.666 n=9+9)
MemmoveUnalignedDst/2048-16                       159GB/s ± 0%   159GB/s ± 0%     ~     (p=0.574 n=8+8)
MemmoveUnalignedDst/4096-16                       161GB/s ± 0%   161GB/s ± 0%     ~     (p=1.000 n=9+9)
MemmoveUnalignedDstOverlap/32-16                 7.84GB/s ± 0%  7.83GB/s ± 0%     ~     (p=0.353 n=10+10)
MemmoveUnalignedDstOverlap/64-16                 14.0GB/s ± 0%  14.0GB/s ± 0%     ~     (p=0.661 n=10+9)
MemmoveUnalignedDstOverlap/128-16                27.4GB/s ± 0%  27.4GB/s ± 0%     ~     (p=0.353 n=10+10)
MemmoveUnalignedDstOverlap/256-16                50.4GB/s ± 0%  50.4GB/s ± 0%     ~     (p=0.156 n=10+9)
MemmoveUnalignedDstOverlap/512-16                60.7GB/s ± 4%  62.5GB/s ± 0%   +3.07%  (p=0.022 n=10+9)
MemmoveUnalignedDstOverlap/1024-16                107GB/s ± 0%   107GB/s ± 0%     ~     (p=0.234 n=8+8)
MemmoveUnalignedDstOverlap/2048-16                146GB/s ± 0%   146GB/s ± 1%     ~     (p=0.182 n=10+9)
MemmoveUnalignedDstOverlap/4096-16                155GB/s ± 0%   155GB/s ± 0%     ~     (p=0.400 n=10+9)
MemmoveUnalignedSrc/1-16                          882MB/s ± 0%   884MB/s ± 1%   +0.24%  (p=0.033 n=10+9)
MemmoveUnalignedSrc/2-16                         1.76GB/s ± 1%  1.77GB/s ± 0%   +0.27%  (p=0.028 n=10+9)
MemmoveUnalignedSrc/3-16                         2.43GB/s ± 0%  2.43GB/s ± 0%   +0.26%  (p=0.027 n=9+10)
MemmoveUnalignedSrc/4-16                         3.24GB/s ± 0%  3.24GB/s ± 1%     ~     (p=0.079 n=9+10)
MemmoveUnalignedSrc/5-16                         3.73GB/s ± 0%  3.73GB/s ± 1%     ~     (p=0.829 n=8+10)
MemmoveUnalignedSrc/6-16                         4.47GB/s ± 0%  4.49GB/s ± 0%   +0.39%  (p=0.000 n=10+10)
MemmoveUnalignedSrc/7-16                         5.22GB/s ± 0%  5.23GB/s ± 0%     ~     (p=0.280 n=10+10)
MemmoveUnalignedSrc/8-16                         5.95GB/s ± 0%  5.98GB/s ± 0%   +0.39%  (p=0.001 n=10+9)
MemmoveUnalignedSrc/9-16                         6.24GB/s ± 0%  6.25GB/s ± 0%     ~     (p=0.549 n=10+9)
MemmoveUnalignedSrc/10-16                        6.93GB/s ± 0%  6.94GB/s ± 0%     ~     (p=0.604 n=10+9)
MemmoveUnalignedSrc/11-16                        7.63GB/s ± 0%  7.63GB/s ± 1%     ~     (p=0.353 n=10+10)
MemmoveUnalignedSrc/12-16                        8.32GB/s ± 0%  8.32GB/s ± 0%     ~     (p=0.218 n=10+10)
MemmoveUnalignedSrc/13-16                        9.02GB/s ± 0%  9.00GB/s ± 1%     ~     (p=0.684 n=10+10)
MemmoveUnalignedSrc/14-16                        9.71GB/s ± 0%  9.71GB/s ± 0%     ~     (p=0.739 n=10+10)
MemmoveUnalignedSrc/15-16                        10.4GB/s ± 0%  10.4GB/s ± 0%     ~     (p=0.353 n=10+10)
MemmoveUnalignedSrc/16-16                        11.1GB/s ± 1%  11.1GB/s ± 0%     ~     (p=0.579 n=10+10)
MemmoveUnalignedSrc/32-16                        20.0GB/s ± 1%  20.0GB/s ± 0%     ~     (p=0.631 n=10+10)
MemmoveUnalignedSrc/64-16                        38.8GB/s ± 0%  38.8GB/s ± 0%     ~     (p=0.579 n=10+10)
MemmoveUnalignedSrc/128-16                       61.2GB/s ± 0%  61.2GB/s ± 0%     ~     (p=0.780 n=10+9)
MemmoveUnalignedSrc/256-16                       94.8GB/s ± 0%  92.2GB/s ± 1%   -2.73%  (p=0.000 n=10+10)
MemmoveUnalignedSrc/512-16                        119GB/s ± 0%   119GB/s ± 0%   +0.26%  (p=0.027 n=8+9)
MemmoveUnalignedSrc/1024-16                       141GB/s ± 0%   142GB/s ± 1%   +1.07%  (p=0.000 n=8+10)
MemmoveUnalignedSrc/2048-16                       157GB/s ± 0%   157GB/s ± 0%     ~     (p=0.167 n=9+8)
MemmoveUnalignedSrc/4096-16                       161GB/s ± 0%   162GB/s ± 1%     ~     (p=0.063 n=10+10)
MemmoveUnalignedSrcOverlap/32-16                 7.93GB/s ± 0%  7.88GB/s ± 0%   -0.63%  (p=0.000 n=9+10)
MemmoveUnalignedSrcOverlap/64-16                 15.5GB/s ± 0%  15.5GB/s ± 0%     ~     (p=0.529 n=10+10)
MemmoveUnalignedSrcOverlap/128-16                28.3GB/s ± 0%  28.3GB/s ± 0%     ~     (p=0.218 n=10+10)
MemmoveUnalignedSrcOverlap/256-16                41.5GB/s ± 0%  41.6GB/s ± 0%   +0.35%  (p=0.000 n=10+9)
MemmoveUnalignedSrcOverlap/512-16                68.9GB/s ± 0%  68.8GB/s ± 0%     ~     (p=0.541 n=9+8)
MemmoveUnalignedSrcOverlap/1024-16                115GB/s ± 0%   115GB/s ± 0%     ~     (p=0.382 n=8+8)
MemmoveUnalignedSrcOverlap/2048-16                155GB/s ± 0%   144GB/s ±18%     ~     (p=0.101 n=8+10)
MemmoveUnalignedSrcOverlap/4096-16                160GB/s ± 0%   160GB/s ± 1%     ~     (p=0.605 n=9+9)
Memclr/5-16                                      5.81GB/s ± 1%  5.80GB/s ± 2%     ~     (p=0.546 n=9+9)
Memclr/16-16                                     15.5GB/s ± 0%  15.4GB/s ± 0%   -0.32%  (p=0.008 n=9+10)
Memclr/64-16                                     51.8GB/s ± 0%  50.7GB/s ± 0%   -2.22%  (p=0.000 n=10+10)
Memclr/256-16                                     113GB/s ± 0%   113GB/s ± 0%     ~     (p=0.143 n=10+10)
Memclr/4096-16                                    239GB/s ± 1%   237GB/s ± 0%   -0.87%  (p=0.000 n=10+10)
Memclr/65536-16                                  79.8GB/s ± 0%  79.7GB/s ± 0%     ~     (p=0.529 n=10+10)
Memclr/1M-16                                     74.6GB/s ± 1%  74.7GB/s ± 1%     ~     (p=0.529 n=10+10)
Memclr/4M-16                                     48.7GB/s ± 1%  48.8GB/s ± 0%     ~     (p=0.123 n=10+10)
Memclr/8M-16                                     48.2GB/s ± 2%  48.6GB/s ± 0%     ~     (p=0.408 n=10+8)
Memclr/16M-16                                    43.6GB/s ± 4%  43.3GB/s ± 0%     ~     (p=0.173 n=10+8)
Memclr/64M-16                                    30.7GB/s ± 0%  30.7GB/s ± 0%     ~     (p=0.113 n=10+9)
GoMemclr/5-16                                    6.07GB/s ± 0%  6.08GB/s ± 0%     ~     (p=0.367 n=9+10)
GoMemclr/16-16                                   15.6GB/s ± 0%  15.6GB/s ± 0%   -0.22%  (p=0.004 n=9+9)
GoMemclr/64-16                                   56.1GB/s ± 0%  56.1GB/s ± 0%     ~     (p=0.968 n=10+9)
GoMemclr/256-16                                   125GB/s ± 0%   124GB/s ± 0%     ~     (p=0.912 n=10+10)
MemclrRange/1K_2K-16                              210GB/s ± 0%   224GB/s ± 1%   +6.81%  (p=0.000 n=10+10)
MemclrRange/2K_8K-16                              228GB/s ± 0%   228GB/s ± 0%     ~     (p=0.684 n=10+10)
MemclrRange/4K_16K-16                             279GB/s ± 0%   279GB/s ± 0%     ~     (p=0.780 n=9+10)
MemclrRange/160K_228K-16                         80.3GB/s ± 0%  80.2GB/s ± 0%     ~     (p=0.165 n=10+10)
Copy/1Byte-16                                     808MB/s ± 1%   810MB/s ± 0%   +0.28%  (p=0.000 n=10+8)
Copy/1String-16                                   810MB/s ± 0%   811MB/s ± 0%     ~     (p=0.105 n=10+10)
Copy/2Byte-16                                    1.62GB/s ± 0%  1.62GB/s ± 0%     ~     (p=0.182 n=10+9)
Copy/2String-16                                  1.62GB/s ± 1%  1.62GB/s ± 1%     ~     (p=1.000 n=10+10)
Copy/4Byte-16                                    3.22GB/s ± 0%  3.24GB/s ± 0%   +0.46%  (p=0.000 n=10+10)
Copy/4String-16                                  3.24GB/s ± 0%  3.24GB/s ± 0%     ~     (p=0.075 n=10+10)
Copy/8Byte-16                                    5.86GB/s ± 0%  5.96GB/s ± 0%   +1.82%  (p=0.000 n=9+9)
Copy/8String-16                                  5.95GB/s ± 0%  5.99GB/s ± 0%   +0.59%  (p=0.000 n=9+10)
Copy/12Byte-16                                   8.32GB/s ± 0%  8.32GB/s ± 0%     ~     (p=0.190 n=9+9)
Copy/12String-16                                 8.31GB/s ± 0%  8.29GB/s ± 0%     ~     (p=0.068 n=9+10)
Copy/16Byte-16                                   11.1GB/s ± 0%  11.1GB/s ± 0%   +0.18%  (p=0.003 n=10+9)
Copy/16String-16                                 11.1GB/s ± 0%  11.1GB/s ± 0%   -0.31%  (p=0.009 n=10+10)
Copy/32Byte-16                                   19.6GB/s ± 1%  19.8GB/s ± 1%   +0.72%  (p=0.002 n=10+10)
Copy/32String-16                                 20.0GB/s ± 0%  19.5GB/s ± 0%   -2.19%  (p=0.000 n=10+10)
Copy/128Byte-16                                  62.2GB/s ± 0%  62.1GB/s ± 0%     ~     (p=0.661 n=9+10)
Copy/128String-16                                61.9GB/s ± 0%  61.7GB/s ± 0%   -0.35%  (p=0.005 n=10+10)
Copy/1024Byte-16                                  169GB/s ± 2%   171GB/s ± 1%   +1.21%  (p=0.000 n=9+10)
Copy/1024String-16                                169GB/s ± 0%   172GB/s ± 1%   +1.57%  (p=0.000 n=10+9)
CompareStringBigUnaligned-16                     43.8GB/s ± 1%  43.7GB/s ± 1%     ~     (p=0.370 n=9+8)
CompareStringBig-16                              47.6GB/s ± 3%  47.3GB/s ± 3%     ~     (p=0.243 n=9+10)
[Geo mean]                                       25.3GB/s       28.0GB/s       +10.66%

name                                             old p50-ns     new p50-ns     delta
ReadMemStatsLatency-16                              98.4k ±37%    112.7k ±64%     ~     (p=0.436 n=10+10)
ReadMetricsLatency-16                               1.69k ± 3%     1.70k ± 2%     ~     (p=0.646 n=9+10)
GoroutineProfile/small-nil/idle-16                  3.75k ± 4%     3.72k ± 1%     ~     (p=0.447 n=10+9)
GoroutineProfile/small-nil/loaded-16                4.33k ± 3%     4.31k ± 5%     ~     (p=0.931 n=9+9)
GoroutineProfile/small/idle-16                       102k ± 3%      101k ± 4%     ~     (p=0.113 n=9+9)
GoroutineProfile/small/loaded-16                     214k ± 3%      215k ± 6%     ~     (p=0.842 n=9+10)
GoroutineProfile/large-nil/idle-16                  3.70k ± 2%     3.65k ± 2%     ~     (p=0.075 n=10+10)
GoroutineProfile/large-nil/loaded-16                4.36k ± 3%     4.31k ±10%     ~     (p=0.631 n=10+10)
GoroutineProfile/large/idle-16                      2.56M ± 1%     2.51M ± 1%   -2.28%  (p=0.000 n=10+10)
GoroutineProfile/large/loaded-16                    6.77M ± 4%     6.85M ±19%     ~     (p=0.536 n=7+10)
GoroutineProfile/sparse-nil/idle-16                 3.66k ± 1%     3.64k ± 2%     ~     (p=0.136 n=9+9)
GoroutineProfile/sparse-nil/loaded-16               4.25k ± 5%     4.15k ± 4%     ~     (p=0.190 n=10+10)
GoroutineProfile/sparse/idle-16                      102k ± 4%      101k ± 3%     ~     (p=0.447 n=10+9)
GoroutineProfile/sparse/loaded-16                    216k ± 4%      218k ± 4%     ~     (p=0.549 n=9+10)
[Geo mean]                                          35.8k          35.9k        +0.35%

name                                             old p90-ns     new p90-ns     delta
ReadMemStatsLatency-16                              983k ±310%      200k ±34%  -79.62%  (p=0.034 n=10+8)
ReadMetricsLatency-16                               4.01k ±35%     3.75k ±17%     ~     (p=0.315 n=10+10)
GoroutineProfile/small-nil/idle-16                  4.21k ± 4%     4.27k ± 8%     ~     (p=0.968 n=9+10)
GoroutineProfile/small-nil/loaded-16                5.58k ± 8%     5.35k ±12%     ~     (p=0.190 n=10+10)
GoroutineProfile/small/idle-16                       108k ± 6%      107k ± 7%     ~     (p=0.497 n=9+10)
GoroutineProfile/small/loaded-16                     450k ± 5%      432k ± 3%   -3.92%  (p=0.002 n=9+9)
GoroutineProfile/large-nil/idle-16                  4.13k ± 7%     4.04k ± 2%     ~     (p=0.181 n=10+8)
GoroutineProfile/large-nil/loaded-16                5.76k ± 4%     5.67k ± 5%     ~     (p=0.190 n=10+10)
GoroutineProfile/large/idle-16                      2.63M ± 2%     2.58M ± 1%   -1.97%  (p=0.000 n=10+10)
GoroutineProfile/large/loaded-16                    16.9M ± 4%     17.0M ± 6%     ~     (p=0.661 n=9+10)
GoroutineProfile/sparse-nil/idle-16                 4.21k ±10%     4.07k ± 7%     ~     (p=0.128 n=10+10)
GoroutineProfile/sparse-nil/loaded-16               5.55k ± 8%     5.38k ± 6%     ~     (p=0.089 n=10+10)
GoroutineProfile/sparse/idle-16                      106k ± 4%      106k ± 3%     ~     (p=0.661 n=10+9)
GoroutineProfile/sparse/loaded-16                    454k ± 6%      441k ± 6%   -2.86%  (p=0.043 n=10+10)
[Geo mean]                                          58.4k          51.0k       -12.61%

name                                             old p99-ns     new p99-ns     delta
ReadMemStatsLatency-16                              983k ±310%      200k ±34%  -79.62%  (p=0.034 n=10+8)
ReadMetricsLatency-16                               26.6k ±22%     26.6k ±17%     ~     (p=0.971 n=10+10)
GoroutineProfile/small-nil/idle-16                  5.27k ±12%     5.19k ±12%     ~     (p=0.579 n=10+10)
GoroutineProfile/small-nil/loaded-16                7.28k ± 3%     7.04k ± 9%     ~     (p=0.113 n=9+10)
GoroutineProfile/small/idle-16                       114k ± 6%      113k ± 6%     ~     (p=0.604 n=9+10)
GoroutineProfile/small/loaded-16                    4.84M ±70%    6.06M ±131%     ~     (p=0.842 n=9+10)
GoroutineProfile/large-nil/idle-16                  5.38k ±19%     5.26k ±13%     ~     (p=0.912 n=10+10)
GoroutineProfile/large-nil/loaded-16                7.38k ± 3%     7.23k ± 4%     ~     (p=0.143 n=10+10)
GoroutineProfile/large/idle-16                      2.79M ± 5%     2.72M ± 3%     ~     (p=0.089 n=10+10)
GoroutineProfile/large/loaded-16                    24.0M ±24%     24.9M ±29%     ~     (p=0.684 n=10+10)
GoroutineProfile/sparse-nil/idle-16                 5.32k ±17%     5.49k ±17%     ~     (p=0.684 n=10+10)
GoroutineProfile/sparse-nil/loaded-16               7.25k ± 4%     6.97k ± 5%   -3.90%  (p=0.005 n=9+10)
GoroutineProfile/sparse/idle-16                      113k ± 5%      112k ± 6%     ~     (p=0.631 n=10+10)
GoroutineProfile/sparse/loaded-16                   4.00M ±66%     4.26M ±65%     ~     (p=0.489 n=9+9)
[Geo mean]                                           107k            97k        -9.55%

name                                             old alloc/op   new alloc/op   delta
NewEmptyMap-16                                      0.00B          0.00B          ~     (all equal)
NewSmallMap-16                                      0.00B          0.00B          ~     (all equal)
MapPopulate/1-16                                    0.00B          0.00B          ~     (all equal)
MapPopulate/10-16                                    179B ± 0%      179B ± 0%     ~     (all equal)
MapPopulate/100-16                                 3.35kB ± 0%    3.35kB ± 0%     ~     (p=0.294 n=10+8)
MapPopulate/1000-16                                53.3kB ± 0%    53.3kB ± 0%     ~     (p=1.000 n=8+10)
MapPopulate/10000-16                                428kB ± 0%     428kB ± 0%     ~     (p=0.469 n=10+10)
MapPopulate/100000-16                              3.62MB ± 0%    3.62MB ± 0%     ~     (p=0.888 n=9+10)
MapStringConversion/32/simple-16                    0.00B          0.00B          ~     (all equal)
MapStringConversion/32/struct-16                    0.00B          0.00B          ~     (all equal)
MapStringConversion/32/array-16                     0.00B          0.00B          ~     (all equal)
MapStringConversion/64/simple-16                    0.00B          0.00B          ~     (all equal)
MapStringConversion/64/struct-16                    0.00B          0.00B          ~     (all equal)
MapStringConversion/64/array-16                     0.00B          0.00B          ~     (all equal)
NewEmptyMapHintLessThan8-16                         0.00B          0.00B          ~     (all equal)
NewEmptyMapHintGreaterThan8-16                     1.15kB ± 0%    1.15kB ± 0%     ~     (all equal)
MapAppendAssign/Int32/256-16                        41.7B ±15%     44.3B ±12%     ~     (p=0.106 n=10+10)
MapAppendAssign/Int32/65536-16                      22.6B ± 6%     23.5B ± 6%   +4.19%  (p=0.025 n=9+10)
MapAppendAssign/Int64/256-16                        43.5B ±10%     42.9B ± 7%     ~     (p=0.757 n=10+10)
MapAppendAssign/Int64/65536-16                      24.7B ± 7%     21.8B ± 6%  -11.74%  (p=0.000 n=10+10)
MapAppendAssign/Str/256-16                          87.6B ±10%     89.2B ± 9%     ~     (p=0.379 n=10+10)
MapAppendAssign/Str/65536-16                        45.1B ±14%     47.2B ± 8%     ~     (p=0.150 n=10+9)
CreateGoroutinesCapture-16                           144B ± 0%      144B ± 0%     ~     (all equal)
[Geo mean]                                           769B           770B        +0.21%

name                                             old allocs/op  new allocs/op  delta
NewEmptyMap-16                                       0.00           0.00          ~     (all equal)
NewSmallMap-16                                       0.00           0.00          ~     (all equal)
MapPopulate/1-16                                     0.00           0.00          ~     (all equal)
MapPopulate/10-16                                    1.00 ± 0%      1.00 ± 0%     ~     (all equal)
MapPopulate/100-16                                   17.0 ± 0%      17.0 ± 0%     ~     (all equal)
MapPopulate/1000-16                                  73.0 ± 0%      73.0 ± 0%     ~     (all equal)
MapPopulate/10000-16                                  320 ± 0%       320 ± 0%     ~     (p=1.000 n=10+10)
MapPopulate/100000-16                               4.00k ± 0%     4.00k ± 0%     ~     (p=0.753 n=10+10)
MapStringConversion/32/simple-16                     0.00           0.00          ~     (all equal)
MapStringConversion/32/struct-16                     0.00           0.00          ~     (all equal)
MapStringConversion/32/array-16                      0.00           0.00          ~     (all equal)
MapStringConversion/64/simple-16                     0.00           0.00          ~     (all equal)
MapStringConversion/64/struct-16                     0.00           0.00          ~     (all equal)
MapStringConversion/64/array-16                      0.00           0.00          ~     (all equal)
NewEmptyMapHintLessThan8-16                          0.00           0.00          ~     (all equal)
NewEmptyMapHintGreaterThan8-16                       1.00 ± 0%      1.00 ± 0%     ~     (all equal)
MapAppendAssign/Int32/256-16                         0.00           0.00          ~     (all equal)
MapAppendAssign/Int32/65536-16                       0.00           0.00          ~     (all equal)
MapAppendAssign/Int64/256-16                         0.00           0.00          ~     (all equal)
MapAppendAssign/Int64/65536-16                       0.00           0.00          ~     (all equal)
MapAppendAssign/Str/256-16                           0.00           0.00          ~     (all equal)
MapAppendAssign/Str/65536-16                         0.00           0.00          ~     (all equal)
CreateGoroutinesCapture-16                           5.00 ± 0%      5.00 ± 0%     ~     (all equal)
[Geo mean]                                           26.0           26.0        +0.00%

Change-Id: I5fb03e93df8b380e04795afbdcd1c94aeeecacc6
Reviewed-on: https://go-review.googlesource.com/c/go/+/454255
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Jakub Ciolek <jakub@ciolek.dev>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-01-31 18:11:24 +00:00
Keith Randall 1e12c63aac cmd/compile: fix -m=2 output for recursive function with closures
ir.VisitFuncsBottomUp returns recursive==true for functions which
call themselves. It also returns any closures inside that function.
We don't want to report the closures as recursive, as they really
aren't. Only the containing function is recursive.

Fixes #54159

Change-Id: I3b4d6710a389ec1d6b250ba8a7065f2e985bdbe1
Reviewed-on: https://go-review.googlesource.com/c/go/+/463233
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
2023-01-28 04:29:02 +00:00
Paul E. Murphy 1540531746 test/codegen: merge identical ppc64 and ppc64le tests
Manually consolidate the remaining ppc64/ppc64le test which
are not so trivial to automatically merge.

The remaining ppc64le tests are limited to cases where load/stores are
merged (this only happens on ppc64le) and the race detector (only
supported on ppc64le).

Change-Id: I1f9c0f3d3ddbb7fbbd8c81fbbd6537394fba63ce
Reviewed-on: https://go-review.googlesource.com/c/go/+/463217
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
2023-01-27 19:03:02 +00:00
Paul E. Murphy 0301c6c351 test/codegen: combine trivial PPC64 tests into ppc64x
Use a small python script to consolidate duplicate
ppc64/ppc64le tests into a single ppc64x codegen test.

This makes small assumption that anytime two tests with
for different arch/variant combos exists, those tests
can be combined into a single ppc64x test.

E.x:

  // ppc64le: foo
  // ppc64le/power9: foo
into
  // ppc64x: foo

or

  // ppc64: foo
  // ppc64le: foo
into
  // ppc64x: foo

import glob
import re
files = glob.glob("codegen/*.go")
for file in files:
    with open(file) as f:
        text = [l for l in f]
    i = 0
    while i < len(text):
        first = re.match("\s*// ?ppc64(le)?(/power[89])?:(.*)", text[i])
        if first:
            j = i+1
            while j < len(text):
                second = re.match("\s*// ?ppc64(le)?(/power[89])?:(.*)", text[j])
                if not second:
                    break
                if (not first.group(2) or first.group(2) == second.group(2)) and first.group(3) == second.group(3):
                    text[i] = re.sub(" ?ppc64(le|x)?"," ppc64x",text[i])
                    text=text[:j] + (text[j+1:])
                else:
                    j += 1
        i+=1
    with open(file, 'w') as f:
        f.write("".join(text))

Change-Id: Ic6b009b54eacaadc5a23db9c5a3bf7331b595821
Reviewed-on: https://go-review.googlesource.com/c/go/+/463220
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-01-27 18:24:12 +00:00
Matthew Dempsky a7de684e1b cmd/compile/internal/noder: stop creating TUNION types
In the types1 universe under the unified frontend, we never need to
worry about type parameter constraints, so we only see pure
interfaces. However, we might still see interfaces that contain union
types, because of interfaces like "interface{ any | int }" (equivalent
to just "any").

We can handle these without needing to actually represent type unions
within types1 by simply mapping any union to "any".

Updates #57410.

Change-Id: I5e4efcf0339edbb01f4035c54fb6fb1f9ddc0c65
Reviewed-on: https://go-review.googlesource.com/c/go/+/458619
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2023-01-26 21:43:08 +00:00
Matthew Dempsky 4f467f1082 cmd: remove GOEXPERIMENT=nounified knob
This CL removes the GOEXPERIMENT=nounified knob, and any conditional
statements that depend on that knob. Further CLs to remove unreachable
code follow this one.

Updates #57410.

Change-Id: I39c147e1a83601c73f8316a001705778fee64a91
Reviewed-on: https://go-review.googlesource.com/c/go/+/458615
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2023-01-25 21:16:32 +00:00
Paul E. Murphy a37672bb7b test/codegen: accept ppc64x as alias for ppc64le and ppc64 arches
This helps simplify the noise when adding ppc codegen tests. ppc64x
is used in other places to indicate something which runs on either
endian.

This helps cleanup existing codegen tests which are mostly
identical between endian variants.

condmove tests are converted as an example.

Change-Id: I2b2d98a9a1859015f62db38d62d9d5d7593435b4
Reviewed-on: https://go-review.googlesource.com/c/go/+/462895
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Paul Murphy <murp@ibm.com>
2023-01-24 22:55:18 +00:00
Matthew Dempsky ffbd194f5c test: remove TODO in issue20250.go
This has been investigated and explained on the issue tracker.

Fixes #54402.

Change-Id: I4d8b971faa810591983ad028b7db16411f3b3b4a
Reviewed-on: https://go-review.googlesource.com/c/go/+/461456
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Benny Siegert <bsiegert@gmail.com>
2023-01-24 19:47:54 +00:00
Keith Randall a6ddb15f8f Revert "cmd/compile: teach prove about bitwise OR operation"
This reverts commit 3680b5e9c4.

Reason for revert: causes long compile times on certain functions. See issue #57959

Change-Id: Ie9e881ca8abbc79a46de2bfeaed0b9d6c416ed42
Reviewed-on: https://go-review.googlesource.com/c/go/+/463295
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2023-01-24 17:58:12 +00:00
Than McIntosh 733ba92187 cmd/compile: flag 'large' functions when -m=2+ in effect
When -m=N (where N > 1) is in effect, include a note in the trace
output if a given function is considered "big" during inlining
analysis, since this causes the inliner to be less aggressive. If a
small change to a large function happens to nudge it over the large
function threshold, it can be confusing for developers, thus it's
probably worth including this info in the remark output.

Change-Id: Id31a1b76371ab1ef9265ba28a377f97b0247d0a7
Reviewed-on: https://go-review.googlesource.com/c/go/+/460317
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Than McIntosh <thanm@google.com>
Reviewed-by: Keith Randall <khr@google.com>
2023-01-24 13:28:54 +00:00
Jorropo 35755d772f cmd/compile: teach prove about unsigned division, modulus and rsh
Fixes: #57077
Change-Id: Icffcac42e28622eadecdba26e3cd7ceca6c4aacc
Reviewed-on: https://go-review.googlesource.com/c/go/+/455095
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2023-01-23 18:35:41 +00:00
Jakub Ciolek bb5ff5342d cmd/compile: make loopbce handle 8, 16 and 32 bit induction variables
Compute limits and increment values for all integer widths.
Resolves 2 TODO's in loopbce.go

compilecmp linux/amd64:

compress/flate
compress/flate.(*huffmanEncoder).bitCounts 1235 -> 1207  (-2.27%)

cmd/internal/obj/wasm
cmd/internal/obj/wasm.assemble 7443 -> 7303  (-1.88%)
cmd/internal/obj/wasm.assemble.func1 165 -> 138  (-16.36%)

cmd/link/internal/ld
cmd/link/internal/ld.(*Link).findfunctab.func1 1646 -> 1627  (-1.15%)

Change-Id: I2d79b7376eb67d6bcc8fdaf0c197c11e631562d0
Reviewed-on: https://go-review.googlesource.com/c/go/+/435258
Reviewed-by: Benny Siegert <bsiegert@gmail.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
2023-01-23 18:10:40 +00:00
David Chase e22bd2348c internal/abi,runtime: refactor map constants into one place
Previously TryBot-tested with bucket bits = 4.
Also tested locally with bucket bits = 5.
This makes it much easier to change the size of map
buckets, and hopefully provides pointers to all the
code that in some way depends on details of map layout.

Change-Id: I9f6669d1eadd02f182d0bc3f959dc5f385fa1683
Reviewed-on: https://go-review.googlesource.com/c/go/+/462115
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: David Chase <drchase@google.com>
Reviewed-by: Austin Clements <austin@google.com>
2023-01-23 15:51:32 +00:00
Keith Randall ba91377454 test: test that we schedule OpArgIntReg early
If OpArgIntReg is incorrectly scheduled, that causes it to be spilled
incorrectly, which causes the argument to not be considered live
at the start of the function.

This is the test for CL 462858

Add a brief mention of why CL 462858 is needed in the scheduling code.

Change-Id: Id199456f88d9ee5ca46d7b0353a3c2049709880e
Reviewed-on: https://go-review.googlesource.com/c/go/+/462899
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
2023-01-21 21:08:30 +00:00
Keith Randall 4ff074945a cmd/compile: sort liveness variable reports
Sort variables before display so that when there are multiple variables
to report, they are in a consistent order.

Otherwise they are ordered in the order they appear in the fn.Dcl list,
which can vary. Particularly, they vary depending on regabi.

Change-Id: I0db380f7cbe6911e87177503a4c3b39851ff1b5a
Reviewed-on: https://go-review.googlesource.com/c/go/+/462898
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2023-01-21 21:08:00 +00:00
Jorropo 5c67ebbb31 cmd/compile: AMD64v3 remove unnecessary TEST comparision in isPowerOfTwo
With GOAMD64=V3 the canonical isPowerOfTwo function:
  func isPowerOfTwo(x uintptr) bool {
    return x&(x-1) == 0
  }

Used to compile to:
  temp := BLSR(x) // x&(x-1)
  flags = TEST(temp, temp)
  return flags.zf

However the blsr instruction already set ZF according to the result.
So we can remove the TEST instruction if we are just checking ZF.
Such as in multiple pieces of code around memory allocations.

This make the code smaller and faster.

Change-Id: Ia12d5a73aa3cb49188c0b647b1eff7b56c5a7b58
Reviewed-on: https://go-review.googlesource.com/c/go/+/448255
Run-TryBot: Jakub Ciolek <jakub@ciolek.dev>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2023-01-20 04:58:59 +00:00
Jorropo fc814056aa cmd/compile: rewrite empty makeslice to zerobase pointer
make\(\[\][a-zA-Z0-9]+, 0\) is seen 52 times in the go source.
And at least 391 times on internet:
https://grep.app/search?q=make%5C%28%5C%5B%5C%5D%5Ba-zA-Z0-9%5D%2B%2C%200%5C%29&regexp=true
This used to compile to calling runtime.makeslice.
However we can copy what we do for []T{}, just use a zerobase pointer.

On my machine this is 10x faster (from 3ns to 0.3ns).
Note that an empty loop also runs in 0.3ns,
so this really is free when you count superscallar execution.

Change-Id: I1cfe7e69f5a7a4dabbc71912ce6a4f8a2d4a7f3c
Reviewed-on: https://go-review.googlesource.com/c/go/+/454036
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Jakub Ciolek <jakub@ciolek.dev>
2023-01-20 04:57:35 +00:00
Keith Randall 12befc3ce3 cmd/compile: improve scheduling pass
Convert the scheduling pass from scheduling backwards to scheduling forwards.

Forward scheduling makes it easier to prioritize scheduling values as
soon as they are ready, which is important for things like nil checks,
select ops, etc.

Forward scheduling is also quite a bit clearer. It was originally
backwards because computing uses is tricky, but I found a way to do it
simply and with n lg n complexity. The new scheme also makes it easy
to add new scheduling edges if needed.

Fixes #42673
Update #56568

Change-Id: Ibbb38c52d191f50ce7a94f8c1cbd3cd9b614ea8b
Reviewed-on: https://go-review.googlesource.com/c/go/+/270940
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2023-01-20 04:54:01 +00:00
Keith Randall f959fb3872 cmd/compile: add anchored version of SP
The SPanchored opcode is identical to SP, except that it takes a memory
argument so that it (and more importantly, anything that uses it)
must be scheduled at or after that memory argument.

This opcode ensures that a LEAQ of a variable gets scheduled after the
corresponding VARDEF for that variable.

This may lead to less CSE of LEAQ operations. The effect is very small.
The go binary is only 80 bytes bigger after this CL. Usually LEAQs get
folded into load/store operations, so the effect is only for pointerful
types, large enough to need a duffzero, and have their address passed
somewhere. Even then, usually the CSEd LEAQs will be un-CSEd because
the two uses are on different sides of a function call and the LEAQ
ends up being rematerialized at the second use anyway.

Change-Id: Ib893562cd05369b91dd563b48fb83f5250950293
Reviewed-on: https://go-review.googlesource.com/c/go/+/452916
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
Reviewed-by: Martin Möhrmann <martin@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
2023-01-19 22:43:12 +00:00
Russ Cox aa51c40b1c runtime: replace panic(nil) with panic(new(runtime.PanicNilError))
Long ago we decided that panic(nil) was too unlikely to bother
making a special case for purposes of recover. Unfortunately,
it has turned out not to be a special case. There are many examples
of code in the Go ecosystem where an author has written panic(nil)
because they want to panic and don't care about the panic value.

Using panic(nil) in this case has the unfortunate behavior of
making recover behave as though the goroutine isn't panicking.
As a result, code like:

	func f() {
		defer func() {
			if err := recover(); err != nil {
				log.Fatalf("panicked! %v", err)
			}
		}()
		call1()
		call2()
	}

looks like it guarantees that call2 has been run any time f returns,
but that turns out not to be strictly true. If call1 does panic(nil),
then f returns "successfully", having recovered the panic, but
without calling call2.

Instead you have to write something like:

	func f() {
		done := false
		defer func() {
			if err := recover(); !done {
				log.Fatalf("panicked! %v", err)
			}
		}()
		call1()
		call2()
		done = true
	}

which defeats nearly the whole point of recover. No one does this,
with the result that almost all uses of recover are subtly broken.

One specific broken use along these lines is in net/http, which
recovers from panics in handlers and sends back an HTTP error.
Users discovered in the early days of Go that panic(nil) was a
convenient way to jump out of a handler up to the serving loop
without sending back an HTTP error. This was a bug, not a feature.
Go 1.8 added panic(http.ErrAbortHandler) as a better way to access the feature.
Any lingering code that uses panic(nil) to abort an HTTP handler
without a failure message should be changed to use http.ErrAbortHandler.

Programs that need the old, unintended behavior from net/http
or other packages can set GODEBUG=panicnil=1 to stop the run-time error.

Uses of recover that want to detect panic(nil) in new programs
can check for recover returning a value of type *runtime.PanicNilError.

Because the new GODEBUG is used inside the runtime, we can't
import internal/godebug, so there is some new machinery to
cross-connect those in this CL, to allow a mutable GODEBUG setting.
That won't be necessary if we add any other mutable GODEBUG settings
in the future. The CL also corrects the handling of defaulted GODEBUG
values in the runtime, for #56986.

Fixes #25448.

Change-Id: I2b39c7e83e4f7aa308777dabf2edae54773e03f5
Reviewed-on: https://go-review.googlesource.com/c/go/+/461956
Reviewed-by: Robert Griesemer <gri@google.com>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Russ Cox <rsc@golang.org>
2023-01-19 22:21:50 +00:00
Cuong Manh Le 198074abd7 cmd/compile: fix unsafe.{SliceData,StringData} escape analysis memory corruption
Fixes #57823

Change-Id: I54654d3ecb20b75afa9052c5c9db2072a86188d4
Reviewed-on: https://go-review.googlesource.com/c/go/+/461759
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2023-01-18 01:27:21 +00:00
Matthew Dempsky f773bef9ab cmd/compile: fix static init inlining for hidden node fields
Unified IR added several new IR fields for holding *runtime._type
expressions. To avoid throwing off any frontend semantics
(particularly inlining cost heuristics), they were marked as
`mknode:"-"` so that code wouldn't visit them.

Unfortunately, this has a bad interaction with the static init
inlining optimization, because the latter relies on ir.EditChildren to
substitute all parameters. This potentially includes dictionary
parameters, which can appear within the new RType fields.

This CL adds a new ir.EditChildrenWithHidden function that also edits
these fields, and switches staticinit to use it. Longer term, we
should unhide the RType fields so that ir.EditChildren visits them
normally, but that's scarier so late in the release cycle.

Fixes #57778.

Change-Id: I98c1e8cf366156dc0c81a0cb79029cc5e59c476f
Reviewed-on: https://go-review.googlesource.com/c/go/+/461686
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
2023-01-17 22:13:42 +00:00
Keith Randall 9088c691da cmd/compile: ensure temp register mask isn't empty
We need to avoid nospill registers at this point in regalloc.
Make sure that we don't restrict our register set to avoid registers
desired by other instructions, if the resulting set includes only
nospill registers.

Fixes #57846

Change-Id: I05478e4513c484755dc2e8621d73dac868e45a27
Reviewed-on: https://go-review.googlesource.com/c/go/+/461685
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-01-17 18:21:06 +00:00
Robert Findley 245e95dfab go/types, types2: don't look up fields or methods when expecting a type
As we have seen many times, the type checker must be careful to avoid
accessing named type information before the type is fully set up. We
need a more systematic solution to this problem, but for now avoid one
case that causes a crash: checking a selector expression on an
incomplete type when a type expression is expected.

For golang/go#57522

Change-Id: I7ed31b859cca263276e3a0647d1f1b49670023a9
Reviewed-on: https://go-review.googlesource.com/c/go/+/461577
Run-TryBot: Robert Findley <rfindley@google.com>
Auto-Submit: Robert Findley <rfindley@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@google.com>
2023-01-11 22:29:34 +00:00
Marcel Meyer 841c3eb166 all: fix typos in go file comments
These typos were found by executing grep, aspell, sort, and uniq in
a pipe and searching the resulting list manually for possible typos.

    grep -r --include '*.go' -E '^// .*$' . | aspell list | sort | uniq

Change-Id: I56281eda3b178968fbf104de1f71316c1feac64f
GitHub-Last-Rev: e91c7cee34
GitHub-Pull-Request: golang/go#57669
Reviewed-on: https://go-review.googlesource.com/c/go/+/460767
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Auto-Submit: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
2023-01-09 15:34:31 +00:00
Robert Griesemer 46e3d9d12a cmd/compile: use "satisfies" (not "implements") for constraint errors
Per the latest spec, we distinguish between interface implementation
and constraint satisfaction. Use the verb "satisfy" when reporting
an error about failing constraint satisfaction.

This CL only changes error messages. It has no impact on correct code.

Fixes #57564.

Change-Id: I6dfb3b2093c2e04fe5566628315fb5f6bd709f17
Reviewed-on: https://go-review.googlesource.com/c/go/+/460396
Reviewed-by: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Robert Griesemer <gri@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
2023-01-04 19:07:27 +00:00
Matthew Dempsky 4f8bc6224b cmd/compile: desugar OCALLMETH->OCALLFUNC within devirtualization
Devirtualization can turn OCALLINTER into OCALLMETH, but then we want
to actually desugar into OCALLFUNC instead for later phases. Just
needs a missing call to typecheck.FixMethodCall.

Fixes #57309.

Change-Id: I331fbd40804e1a370134ef17fa6dd501c0920ed3
Reviewed-on: https://go-review.googlesource.com/c/go/+/457715
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-12-14 20:37:17 +00:00
Keith Randall e8f78cb60c cmd/compile: fix conditional select rule
ARM64 maintains booleans in the low byte of registers. Upper parts
of that register are junk.
This rule is using all 32 bits of a boolean-containing register, which
is wrong. Change the rule to only look at the low bit.

Fixes #57184

Change-Id: Ibbef86b2be859df3d06d993db00e1231c481c428
Reviewed-on: https://go-review.googlesource.com/c/go/+/456556
Auto-Submit: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
2022-12-09 21:38:33 +00:00
Keith Randall 1eb0465fa5 cmd/compile: turn off jump tables when spectre retpolines are on
Fixes #57097

Change-Id: I6ab659abbca1ae0ac8710674d39aec116fab0baa
Reviewed-on: https://go-review.googlesource.com/c/go/+/455336
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
2022-12-06 05:12:12 +00:00
cui fliter 3a7a528c2d all: fix some comments for method
Change-Id: I4cff6b2a1fed6acdf754539c3c53a61eaa3b3f84
Reviewed-on: https://go-review.googlesource.com/c/go/+/450176
Auto-Submit: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2022-12-03 17:08:51 +00:00
Cuong Manh Le c85848a4a6 cmd/compile: fix inline static init with derived types
CL 450136 added handling for simple calls in staticinit. If there's any
derived types conversion in the body of generic function called, that
conversion will require runtime dictionary, thus the optimization could
not happen.

Fixes #56923

Change-Id: I498cee9f8ab4397812ef79a6c2ab6c55e0ee4aef
Reviewed-on: https://go-review.googlesource.com/c/go/+/453315
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Gabriel Morency (Amgc63spaming) <morencyvincent8@gmail.com>
2022-11-30 23:25:43 +00:00
Keith Randall c8057d8569 cmd/compile: disallow CMOV optimization with ptr arithmetic as an arg
if q != nil {
        p = &q.f
    }

Which gets rewritten to a conditional move:

    tmp := &q.f
    p = Select q!=nil, tmp, p

Unfortunately, we can't compute &q.f before we've checked if q is nil,
because if it is nil, &q.f is an invalid pointer (if f's offset is
nonzero but small).

Normally this is not a problem because the tmp variable above
immediately dies, and is thus not live across any safepoint. However,
if later there is another &q.f computation, those two computations are
CSEd, causing tmp to be used at both use points. That will extend
tmp's lifetime, possibly across a call.

Fixes #56990

Change-Id: I3ea31be93feae04fbe3304cb11323194c5df3879
Reviewed-on: https://go-review.googlesource.com/c/go/+/454155
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
2022-11-30 17:46:51 +00:00
Russ Cox f89b39c0af cmd/compile: reenable inlstaticinit
This was disabled in CL 452676 out of an abundance of caution,
but further analysis has shown that the failures were not being
caused by this optimization. Instead the sequence of commits was:

CL 450136 cmd/compile: handle simple inlined calls in staticinit
...
CL 449937 archive/tar, archive/zip: return ErrInsecurePath for unsafe paths
...
CL 451555 cmd/compile: fix static init for inlined calls

The failures in question became compile failures in the first CL
and started building again after the last CL.
But in the interim the code had been broken by the middle CL.
CL 451555 was just the first time that the tests could run and fail.

For #30820.

Change-Id: I65064032355b56fdb43d9731be2f9f32ef6ee600
Reviewed-on: https://go-review.googlesource.com/c/go/+/452817
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Russ Cox <rsc@golang.org>
Auto-Submit: Russ Cox <rsc@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-11-23 21:54:55 +00:00
Matthew Dempsky 152119990f cmd/compile: add -d=inlstaticinit debug flag
This CL adds -d=inlstaticinit to control whether static initialization
of inlined function calls (added in CL 450136) is allowed.

We've needed to fix it once already (CL 451555) and Google-internal
testing is hitting additional failure cases, so putting this
optimization behind a feature flag seems appropriate regardless.

Also, while we diagnose and fix the remaining cases, this CL also
disables the optimization to avoid miscompilations.

Updates #56894.

Change-Id: If52a358ad1e9d6aad1c74fac5a81ff9cfa5a3793
Reviewed-on: https://go-review.googlesource.com/c/go/+/452676
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
2022-11-22 01:42:49 +00:00
Matthew Dempsky 840b346c5d cmd/compile: reject anonymous interface cycles
This CL changes cmd/compile to reject anonymous interface cycles like:

	type I interface { m() interface { I } }

We don't anticipate any users to be affected by this change in
practice. Nonetheless, this CL also adds a `-d=interfacecycles`
compiler flag to suppress the error. And assuming no issue reports
from users, we'll move the check into go/types and types2 instead.

Updates #56103.

Change-Id: I1f1dce2d7aa19fb388312cc020e99cc354afddcb
Reviewed-on: https://go-review.googlesource.com/c/go/+/445598
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
2022-11-21 20:15:23 +00:00
Wayne Zuo 8893da7c72 cmd/compile: fix wrong optimization for eliding Not in Phi
The previous rule may move the phi value into a wrong block.
This CL make it only rewrite the phi value not the If block,
so that the phi value will stay in old block.

Fixes #56777

Change-Id: I9479a5c7f28529786968413d35b82a16181bb1f1
Reviewed-on: https://go-review.googlesource.com/c/go/+/451496
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Wayne Zuo <wdvxdr@golangcn.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
2022-11-18 13:26:33 +00:00
Cuong Manh Le 81c9b1d65f cmd/compile: fix broken IR for iface -> eface
For implementing interface to empty interface conversion, the compiler
generate code like:

	var res *uint8
	res = itab
	if res != nil {
		res = res.type
	}

However, itab has type *uintptr, so the assignment is broken. The
problem is not shown up, until CL 450215, which call typecheck on this
broken assignment.

To fix this, just cast itab to *uint8 when doing the conversion.

Fixes #56768

Change-Id: Id42792d18e7f382578b40854d46eecd49673792c
Reviewed-on: https://go-review.googlesource.com/c/go/+/451256
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2022-11-17 19:55:28 +00:00
Cuong Manh Le 249e51e5d9 cmd/compile: fix static init for inlined calls
CL 450136 made the compiler to be able to handle simple inlined calls in
staticinit. However, it's missed a condition when checking substituting
arg for param. If there's any non-trivial closures, it has captured one
of the param, so the substitution could not happen.

Fixes #56778

Change-Id: I427c9134e333e2f9af136c1a124da4d37d326f10
Reviewed-on: https://go-review.googlesource.com/c/go/+/451555
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
2022-11-17 18:31:28 +00:00
Cuong Manh Le 1daa8e2d52 test: remove optimizationOff
Cl 426334 removed its only usage, and now we have gcflags_noopt.

Change-Id: I3b33a8c868669deea00bf6dfcf8d81981504e293
Reviewed-on: https://go-review.googlesource.com/c/go/+/451255
Reviewed-by: Joedian Reid <joedian@golang.org>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-11-17 16:06:03 +00:00
Russ Cox bed970b3ff cmd/compile: handle integer conversions in static init inliner
Given code like

	func itou(i int) uint { return uint(i) }
	var x = itou(-1)

the static inliner from CL 450136 was rewriting the code to

	var x = uint(-1)

which is not valid Go code. Fix this by converting the
constants appropriately during inlining.

Fixes golang.org/x/image/vector test.

Change-Id: I13448df8504c6a70525b1cdc36e2c947e22cdd33
Reviewed-on: https://go-review.googlesource.com/c/go/+/451376
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-11-17 13:46:05 +00:00
Russ Cox 5947a07d72 test: fix noinit on noopt builder
Fix noopt build break from CL 450136 by not running test.

I can't reproduce the failure locally, but it's entirely reasonable
for this test to fail when optimizations are disabled, so just don't
run it when optimizations are disabled.

Change-Id: I882760fc7373ba0449379f81d295312a6be49be1
Reviewed-on: https://go-review.googlesource.com/c/go/+/450740
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Michael Stapelberg <stapelberg@google.com>
Auto-Submit: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-11-16 13:26:16 +00:00
Russ Cox b1678e508b cmd/compile: handle simple inlined calls in staticinit
Global variable initializers like

	var myErr error = &myError{"msg"}

have been converted to statically initialized data
from the earliest days of Go: there is no init-time
execution or allocation for that line of code.

But if the expression is moved into an inlinable function,
the static initialization no longer happens.
That is, this code has always executed and allocated
at init time, even after we added inlining to the compiler,
which should in theory make this code equivalent to
the original:

	func NewError(s string) error { return &myError{s} }
	var myErr2 = NewError("msg")

This CL makes the static initialization rewriter understand
inlined functions consisting of a single return statement,
like in this example, so that myErr2 can be implemented as
statically initialized data too, just like myErr, with no init-time
execution or allocation.

A real example of code that benefits from this rewrite is
all globally declared errors created with errors.New, like

	package io
	var EOF = errors.New("EOF")

Package io no longer has to allocate and initialize EOF each
time a program starts.

Another example of code that benefits is any globally declared
godebug setting (using the API from CL 449504), like

	package http
	var http2server = godebug.New("http2server")

These are no longer allocated and initialized at program startup either.

The list of functions that are inlined into static initializers when
compiling std and cmd (along with how many times each occurs) is:

	cmd/compile/internal/ssa.StringToAux (3)
	cmd/compile/internal/walk.mkmapnames (4)
	errors.New (360)
	go/ast.NewIdent (1)
	go/constant.MakeBool (4)
	go/constant.MakeInt64 (3)
	image.NewUniform (4)
	image/color.ModelFunc (11)
	internal/godebug.New (12)
	vendor/golang.org/x/text/unicode/bidi.newBidiTrie (1)
	vendor/golang.org/x/text/unicode/norm.newNfcTrie (1)
	vendor/golang.org/x/text/unicode/norm.newNfkcTrie (1)

For the cmd/go binary, this CL cuts the number of init-time
allocations from about 1920 to about 1620 (a 15% reduction).

The total executable code footprint of init functions is reduced
by 24kB, from 137kB to 113kB (an 18% reduction).
The overall binary size is reduced by 45kB,
from 15.335MB to 15.290MB (a 0.3% reduction).
(The binary size savings is larger than the executable code savings
because every byte of executable code also requires corresponding
runtime tables for unwinding, source-line mapping, and so on.)

Also merge test/sinit_run.go, which had stopped testing anything
at all as of CL 161337 (Feb 2019) and initempty.go into a new test
noinit.go.

Fixes #30820.

Change-Id: I52f7275b1ac2a0a32e22c29f9095071c7b1fac20
Reviewed-on: https://go-review.googlesource.com/c/go/+/450136
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Joedian Reid <joedian@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
Auto-Submit: Russ Cox <rsc@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
2022-11-16 04:04:52 +00:00
Cuong Manh Le 03a1810473 cmd/compile: fix missing typecheck for static initialization slice
CL 440455 fixed missing walk pass for static initialization slice.
However, slicelit may produce un-typechecked node, thus we need to do
typecheck for sinit before calling walkStmtList.

Fixes #56727

Change-Id: I40730cebcd09f2be4389d71c5a90eb9a060e4ab7
Reviewed-on: https://go-review.googlesource.com/c/go/+/450215
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2022-11-15 17:35:03 +00:00
Paul E. Murphy dc6b7c86df cmd/compile: merge zero constant ISEL in PPC64 lateLower pass
Add a new SSA opcode ISELZ, similar to ISELB to represent a select
of value or 0. Then, merge candidate ISEL opcodes inside the late
lower pass.

This avoids complicating rules within the the lower pass.

Change-Id: I3b14c94b763863aadc834b0e910a85870c131313
Reviewed-on: https://go-review.googlesource.com/c/go/+/442596
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
Reviewed-by: Joedian Reid <joedian@golang.org>
2022-11-14 19:44:47 +00:00
Cuong Manh Le 73f987c88b test: add regression test for issue 53439
Fixes #53439

Change-Id: I425af0f78153511034e4a4648f32ef8c9378a325
Reviewed-on: https://go-review.googlesource.com/c/go/+/449756
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
2022-11-11 14:48:29 +00:00
Wayne Zuo 268f4629df cmd/compile: enable brachelim pass on loong64
Change-Id: I4fd1c307901c265ab9865bf8a74460ddc15e5d14
Reviewed-on: https://go-review.googlesource.com/c/go/+/416735
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: xiaodong liu <teaofmoli@gmail.com>
Auto-Submit: Wayne Zuo <wdvxdr@golangcn.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Wayne Zuo <wdvxdr@golangcn.org>
2022-11-09 06:10:55 +00:00
Matthew Dempsky 9944ba757b cmd/compile: fix transitive inlining of generic functions
If an imported, non-generic function F transitively calls a generic
function G[T], we may need to call CanInline on G[T].

While here, we can also take advantage of the fact that we know G[T]
was already seen and compiled in an imported package, so we don't need
to call InlineCalls or add it to typecheck.Target.Decls. This saves us
from wasting compile time re-creating DUPOK symbols that we know
already exist in the imported package's link objects.

Fixes #56280.

Change-Id: I3336786bee01616ee9f2b18908738e4ca41c8102
Reviewed-on: https://go-review.googlesource.com/c/go/+/443535
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
2022-11-08 21:26:09 +00:00
Paul E. Murphy 390abbbbf1 codegen: check for PPC64 ISEL in condmove tests
ISEL is roughly equivalent to CMOV on PPC64. Verify ISEL generation
in all reasonable cases.

Note "ISEL test x y z" is the same as "ISEL !test y x z". test is
always one of LT (0), GT (1), EQ (2), SO (3). Sometimes x and y are
swapped if GE/LE/NE is desired.

Change-Id: Ie1bf029224064e004d855099731fe5e8d05aa990
Reviewed-on: https://go-review.googlesource.com/c/go/+/445215
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Paul Murphy <murp@ibm.com>
Reviewed-by: Than McIntosh <thanm@google.com>
2022-11-07 15:19:20 +00:00
Matthew Dempsky aa6240a445 cmd/compile: allow ineffectual //go:linkname in -lang=go1.17 and older
Prior to Go 1.18, ineffectual //go:linkname directives (i.e.,
directives referring to an undeclared name, or to a declared type or
constant) were treated as noops. In Go 1.18, we changed this into a
compiler error to mitigate accidental misuse.

However, the x/sys repo contained ineffectual //go:linkname directives
up until go.dev/cl/274573, which has caused a lot of user confusion.

It seems a bit late to worry about now, but to at least prevent
further user pain, this CL changes the error message to only apply to
modules using "go 1.18" or newer. (The x/sys repo declared "go 1.12"
at the time go.dev/cl/274573 was submitted.)

Fixes #55889.

Change-Id: Id762fff96fd13ba0f1e696929a9e276dfcba2620
Reviewed-on: https://go-review.googlesource.com/c/go/+/447755
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
2022-11-03 20:35:31 +00:00
Paul E. Murphy d031e9e07a cmd/compile/internal/ssa: re-adjust CarryChainTail scheduling priority
This needs to be as low as possible while not breaking priority
assumptions of other scores to correctly schedule carry chains.

Prior to the arm64 changes, it was set below ReadTuple. At the time,
this prevented the MulHiLo implementation on PPC64 from occluding
the scheduling of a full carry chain.

Memory scores can also prevent better scheduling, as can be observed
with crypto/internal/edwards25519/field.feMulGeneric.

Fixes #56497

Change-Id: Ia4b54e6dffcce584faf46b1b8d7cea18a3913887
Reviewed-on: https://go-review.googlesource.com/c/go/+/447435
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2022-11-03 19:59:19 +00:00
Austin Clements 6a44a3aa9f test/bench/go1: eliminate start-up time
The go1 benchmark suite does a lot of work at package init time, which
makes it take quite a while to run even if you're not running any of
the benchmarks, or if you're only running a subset of them. This leads
to an awkward workaround in dist test to compile but not run the
package, unlike roughly all other packages. It also reduces isolation
between benchmarks by affecting the starting heap size of all
benchmarks.

Fix this by initializing all data required by a benchmark when that
benchmark runs, and keeping it local so it gets freed by the GC and
doesn't leak between benchmarks. Now, none of the benchmarks depend on
global state.

Re-initializing the data on each benchmark run does add overhead to an
actual benchmark run, as each benchmark function is called several
times with different values of b.N. A full run of all benchmarks at
the default -benchtime=1s now takes ~10% longer; higher -benchtimes
would be less. It would be quite difficult to cache this data between
invocations of the same benchmark function without leaking between
different benchmarks and affecting GC overheads, as the testing
package doesn't provide any mechanism for this.

This reduces the time to run the binary with no benchmarks from 1.5
seconds to 10 ms, and also reduces the memory required to do this from
342 MiB to 17 MiB.

To make sure data was not leaking between different benchmarks, I ran
the benchmarks with -shuffle=on. The variance remained low: mostly
under 3%. A few benchmarks had higher variance, but in all cases it
was similar to the variance between this change.

This CL naturally changes the measured performance of several of the
benchmarks because it dramatically changes the heap size and hence GC
overheads. However, going forward the benchmarks should be much better
isolated.

For #37486.

Change-Id: I252ebea703a9560706cc1990dc5ad22d1927c7a0
Reviewed-on: https://go-review.googlesource.com/c/go/+/443336
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Austin Clements <austin@google.com>
2022-11-01 17:07:14 +00:00
Keith Randall 9ce27feaeb cmd/compile: add rule for post-decomposed growslice optimization
The recently added rule only works before decomposing slices.
Add a rule that works after decomposing slices.

The reason we need the latter is because although the length may
be a constant, it can be hidden inside a slice that is not constant
(its pointer or capacity might be changing). By applying this
optimization after decomposing slices, we can find more cases
where it applies.

Fixes #56440

Change-Id: I0094e59eee3065ab4d210defdda8227a6e897420
Reviewed-on: https://go-review.googlesource.com/c/go/+/446277
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
2022-10-31 21:40:49 +00:00
Keith Randall 0156b797e6 cmd/compile: recognize when the result of append has a constant length
Fixes a performance regression due to CL 418554.

Fixes #56440

Change-Id: I6ff152e9b83084756363f49ee6b0844a7a284880
Reviewed-on: https://go-review.googlesource.com/c/go/+/445875
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2022-10-27 17:09:50 +00:00
Keith Randall 8415ec8c98 cmd/compile: in compiler errors, print more digits for floats close to an int
Error messages currently print floats with %.6g, which means that if
you tried to convert something close to, but not quite, an integer, to
an integer, the error you get looks like "cannot convert 1 to type
int", when really you want "cannot convert 0.9999999 to type int".

Add more digits to floats when printing them, to make it clear that they
aren't quite integers. This helps for errors which are the result of not
being an integer. For other errors, it won't hurt much.

Fixes #56220

Change-Id: I7f5873af5993114a61460ef399d15316925a15a5
Reviewed-on: https://go-review.googlesource.com/c/go/+/442935
Reviewed-by: Rob Pike <r@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
2022-10-20 21:52:09 +00:00
Youlin Feng 7ae652b7c0 runtime: replace all uses of CtzXX with TrailingZerosXX
Replace all uses of Ctz64/32/8 with TrailingZeros64/32/8, because they
are the same and maybe duplicated. Also renamed CtzXX functions in 386
assembly code.

Change-Id: I19290204858083750f4be589bb0923393950ae6d
Reviewed-on: https://go-review.googlesource.com/c/go/+/438935
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
2022-10-18 18:06:27 +00:00
Michael Matloob 6f445a9db5 test: update test/run.go and some tests to use importcfg
Using importcfg instead of depending on the existence of .a files for
standard library packages will enable us to remove the .a files in a
future cl.

Change-Id: I6108384224508bc37d82fd990fc4a8649222502c
Reviewed-on: https://go-review.googlesource.com/c/go/+/440222
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Michael Matloob <matloob@golang.org>
Run-TryBot: Michael Matloob <matloob@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-10-12 23:16:41 +00:00
Cuong Manh Le 4bcf94b023 all: prevent fakePC overflow on 386 in libfuzzer mode
fakePC uses hash.Sum32, which returns an uint32. However, libfuzzer
trace/hook functions declare fakePC argument as int, causing overflow on
386 archs.

Fixing this by changing fakePC argument to uint to prevent the overflow.

Fixes #56141

Change-Id: I3994c461319983ab70065f90bf61539a363e0a2a
Reviewed-on: https://go-review.googlesource.com/c/go/+/441996
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2022-10-12 00:12:53 +00:00
Ian Lance Taylor bb2a96b79d test: add test case that caused a bogus error from gofrontend
For #56109

Change-Id: I999763e463fac57732a92f5e396f8fa8c35bd2e1
Reviewed-on: https://go-review.googlesource.com/c/go/+/440297
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
2022-10-10 21:47:48 +00:00