Commit graph

8 commits

Author SHA1 Message Date
Austin Clements 2010189407 runtime: remove legacy eager write barrier
Now that the buffered write barrier is implemented for all
architectures, we can remove the old eager write barrier
implementation. This CL removes the implementation from the runtime,
support in the compiler for calling it, and updates some compiler
tests that relied on the old eager barrier support. It also makes sure
that all of the useful comments from the old write barrier
implementation still have a place to live.

Fixes #22460.

Updates #21640 since this fixes the layering concerns of the write
barrier (but not the other things in that issue).

Change-Id: I580f93c152e89607e0a72fe43370237ba97bae74
Reviewed-on: https://go-review.googlesource.com/92705
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-02-13 16:34:46 +00:00
Austin Clements 7e343134d3 cmd/compile: compiler support for buffered write barrier
This CL implements the compiler support for calling the buffered write
barrier added by the previous CL.

Since the buffered write barrier is only implemented on amd64 right
now, this still supports the old, eager write barrier as well. There's
little overhead to supporting both and this way a few tests in
test/fixedbugs that expect to have liveness maps at write barrier
calls can easily opt-in to the old, eager barrier.

This significantly improves the performance of the write barrier:

name             old time/op  new time/op  delta
WriteBarrier-12  73.5ns ±20%  19.2ns ±27%  -73.90%  (p=0.000 n=19+18)

It also reduces the size of binaries because the write barrier call is
more compact:

name        old object-bytes  new object-bytes  delta
Template           398k ± 0%         393k ± 0%  -1.14%  (p=0.008 n=5+5)
Unicode            208k ± 0%         206k ± 0%  -1.00%  (p=0.008 n=5+5)
GoTypes           1.18M ± 0%        1.15M ± 0%  -2.00%  (p=0.008 n=5+5)
Compiler          4.05M ± 0%        3.88M ± 0%  -4.26%  (p=0.008 n=5+5)
SSA               8.25M ± 0%        8.11M ± 0%  -1.59%  (p=0.008 n=5+5)
Flate              228k ± 0%         224k ± 0%  -1.83%  (p=0.008 n=5+5)
GoParser           295k ± 0%         284k ± 0%  -3.62%  (p=0.008 n=5+5)
Reflect           1.00M ± 0%        0.99M ± 0%  -0.70%  (p=0.008 n=5+5)
Tar                339k ± 0%         333k ± 0%  -1.67%  (p=0.008 n=5+5)
XML                404k ± 0%         395k ± 0%  -2.10%  (p=0.008 n=5+5)
[Geo mean]         704k              690k       -2.00%

name        old exe-bytes     new exe-bytes     delta
HelloSize         1.05M ± 0%        1.04M ± 0%  -1.55%  (p=0.008 n=5+5)

https://perf.golang.org/search?q=upload:20171027.1

(Amusingly, this also reduces compiler allocations by 0.75%, which,
combined with the better write barrier, speeds up the compiler overall
by 2.10%. See the perf link.)

It slightly improves the performance of most of the go1 benchmarks and
improves the performance of the x/benchmarks:

name                      old time/op    new time/op    delta
BinaryTree17-12              2.40s ± 1%     2.47s ± 1%  +2.69%  (p=0.000 n=19+19)
Fannkuch11-12                2.95s ± 0%     2.95s ± 0%  +0.21%  (p=0.000 n=20+19)
FmtFprintfEmpty-12          41.8ns ± 4%    41.4ns ± 2%  -1.03%  (p=0.014 n=20+20)
FmtFprintfString-12         68.7ns ± 2%    67.5ns ± 1%  -1.75%  (p=0.000 n=20+17)
FmtFprintfInt-12            79.0ns ± 3%    77.1ns ± 1%  -2.40%  (p=0.000 n=19+17)
FmtFprintfIntInt-12          127ns ± 1%     123ns ± 3%  -3.42%  (p=0.000 n=20+20)
FmtFprintfPrefixedInt-12     152ns ± 1%     150ns ± 1%  -1.02%  (p=0.000 n=18+17)
FmtFprintfFloat-12           211ns ± 1%     209ns ± 0%  -0.99%  (p=0.000 n=20+16)
FmtManyArgs-12               500ns ± 0%     496ns ± 0%  -0.73%  (p=0.000 n=17+20)
GobDecode-12                6.44ms ± 1%    6.53ms ± 0%  +1.28%  (p=0.000 n=20+19)
GobEncode-12                5.46ms ± 0%    5.46ms ± 1%    ~     (p=0.550 n=19+20)
Gzip-12                      220ms ± 1%     216ms ± 0%  -1.75%  (p=0.000 n=19+19)
Gunzip-12                   38.8ms ± 0%    38.6ms ± 0%  -0.30%  (p=0.000 n=18+19)
HTTPClientServer-12         79.0µs ± 1%    78.2µs ± 1%  -1.01%  (p=0.000 n=20+20)
JSONEncode-12               11.9ms ± 0%    11.9ms ± 0%  -0.29%  (p=0.000 n=20+19)
JSONDecode-12               52.6ms ± 0%    52.2ms ± 0%  -0.68%  (p=0.000 n=19+20)
Mandelbrot200-12            3.69ms ± 0%    3.68ms ± 0%  -0.36%  (p=0.000 n=20+20)
GoParse-12                  3.13ms ± 1%    3.18ms ± 1%  +1.67%  (p=0.000 n=19+20)
RegexpMatchEasy0_32-12      73.2ns ± 1%    72.3ns ± 1%  -1.19%  (p=0.000 n=19+18)
RegexpMatchEasy0_1K-12       241ns ± 0%     239ns ± 0%  -0.83%  (p=0.000 n=17+16)
RegexpMatchEasy1_32-12      68.6ns ± 1%    69.0ns ± 1%  +0.47%  (p=0.015 n=18+16)
RegexpMatchEasy1_1K-12       364ns ± 0%     361ns ± 0%  -0.67%  (p=0.000 n=16+17)
RegexpMatchMedium_32-12      104ns ± 1%     103ns ± 1%  -0.79%  (p=0.001 n=20+15)
RegexpMatchMedium_1K-12     33.8µs ± 3%    34.0µs ± 2%    ~     (p=0.267 n=20+19)
RegexpMatchHard_32-12       1.64µs ± 1%    1.62µs ± 2%  -1.25%  (p=0.000 n=19+18)
RegexpMatchHard_1K-12       49.2µs ± 0%    48.7µs ± 1%  -0.93%  (p=0.000 n=19+18)
Revcomp-12                   391ms ± 5%     396ms ± 7%    ~     (p=0.154 n=19+19)
Template-12                 63.1ms ± 0%    59.5ms ± 0%  -5.76%  (p=0.000 n=18+19)
TimeParse-12                 307ns ± 0%     306ns ± 0%  -0.39%  (p=0.000 n=19+17)
TimeFormat-12                325ns ± 0%     323ns ± 0%  -0.50%  (p=0.000 n=19+19)
[Geo mean]                  47.3µs         46.9µs       -0.67%

https://perf.golang.org/search?q=upload:20171026.1

name                       old time/op  new time/op  delta
Garbage/benchmem-MB=64-12  2.25ms ± 1%  2.20ms ± 1%  -2.31%  (p=0.000 n=18+18)
HTTP-12                    12.6µs ± 0%  12.6µs ± 0%  -0.72%  (p=0.000 n=18+17)
JSON-12                    11.0ms ± 0%  11.0ms ± 1%  -0.68%  (p=0.000 n=17+19)

https://perf.golang.org/search?q=upload:20171026.2

Updates #14951.
Updates #22460.

Change-Id: Id4c0932890a1d41020071bec73b8522b1367d3e7
Reviewed-on: https://go-review.googlesource.com/73712
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-10-30 18:12:46 +00:00
David Chase 9c066bab64 cmd/compile: mark temps with new AutoTemp flag, and use it.
This is an extension of
https://go-review.googlesource.com/c/31662/
to mark all the temporaries, not just the ssa-generated ones.

Before-and-after ls -l `go tool -n compile` shows a 3%
reduction in size (or rather, a prior 3% inflation for
failing to filter temps out properly.)

Replaced name-dependent "is it a temp?" tests with calls to
*Node.IsAutoTmp(), which depends on AutoTemp.  Also replace
calls to istemp(n) with n.IsAutoTmp(), to reduce duplication
and clean up function name space.  Generated temporaries
now come with a "." prefix to avoid (apparently harmless)
clashes with legal Go variable names.

Fixes #17644.
Fixes #17240.

Change-Id: If1417f29c79a7275d7303ddf859b51472890fd43
Reviewed-on: https://go-review.googlesource.com/32255
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2016-10-31 19:38:50 +00:00
Austin Clements c39918a049 cmd/compile: disable various write barrier optimizations
Several of our current write barrier elision optimizations are invalid
with the hybrid barrier. Eliding the hybrid barrier requires that
*both* the current and new pointer be already shaded and, since we
don't have the flow analysis to figure out anything about the slot's
current value, for now we have to just disable several of these
optimizations.

This has a slight impact on binary size. On linux/amd64, the go tool
binary increases by 0.7% and the compile binary increases by 1.5%.

It also has a slight impact on performance, as one would expect. We'll
win some of this back in subsequent commits.

name                      old time/op    new time/op    delta
BinaryTree17-12              2.38s ± 1%     2.40s ± 1%  +0.82%  (p=0.000 n=18+20)
Fannkuch11-12                2.84s ± 1%     2.70s ± 0%  -4.97%  (p=0.000 n=18+18)
FmtFprintfEmpty-12          44.2ns ± 1%    46.4ns ± 2%  +4.89%  (p=0.000 n=16+18)
FmtFprintfString-12          131ns ± 0%     134ns ± 1%  +2.05%  (p=0.000 n=12+19)
FmtFprintfInt-12             114ns ± 1%     117ns ± 1%  +3.26%  (p=0.000 n=19+20)
FmtFprintfIntInt-12          176ns ± 1%     181ns ± 1%  +3.25%  (p=0.000 n=20+20)
FmtFprintfPrefixedInt-12     185ns ± 1%     190ns ± 1%  +2.77%  (p=0.000 n=19+18)
FmtFprintfFloat-12           249ns ± 1%     254ns ± 1%  +1.71%  (p=0.000 n=18+20)
FmtManyArgs-12               747ns ± 1%     743ns ± 1%  -0.58%  (p=0.000 n=19+18)
GobDecode-12                6.57ms ± 1%    6.61ms ± 0%  +0.73%  (p=0.000 n=19+20)
GobEncode-12                5.58ms ± 1%    5.60ms ± 0%  +0.27%  (p=0.001 n=18+18)
Gzip-12                      223ms ± 1%     223ms ± 1%    ~     (p=0.351 n=19+20)
Gunzip-12                   37.9ms ± 0%    37.9ms ± 1%    ~     (p=0.095 n=16+20)
HTTPClientServer-12         77.8µs ± 1%    78.5µs ± 1%  +0.97%  (p=0.000 n=19+20)
JSONEncode-12               14.8ms ± 1%    14.8ms ± 1%    ~     (p=0.079 n=20+19)
JSONDecode-12               53.7ms ± 1%    54.2ms ± 1%  +0.92%  (p=0.000 n=20+19)
Mandelbrot200-12            3.81ms ± 1%    3.81ms ± 0%    ~     (p=0.916 n=19+18)
GoParse-12                  3.19ms ± 1%    3.19ms ± 1%    ~     (p=0.175 n=20+19)
RegexpMatchEasy0_32-12      71.9ns ± 1%    70.6ns ± 1%  -1.87%  (p=0.000 n=19+20)
RegexpMatchEasy0_1K-12       946ns ± 0%     944ns ± 0%  -0.22%  (p=0.000 n=19+16)
RegexpMatchEasy1_32-12      67.3ns ± 2%    66.8ns ± 1%  -0.72%  (p=0.008 n=20+20)
RegexpMatchEasy1_1K-12       374ns ± 1%     384ns ± 1%  +2.69%  (p=0.000 n=18+20)
RegexpMatchMedium_32-12      107ns ± 1%     107ns ± 1%    ~     (p=1.000 n=20+20)
RegexpMatchMedium_1K-12     34.3µs ± 1%    34.6µs ± 1%  +0.90%  (p=0.000 n=20+20)
RegexpMatchHard_32-12       1.78µs ± 1%    1.80µs ± 1%  +1.45%  (p=0.000 n=20+19)
RegexpMatchHard_1K-12       53.6µs ± 0%    54.5µs ± 1%  +1.52%  (p=0.000 n=19+18)
Revcomp-12                   417ms ± 5%     391ms ± 1%  -6.42%  (p=0.000 n=16+19)
Template-12                 61.1ms ± 1%    64.2ms ± 0%  +5.07%  (p=0.000 n=19+20)
TimeParse-12                 302ns ± 1%     305ns ± 1%  +0.90%  (p=0.000 n=18+18)
TimeFormat-12                319ns ± 1%     315ns ± 1%  -1.25%  (p=0.000 n=18+18)
[Geo mean]                  54.0µs         54.3µs       +0.58%

name         old time/op  new time/op  delta
XGarbage-12  2.24ms ± 2%  2.28ms ± 1%  +1.68%  (p=0.000 n=18+17)
XHTTP-12     11.4µs ± 1%  11.6µs ± 2%  +1.63%  (p=0.000 n=18+18)
XJSON-12     11.6ms ± 0%  12.5ms ± 0%  +7.84%  (p=0.000 n=18+17)

Updates #17503.

Change-Id: I1899f8e35662971e24bf692b517dfbe2b533c00c
Reviewed-on: https://go-review.googlesource.com/31572
Reviewed-by: Keith Randall <khr@golang.org>
2016-10-28 20:05:58 +00:00
Austin Clements ad5fd2872f test: simplify fixedbugs/issue15747.go
The error check patterns in this test are more complex than necessary
because f2 gets inlined into f1. This behavior isn't important to the
test, so disable inlining of f2 and simplify the error check patterns.

Change-Id: Ia8aee92a52f9217ad71b89b2931494047e8d2185
Reviewed-on: https://go-review.googlesource.com/31132
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-10-15 21:27:45 +00:00
Keith Randall ca4089ad62 cmd/compile: args no longer live until end-of-function
We're dropping this behavior in favor of runtime.KeepAlive.
Implement runtime.KeepAlive as an intrinsic.

Update #15843

Change-Id: Ib60225bd30d6770ece1c3c7d1339a06aa25b1cbc
Reviewed-on: https://go-review.googlesource.com/28310
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2016-09-19 16:54:35 +00:00
David Chase 5b9ff11c3d cmd/compile: ppc64le working, not optimized enough
This time with the cherry-pick from the proper patch of
the old CL.

Stack size increased.
Corrected NaN-comparison glitches.
Marked g register as clobbered by calls.
Fixed shared libraries.

live_ssa.go still disabled because of differences.
Presumably turning on more optimization will fix
both the stack size and the live_ssa.go glitches.

Enhanced debugging output for shared libs test.

Rebased onto master.

Updates #16010.

Change-Id: I40864faf1ef32c118fb141b7ef8e854498e6b2c4
Reviewed-on: https://go-review.googlesource.com/27159
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2016-08-18 16:34:47 +00:00
Russ Cox b6dc3e6f66 cmd/compile: fix liveness computation for heap-escaped parameters
The liveness computation of parameters generally was never
correct, but forcing all parameters to be live throughout the
function covered up that problem. The new SSA back end is
too clever: even though it currently keeps the parameter values live
throughout the function, it may find optimizations that mean
the current values are not written back to the original parameter
stack slots immediately or ever (for example if a parameter is set
to nil, SSA constant propagation may replace all later uses of the
parameter with a constant nil, eliminating the need to write the nil
value back to the stack slot), so the liveness code must now
track the actual operations on the stack slots, exposing these
problems.

One small problem in the handling of arguments is that nodarg
can return ONAME PPARAM nodes with adjusted offsets, so that
there are actually multiple *Node pointers for the same parameter
in the instruction stream. This might be possible to correct, but
not in this CL. For now, we fix this by using n.Orig instead of n
when considering PPARAM and PPARAMOUT nodes.

The major problem in the handling of arguments is general
confusion in the liveness code about the meaning of PPARAM|PHEAP
and PPARAMOUT|PHEAP nodes, especially as contrasted with PAUTO|PHEAP.
The difference between these two is that when a local variable "moves"
to the heap, it's really just allocated there to start with; in contrast,
when an argument moves to the heap, the actual data has to be copied
there from the stack at the beginning of the function, and when a
result "moves" to the heap the value in the heap has to be copied
back to the stack when the function returns
This general confusion is also present in the SSA back end.

The PHEAP bit worked decently when I first introduced it 7 years ago (!)
in 391425ae. The back end did nothing sophisticated, and in particular
there was no analysis at all: no escape analysis, no liveness analysis,
and certainly no SSA back end. But the complications caused in the
various downstream consumers suggest that this should be a detail
kept mainly in the front end.

This CL therefore eliminates both the PHEAP bit and even the idea of
"heap variables" from the back ends.

First, it replaces the PPARAM|PHEAP, PPARAMOUT|PHEAP, and PAUTO|PHEAP
variable classes with the single PAUTOHEAP, a pseudo-class indicating
a variable maintained on the heap and available by indirecting a
local variable kept on the stack (a plain PAUTO).

Second, walkexpr replaces all references to PAUTOHEAP variables
with indirections of the corresponding PAUTO variable.
The back ends and the liveness code now just see plain indirected
variables. This may actually produce better code, but the real goal
here is to eliminate these little-used and somewhat suspect code
paths in the back end analyses.

The OPARAM node type goes away too.

A followup CL will do the same to PPARAMREF. I'm not sure that
the back ends (SSA in particular) are handling those right either,
and with the framework established in this CL that change is trivial
and the result clearly more correct.

Fixes #15747.

Change-Id: I2770b1ce3cbc93981bfc7166be66a9da12013d74
Reviewed-on: https://go-review.googlesource.com/23393
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-27 03:19:52 +00:00