languages/go - HydraGit

mirror of https://github.com/golang/go synced 2024-11-02 13:42:29 +00:00

Author	SHA1	Message	Date
Ben Shi	b444215116	cmd/compile: optimize ARM64's code with MADD/MSUB MADD does MUL-ADD in a single instruction, and MSUB does the similiar simplification for MUL-SUB. The CL implements the optimization with MADD/MSUB. 1. The total size of pkg/android_arm64/ decreases about 20KB, excluding cmd/compile/. 2. The go1 benchmark shows a little improvement for RegexpMatchHard_32-4 and Template-4, excluding noise. name old time/op new time/op delta BinaryTree17-4 16.3s ± 1% 16.5s ± 1% +1.41% (p=0.000 n=26+28) Fannkuch11-4 8.79s ± 1% 8.76s ± 0% -0.36% (p=0.000 n=26+28) FmtFprintfEmpty-4 172ns ± 0% 172ns ± 0% ~ (all equal) FmtFprintfString-4 362ns ± 1% 364ns ± 0% +0.55% (p=0.000 n=30+30) FmtFprintfInt-4 416ns ± 0% 416ns ± 0% ~ (p=0.099 n=22+30) FmtFprintfIntInt-4 655ns ± 1% 660ns ± 1% +0.76% (p=0.000 n=30+30) FmtFprintfPrefixedInt-4 810ns ± 0% 809ns ± 0% -0.08% (p=0.009 n=29+29) FmtFprintfFloat-4 1.08µs ± 0% 1.09µs ± 0% +0.61% (p=0.000 n=30+29) FmtManyArgs-4 2.70µs ± 0% 2.69µs ± 0% -0.23% (p=0.000 n=29+28) GobDecode-4 32.2ms ± 1% 32.1ms ± 1% -0.39% (p=0.000 n=27+26) GobEncode-4 27.4ms ± 2% 27.4ms ± 1% ~ (p=0.864 n=28+28) Gzip-4 1.53s ± 1% 1.52s ± 1% -0.30% (p=0.031 n=29+29) Gunzip-4 146ms ± 0% 146ms ± 0% -0.14% (p=0.001 n=25+30) HTTPClientServer-4 1.00ms ± 4% 0.98ms ± 6% -1.65% (p=0.001 n=29+30) JSONEncode-4 67.3ms ± 1% 67.2ms ± 1% ~ (p=0.520 n=28+28) JSONDecode-4 329ms ± 5% 330ms ± 4% ~ (p=0.142 n=30+30) Mandelbrot200-4 17.3ms ± 0% 17.3ms ± 0% ~ (p=0.055 n=26+29) GoParse-4 16.9ms ± 1% 17.0ms ± 1% +0.82% (p=0.000 n=30+30) RegexpMatchEasy0_32-4 382ns ± 0% 382ns ± 0% ~ (all equal) RegexpMatchEasy0_1K-4 1.33µs ± 0% 1.33µs ± 0% -0.25% (p=0.000 n=30+27) RegexpMatchEasy1_32-4 361ns ± 0% 361ns ± 0% -0.08% (p=0.002 n=30+28) RegexpMatchEasy1_1K-4 2.11µs ± 0% 2.09µs ± 0% -0.54% (p=0.000 n=30+29) RegexpMatchMedium_32-4 594ns ± 0% 592ns ± 0% -0.32% (p=0.000 n=30+30) RegexpMatchMedium_1K-4 173µs ± 0% 172µs ± 0% -0.77% (p=0.000 n=29+27) RegexpMatchHard_32-4 10.4µs ± 0% 10.1µs ± 0% -3.63% (p=0.000 n=28+27) RegexpMatchHard_1K-4 306µs ± 0% 301µs ± 0% -1.64% (p=0.000 n=29+30) Revcomp-4 2.51s ± 1% 2.52s ± 0% +0.18% (p=0.017 n=26+27) Template-4 394ms ± 3% 382ms ± 3% -3.22% (p=0.000 n=28+28) TimeParse-4 1.67µs ± 0% 1.67µs ± 0% +0.05% (p=0.030 n=27+30) TimeFormat-4 1.72µs ± 0% 1.70µs ± 0% -0.79% (p=0.000 n=28+26) [Geo mean] 259µs 259µs -0.33% name old speed new speed delta GobDecode-4 23.8MB/s ± 1% 23.9MB/s ± 1% +0.40% (p=0.001 n=27+26) GobEncode-4 28.0MB/s ± 2% 28.0MB/s ± 1% ~ (p=0.863 n=28+28) Gzip-4 12.7MB/s ± 1% 12.7MB/s ± 1% +0.32% (p=0.026 n=29+29) Gunzip-4 133MB/s ± 0% 133MB/s ± 0% +0.15% (p=0.001 n=24+30) JSONEncode-4 28.8MB/s ± 1% 28.9MB/s ± 1% ~ (p=0.475 n=28+28) JSONDecode-4 5.89MB/s ± 4% 5.87MB/s ± 5% ~ (p=0.174 n=29+30) GoParse-4 3.43MB/s ± 0% 3.40MB/s ± 1% -0.83% (p=0.000 n=28+30) RegexpMatchEasy0_32-4 83.6MB/s ± 0% 83.6MB/s ± 0% ~ (p=0.848 n=28+29) RegexpMatchEasy0_1K-4 768MB/s ± 0% 770MB/s ± 0% +0.25% (p=0.000 n=30+27) RegexpMatchEasy1_32-4 88.5MB/s ± 0% 88.5MB/s ± 0% ~ (p=0.086 n=29+29) RegexpMatchEasy1_1K-4 486MB/s ± 0% 489MB/s ± 0% +0.54% (p=0.000 n=30+29) RegexpMatchMedium_32-4 1.68MB/s ± 0% 1.69MB/s ± 0% +0.60% (p=0.000 n=30+23) RegexpMatchMedium_1K-4 5.90MB/s ± 0% 5.95MB/s ± 0% +0.85% (p=0.000 n=18+20) RegexpMatchHard_32-4 3.07MB/s ± 0% 3.18MB/s ± 0% +3.72% (p=0.000 n=29+26) RegexpMatchHard_1K-4 3.35MB/s ± 0% 3.40MB/s ± 0% +1.69% (p=0.000 n=30+30) Revcomp-4 101MB/s ± 0% 101MB/s ± 0% -0.18% (p=0.018 n=26+27) Template-4 4.92MB/s ± 4% 5.09MB/s ± 3% +3.31% (p=0.000 n=28+28) [Geo mean] 22.4MB/s 22.6MB/s +0.62% Change-Id: I8f304b272785739f57b3c8f736316f658f8c1b2a Reviewed-on: https://go-review.googlesource.com/129119 Run-TryBot: Ben Shi <powerman1st@163.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-09-04 20:41:58 +00:00
Matthew Dempsky	f7a633aa79	cmd/compile: use "N variables but M values" error for OAS Makes the error message more consistent between OAS and OAS2. Fixes #26616. Change-Id: I07ab46c5ef8a37efb2cb557632697f5d1bf789f7 Reviewed-on: https://go-review.googlesource.com/131280 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2018-09-04 19:05:56 +00:00
Alexey Naidonov	669fa8f36a	cmd/compile: remove unnecessary nil-check Removes unnecessary nil-check when referencing offset from an address. Suggested by Keith Randall in golang/go#27180. Updates golang/go#27180 Change-Id: I326ed7fda7cfa98b7e4354c811900707fee26021 Reviewed-on: https://go-review.googlesource.com/131735 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-09-04 17:44:14 +00:00
Iskander Sharipov	67ac554d79	cmd/compile/internal/gc: fix invalid positions for sink nodes in esc.go Make OAS2 and OAS2FUNC sink locations point to the assignment position, not the nth LHS position. Fixes #26987 Change-Id: Ibeb9df2da754da8b6638fe1e49e813f37515c13c Reviewed-on: https://go-review.googlesource.com/129315 Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-09-03 16:46:59 +00:00
Michael Munday	6f9b94ab66	cmd/compile: implement OnesCount{8,16,32,64} intrinsics on s390x This CL implements the math/bits.OnesCount{8,16,32,64} functions as intrinsics on s390x using the 'population count' (popcnt) instruction. This instruction was released as the 'population-count' facility which uses the same facility bit (45) as the 'distinct-operands' facility which is a pre-requisite for Go on s390x. We can therefore use it without a feature check. The s390x popcnt instruction treats a 64 bit register as a vector of 8 bytes, summing the number of ones in each byte individually. It then writes the results to the corresponding bytes in the output register. Therefore to implement OnesCount{16,32,64} we need to sum the individual byte counts using some extra instructions. To do this efficiently I've added some additional pseudo operations to the s390x SSA backend. Unlike other architectures the new instruction sequence is faster for OnesCount8, so that is implemented using the intrinsic. name old time/op new time/op delta OnesCount 3.21ns ± 1% 1.35ns ± 0% -58.00% (p=0.000 n=20+20) OnesCount8 0.91ns ± 1% 0.81ns ± 0% -11.43% (p=0.000 n=20+20) OnesCount16 1.51ns ± 3% 1.21ns ± 0% -19.71% (p=0.000 n=20+17) OnesCount32 1.91ns ± 0% 1.12ns ± 1% -41.60% (p=0.000 n=19+20) OnesCount64 3.18ns ± 4% 1.35ns ± 0% -57.52% (p=0.000 n=20+20) Change-Id: Id54f0bd28b6db9a887ad12c0d72fcc168ef9c4e0 Reviewed-on: https://go-review.googlesource.com/114675 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-09-03 14:35:38 +00:00
Iskander Sharipov	ff468a43be	cmd/compile/internal/gc: better handling of self-assignments in esc.go Teach escape analysis to recognize these assignment patterns as not causing the src to leak: val.x = val.y val.x[i] = val.y[j] val.x1.x2 = val.x1.y2 ... etc Helps to avoid "leaking param" with assignments showed above. The implementation is based on somewhat similiar xs=xs[a:b] special case that is ignored by the escape analysis. We may figure out more generalized version of this, but this one looks like a safe step into that direction. Updates #14858 Change-Id: I6fe5bfedec9c03bdc1d7624883324a523bd11fde Reviewed-on: https://go-review.googlesource.com/126395 Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2018-09-03 14:28:51 +00:00
Giovanni Bajo	dd5e9b32ff	cmd/compile: add testcase for #24876 This is still not fixed, the testcase reflects that there are still a few boundchecks. Let's fix the good alternative with an explicit test though. Updates #24876 Change-Id: I4da35eb353e19052bd7b69ea6190a69ced8b9b3d Reviewed-on: https://go-review.googlesource.com/107355 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Giovanni Bajo <rasky@develer.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-09-02 10:34:51 +00:00
Giovanni Bajo	f02cc88f46	test: relax whitespaces matching in codegen tests The codegen testsuite uses regexp to parse the syntax, but it doesn't have a way to tell line comments containing checks from line comments containing English sentences. This means that any syntax error (that is, non-matching regexp) is currently ignored and not reported. There were some tests in memcombine.go that had an extraneous space and were thus effectively disabled. It would be great if we could report it as a syntax error, but for now we just punt and swallow the spaces as a workaround, to avoid the same mistake again. Fixes #25452 Change-Id: Ic7747a2278bc00adffd0c199ce40937acbbc9cf0 Reviewed-on: https://go-review.googlesource.com/113835 Run-TryBot: Giovanni Bajo <rasky@develer.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-09-02 10:31:37 +00:00
Giovanni Bajo	09ea3c08e8	cmd/compile: in prove, fix fence-post implications for unsigned domain Fence-post implications of the form "x-1 >= w && x > min ⇒ x > w" were not correctly handling unsigned domain, by always checking signed limits. This bug was uncovered once we taught prove that len(x) is always >= 0 in the signed domain. In the code being miscompiled (s[len(s)-1]), prove checks whether len(s)-1 >= len(s) in the unsigned domain; if it proves that this is always false, it can remove the bound check. Notice that len(s)-1 >= len(s) can be true for len(s) = 0 because of the wrap-around, so this is something prove should not be able to deduce. But because of the bug, the gate condition for the fence-post implication was len(s) > MinInt64 instead of len(s) > 0; that condition would be good in the signed domain but not in the unsigned domain. And since in CL105635 we taught prove that len(s) >= 0, the condition incorrectly triggered (len(s) >= 0 > MinInt64) and things were going downfall. Fixes #27251 Fixes #27289 Change-Id: I3dbcb1955ac5a66a0dcbee500f41e8d219409be5 Reviewed-on: https://go-review.googlesource.com/132495 Reviewed-by: Keith Randall <khr@golang.org>	2018-08-31 08:54:38 +00:00
Andrew Bonventre	5ac2476748	cmd/compile: make math/bits.RotateLeft* an intrinsic on amd64 Previously, pattern matching was good enough to achieve good performance for the RotateLeft* functions, but the inlining cost for them was much too high. Make RotateLeft* intrinsic on amd64 as a stop-gap for now to reduce inlining costs. This should be done (or at least looked at) for other architectures as well. Updates golang/go#17566 Change-Id: I6a106ff00b6c4e3f490650af3e083ed2be00c819 Reviewed-on: https://go-review.googlesource.com/132435 Run-TryBot: Andrew Bonventre <andybons@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2018-08-30 22:48:28 +00:00
Cherry Zhang	289dce24c1	test: fix nosplit test on 386 The 120->124 change in https://go-review.googlesource.com/c/go/+/61511/21/test/nosplit.go#143 looks accidental. Change back to 120. Change-Id: I1690a8ae2d32756ba05544d2ed1baabfa64e1704 Reviewed-on: https://go-review.googlesource.com/131958 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-08-30 12:06:34 +00:00
Cherry Zhang	54f9c0416a	cmd/compile: count nil check as use in dead auto elim Nil check is special in that it has no use but we must keep it. Count it as a use of the auto. Fixes #27278. Change-Id: I857c3d0db2ebdca1bc342b4993c0dac5c01e067f Reviewed-on: https://go-review.googlesource.com/131955 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-08-30 02:19:37 +00:00
Zheng Xu	8f4fd3f34e	build: support frame-pointer for arm64 Supporting frame-pointer makes Linux's perf and other profilers much more useful because it lets them gather a stack trace efficiently on profiling events. Major changes include: 1. save FP on the word below where RSP is pointing to (proposed by Cherry and Austin) 2. adjust some specific offsets in runtime assembly and wrapper code 3. add support to FP in goroutine scheduler 4. adjust link stack overflow check to take the extra word into account 5. adjust nosplit test cases to enable frame sizes which are 16 bytes aligned Performance impacts on go1 benchmarks: Enable frame-pointer (by default) name old time/op new time/op delta BinaryTree17-46 5.94s ± 0% 6.00s ± 0% +1.03% (p=0.029 n=4+4) Fannkuch11-46 2.84s ± 1% 2.77s ± 0% -2.58% (p=0.008 n=5+5) FmtFprintfEmpty-46 55.0ns ± 1% 58.9ns ± 1% +7.06% (p=0.008 n=5+5) FmtFprintfString-46 102ns ± 0% 105ns ± 0% +2.94% (p=0.008 n=5+5) FmtFprintfInt-46 118ns ± 0% 117ns ± 1% -1.19% (p=0.000 n=4+5) FmtFprintfIntInt-46 181ns ± 0% 182ns ± 1% ~ (p=0.444 n=5+5) FmtFprintfPrefixedInt-46 215ns ± 1% 214ns ± 0% ~ (p=0.254 n=5+4) FmtFprintfFloat-46 292ns ± 0% 296ns ± 0% +1.46% (p=0.029 n=4+4) FmtManyArgs-46 720ns ± 0% 732ns ± 0% +1.72% (p=0.008 n=5+5) GobDecode-46 9.82ms ± 1% 10.03ms ± 2% +2.10% (p=0.008 n=5+5) GobEncode-46 8.14ms ± 0% 8.72ms ± 1% +7.14% (p=0.008 n=5+5) Gzip-46 420ms ± 0% 424ms ± 0% +0.92% (p=0.008 n=5+5) Gunzip-46 48.2ms ± 0% 48.4ms ± 0% +0.41% (p=0.008 n=5+5) HTTPClientServer-46 201µs ± 4% 201µs ± 0% ~ (p=0.730 n=5+4) JSONEncode-46 17.1ms ± 0% 17.7ms ± 1% +3.80% (p=0.008 n=5+5) JSONDecode-46 88.0ms ± 0% 90.1ms ± 0% +2.42% (p=0.008 n=5+5) Mandelbrot200-46 5.06ms ± 0% 5.07ms ± 0% ~ (p=0.310 n=5+5) GoParse-46 5.04ms ± 0% 5.12ms ± 0% +1.53% (p=0.008 n=5+5) RegexpMatchEasy0_32-46 117ns ± 0% 117ns ± 0% ~ (all equal) RegexpMatchEasy0_1K-46 332ns ± 0% 329ns ± 0% -0.78% (p=0.008 n=5+5) RegexpMatchEasy1_32-46 104ns ± 0% 113ns ± 0% +8.65% (p=0.029 n=4+4) RegexpMatchEasy1_1K-46 563ns ± 0% 569ns ± 0% +1.10% (p=0.008 n=5+5) RegexpMatchMedium_32-46 167ns ± 2% 177ns ± 1% +5.74% (p=0.008 n=5+5) RegexpMatchMedium_1K-46 49.5µs ± 0% 53.4µs ± 0% +7.81% (p=0.008 n=5+5) RegexpMatchHard_32-46 2.56µs ± 1% 2.72µs ± 0% +6.01% (p=0.008 n=5+5) RegexpMatchHard_1K-46 77.0µs ± 0% 81.8µs ± 0% +6.24% (p=0.016 n=5+4) Revcomp-46 631ms ± 1% 627ms ± 1% ~ (p=0.095 n=5+5) Template-46 81.8ms ± 0% 86.3ms ± 0% +5.55% (p=0.008 n=5+5) TimeParse-46 423ns ± 0% 432ns ± 0% +2.32% (p=0.008 n=5+5) TimeFormat-46 478ns ± 2% 497ns ± 1% +3.89% (p=0.008 n=5+5) [Geo mean] 71.6µs 73.3µs +2.45% name old speed new speed delta GobDecode-46 78.1MB/s ± 1% 76.6MB/s ± 2% -2.04% (p=0.008 n=5+5) GobEncode-46 94.3MB/s ± 0% 88.0MB/s ± 1% -6.67% (p=0.008 n=5+5) Gzip-46 46.2MB/s ± 0% 45.8MB/s ± 0% -0.91% (p=0.008 n=5+5) Gunzip-46 403MB/s ± 0% 401MB/s ± 0% -0.41% (p=0.008 n=5+5) JSONEncode-46 114MB/s ± 0% 109MB/s ± 1% -3.66% (p=0.008 n=5+5) JSONDecode-46 22.0MB/s ± 0% 21.5MB/s ± 0% -2.35% (p=0.008 n=5+5) GoParse-46 11.5MB/s ± 0% 11.3MB/s ± 0% -1.51% (p=0.008 n=5+5) RegexpMatchEasy0_32-46 272MB/s ± 0% 272MB/s ± 1% ~ (p=0.190 n=4+5) RegexpMatchEasy0_1K-46 3.08GB/s ± 0% 3.11GB/s ± 0% +0.77% (p=0.008 n=5+5) RegexpMatchEasy1_32-46 306MB/s ± 0% 283MB/s ± 0% -7.63% (p=0.029 n=4+4) RegexpMatchEasy1_1K-46 1.82GB/s ± 0% 1.80GB/s ± 0% -1.07% (p=0.008 n=5+5) RegexpMatchMedium_32-46 5.99MB/s ± 0% 5.64MB/s ± 1% -5.77% (p=0.016 n=4+5) RegexpMatchMedium_1K-46 20.7MB/s ± 0% 19.2MB/s ± 0% -7.25% (p=0.008 n=5+5) RegexpMatchHard_32-46 12.5MB/s ± 1% 11.8MB/s ± 0% -5.66% (p=0.008 n=5+5) RegexpMatchHard_1K-46 13.3MB/s ± 0% 12.5MB/s ± 1% -6.01% (p=0.008 n=5+5) Revcomp-46 402MB/s ± 1% 405MB/s ± 1% ~ (p=0.095 n=5+5) Template-46 23.7MB/s ± 0% 22.5MB/s ± 0% -5.25% (p=0.008 n=5+5) [Geo mean] 82.2MB/s 79.6MB/s -3.26% Disable frame-pointer (GOEXPERIMENT=noframepointer) name old time/op new time/op delta BinaryTree17-46 5.94s ± 0% 5.96s ± 0% +0.39% (p=0.029 n=4+4) Fannkuch11-46 2.84s ± 1% 2.79s ± 1% -1.68% (p=0.008 n=5+5) FmtFprintfEmpty-46 55.0ns ± 1% 55.2ns ± 3% ~ (p=0.794 n=5+5) FmtFprintfString-46 102ns ± 0% 103ns ± 0% +0.98% (p=0.016 n=5+4) FmtFprintfInt-46 118ns ± 0% 115ns ± 0% -2.54% (p=0.029 n=4+4) FmtFprintfIntInt-46 181ns ± 0% 179ns ± 0% -1.10% (p=0.000 n=5+4) FmtFprintfPrefixedInt-46 215ns ± 1% 213ns ± 0% ~ (p=0.143 n=5+4) FmtFprintfFloat-46 292ns ± 0% 300ns ± 0% +2.83% (p=0.029 n=4+4) FmtManyArgs-46 720ns ± 0% 739ns ± 0% +2.64% (p=0.008 n=5+5) GobDecode-46 9.82ms ± 1% 9.78ms ± 1% ~ (p=0.151 n=5+5) GobEncode-46 8.14ms ± 0% 8.12ms ± 1% ~ (p=0.690 n=5+5) Gzip-46 420ms ± 0% 420ms ± 0% ~ (p=0.548 n=5+5) Gunzip-46 48.2ms ± 0% 48.0ms ± 0% -0.33% (p=0.032 n=5+5) HTTPClientServer-46 201µs ± 4% 199µs ± 3% ~ (p=0.548 n=5+5) JSONEncode-46 17.1ms ± 0% 17.2ms ± 0% ~ (p=0.056 n=5+5) JSONDecode-46 88.0ms ± 0% 88.6ms ± 0% +0.64% (p=0.008 n=5+5) Mandelbrot200-46 5.06ms ± 0% 5.07ms ± 0% ~ (p=0.548 n=5+5) GoParse-46 5.04ms ± 0% 5.07ms ± 0% +0.65% (p=0.008 n=5+5) RegexpMatchEasy0_32-46 117ns ± 0% 112ns ± 4% -4.27% (p=0.016 n=4+5) RegexpMatchEasy0_1K-46 332ns ± 0% 330ns ± 1% ~ (p=0.095 n=5+5) RegexpMatchEasy1_32-46 104ns ± 0% 110ns ± 1% +5.29% (p=0.029 n=4+4) RegexpMatchEasy1_1K-46 563ns ± 0% 567ns ± 2% ~ (p=0.151 n=5+5) RegexpMatchMedium_32-46 167ns ± 2% 166ns ± 0% ~ (p=0.333 n=5+4) RegexpMatchMedium_1K-46 49.5µs ± 0% 49.6µs ± 0% ~ (p=0.841 n=5+5) RegexpMatchHard_32-46 2.56µs ± 1% 2.49µs ± 0% -2.81% (p=0.008 n=5+5) RegexpMatchHard_1K-46 77.0µs ± 0% 75.8µs ± 0% -1.55% (p=0.008 n=5+5) Revcomp-46 631ms ± 1% 628ms ± 0% ~ (p=0.095 n=5+5) Template-46 81.8ms ± 0% 84.3ms ± 1% +3.05% (p=0.008 n=5+5) TimeParse-46 423ns ± 0% 425ns ± 0% +0.52% (p=0.008 n=5+5) TimeFormat-46 478ns ± 2% 478ns ± 1% ~ (p=1.000 n=5+5) [Geo mean] 71.6µs 71.6µs -0.01% name old speed new speed delta GobDecode-46 78.1MB/s ± 1% 78.5MB/s ± 1% ~ (p=0.151 n=5+5) GobEncode-46 94.3MB/s ± 0% 94.5MB/s ± 1% ~ (p=0.690 n=5+5) Gzip-46 46.2MB/s ± 0% 46.2MB/s ± 0% ~ (p=0.571 n=5+5) Gunzip-46 403MB/s ± 0% 404MB/s ± 0% +0.33% (p=0.032 n=5+5) JSONEncode-46 114MB/s ± 0% 113MB/s ± 0% ~ (p=0.056 n=5+5) JSONDecode-46 22.0MB/s ± 0% 21.9MB/s ± 0% -0.64% (p=0.008 n=5+5) GoParse-46 11.5MB/s ± 0% 11.4MB/s ± 0% -0.64% (p=0.008 n=5+5) RegexpMatchEasy0_32-46 272MB/s ± 0% 285MB/s ± 4% +4.74% (p=0.016 n=4+5) RegexpMatchEasy0_1K-46 3.08GB/s ± 0% 3.10GB/s ± 1% ~ (p=0.151 n=5+5) RegexpMatchEasy1_32-46 306MB/s ± 0% 290MB/s ± 1% -5.21% (p=0.029 n=4+4) RegexpMatchEasy1_1K-46 1.82GB/s ± 0% 1.81GB/s ± 2% ~ (p=0.151 n=5+5) RegexpMatchMedium_32-46 5.99MB/s ± 0% 6.02MB/s ± 1% ~ (p=0.063 n=4+5) RegexpMatchMedium_1K-46 20.7MB/s ± 0% 20.7MB/s ± 0% ~ (p=0.659 n=5+5) RegexpMatchHard_32-46 12.5MB/s ± 1% 12.8MB/s ± 0% +2.88% (p=0.008 n=5+5) RegexpMatchHard_1K-46 13.3MB/s ± 0% 13.5MB/s ± 0% +1.58% (p=0.008 n=5+5) Revcomp-46 402MB/s ± 1% 405MB/s ± 0% ~ (p=0.095 n=5+5) Template-46 23.7MB/s ± 0% 23.0MB/s ± 1% -2.95% (p=0.008 n=5+5) [Geo mean] 82.2MB/s 82.3MB/s +0.04% Frame-pointer is enabled on Linux by default but can be disabled by setting: GOEXPERIMENT=noframepointer. Fixes #10110 Change-Id: I1bfaca6dba29a63009d7c6ab04ed7a1413d9479e Reviewed-on: https://go-review.googlesource.com/61511 Reviewed-by: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-08-29 18:28:34 +00:00
Dave Brophy	7c7cecc184	fmt: fix incorrect format of whole-number floats when using %#v This fixes the unwanted behaviour where printing a zero float with the #v fmt verb outputs "0" - e.g. missing the trailing decimal. This means that the output would be interpreted as an int rather than a float when parsed as Go source. After this change the the output is "0.0". Fixes #26363 Change-Id: Ic5c060522459cd5ce077675d47c848b22ddc34fa GitHub-Last-Rev: `adfb061363` GitHub-Pull-Request: golang/go#26383 Reviewed-on: https://go-review.googlesource.com/123956 Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Rob Pike <r@golang.org> Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-08-28 20:15:15 +00:00
Ben Shi	3ca3e89bb6	cmd/compile: optimize arm64 with indexed FP load/store The FP load/store on arm64 have register indexed forms. And this CL implements this optimization. 1. The total size of pkg/android_arm64 (excluding cmd/compile) decreases about 400 bytes. 2. There is no regression in the go1 benchmark, the test case GobEncode even gets slight improvement, excluding noise. name old time/op new time/op delta BinaryTree17-4 19.0s ± 0% 19.0s ± 1% ~ (p=0.817 n=29+29) Fannkuch11-4 9.94s ± 0% 9.95s ± 0% +0.03% (p=0.010 n=24+30) FmtFprintfEmpty-4 233ns ± 0% 233ns ± 0% ~ (all equal) FmtFprintfString-4 427ns ± 0% 427ns ± 0% ~ (p=0.649 n=30+30) FmtFprintfInt-4 471ns ± 0% 471ns ± 0% ~ (all equal) FmtFprintfIntInt-4 730ns ± 0% 730ns ± 0% ~ (all equal) FmtFprintfPrefixedInt-4 889ns ± 0% 889ns ± 0% ~ (all equal) FmtFprintfFloat-4 1.21µs ± 0% 1.21µs ± 0% +0.04% (p=0.012 n=20+30) FmtManyArgs-4 2.99µs ± 0% 2.99µs ± 0% ~ (p=0.651 n=29+29) GobDecode-4 42.4ms ± 1% 42.3ms ± 1% -0.27% (p=0.001 n=29+28) GobEncode-4 37.8ms ±11% 36.0ms ± 0% -4.67% (p=0.000 n=30+26) Gzip-4 1.98s ± 1% 1.96s ± 1% -1.26% (p=0.000 n=30+30) Gunzip-4 175ms ± 0% 175ms ± 0% ~ (p=0.988 n=29+29) HTTPClientServer-4 854µs ± 5% 860µs ± 5% ~ (p=0.236 n=28+29) JSONEncode-4 88.8ms ± 0% 87.9ms ± 0% -1.00% (p=0.000 n=24+26) JSONDecode-4 390ms ± 1% 392ms ± 2% +0.48% (p=0.025 n=30+30) Mandelbrot200-4 19.5ms ± 0% 19.5ms ± 0% ~ (p=0.894 n=24+29) GoParse-4 20.3ms ± 0% 20.1ms ± 1% -0.94% (p=0.000 n=27+26) RegexpMatchEasy0_32-4 451ns ± 0% 451ns ± 0% ~ (p=0.578 n=30+30) RegexpMatchEasy0_1K-4 1.63µs ± 0% 1.63µs ± 0% ~ (p=0.298 n=30+28) RegexpMatchEasy1_32-4 431ns ± 0% 434ns ± 0% +0.67% (p=0.000 n=30+29) RegexpMatchEasy1_1K-4 2.60µs ± 0% 2.64µs ± 0% +1.36% (p=0.000 n=28+26) RegexpMatchMedium_32-4 744ns ± 0% 744ns ± 0% ~ (p=0.474 n=29+29) RegexpMatchMedium_1K-4 223µs ± 0% 223µs ± 0% -0.08% (p=0.038 n=26+30) RegexpMatchHard_32-4 12.2µs ± 0% 12.3µs ± 0% +0.27% (p=0.000 n=29+30) RegexpMatchHard_1K-4 373µs ± 0% 373µs ± 0% ~ (p=0.219 n=29+28) Revcomp-4 2.84s ± 0% 2.84s ± 0% ~ (p=0.130 n=28+28) Template-4 394ms ± 1% 392ms ± 1% -0.52% (p=0.001 n=30+30) TimeParse-4 1.93µs ± 0% 1.93µs ± 0% ~ (p=0.587 n=29+30) TimeFormat-4 2.00µs ± 0% 2.00µs ± 0% +0.07% (p=0.001 n=28+27) [Geo mean] 306µs 305µs -0.17% name old speed new speed delta GobDecode-4 18.1MB/s ± 1% 18.2MB/s ± 1% +0.27% (p=0.001 n=29+28) GobEncode-4 20.3MB/s ±10% 21.3MB/s ± 0% +4.64% (p=0.000 n=30+26) Gzip-4 9.79MB/s ± 1% 9.91MB/s ± 1% +1.28% (p=0.000 n=30+30) Gunzip-4 111MB/s ± 0% 111MB/s ± 0% ~ (p=0.988 n=29+29) JSONEncode-4 21.8MB/s ± 0% 22.1MB/s ± 0% +1.02% (p=0.000 n=24+26) JSONDecode-4 4.97MB/s ± 1% 4.95MB/s ± 2% -0.45% (p=0.031 n=30+30) GoParse-4 2.85MB/s ± 1% 2.88MB/s ± 1% +1.03% (p=0.000 n=30+26) RegexpMatchEasy0_32-4 70.9MB/s ± 0% 70.9MB/s ± 0% ~ (p=0.904 n=29+28) RegexpMatchEasy0_1K-4 627MB/s ± 0% 627MB/s ± 0% ~ (p=0.156 n=30+30) RegexpMatchEasy1_32-4 74.2MB/s ± 0% 73.7MB/s ± 0% -0.67% (p=0.000 n=30+29) RegexpMatchEasy1_1K-4 393MB/s ± 0% 388MB/s ± 0% -1.34% (p=0.000 n=28+26) RegexpMatchMedium_32-4 1.34MB/s ± 0% 1.34MB/s ± 0% ~ (all equal) RegexpMatchMedium_1K-4 4.59MB/s ± 0% 4.59MB/s ± 0% +0.07% (p=0.035 n=25+30) RegexpMatchHard_32-4 2.61MB/s ± 0% 2.61MB/s ± 0% -0.11% (p=0.002 n=28+30) RegexpMatchHard_1K-4 2.75MB/s ± 0% 2.75MB/s ± 0% +0.15% (p=0.001 n=30+24) Revcomp-4 89.4MB/s ± 0% 89.4MB/s ± 0% ~ (p=0.140 n=28+28) Template-4 4.93MB/s ± 1% 4.95MB/s ± 1% +0.51% (p=0.001 n=30+30) [Geo mean] 18.4MB/s 18.4MB/s +0.37% Change-Id: I9a6b521a971b21cfb51064e8e9b853cef8a1d071 Reviewed-on: https://go-review.googlesource.com/124636 Run-TryBot: Ben Shi <powerman1st@163.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-08-28 02:37:18 +00:00
Ben Shi	2b69ad0b3c	cmd/compile: optimize arm's comparison The CMP/CMN/TST/TEQ perform similar to SUB/ADD/AND/XOR except the result is abondoned, and only NZCV flags are affected. This CL implements further optimization with them. 1. A micro benchmark test gets more than 9% improvment. TSTTEQ-4 6.99ms ± 0% 6.35ms ± 0% -9.15% (p=0.000 n=33+36) (https://github.com/benshi001/ugo1/blob/master/tstteq2_test.go) 2. The go1 benckmark shows no regression, excluding noise. name old time/op new time/op delta BinaryTree17-4 25.7s ± 1% 25.7s ± 1% ~ (p=0.830 n=40+40) Fannkuch11-4 13.3s ± 0% 13.2s ± 0% -0.65% (p=0.000 n=40+34) FmtFprintfEmpty-4 394ns ± 0% 394ns ± 0% ~ (p=0.819 n=40+40) FmtFprintfString-4 677ns ± 0% 677ns ± 0% +0.06% (p=0.039 n=39+40) FmtFprintfInt-4 707ns ± 0% 706ns ± 0% -0.14% (p=0.000 n=40+39) FmtFprintfIntInt-4 1.04µs ± 0% 1.04µs ± 0% +0.10% (p=0.000 n=29+31) FmtFprintfPrefixedInt-4 1.10µs ± 0% 1.11µs ± 0% +0.65% (p=0.000 n=39+37) FmtFprintfFloat-4 2.27µs ± 0% 2.26µs ± 0% -0.53% (p=0.000 n=39+40) FmtManyArgs-4 3.96µs ± 0% 3.96µs ± 0% +0.10% (p=0.000 n=39+40) GobDecode-4 53.4ms ± 1% 52.8ms ± 2% -1.10% (p=0.000 n=39+39) GobEncode-4 50.3ms ± 3% 50.4ms ± 2% ~ (p=0.089 n=40+39) Gzip-4 2.62s ± 0% 2.64s ± 0% +0.60% (p=0.000 n=40+39) Gunzip-4 312ms ± 0% 312ms ± 0% +0.02% (p=0.030 n=40+39) HTTPClientServer-4 1.01ms ± 7% 0.98ms ± 7% -2.37% (p=0.000 n=40+39) JSONEncode-4 126ms ± 1% 126ms ± 1% -0.38% (p=0.004 n=39+39) JSONDecode-4 423ms ± 0% 426ms ± 2% +0.72% (p=0.001 n=39+40) Mandelbrot200-4 18.4ms ± 0% 18.4ms ± 0% +0.04% (p=0.000 n=38+40) GoParse-4 22.8ms ± 0% 22.6ms ± 0% -0.68% (p=0.000 n=35+40) RegexpMatchEasy0_32-4 699ns ± 0% 704ns ± 0% +0.73% (p=0.000 n=27+40) RegexpMatchEasy0_1K-4 4.27µs ± 0% 4.26µs ± 0% -0.09% (p=0.000 n=35+38) RegexpMatchEasy1_32-4 741ns ± 0% 735ns ± 0% -0.85% (p=0.000 n=40+35) RegexpMatchEasy1_1K-4 5.53µs ± 0% 5.49µs ± 0% -0.69% (p=0.000 n=39+40) RegexpMatchMedium_32-4 1.07µs ± 0% 1.04µs ± 2% -2.34% (p=0.000 n=40+40) RegexpMatchMedium_1K-4 261µs ± 0% 261µs ± 0% -0.16% (p=0.000 n=40+39) RegexpMatchHard_32-4 14.9µs ± 0% 14.9µs ± 0% -0.18% (p=0.000 n=39+40) RegexpMatchHard_1K-4 445µs ± 0% 446µs ± 0% +0.09% (p=0.000 n=36+34) Revcomp-4 41.8ms ± 1% 41.8ms ± 1% ~ (p=0.595 n=39+38) Template-4 530ms ± 1% 528ms ± 1% -0.49% (p=0.000 n=40+40) TimeParse-4 3.39µs ± 0% 3.42µs ± 0% +0.98% (p=0.000 n=36+38) TimeFormat-4 6.12µs ± 0% 6.07µs ± 0% -0.81% (p=0.000 n=34+38) [Geo mean] 384µs 383µs -0.24% name old speed new speed delta GobDecode-4 14.4MB/s ± 1% 14.5MB/s ± 2% +1.11% (p=0.000 n=39+39) GobEncode-4 15.3MB/s ± 3% 15.2MB/s ± 2% ~ (p=0.104 n=40+39) Gzip-4 7.40MB/s ± 1% 7.36MB/s ± 0% -0.60% (p=0.000 n=40+39) Gunzip-4 62.2MB/s ± 0% 62.1MB/s ± 0% -0.02% (p=0.047 n=40+39) JSONEncode-4 15.4MB/s ± 1% 15.4MB/s ± 2% +0.39% (p=0.002 n=39+39) JSONDecode-4 4.59MB/s ± 0% 4.56MB/s ± 2% -0.71% (p=0.000 n=39+40) GoParse-4 2.54MB/s ± 0% 2.56MB/s ± 0% +0.72% (p=0.000 n=26+40) RegexpMatchEasy0_32-4 45.8MB/s ± 0% 45.4MB/s ± 0% -0.75% (p=0.000 n=38+40) RegexpMatchEasy0_1K-4 240MB/s ± 0% 240MB/s ± 0% +0.09% (p=0.000 n=35+38) RegexpMatchEasy1_32-4 43.1MB/s ± 0% 43.5MB/s ± 0% +0.84% (p=0.000 n=40+39) RegexpMatchEasy1_1K-4 185MB/s ± 0% 186MB/s ± 0% +0.69% (p=0.000 n=39+40) RegexpMatchMedium_32-4 936kB/s ± 1% 959kB/s ± 2% +2.38% (p=0.000 n=40+40) RegexpMatchMedium_1K-4 3.92MB/s ± 0% 3.93MB/s ± 0% +0.18% (p=0.000 n=39+40) RegexpMatchHard_32-4 2.15MB/s ± 0% 2.15MB/s ± 0% +0.19% (p=0.000 n=40+40) RegexpMatchHard_1K-4 2.30MB/s ± 0% 2.30MB/s ± 0% ~ (all equal) Revcomp-4 60.8MB/s ± 1% 60.8MB/s ± 1% ~ (p=0.600 n=39+38) Template-4 3.66MB/s ± 1% 3.68MB/s ± 1% +0.46% (p=0.000 n=40+40) [Geo mean] 12.8MB/s 12.8MB/s +0.27% Change-Id: I849161169ecf0876a04b7c1d3990fa8d1435215e Reviewed-on: https://go-review.googlesource.com/122855 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-08-27 15:49:46 +00:00
Alberto Donizetti	42cc4ca30a	cmd/compile: prevent overflow in walkinrange In the compiler frontend, walkinrange indiscriminately calls Int64() on const CTINT nodes, even though Int64's return value is undefined for anything over 2⁶³ (in practise, it'll return a negative number). This causes the introduction of bad constants during rewrites of unsigned expressions, which make the compiler reject valid Go programs. This change introduces a preliminary check that Int64() is safe to call on the consts on hand. If it isn't, walkinrange exits without doing any rewrite. Fixes #27143 Change-Id: I2017073cae65468a521ff3262d4ea8ab0d7098d9 Reviewed-on: https://go-review.googlesource.com/130735 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2018-08-26 21:52:27 +00:00
Ben Shi	e03220a594	cmd/compile: optimize 386 code with FLDPI FLDPI pushes the constant pi to 387's register stack, which is more efficient than MOVSSconst/MOVSDconst. 1. This optimization reduces 0.3KB of the total size of pkg/linux_386 (exlcuding cmd/compile). 2. There is little regression in the go1 benchmark. name old time/op new time/op delta BinaryTree17-4 3.30s ± 3% 3.30s ± 2% ~ (p=0.759 n=40+39) Fannkuch11-4 3.53s ± 1% 3.54s ± 1% ~ (p=0.168 n=40+40) FmtFprintfEmpty-4 45.5ns ± 3% 45.6ns ± 3% ~ (p=0.553 n=40+40) FmtFprintfString-4 78.4ns ± 3% 78.3ns ± 3% ~ (p=0.593 n=40+40) FmtFprintfInt-4 88.8ns ± 2% 89.9ns ± 2% ~ (p=0.083 n=40+33) FmtFprintfIntInt-4 140ns ± 4% 140ns ± 4% ~ (p=0.656 n=40+40) FmtFprintfPrefixedInt-4 180ns ± 2% 181ns ± 3% +0.53% (p=0.050 n=40+40) FmtFprintfFloat-4 408ns ± 4% 411ns ± 3% ~ (p=0.112 n=40+40) FmtManyArgs-4 599ns ± 3% 602ns ± 3% ~ (p=0.784 n=40+40) GobDecode-4 7.24ms ± 6% 7.30ms ± 5% ~ (p=0.171 n=40+40) GobEncode-4 6.98ms ± 5% 6.89ms ± 8% ~ (p=0.107 n=40+40) Gzip-4 396ms ± 4% 396ms ± 3% ~ (p=0.852 n=40+40) Gunzip-4 41.3ms ± 3% 41.5ms ± 4% ~ (p=0.221 n=40+40) HTTPClientServer-4 63.4µs ± 3% 63.4µs ± 2% ~ (p=0.895 n=39+40) JSONEncode-4 17.5ms ± 2% 17.5ms ± 3% ~ (p=0.090 n=40+40) JSONDecode-4 60.6ms ± 3% 60.1ms ± 4% ~ (p=0.184 n=40+40) Mandelbrot200-4 7.80ms ± 3% 7.78ms ± 2% ~ (p=0.512 n=40+40) GoParse-4 3.30ms ± 3% 3.28ms ± 2% -0.61% (p=0.034 n=40+40) RegexpMatchEasy0_32-4 104ns ± 4% 103ns ± 4% ~ (p=0.118 n=40+40) RegexpMatchEasy0_1K-4 850ns ± 2% 848ns ± 2% ~ (p=0.370 n=40+40) RegexpMatchEasy1_32-4 112ns ± 4% 112ns ± 4% ~ (p=0.848 n=40+40) RegexpMatchEasy1_1K-4 1.04µs ± 4% 1.03µs ± 4% ~ (p=0.333 n=40+40) RegexpMatchMedium_32-4 132ns ± 4% 131ns ± 3% ~ (p=0.527 n=40+40) RegexpMatchMedium_1K-4 43.4µs ± 3% 43.5µs ± 3% ~ (p=0.111 n=40+40) RegexpMatchHard_32-4 2.24µs ± 4% 2.24µs ± 4% ~ (p=0.441 n=40+40) RegexpMatchHard_1K-4 67.9µs ± 3% 68.0µs ± 3% ~ (p=0.095 n=40+40) Revcomp-4 1.84s ± 2% 1.84s ± 2% ~ (p=0.677 n=40+40) Template-4 68.4ms ± 3% 68.6ms ± 3% ~ (p=0.345 n=40+40) TimeParse-4 433ns ± 3% 433ns ± 3% ~ (p=0.403 n=40+40) TimeFormat-4 407ns ± 3% 406ns ± 3% ~ (p=0.900 n=40+40) [Geo mean] 67.1µs 67.2µs +0.04% name old speed new speed delta GobDecode-4 106MB/s ± 5% 105MB/s ± 5% ~ (p=0.173 n=40+40) GobEncode-4 110MB/s ± 5% 112MB/s ± 9% ~ (p=0.104 n=40+40) Gzip-4 49.0MB/s ± 4% 49.1MB/s ± 4% ~ (p=0.836 n=40+40) Gunzip-4 471MB/s ± 3% 468MB/s ± 4% ~ (p=0.218 n=40+40) JSONEncode-4 111MB/s ± 2% 111MB/s ± 3% ~ (p=0.090 n=40+40) JSONDecode-4 32.0MB/s ± 3% 32.3MB/s ± 4% ~ (p=0.194 n=40+40) GoParse-4 17.6MB/s ± 3% 17.7MB/s ± 2% +0.62% (p=0.035 n=40+40) RegexpMatchEasy0_32-4 307MB/s ± 4% 309MB/s ± 4% +0.70% (p=0.041 n=40+40) RegexpMatchEasy0_1K-4 1.20GB/s ± 3% 1.21GB/s ± 2% ~ (p=0.353 n=40+40) RegexpMatchEasy1_32-4 285MB/s ± 3% 284MB/s ± 4% ~ (p=0.384 n=40+40) RegexpMatchEasy1_1K-4 988MB/s ± 4% 992MB/s ± 3% ~ (p=0.335 n=40+40) RegexpMatchMedium_32-4 7.56MB/s ± 4% 7.57MB/s ± 4% ~ (p=0.314 n=40+40) RegexpMatchMedium_1K-4 23.6MB/s ± 3% 23.6MB/s ± 3% ~ (p=0.107 n=40+40) RegexpMatchHard_32-4 14.3MB/s ± 4% 14.3MB/s ± 4% ~ (p=0.429 n=40+40) RegexpMatchHard_1K-4 15.1MB/s ± 3% 15.1MB/s ± 3% ~ (p=0.099 n=40+40) Revcomp-4 138MB/s ± 2% 138MB/s ± 2% ~ (p=0.658 n=40+40) Template-4 28.4MB/s ± 3% 28.3MB/s ± 3% ~ (p=0.331 n=40+40) [Geo mean] 80.8MB/s 80.8MB/s +0.09% Change-Id: I0cb715eead68ade097a302e7fb80ccbd1d1b511e Reviewed-on: https://go-review.googlesource.com/130975 Run-TryBot: Ben Shi <powerman1st@163.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-08-25 02:39:49 +00:00
Ben Shi	3bc34385fa	cmd/compile: introduce more read-modify-write operations for amd64 Add suport of read-modify-write for AND/SUB/AND/OR/XOR on amd64. 1. The total size of pkg/linux_amd64 decreases about 4KB, excluding cmd/compile. 2. The go1 benchmark shows a little improvement, excluding noise. name old time/op new time/op delta BinaryTree17-4 2.63s ± 3% 2.65s ± 4% +1.01% (p=0.037 n=35+35) Fannkuch11-4 2.33s ± 2% 2.39s ± 2% +2.49% (p=0.000 n=35+35) FmtFprintfEmpty-4 45.4ns ± 5% 40.8ns ± 6% -10.09% (p=0.000 n=35+35) FmtFprintfString-4 73.3ns ± 4% 70.9ns ± 3% -3.23% (p=0.000 n=30+35) FmtFprintfInt-4 79.9ns ± 4% 79.5ns ± 3% ~ (p=0.736 n=34+35) FmtFprintfIntInt-4 126ns ± 4% 125ns ± 4% ~ (p=0.083 n=35+35) FmtFprintfPrefixedInt-4 152ns ± 6% 152ns ± 3% ~ (p=0.855 n=34+35) FmtFprintfFloat-4 215ns ± 4% 213ns ± 4% ~ (p=0.066 n=35+35) FmtManyArgs-4 522ns ± 3% 506ns ± 3% -3.15% (p=0.000 n=35+35) GobDecode-4 6.45ms ± 8% 6.51ms ± 7% +0.96% (p=0.026 n=35+35) GobEncode-4 6.10ms ± 6% 6.02ms ± 8% ~ (p=0.160 n=35+35) Gzip-4 228ms ± 3% 221ms ± 3% -2.92% (p=0.000 n=35+35) Gunzip-4 37.5ms ± 4% 37.2ms ± 3% -0.78% (p=0.036 n=35+35) HTTPClientServer-4 58.7µs ± 2% 59.2µs ± 1% +0.80% (p=0.000 n=33+33) JSONEncode-4 12.0ms ± 3% 12.2ms ± 3% +1.84% (p=0.008 n=35+35) JSONDecode-4 57.0ms ± 4% 56.6ms ± 3% ~ (p=0.320 n=35+35) Mandelbrot200-4 3.82ms ± 3% 3.79ms ± 3% ~ (p=0.074 n=35+35) GoParse-4 3.21ms ± 5% 3.24ms ± 4% ~ (p=0.119 n=35+35) RegexpMatchEasy0_32-4 76.3ns ± 4% 75.4ns ± 4% -1.14% (p=0.014 n=34+33) RegexpMatchEasy0_1K-4 251ns ± 4% 254ns ± 3% +1.28% (p=0.016 n=35+35) RegexpMatchEasy1_32-4 69.6ns ± 3% 70.1ns ± 3% +0.82% (p=0.005 n=35+35) RegexpMatchEasy1_1K-4 367ns ± 4% 376ns ± 4% +2.47% (p=0.000 n=35+35) RegexpMatchMedium_32-4 108ns ± 5% 104ns ± 4% -3.18% (p=0.000 n=35+35) RegexpMatchMedium_1K-4 33.8µs ± 3% 32.7µs ± 3% -3.27% (p=0.000 n=35+35) RegexpMatchHard_32-4 1.55µs ± 3% 1.52µs ± 3% -1.64% (p=0.000 n=35+35) RegexpMatchHard_1K-4 46.6µs ± 3% 46.6µs ± 4% ~ (p=0.149 n=35+35) Revcomp-4 416ms ± 7% 412ms ± 6% -0.95% (p=0.033 n=33+35) Template-4 64.3ms ± 3% 62.4ms ± 7% -2.94% (p=0.000 n=35+35) TimeParse-4 320ns ± 2% 322ns ± 3% ~ (p=0.589 n=35+35) TimeFormat-4 300ns ± 3% 300ns ± 3% ~ (p=0.597 n=35+35) [Geo mean] 47.4µs 47.0µs -0.86% name old speed new speed delta GobDecode-4 119MB/s ± 7% 118MB/s ± 7% -0.96% (p=0.027 n=35+35) GobEncode-4 126MB/s ± 7% 127MB/s ± 6% ~ (p=0.157 n=34+34) Gzip-4 85.3MB/s ± 3% 87.9MB/s ± 3% +3.02% (p=0.000 n=35+35) Gunzip-4 518MB/s ± 4% 522MB/s ± 3% +0.79% (p=0.037 n=35+35) JSONEncode-4 162MB/s ± 3% 159MB/s ± 3% -1.81% (p=0.009 n=35+35) JSONDecode-4 34.1MB/s ± 4% 34.3MB/s ± 3% ~ (p=0.318 n=35+35) GoParse-4 18.0MB/s ± 5% 17.9MB/s ± 4% ~ (p=0.117 n=35+35) RegexpMatchEasy0_32-4 419MB/s ± 3% 425MB/s ± 4% +1.46% (p=0.003 n=32+33) RegexpMatchEasy0_1K-4 4.07GB/s ± 4% 4.02GB/s ± 3% -1.28% (p=0.014 n=35+35) RegexpMatchEasy1_32-4 460MB/s ± 3% 456MB/s ± 4% -0.82% (p=0.004 n=35+35) RegexpMatchEasy1_1K-4 2.79GB/s ± 4% 2.72GB/s ± 4% -2.39% (p=0.000 n=35+35) RegexpMatchMedium_32-4 9.23MB/s ± 4% 9.53MB/s ± 4% +3.16% (p=0.000 n=35+35) RegexpMatchMedium_1K-4 30.3MB/s ± 3% 31.3MB/s ± 3% +3.38% (p=0.000 n=35+35) RegexpMatchHard_32-4 20.7MB/s ± 3% 21.0MB/s ± 3% +1.67% (p=0.000 n=35+35) RegexpMatchHard_1K-4 22.0MB/s ± 3% 21.9MB/s ± 4% ~ (p=0.277 n=35+33) Revcomp-4 612MB/s ± 7% 618MB/s ± 6% +0.96% (p=0.034 n=33+35) Template-4 30.2MB/s ± 3% 31.1MB/s ± 6% +3.05% (p=0.000 n=35+35) [Geo mean] 123MB/s 124MB/s +0.64% Change-Id: Ia025da272e07d0069413824bfff3471b106d6280 Reviewed-on: https://go-review.googlesource.com/121535 Run-TryBot: Ben Shi <powerman1st@163.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ilya Tocar <ilya.tocar@intel.com> Reviewed-by: Keith Randall <khr@golang.org>	2018-08-24 23:38:25 +00:00
Kazuhiro Sera	ad644d2e86	all: fix typos detected by github.com/client9/misspell Change-Id: Iadb3c5de8ae9ea45855013997ed70f7929a88661 GitHub-Last-Rev: `ae85bcf82b` GitHub-Pull-Request: golang/go#26920 Reviewed-on: https://go-review.googlesource.com/128955 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-08-23 15:54:07 +00:00
Yury Smolsky	1484270aec	test: restore tests for the reject unsafe code option Tests in test/safe were neglected after moving to the run.go framework. This change restores them. These tests are skipped for go/types via -+ option. Fixes #25668 Change-Id: I8fe26574a76fa7afa8664c467d7c2e6334f1bba9 Reviewed-on: https://go-review.googlesource.com/124660 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-08-22 17:54:09 +00:00
Yury Smolsky	02fecd33f6	test: remove errchk, the perl script gc tests do not depend on errchk. Fixes #25669 Change-Id: I99eb87bb9677897b9167d4fc9a6321fa66cd9116 Reviewed-on: https://go-review.googlesource.com/115955 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-08-22 16:28:36 +00:00
Ben Shi	a0a7e9fc0c	cmd/compile: implement "OPC $imm, (mem)" for 386 New read-modify-write operations are introduced in this CL for 386. 1. The total size of pkg/linux_386 decreases about 10KB (excluding cmd/compile). 2. The go1 benchmark shows little regression. name old time/op new time/op delta BinaryTree17-4 3.32s ± 4% 3.29s ± 2% ~ (p=0.059 n=30+30) Fannkuch11-4 3.49s ± 1% 3.46s ± 1% -0.92% (p=0.001 n=30+30) FmtFprintfEmpty-4 47.7ns ± 2% 46.8ns ± 5% -1.93% (p=0.011 n=25+30) FmtFprintfString-4 79.5ns ± 7% 80.2ns ± 3% +0.89% (p=0.001 n=28+29) FmtFprintfInt-4 90.5ns ± 2% 92.1ns ± 2% +1.82% (p=0.014 n=22+30) FmtFprintfIntInt-4 141ns ± 1% 144ns ± 3% +2.23% (p=0.013 n=22+30) FmtFprintfPrefixedInt-4 183ns ± 2% 184ns ± 3% ~ (p=0.080 n=21+30) FmtFprintfFloat-4 409ns ± 3% 412ns ± 3% +0.83% (p=0.040 n=30+30) FmtManyArgs-4 597ns ± 6% 607ns ± 4% +1.71% (p=0.006 n=30+30) GobDecode-4 7.21ms ± 5% 7.18ms ± 6% ~ (p=0.665 n=30+30) GobEncode-4 7.17ms ± 6% 7.09ms ± 7% ~ (p=0.117 n=29+30) Gzip-4 413ms ± 4% 399ms ± 4% -3.48% (p=0.000 n=30+30) Gunzip-4 41.3ms ± 4% 41.7ms ± 3% +1.05% (p=0.011 n=30+30) HTTPClientServer-4 63.5µs ± 3% 62.9µs ± 2% -0.97% (p=0.017 n=30+27) JSONEncode-4 20.3ms ± 5% 20.1ms ± 5% -1.16% (p=0.004 n=30+30) JSONDecode-4 66.2ms ± 4% 67.7ms ± 4% +2.21% (p=0.000 n=30+30) Mandelbrot200-4 5.16ms ± 3% 5.18ms ± 3% ~ (p=0.123 n=30+30) GoParse-4 3.23ms ± 2% 3.27ms ± 2% +1.08% (p=0.006 n=30+30) RegexpMatchEasy0_32-4 98.9ns ± 5% 97.1ns ± 4% -1.83% (p=0.006 n=30+30) RegexpMatchEasy0_1K-4 842ns ± 3% 842ns ± 3% ~ (p=0.550 n=30+30) RegexpMatchEasy1_32-4 107ns ± 4% 105ns ± 4% -1.93% (p=0.012 n=30+30) RegexpMatchEasy1_1K-4 1.03µs ± 4% 1.04µs ± 4% ~ (p=0.304 n=30+30) RegexpMatchMedium_32-4 132ns ± 2% 129ns ± 4% -2.02% (p=0.000 n=21+30) RegexpMatchMedium_1K-4 44.1µs ± 4% 43.8µs ± 3% ~ (p=0.641 n=30+30) RegexpMatchHard_32-4 2.26µs ± 4% 2.23µs ± 4% -1.28% (p=0.023 n=30+30) RegexpMatchHard_1K-4 68.1µs ± 3% 68.6µs ± 4% ~ (p=0.089 n=30+30) Revcomp-4 1.85s ± 2% 1.84s ± 2% ~ (p=0.072 n=30+30) Template-4 69.2ms ± 3% 68.5ms ± 3% -1.04% (p=0.012 n=30+30) TimeParse-4 441ns ± 3% 446ns ± 4% +1.21% (p=0.001 n=30+30) TimeFormat-4 415ns ± 3% 415ns ± 3% ~ (p=0.436 n=30+30) [Geo mean] 67.0µs 66.9µs -0.17% name old speed new speed delta GobDecode-4 107MB/s ± 5% 107MB/s ± 6% ~ (p=0.663 n=30+30) GobEncode-4 107MB/s ± 6% 108MB/s ± 7% ~ (p=0.117 n=29+30) Gzip-4 47.0MB/s ± 4% 48.7MB/s ± 4% +3.61% (p=0.000 n=30+30) Gunzip-4 470MB/s ± 4% 466MB/s ± 4% -1.05% (p=0.011 n=30+30) JSONEncode-4 95.6MB/s ± 5% 96.7MB/s ± 5% +1.16% (p=0.005 n=30+30) JSONDecode-4 29.3MB/s ± 4% 28.7MB/s ± 4% -2.17% (p=0.000 n=30+30) GoParse-4 17.9MB/s ± 2% 17.7MB/s ± 2% -1.06% (p=0.007 n=30+30) RegexpMatchEasy0_32-4 323MB/s ± 5% 329MB/s ± 4% +1.93% (p=0.006 n=30+30) RegexpMatchEasy0_1K-4 1.22GB/s ± 3% 1.22GB/s ± 3% ~ (p=0.496 n=30+30) RegexpMatchEasy1_32-4 298MB/s ± 4% 303MB/s ± 4% +1.84% (p=0.017 n=30+30) RegexpMatchEasy1_1K-4 995MB/s ± 4% 989MB/s ± 4% ~ (p=0.307 n=30+30) RegexpMatchMedium_32-4 7.56MB/s ± 4% 7.74MB/s ± 4% +2.46% (p=0.000 n=22+30) RegexpMatchMedium_1K-4 23.2MB/s ± 4% 23.4MB/s ± 3% ~ (p=0.651 n=30+30) RegexpMatchHard_32-4 14.2MB/s ± 4% 14.3MB/s ± 4% +1.29% (p=0.021 n=30+30) RegexpMatchHard_1K-4 15.0MB/s ± 3% 14.9MB/s ± 4% ~ (p=0.069 n=30+29) Revcomp-4 138MB/s ± 2% 138MB/s ± 2% ~ (p=0.072 n=30+30) Template-4 28.1MB/s ± 3% 28.4MB/s ± 3% +1.05% (p=0.012 n=30+30) [Geo mean] 79.7MB/s 80.2MB/s +0.60% Change-Id: I44a1dfc942c9a385904553c4fe1fa8e509c8aa31 Reviewed-on: https://go-review.googlesource.com/120916 Run-TryBot: Ben Shi <powerman1st@163.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-08-22 04:12:42 +00:00
Ben Shi	90f2fa0037	cmd/compile: optimize 386 code with MULLload/DIVSSload/DIVSDload IMULL/DIVSS/DIVSD all can take the source operand from memory directly. And this CL implement that optimization. 1. The total size of pkg/linux_386 decreases about 84KB (excluding cmd/compile). 2. The go1 benchmark shows little regression in total (excluding noise). name old time/op new time/op delta BinaryTree17-4 3.29s ± 2% 3.27s ± 4% ~ (p=0.192 n=30+30) Fannkuch11-4 3.49s ± 2% 3.54s ± 1% +1.48% (p=0.000 n=30+30) FmtFprintfEmpty-4 45.9ns ± 3% 46.3ns ± 4% +0.89% (p=0.037 n=30+30) FmtFprintfString-4 78.8ns ± 3% 78.7ns ± 4% ~ (p=0.209 n=30+27) FmtFprintfInt-4 91.0ns ± 2% 90.3ns ± 2% -0.82% (p=0.031 n=30+27) FmtFprintfIntInt-4 142ns ± 4% 143ns ± 4% ~ (p=0.136 n=30+30) FmtFprintfPrefixedInt-4 181ns ± 3% 183ns ± 4% +1.40% (p=0.005 n=30+30) FmtFprintfFloat-4 404ns ± 4% 408ns ± 3% ~ (p=0.397 n=30+30) FmtManyArgs-4 601ns ± 3% 609ns ± 5% ~ (p=0.059 n=30+30) GobDecode-4 7.21ms ± 5% 7.24ms ± 5% ~ (p=0.612 n=30+30) GobEncode-4 6.91ms ± 6% 6.91ms ± 6% ~ (p=0.797 n=30+30) Gzip-4 398ms ± 6% 399ms ± 4% ~ (p=0.173 n=30+30) Gunzip-4 41.7ms ± 3% 41.8ms ± 3% ~ (p=0.423 n=30+30) HTTPClientServer-4 62.3µs ± 2% 62.7µs ± 3% ~ (p=0.085 n=29+30) JSONEncode-4 21.0ms ± 4% 20.7ms ± 5% -1.39% (p=0.014 n=30+30) JSONDecode-4 66.3ms ± 3% 67.4ms ± 1% +1.71% (p=0.003 n=30+24) Mandelbrot200-4 5.15ms ± 3% 5.16ms ± 3% ~ (p=0.697 n=30+30) GoParse-4 3.24ms ± 3% 3.27ms ± 4% +0.91% (p=0.032 n=30+30) RegexpMatchEasy0_32-4 101ns ± 5% 99ns ± 4% -1.82% (p=0.008 n=29+30) RegexpMatchEasy0_1K-4 848ns ± 4% 841ns ± 2% -0.77% (p=0.043 n=30+30) RegexpMatchEasy1_32-4 106ns ± 6% 106ns ± 3% ~ (p=0.939 n=29+30) RegexpMatchEasy1_1K-4 1.02µs ± 3% 1.03µs ± 4% ~ (p=0.297 n=28+30) RegexpMatchMedium_32-4 129ns ± 4% 127ns ± 4% ~ (p=0.073 n=30+30) RegexpMatchMedium_1K-4 43.9µs ± 3% 43.8µs ± 3% ~ (p=0.186 n=30+30) RegexpMatchHard_32-4 2.24µs ± 4% 2.22µs ± 4% ~ (p=0.332 n=30+29) RegexpMatchHard_1K-4 68.0µs ± 4% 67.5µs ± 3% ~ (p=0.290 n=30+30) Revcomp-4 1.85s ± 3% 1.85s ± 3% ~ (p=0.358 n=30+30) Template-4 69.6ms ± 3% 70.0ms ± 4% ~ (p=0.273 n=30+30) TimeParse-4 445ns ± 3% 441ns ± 3% ~ (p=0.494 n=30+30) TimeFormat-4 412ns ± 3% 412ns ± 6% ~ (p=0.841 n=30+30) [Geo mean] 66.7µs 66.8µs +0.13% name old speed new speed delta GobDecode-4 107MB/s ± 5% 106MB/s ± 5% ~ (p=0.615 n=30+30) GobEncode-4 111MB/s ± 6% 111MB/s ± 6% ~ (p=0.790 n=30+30) Gzip-4 48.8MB/s ± 6% 48.7MB/s ± 4% ~ (p=0.167 n=30+30) Gunzip-4 465MB/s ± 3% 465MB/s ± 3% ~ (p=0.420 n=30+30) JSONEncode-4 92.4MB/s ± 4% 93.7MB/s ± 5% +1.42% (p=0.015 n=30+30) JSONDecode-4 29.3MB/s ± 3% 28.8MB/s ± 1% -1.72% (p=0.003 n=30+24) GoParse-4 17.9MB/s ± 3% 17.7MB/s ± 4% -0.89% (p=0.037 n=30+30) RegexpMatchEasy0_32-4 317MB/s ± 8% 324MB/s ± 4% +2.14% (p=0.006 n=30+30) RegexpMatchEasy0_1K-4 1.21GB/s ± 4% 1.22GB/s ± 2% +0.77% (p=0.036 n=30+30) RegexpMatchEasy1_32-4 298MB/s ± 7% 299MB/s ± 4% ~ (p=0.511 n=30+30) RegexpMatchEasy1_1K-4 1.00GB/s ± 3% 1.00GB/s ± 4% ~ (p=0.304 n=28+30) RegexpMatchMedium_32-4 7.75MB/s ± 4% 7.82MB/s ± 4% ~ (p=0.089 n=30+30) RegexpMatchMedium_1K-4 23.3MB/s ± 3% 23.4MB/s ± 3% ~ (p=0.181 n=30+30) RegexpMatchHard_32-4 14.3MB/s ± 4% 14.4MB/s ± 4% ~ (p=0.320 n=30+29) RegexpMatchHard_1K-4 15.1MB/s ± 4% 15.2MB/s ± 3% ~ (p=0.273 n=30+30) Revcomp-4 137MB/s ± 3% 137MB/s ± 3% ~ (p=0.352 n=30+30) Template-4 27.9MB/s ± 3% 27.7MB/s ± 4% ~ (p=0.277 n=30+30) [Geo mean] 79.9MB/s 80.1MB/s +0.15% Change-Id: I97333cd8ddabb3c7c88ca5aa9e14a005b74d306d Reviewed-on: https://go-review.googlesource.com/120695 Run-TryBot: Ben Shi <powerman1st@163.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-08-22 03:38:38 +00:00
Ben Shi	705f3c74e6	cmd/compile: optimize AMD64 with DIVSSload and DIVSDload DIVSSload & DIVSDload directly operate on a memory operand. And binary size can be reduced by them, while the performance is not affected. The total size of pkg/linux_amd64 (excluding cmd/compile) decreases about 6KB. There is little regression in the go1 benchmark test (excluding noise). name old time/op new time/op delta BinaryTree17-4 2.63s ± 4% 2.62s ± 4% ~ (p=0.809 n=30+30) Fannkuch11-4 2.40s ± 2% 2.40s ± 2% ~ (p=0.109 n=30+30) FmtFprintfEmpty-4 43.1ns ± 4% 43.2ns ± 9% ~ (p=0.168 n=30+30) FmtFprintfString-4 73.6ns ± 4% 74.1ns ± 4% ~ (p=0.069 n=30+30) FmtFprintfInt-4 81.0ns ± 3% 81.4ns ± 5% ~ (p=0.350 n=30+30) FmtFprintfIntInt-4 127ns ± 4% 129ns ± 4% +0.99% (p=0.021 n=30+30) FmtFprintfPrefixedInt-4 156ns ± 4% 155ns ± 4% ~ (p=0.415 n=30+30) FmtFprintfFloat-4 219ns ± 4% 218ns ± 4% ~ (p=0.071 n=30+30) FmtManyArgs-4 522ns ± 3% 518ns ± 3% -0.68% (p=0.034 n=30+30) GobDecode-4 6.49ms ± 6% 6.52ms ± 6% ~ (p=0.832 n=30+30) GobEncode-4 6.10ms ± 9% 6.14ms ± 7% ~ (p=0.485 n=30+30) Gzip-4 227ms ± 1% 224ms ± 4% ~ (p=0.484 n=24+30) Gunzip-4 37.2ms ± 3% 36.8ms ± 4% ~ (p=0.889 n=30+30) HTTPClientServer-4 58.9µs ± 1% 58.7µs ± 2% -0.42% (p=0.003 n=28+28) JSONEncode-4 12.0ms ± 3% 12.0ms ± 4% ~ (p=0.523 n=30+30) JSONDecode-4 54.6ms ± 4% 54.5ms ± 4% ~ (p=0.708 n=30+30) Mandelbrot200-4 3.78ms ± 4% 3.81ms ± 3% +0.99% (p=0.016 n=30+30) GoParse-4 3.20ms ± 4% 3.20ms ± 5% ~ (p=0.994 n=30+30) RegexpMatchEasy0_32-4 77.0ns ± 4% 75.9ns ± 3% -1.39% (p=0.006 n=29+30) RegexpMatchEasy0_1K-4 255ns ± 4% 253ns ± 4% ~ (p=0.091 n=30+30) RegexpMatchEasy1_32-4 69.7ns ± 3% 70.3ns ± 4% ~ (p=0.120 n=30+30) RegexpMatchEasy1_1K-4 373ns ± 2% 378ns ± 3% +1.43% (p=0.000 n=21+26) RegexpMatchMedium_32-4 107ns ± 2% 108ns ± 4% +1.50% (p=0.012 n=22+30) RegexpMatchMedium_1K-4 34.0µs ± 1% 34.3µs ± 3% +1.08% (p=0.008 n=24+30) RegexpMatchHard_32-4 1.53µs ± 3% 1.54µs ± 3% ~ (p=0.234 n=30+30) RegexpMatchHard_1K-4 46.7µs ± 4% 47.0µs ± 4% ~ (p=0.420 n=30+30) Revcomp-4 411ms ± 7% 415ms ± 6% ~ (p=0.059 n=30+30) Template-4 65.5ms ± 5% 66.9ms ± 4% +2.21% (p=0.001 n=30+30) TimeParse-4 317ns ± 3% 311ns ± 3% -1.97% (p=0.000 n=30+30) TimeFormat-4 293ns ± 3% 294ns ± 3% ~ (p=0.243 n=30+30) [Geo mean] 47.4µs 47.5µs +0.17% name old speed new speed delta GobDecode-4 118MB/s ± 5% 118MB/s ± 6% ~ (p=0.832 n=30+30) GobEncode-4 125MB/s ± 7% 125MB/s ± 7% ~ (p=0.625 n=29+30) Gzip-4 85.3MB/s ± 1% 86.6MB/s ± 4% ~ (p=0.486 n=24+30) Gunzip-4 522MB/s ± 3% 527MB/s ± 4% ~ (p=0.889 n=30+30) JSONEncode-4 162MB/s ± 3% 162MB/s ± 4% ~ (p=0.520 n=30+30) JSONDecode-4 35.5MB/s ± 4% 35.6MB/s ± 4% ~ (p=0.701 n=30+30) GoParse-4 18.1MB/s ± 4% 18.1MB/s ± 4% ~ (p=0.891 n=29+30) RegexpMatchEasy0_32-4 416MB/s ± 4% 422MB/s ± 3% +1.43% (p=0.005 n=29+30) RegexpMatchEasy0_1K-4 4.01GB/s ± 4% 4.04GB/s ± 4% ~ (p=0.091 n=30+30) RegexpMatchEasy1_32-4 460MB/s ± 3% 456MB/s ± 5% ~ (p=0.123 n=30+30) RegexpMatchEasy1_1K-4 2.74GB/s ± 2% 2.70GB/s ± 3% -1.33% (p=0.000 n=22+26) RegexpMatchMedium_32-4 9.39MB/s ± 3% 9.19MB/s ± 4% -2.06% (p=0.001 n=28+30) RegexpMatchMedium_1K-4 30.1MB/s ± 1% 29.8MB/s ± 3% -1.04% (p=0.008 n=24+30) RegexpMatchHard_32-4 20.9MB/s ± 3% 20.8MB/s ± 3% ~ (p=0.234 n=30+30) RegexpMatchHard_1K-4 21.9MB/s ± 4% 21.8MB/s ± 4% ~ (p=0.420 n=30+30) Revcomp-4 619MB/s ± 7% 612MB/s ± 7% ~ (p=0.059 n=30+30) Template-4 29.6MB/s ± 4% 29.0MB/s ± 4% -2.16% (p=0.002 n=30+30) [Geo mean] 123MB/s 123MB/s -0.33% Change-Id: Ia59e077feae4f2824df79059daea4d0f678e3e4c Reviewed-on: https://go-review.googlesource.com/120275 Run-TryBot: Ben Shi <powerman1st@163.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>	2018-08-22 02:47:49 +00:00
Ian Lance Taylor	7e64377903	cmd/compile: only support -race and -msan where they work Consolidate decision about whether -race and -msan options are supported in cmd/internal/sys. Use consolidated functions in cmd/compile and cmd/go. Use a copy of them in cmd/dist; cmd/dist can't import cmd/internal/sys because Go 1.4 doesn't have it. Fixes #24315 Change-Id: I9cecaed4895eb1a2a49379b4848db40de66d32a9 Reviewed-on: https://go-review.googlesource.com/121816 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-08-21 03:38:27 +00:00
Ilya Tocar	a3593685cf	cmd/compile/internal/ssa: remove useless zero extension We generate MOVBLZX for byte-sized LoadReg, so (MOVBQZX (LoadReg (Arg))) is the same as (LoadReg (Arg)). Remove those zero extension where possible. Triggers several times during all.bash. Fixes #25378 Updates #15300 Change-Id: If50656e66f217832a13ee8f49c47997f4fcc093a Reviewed-on: https://go-review.googlesource.com/115617 Run-TryBot: Ilya Tocar <ilya.tocar@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-08-20 21:38:20 +00:00
Ilya Tocar	c292b32f33	cmd/compile: enable disjoint memmove inlining on amd64 Memmove can use AVX/prefetches/other optional instructions, so only do it for small sizes, when call overhead dominates. Change-Id: Ice5e93deb11462217f7fb5fc350b703109bb4090 Reviewed-on: https://go-review.googlesource.com/112517 Run-TryBot: Ilya Tocar <ilya.tocar@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Munday <mike.munday@ibm.com>	2018-08-20 21:10:12 +00:00
Ben Shi	0a382e0b7f	cmd/compile: optimize ARMv7 code "AND $0xffff0000, Rx" will be encoded to 12 bytes. 1. MOVWload from the constant pool to Rtmp 2. AND Rtmp, Rx 3. a 4-byte item in the constant pool It can be simplified to 8 bytes on ARMv7, since ARMv7 has "MOVW $imm-16, Rx". 1. MOVW $0xffff, Rtmp 2. BIC Rtmp, Rx The above optimization also applies to BICconst, ADDconst and SUBconst. 1. The total size of pkg/android_arm (excluding cmd/compile) decreases about 2KB. 2. The go1 benchmark shows no regression, exlcuding noise. name old time/op new time/op delta BinaryTree17-4 25.5s ± 1% 25.2s ± 1% -0.85% (p=0.000 n=30+30) Fannkuch11-4 13.3s ± 0% 13.3s ± 0% +0.16% (p=0.000 n=24+25) FmtFprintfEmpty-4 397ns ± 0% 394ns ± 0% -0.64% (p=0.000 n=30+30) FmtFprintfString-4 679ns ± 0% 678ns ± 0% ~ (p=0.093 n=30+29) FmtFprintfInt-4 708ns ± 0% 707ns ± 0% -0.19% (p=0.000 n=27+28) FmtFprintfIntInt-4 1.05µs ± 0% 1.05µs ± 0% -0.07% (p=0.001 n=18+30) FmtFprintfPrefixedInt-4 1.16µs ± 0% 1.15µs ± 0% -0.41% (p=0.000 n=29+30) FmtFprintfFloat-4 2.26µs ± 0% 2.23µs ± 1% -1.40% (p=0.000 n=30+30) FmtManyArgs-4 3.96µs ± 0% 3.95µs ± 0% -0.29% (p=0.000 n=29+30) GobDecode-4 52.9ms ± 2% 53.4ms ± 2% +0.92% (p=0.004 n=28+30) GobEncode-4 49.7ms ± 2% 49.8ms ± 2% ~ (p=0.890 n=30+26) Gzip-4 2.61s ± 0% 2.60s ± 0% -0.36% (p=0.000 n=29+29) Gunzip-4 312ms ± 0% 311ms ± 0% -0.13% (p=0.000 n=30+28) HTTPClientServer-4 1.02ms ± 8% 1.00ms ± 7% ~ (p=0.224 n=29+26) JSONEncode-4 125ms ± 1% 124ms ± 3% -1.05% (p=0.000 n=25+30) JSONDecode-4 432ms ± 1% 436ms ± 2% ~ (p=0.277 n=26+30) Mandelbrot200-4 18.4ms ± 0% 18.4ms ± 0% +0.02% (p=0.001 n=28+25) GoParse-4 22.4ms ± 1% 22.3ms ± 1% -0.41% (p=0.000 n=28+28) RegexpMatchEasy0_32-4 697ns ± 0% 706ns ± 0% +1.23% (p=0.000 n=19+30) RegexpMatchEasy0_1K-4 4.27µs ± 0% 4.26µs ± 0% -0.06% (p=0.000 n=30+30) RegexpMatchEasy1_32-4 741ns ± 0% 735ns ± 0% -0.86% (p=0.000 n=26+30) RegexpMatchEasy1_1K-4 5.49µs ± 0% 5.49µs ± 0% -0.03% (p=0.023 n=25+30) RegexpMatchMedium_32-4 1.05µs ± 2% 1.04µs ± 2% ~ (p=0.893 n=30+30) RegexpMatchMedium_1K-4 261µs ± 0% 261µs ± 0% -0.11% (p=0.000 n=29+30) RegexpMatchHard_32-4 14.9µs ± 0% 14.9µs ± 0% -0.36% (p=0.000 n=23+29) RegexpMatchHard_1K-4 446µs ± 0% 445µs ± 0% -0.17% (p=0.000 n=30+29) Revcomp-4 41.6ms ± 1% 41.7ms ± 1% +0.27% (p=0.040 n=28+30) Template-4 531ms ± 0% 532ms ± 1% ~ (p=0.059 n=30+30) TimeParse-4 3.40µs ± 0% 3.33µs ± 0% -2.02% (p=0.000 n=30+30) TimeFormat-4 6.14µs ± 0% 6.11µs ± 0% -0.45% (p=0.000 n=27+29) [Geo mean] 384µs 383µs -0.27% name old speed new speed delta GobDecode-4 14.5MB/s ± 2% 14.4MB/s ± 2% -0.90% (p=0.005 n=28+30) GobEncode-4 15.4MB/s ± 2% 15.4MB/s ± 2% ~ (p=0.741 n=30+25) Gzip-4 7.44MB/s ± 0% 7.47MB/s ± 1% +0.37% (p=0.000 n=25+30) Gunzip-4 62.3MB/s ± 0% 62.4MB/s ± 0% +0.13% (p=0.000 n=30+28) JSONEncode-4 15.5MB/s ± 1% 15.6MB/s ± 3% +1.07% (p=0.000 n=25+30) JSONDecode-4 4.48MB/s ± 0% 4.46MB/s ± 2% ~ (p=0.655 n=23+30) GoParse-4 2.58MB/s ± 1% 2.59MB/s ± 1% +0.42% (p=0.000 n=28+29) RegexpMatchEasy0_32-4 45.9MB/s ± 0% 45.3MB/s ± 0% -1.23% (p=0.000 n=28+30) RegexpMatchEasy0_1K-4 240MB/s ± 0% 240MB/s ± 0% +0.07% (p=0.000 n=30+30) RegexpMatchEasy1_32-4 43.2MB/s ± 0% 43.5MB/s ± 0% +0.85% (p=0.000 n=30+28) RegexpMatchEasy1_1K-4 186MB/s ± 0% 186MB/s ± 0% +0.03% (p=0.026 n=25+30) RegexpMatchMedium_32-4 955kB/s ± 2% 960kB/s ± 2% ~ (p=0.084 n=30+30) RegexpMatchMedium_1K-4 3.92MB/s ± 0% 3.93MB/s ± 0% +0.14% (p=0.000 n=29+30) RegexpMatchHard_32-4 2.14MB/s ± 0% 2.15MB/s ± 0% +0.31% (p=0.000 n=30+26) RegexpMatchHard_1K-4 2.30MB/s ± 0% 2.30MB/s ± 0% ~ (all equal) Revcomp-4 61.1MB/s ± 1% 60.9MB/s ± 1% -0.27% (p=0.039 n=28+30) Template-4 3.66MB/s ± 0% 3.65MB/s ± 1% -0.14% (p=0.045 n=30+30) [Geo mean] 12.8MB/s 12.8MB/s +0.04% Change-Id: I02370e2584b4c041fddd324c97628fd6f0c12183 Reviewed-on: https://go-review.googlesource.com/123179 Run-TryBot: Ben Shi <powerman1st@163.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-08-20 15:13:23 +00:00
Ben Shi	556971316f	cmd/compile: optimize 386's comparison CMPL/CMPW/CMPB can take a memory operand on 386, and this CL implements that optimization. 1. The total size of pkg/linux_386 decreases about 45KB, excluding cmd/compile. 2. The go1 benchmark shows a little improvement. name old time/op new time/op delta BinaryTree17-4 3.36s ± 2% 3.37s ± 3% ~ (p=0.537 n=40+40) Fannkuch11-4 3.59s ± 1% 3.53s ± 2% -1.58% (p=0.000 n=40+40) FmtFprintfEmpty-4 46.0ns ± 3% 45.8ns ± 3% ~ (p=0.249 n=40+40) FmtFprintfString-4 80.0ns ± 4% 78.8ns ± 3% -1.49% (p=0.001 n=40+40) FmtFprintfInt-4 89.7ns ± 2% 90.3ns ± 2% +0.74% (p=0.003 n=40+40) FmtFprintfIntInt-4 144ns ± 3% 143ns ± 3% -0.95% (p=0.003 n=40+40) FmtFprintfPrefixedInt-4 181ns ± 4% 180ns ± 2% ~ (p=0.103 n=40+40) FmtFprintfFloat-4 412ns ± 3% 408ns ± 4% -0.97% (p=0.018 n=40+40) FmtManyArgs-4 607ns ± 4% 605ns ± 4% ~ (p=0.148 n=40+40) GobDecode-4 7.19ms ± 4% 7.24ms ± 5% ~ (p=0.340 n=40+40) GobEncode-4 7.04ms ± 9% 6.99ms ± 9% ~ (p=0.289 n=40+40) Gzip-4 400ms ± 6% 398ms ± 5% ~ (p=0.168 n=40+40) Gunzip-4 41.2ms ± 3% 41.7ms ± 3% +1.40% (p=0.001 n=40+40) HTTPClientServer-4 62.5µs ± 1% 62.1µs ± 2% -0.61% (p=0.000 n=37+37) JSONEncode-4 20.7ms ± 4% 20.4ms ± 3% -1.60% (p=0.000 n=40+40) JSONDecode-4 69.4ms ± 4% 69.2ms ± 6% ~ (p=0.177 n=40+40) Mandelbrot200-4 5.22ms ± 6% 5.21ms ± 3% ~ (p=0.531 n=40+40) GoParse-4 3.29ms ± 3% 3.28ms ± 3% ~ (p=0.321 n=40+39) RegexpMatchEasy0_32-4 104ns ± 4% 103ns ± 7% -0.89% (p=0.040 n=40+40) RegexpMatchEasy0_1K-4 852ns ± 3% 853ns ± 2% ~ (p=0.357 n=40+40) RegexpMatchEasy1_32-4 113ns ± 8% 113ns ± 3% ~ (p=0.906 n=40+40) RegexpMatchEasy1_1K-4 1.03µs ± 4% 1.03µs ± 5% ~ (p=0.326 n=40+40) RegexpMatchMedium_32-4 136ns ± 3% 133ns ± 3% -2.31% (p=0.000 n=40+40) RegexpMatchMedium_1K-4 44.0µs ± 3% 43.7µs ± 3% ~ (p=0.053 n=40+40) RegexpMatchHard_32-4 2.27µs ± 3% 2.26µs ± 4% ~ (p=0.391 n=40+40) RegexpMatchHard_1K-4 68.0µs ± 3% 68.9µs ± 3% +1.28% (p=0.000 n=40+40) Revcomp-4 1.86s ± 5% 1.86s ± 2% ~ (p=0.950 n=40+40) Template-4 73.4ms ± 4% 69.9ms ± 7% -4.78% (p=0.000 n=40+40) TimeParse-4 449ns ± 4% 441ns ± 5% -1.76% (p=0.000 n=40+40) TimeFormat-4 416ns ± 3% 417ns ± 4% ~ (p=0.304 n=40+40) [Geo mean] 67.7µs 67.3µs -0.55% name old speed new speed delta GobDecode-4 107MB/s ± 4% 106MB/s ± 5% ~ (p=0.336 n=40+40) GobEncode-4 109MB/s ± 5% 110MB/s ± 9% ~ (p=0.142 n=38+40) Gzip-4 48.5MB/s ± 5% 48.8MB/s ± 5% ~ (p=0.172 n=40+40) Gunzip-4 472MB/s ± 3% 465MB/s ± 3% -1.39% (p=0.001 n=40+40) JSONEncode-4 93.6MB/s ± 4% 95.1MB/s ± 3% +1.61% (p=0.000 n=40+40) JSONDecode-4 28.0MB/s ± 3% 28.1MB/s ± 6% ~ (p=0.181 n=40+40) GoParse-4 17.6MB/s ± 3% 17.7MB/s ± 3% ~ (p=0.350 n=40+39) RegexpMatchEasy0_32-4 308MB/s ± 4% 311MB/s ± 6% +0.96% (p=0.025 n=40+40) RegexpMatchEasy0_1K-4 1.20GB/s ± 3% 1.20GB/s ± 2% ~ (p=0.317 n=40+40) RegexpMatchEasy1_32-4 282MB/s ± 7% 282MB/s ± 3% ~ (p=0.516 n=40+40) RegexpMatchEasy1_1K-4 994MB/s ± 4% 991MB/s ± 5% ~ (p=0.319 n=40+40) RegexpMatchMedium_32-4 7.31MB/s ± 3% 7.49MB/s ± 3% +2.46% (p=0.000 n=40+40) RegexpMatchMedium_1K-4 23.3MB/s ± 3% 23.4MB/s ± 3% ~ (p=0.052 n=40+40) RegexpMatchHard_32-4 14.1MB/s ± 3% 14.1MB/s ± 4% ~ (p=0.391 n=40+40) RegexpMatchHard_1K-4 15.1MB/s ± 3% 14.9MB/s ± 3% -1.27% (p=0.000 n=40+40) Revcomp-4 137MB/s ± 5% 137MB/s ± 2% ~ (p=0.942 n=40+40) Template-4 26.5MB/s ± 4% 27.8MB/s ± 7% +5.03% (p=0.000 n=40+40) [Geo mean] 78.6MB/s 79.0MB/s +0.57% Change-Id: Idcacc6881ef57cd7dc33aa87b711282842b72a53 Reviewed-on: https://go-review.googlesource.com/126618 Run-TryBot: Ben Shi <powerman1st@163.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-08-20 14:23:22 +00:00
Ben Shi	b75c5c5992	cmd/compile: optimize AMD64 with more read-modify-write operations 6 more operations which do read-modify-write with a constant source operand are added. 1. The total size of pkg/linux_amd64 decreases about 3KB, excluding cmd/compile. 2. The go1 benckmark shows a slight improvement. name old time/op new time/op delta BinaryTree17-4 2.61s ± 4% 2.67s ± 2% +2.26% (p=0.000 n=30+29) Fannkuch11-4 2.39s ± 2% 2.32s ± 2% -2.67% (p=0.000 n=30+30) FmtFprintfEmpty-4 44.0ns ± 4% 41.7ns ± 4% -5.15% (p=0.000 n=30+30) FmtFprintfString-4 74.2ns ± 4% 72.3ns ± 4% -2.59% (p=0.000 n=30+30) FmtFprintfInt-4 81.7ns ± 3% 78.8ns ± 4% -3.54% (p=0.000 n=27+30) FmtFprintfIntInt-4 130ns ± 4% 124ns ± 5% -4.60% (p=0.000 n=30+30) FmtFprintfPrefixedInt-4 154ns ± 3% 152ns ± 3% -1.13% (p=0.012 n=30+30) FmtFprintfFloat-4 215ns ± 4% 212ns ± 5% -1.56% (p=0.002 n=30+30) FmtManyArgs-4 522ns ± 3% 512ns ± 3% -1.84% (p=0.001 n=30+30) GobDecode-4 6.42ms ± 5% 6.49ms ± 7% ~ (p=0.070 n=30+30) GobEncode-4 6.07ms ± 8% 5.98ms ± 8% ~ (p=0.150 n=30+30) Gzip-4 236ms ± 4% 223ms ± 4% -5.57% (p=0.000 n=30+30) Gunzip-4 37.4ms ± 3% 36.7ms ± 4% -2.03% (p=0.000 n=30+30) HTTPClientServer-4 58.7µs ± 1% 58.5µs ± 2% -0.37% (p=0.018 n=30+29) JSONEncode-4 12.0ms ± 4% 12.1ms ± 3% ~ (p=0.112 n=30+30) JSONDecode-4 54.5ms ± 3% 55.5ms ± 4% +1.80% (p=0.006 n=30+30) Mandelbrot200-4 3.78ms ± 4% 3.78ms ± 4% ~ (p=0.173 n=30+30) GoParse-4 3.16ms ± 5% 3.22ms ± 5% +1.75% (p=0.010 n=30+30) RegexpMatchEasy0_32-4 76.6ns ± 1% 75.9ns ± 3% ~ (p=0.672 n=25+30) RegexpMatchEasy0_1K-4 252ns ± 3% 253ns ± 3% +0.57% (p=0.027 n=30+30) RegexpMatchEasy1_32-4 69.8ns ± 4% 70.2ns ± 6% ~ (p=0.539 n=30+30) RegexpMatchEasy1_1K-4 374ns ± 3% 373ns ± 5% ~ (p=0.263 n=30+30) RegexpMatchMedium_32-4 107ns ± 4% 109ns ± 3% ~ (p=0.067 n=30+30) RegexpMatchMedium_1K-4 33.9µs ± 5% 34.1µs ± 4% ~ (p=0.297 n=30+30) RegexpMatchHard_32-4 1.54µs ± 3% 1.56µs ± 4% +1.43% (p=0.002 n=30+30) RegexpMatchHard_1K-4 46.6µs ± 3% 47.0µs ± 3% ~ (p=0.055 n=30+30) Revcomp-4 411ms ± 6% 407ms ± 6% ~ (p=0.219 n=30+30) Template-4 66.8ms ± 3% 64.8ms ± 5% -3.01% (p=0.000 n=30+30) TimeParse-4 312ns ± 2% 319ns ± 3% +2.50% (p=0.000 n=30+30) TimeFormat-4 296ns ± 5% 299ns ± 3% +0.93% (p=0.005 n=30+30) [Geo mean] 47.5µs 47.1µs -0.75% name old speed new speed delta GobDecode-4 120MB/s ± 5% 118MB/s ± 6% ~ (p=0.072 n=30+30) GobEncode-4 127MB/s ± 8% 129MB/s ± 8% ~ (p=0.150 n=30+30) Gzip-4 82.1MB/s ± 4% 87.0MB/s ± 4% +5.90% (p=0.000 n=30+30) Gunzip-4 519MB/s ± 4% 529MB/s ± 4% +2.07% (p=0.001 n=30+30) JSONEncode-4 162MB/s ± 4% 161MB/s ± 3% ~ (p=0.110 n=30+30) JSONDecode-4 35.6MB/s ± 3% 35.0MB/s ± 4% -1.77% (p=0.007 n=30+30) GoParse-4 18.3MB/s ± 4% 18.0MB/s ± 4% -1.72% (p=0.009 n=30+30) RegexpMatchEasy0_32-4 418MB/s ± 1% 422MB/s ± 3% ~ (p=0.645 n=25+30) RegexpMatchEasy0_1K-4 4.06GB/s ± 3% 4.04GB/s ± 3% -0.57% (p=0.033 n=30+30) RegexpMatchEasy1_32-4 459MB/s ± 4% 456MB/s ± 6% ~ (p=0.530 n=30+30) RegexpMatchEasy1_1K-4 2.73GB/s ± 3% 2.75GB/s ± 5% ~ (p=0.279 n=30+30) RegexpMatchMedium_32-4 9.28MB/s ± 5% 9.18MB/s ± 4% ~ (p=0.086 n=30+30) RegexpMatchMedium_1K-4 30.2MB/s ± 4% 30.0MB/s ± 4% ~ (p=0.300 n=30+30) RegexpMatchHard_32-4 20.8MB/s ± 3% 20.5MB/s ± 4% -1.41% (p=0.002 n=30+30) RegexpMatchHard_1K-4 22.0MB/s ± 3% 21.8MB/s ± 3% ~ (p=0.051 n=30+30) Revcomp-4 619MB/s ± 7% 625MB/s ± 7% ~ (p=0.219 n=30+30) Template-4 29.0MB/s ± 3% 29.9MB/s ± 4% +3.11% (p=0.000 n=30+30) [Geo mean] 123MB/s 123MB/s +0.28% Change-Id: I850652cfd53329c1af804b7f57f4393d8097bb0d Reviewed-on: https://go-review.googlesource.com/121135 Run-TryBot: Ben Shi <powerman1st@163.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>	2018-08-20 14:18:39 +00:00
Richard Musiol	81555cb4f3	cmd/compile/internal/gc: add nil check for closure call on wasm This commit adds an explicit nil check for closure calls on wasm, so calling a nil func causes a proper panic instead of crashing on the WebAssembly level. Change-Id: I6246844f316677976cdd420618be5664444c25ae Reviewed-on: https://go-review.googlesource.com/127759 Run-TryBot: Richard Musiol <neelance@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-08-14 09:19:38 +00:00
Ben Shi	de0e72610b	test/codegen: add more combined store tests for arm64 Some combined store optimization was already implemented in go-1.11, but there is no corresponding test cases. Change-Id: Iebdad186e92047942e53a74f2c20b390922e1e9c Reviewed-on: https://go-review.googlesource.com/122915 Run-TryBot: Ben Shi <powerman1st@163.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-08-02 04:23:45 +00:00
Keith Randall	5fc70b6fac	cmd/compile: set stricter inlining threshold in large functions If we're compiling a large function, be more picky about how big the function we're inlining is. If the function is >5000 nodes, we lower the inlining threshold from a cost of 80 to 20. Turns out reflect.Value's cost is exactly 80. That's the function at issue in #26546. 20 was chosen as a proxy for "inlined body is smaller than the call would be". Simple functions still get inlined, like this one at cost 7: func ifaceIndir(t rtype) bool { return t.kind&kindDirectIface == 0 } 5000 nodes was chosen as the big function size. Here are all the 5000+ node (~~1000+ lines) functions in the stdlib: 5187 cmd/internal/obj/arm (ctxt5).asmout 6879 cmd/internal/obj/s390x (ctxtz).asmout 6567 cmd/internal/obj/ppc64 (ctxt9).asmout 9643 cmd/internal/obj/arm64 (ctxt7).asmout 5042 cmd/internal/obj/x86 (AsmBuf).doasm 8768 cmd/compile/internal/ssa rewriteBlockAMD64 8878 cmd/compile/internal/ssa rewriteBlockARM 8344 cmd/compile/internal/ssa rewriteValueARM64_OpARM64OR_20 7916 cmd/compile/internal/ssa rewriteValueARM64_OpARM64OR_30 5427 cmd/compile/internal/ssa rewriteBlockARM64 5126 cmd/compile/internal/ssa rewriteValuePPC64_OpPPC64OR_50 6152 cmd/compile/internal/ssa rewriteValuePPC64_OpPPC64OR_60 6412 cmd/compile/internal/ssa rewriteValuePPC64_OpPPC64OR_70 6486 cmd/compile/internal/ssa rewriteValuePPC64_OpPPC64OR_80 6534 cmd/compile/internal/ssa rewriteValuePPC64_OpPPC64OR_90 6534 cmd/compile/internal/ssa rewriteValuePPC64_OpPPC64OR_100 6534 cmd/compile/internal/ssa rewriteValuePPC64_OpPPC64OR_110 6675 cmd/compile/internal/gc typecheck1 5433 cmd/compile/internal/gc walkexpr 14070 cmd/vendor/golang.org/x/arch/arm64/arm64asm decodeArg There are a lot more smaller (~1000 node) functions in the stdlib. The function in #26546 has 12477 nodes. At some point it might be nice to have a better heuristic for "inlined body is smaller than the call", a non-cliff way to scale down the cost as the function gets bigger, doing cheaper inlined calls first, etc. All that can wait for another release. I'd like to do this CL for 1.11. Fixes #26546 Update #17566 Change-Id: Idda13020e46ec2b28d79a17217f44b189f8139ac Reviewed-on: https://go-review.googlesource.com/125516 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2018-07-24 16:11:08 +00:00
Cherry Zhang	48c79734ff	test: add test for gccgo bug #26495 Gccgo produced incorrect order of evaluation for expressions involving &&, \|\| subexpressions. The fix is CL 125299. Updates #26495. Change-Id: I18d873281709f3160b3e09f0b2e46f5c120e1cab Reviewed-on: https://go-review.googlesource.com/125301 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-07-20 20:08:15 +00:00
Keith Randall	59d0baeb33	cmd/compile: add test for OPmodify ops clobbering flags Code fix was in CL 122556. This is a corresponding test case. Fixes #26426 Change-Id: Ib8769f367aed8bead029da0a8d2ddccee1d1dccb Reviewed-on: https://go-review.googlesource.com/124535 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-07-19 16:24:53 +00:00
Daniel Martí	ba6974fdc3	cmd/compile: fix crash on invalid struct literal If one tries to use promoted fields in a struct literal, the compiler errors correctly. However, if the embedded fields are of struct pointer type, the field.Type.Sym.Name expression below panics. This is because field.Type.Sym is nil in that case. We can simply use field.Sym.Name in this piece of code though, as it only concerns embedded fields, in which case what we are after is the field name. Added a test mirroring fixedbugs/issue23609.go, but with pointer types. Fixes #26416. Change-Id: Ia46ce62995c9e1653f315accb99d592aff2f285e Reviewed-on: https://go-review.googlesource.com/124395 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2018-07-18 20:32:04 +00:00
Ben Shi	f6ce1e2aa5	cmd/compile: fix an arm64's comparison bug The arm64 backend generates "TST" for "if uint32(a)&uint32(b) == 0", which should be "TSTW". fixes #26438 Change-Id: I7d64c30e3a840b43486bcd10eea2e3e75aaa4857 Reviewed-on: https://go-review.googlesource.com/124637 Run-TryBot: Ben Shi <powerman1st@163.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-07-18 14:15:05 +00:00
Michael Munday	1546ab5a39	cmd/compile: keep autos if their address reaches a control value Autos must be kept if their address reaches the control value of a block. We didn't see this before because it is rare for an auto's address to reach a control value without also reaching a phi or being written to memory. We can probably optimize away the comparisons that lead to this scenario since autos cannot alias with pointers from elsewhere, however for now we take the conservative approach and just ensure the auto is properly initialised if its address reaches a control value. Fixes #26407. Change-Id: I02265793f010a9e001c3e1a5397c290c6769d4de Reviewed-on: https://go-review.googlesource.com/124335 Reviewed-by: David Chase <drchase@google.com>	2018-07-17 14:58:54 +00:00
Keith Randall	85cfa4d55e	cmd/compile: handle degenerate write barrier case If both branches of a write barrier test go to the same block, then there's no unsafe points. This can only happen if the resulting memory state is somehow dead, which can only occur in degenerate cases, like infinite loops. No point in cleaning up the useless branch in these situations. Fixes #26024. Change-Id: I93a7df9fdf2fc94c6c4b1fe61180dc4fd4a0871f Reviewed-on: https://go-review.googlesource.com/123655 Reviewed-by: David Chase <drchase@google.com> Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-07-12 19:06:09 +00:00
David Chase	0029cd479e	cmd/compile: add LocalAddr that takes SP,mem operands Lack of a well-defined order between VarDef and related address operations sometimes causes problems with store order and write barrier transformations; glitches in the order are made irreparable (by later optimizations) if the two parts of the glitch straddle a split in the original block caused by insertion of a write barrier diamond. Fix this by creating a LocalAddr for addresses of locals (what VarDef matters for) that takes a memory input to help make the order explicit. Addr is modified to only be legal for SB operand, so there is no overlap between Addr and LocalAddr uses (there may be some downstream cleanup from this). Changes to generic.rules and rewrite.go ensure that codegen tests continue to pass; CSE of LocalAddr is impaired, not quite sure of the cost. Fixes #26105. Change-Id: Id4192b4440aa4e9d7ba54a465c456df9b530b515 Reviewed-on: https://go-review.googlesource.com/122483 Run-TryBot: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@golang.org>	2018-07-12 18:45:31 +00:00
Ian Lance Taylor	964c15f360	test: add test of valid code that gccgo failed to compile Updates #26340 Change-Id: I3bc7cd544ea77df660bbda7de99a009b63d5be1b Reviewed-on: https://go-review.googlesource.com/123477 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-07-12 05:29:12 +00:00
Matthew Dempsky	cc422e64d0	cmd/compile: fix ICE due to missing inline function body For golang.org/cl/74110, I forgot that you can use range-based for loops to extract key values from a map value. This wasn't a problem for the binary format importer, because it was more tolerant about missing inline function bodies. However, the indexed importer is more particular about this. We could potentially just make it more lenient like the binary importer, but tweaking the logic here is easy enough and seems like the preferable solution. Fixes #26341. Change-Id: I54564dcd0be60ea393f8a0f6954b7d3d61e96ee5 Reviewed-on: https://go-review.googlesource.com/123475 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Heschi Kreinick <heschi@google.com>	2018-07-12 00:13:37 +00:00
Ian Lance Taylor	cda1947fd1	runtime: don't say "different packages" if they may not be different Fix the panic message produced for an interface conversion error to only say "types from different packages" if they are definitely from different packges. If they may be from the same package, say "types from different scopes." Updates #18911 Fixes #26094 Change-Id: I0cea50ba31007d88e70c067b4680009ede69bab9 Reviewed-on: https://go-review.googlesource.com/123395 Reviewed-by: Austin Clements <austin@google.com>	2018-07-11 21:34:05 +00:00
Ian Lance Taylor	787ff17dea	test: add test case that failed with gccgo Updates #26335 Change-Id: Ibfb1e232a0c66fa699842c8908ae5ff0f5d2177d Reviewed-on: https://go-review.googlesource.com/123316 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-07-11 18:14:35 +00:00
Ian Lance Taylor	d49572eaf3	test: add order of evaluation test case that gccgo got wrong Updates #23188 Change-Id: Idc5567546d1c4c592f997a4cebbbf483b85331e0 Reviewed-on: https://go-review.googlesource.com/123115 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-07-10 21:30:26 +00:00
Keith Randall	be9c994609	internal/bytealg: specify argmaps for exported functions Functions exported on behalf of other packages need to have their argument stack maps specified explicitly. They don't get an implicit map because they are not in the local package, and if they get defer'd they need argument maps. Fixes #24419 Change-Id: I35b7d8b4a03d4770ba88699e1007cb3fcb5397a9 Reviewed-on: https://go-review.googlesource.com/122676 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-07-10 17:13:53 +00:00
Ben Shi	f9800a9473	test/codegen: add more test cases for arm64 More test cases of combined load for arm64. Change-Id: I7a9f4dcec6930f161cbded1f47dbf7fcef1db4f1 Reviewed-on: https://go-review.googlesource.com/122582 Run-TryBot: Ben Shi <powerman1st@163.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-07-10 15:41:15 +00:00
Cherry Zhang	c801232525	cmd/compile: make sure alg functions are generated when we call them When DWARF is disabled, some alg functions were not generated. Make sure they are generated when we about to generate calls to them. Fixes #23546. Change-Id: Iecfa0eea830e42ee92e55268167cefb1540980b2 Reviewed-on: https://go-review.googlesource.com/122403 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2018-07-10 01:20:45 +00:00
Cherry Zhang	5f256dc8e6	test: add test for gccgo bug #26248 The fix is CL 122756. Updates #26248. Change-Id: Ic4250ab5d01da9f65d0bc033e2306343d9c87a99 Reviewed-on: https://go-review.googlesource.com/122757 Run-TryBot: Cherry Zhang <cherryyz@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-07-10 01:19:37 +00:00
Keith Randall	58d287e5e8	cmd/compile: ensure that loop condition is detected correctly We need to make sure that the terminating comparison has the right sense given the increment direction. If the increment is positive, the terminating comparsion must be < or <=. If the increment is negative, the terminating comparison must be > or >=. Do a few cleanups, like constant-folding entry==0, adding comments, removing unused "exported" fields. Fixes #26116 Change-Id: I14230ee8126054b750e2a1f2b18eb8f09873dbd5 Reviewed-on: https://go-review.googlesource.com/121940 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Heschi Kreinick <heschi@google.com>	2018-07-09 18:23:39 +00:00
Matthew Dempsky	8d5fd871d7	cmd/compile: fix "width not calculated" ICE Expanding interface method sets is handled during width calculation, which can't be performed concurrently. Make sure that we eagerly expand interfaces in the frontend when importing them, even if they're not actually used by code, because we might need to generate a type description of them. Fixes #25055. Change-Id: I6fd2756de2c7d5dbc33056f70b3028ca3aebab41 Reviewed-on: https://go-review.googlesource.com/122517 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-07-06 20:25:52 +00:00
Cherry Zhang	3f54e8537a	cmd/compile: run generic deadcode in -N mode Late opt pass may generate dead stores, which messes up store chain calculation in later passes. Run generic deadcode even in -N mode to remove them. Fixes #26163. Change-Id: I8276101717bb978d5980e6c7998f53fd8d0ae10f Reviewed-on: https://go-review.googlesource.com/121856 Run-TryBot: Cherry Zhang <cherryyz@google.com> Reviewed-by: Keith Randall <khr@golang.org>	2018-07-02 20:53:23 +00:00
Michael Munday	adfa8b8691	cmd/compile: keep autos whose address reaches a phi If the address of an auto reaches a phi then any further stores to the pointer represented by the phi probably need to be kept. This is because stores to the other arguments to the phi may be visible to the program. Fixes #26153. Change-Id: Ic506c6c543bf70d792e5b1a64bdde1e5fdf1126a Reviewed-on: https://go-review.googlesource.com/121796 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Heschi Kreinick <heschi@google.com>	2018-07-02 19:43:07 +00:00
David Chase	d4d8237a5b	Revert "cmd/compile: make OpAddr depend on VarDef in storeOrder" This reverts commit `1a27f048ad`. Reason for revert: Broke the ssacheck and -N-l builders, and the -N-l fix looks like it will take some time and take a different route entirely. Change-Id: Ie0ac5e86ab7d72a303dfbbc48dfdf1e092d4f61a Reviewed-on: https://go-review.googlesource.com/121715 Reviewed-by: David Chase <drchase@google.com>	2018-06-29 19:48:48 +00:00
David Chase	1a27f048ad	cmd/compile: make OpAddr depend on VarDef in storeOrder Given a carefully constructed input, writebarrier would split a block with the OpAddr in the first half and the VarDef in the second half which ultimately leads to a compiler crash because the scheduler is no longer able to put them in the proper order. To fix, recognize the implicit dependence of OpAddr on the VarDef of the same symbol if any exists. This fix was chosen over making OpAddr take a memory operand to make the dependence explicit, because this change is less invasive at this late part of the 1.11 release cycle. Fixes #26105. Change-Id: I9b65460673af3af41740ef877d2fca91acd336bc Reviewed-on: https://go-review.googlesource.com/121436 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-06-29 15:20:52 +00:00
Cherry Zhang	d21bdf125c	cmd/compile: check SSAability in handling of INDEX of 1-element array SSA can handle 1-element array, but only when the element type is SSAable. When building SSA for INDEX of 1-element array, we did not check the element type is SSAable. And when it's not, it resulted in an unhandled SSA op. Fixes #26120. Change-Id: Id709996b5d9d90212f6c56d3f27eed320a4d8360 Reviewed-on: https://go-review.googlesource.com/121496 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2018-06-29 12:05:05 +00:00
Robert Griesemer	b410ce750e	cmd/compile: don't crash in untyped expr to interface conversion Fixes #24763. Change-Id: Ibe534271d75b6961d00ebfd7d42c43a3ac650d79 Reviewed-on: https://go-review.googlesource.com/121335 Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2018-06-28 23:01:22 +00:00
Ilya Tocar	b71ea0b7dd	cmd/compile: mark CMOVLEQF, CMOVWEQF as cloberring AX Code generation for OpAMD64CMOV[WLQ]EQF uses AX as a scratch register, but only CMOVQEQF, correctly lets compiler know. Mark other 2 as clobbering AX. Fixes #26097 Change-Id: I2a65bd67bf18a540898b4a0ae6c8766e0b767b19 Reviewed-on: https://go-review.googlesource.com/121336 Run-TryBot: Ilya Tocar <ilya.tocar@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Heschi Kreinick <heschi@google.com> Reviewed-by: Giovanni Bajo <rasky@develer.com>	2018-06-28 18:39:56 +00:00
Cherry Zhang	1d303a0086	cmd/compile: fold offset into address on Wasm On Wasm, the offset was not folded into LoweredAddr, so it was not rematerializeable. This led to the address-taken operation in some cases generated too early, before the local variable becoming live. The liveness code thinks the variable live when the address is taken, then backs it up to live at function entry, then complains about it, because nothing other than arguments should be live on entry. This CL folds the offset into the address operation, so it is rematerializeable and so generated right before use, after the variable actually becomes live. It might be possible to relax the liveness code not to think a variable live when its address being taken, but until the address actually being used. But it would be quite complicated. As we're late in Go 1.11 freeze, it would be better not to do it. Also, I think the address operation is rematerializeable now on all architectures, so this is probably less necessary. This may also be a slight optimization, as the address+offset is now rematerializeable, which can be generated on the Wasm stack, without using any "registers" which are emulated by local variables on Wasm. I don't know how to do benchmarks on Wasm. At least, cmd/go binary size shrinks 9K. Fixes #25966. Change-Id: I01e5869515d6a3942fccdcb857f924a866876e57 Reviewed-on: https://go-review.googlesource.com/120599 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Richard Musiol <neelance@gmail.com>	2018-06-27 14:30:00 +00:00
David Chase	7e9d55eeeb	cmd/compile: avoid remainder in loopbce when increment=0 For non-unit increment, loopbce checks to see if the increment evenly divides the difference between (constant) loop start and end. This test panics when the increment is zero. Fix: check for zero, if found, don't optimize the loop. Also added missing copyright notice to loopbce.go. Fixes #26043. Change-Id: I5f460104879cacc94481949234c9ce8c519d6380 Reviewed-on: https://go-review.googlesource.com/120759 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-06-25 20:36:42 +00:00
Matthew Dempsky	f422bea498	cmd/compile: fix compile failure for lazily resolved shadowed types If expanding an inline function body required lazily expanding a package-scoped type whose identifier was shadowed within the function body, the lazy expansion would instead overwrite the local symbol definition instead of the package-scoped symbol. This was due to importsym using s.Def instead of s.PkgDef. Unfortunately, this is yet another consequence of the current awkward scope handling code. Passes toolstash-check. Fixes #25984. Change-Id: Ia7033e1749a883e6e979c854d4b12b0b28083dd8 Reviewed-on: https://go-review.googlesource.com/120456 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Robert Griesemer <gri@golang.org>	2018-06-22 17:57:09 +00:00
Cherry Zhang	78a579316b	cmd/compile: convert uint32 to int32 in ARM constant folding rules MOVWconst's AuxInt is Int32. SSA check complains if the AuxInt does not fit in int32. Convert uint32 to int32 to make it happy. The generated code is unchanged. MOVW only cares low 32 bits. Passes "toolstash -cmp" std cmd for ARM. Fixes #25993. Change-Id: I2b6532c9c285ea6d89652505fb7c553f85a98864 Reviewed-on: https://go-review.googlesource.com/120335 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-06-22 16:25:32 +00:00
David Chase	c6e455bb11	cmd/compile: conditional on -race, disable inline of go:norace Adds the appropriate check to inl.go. Includes tests of both -race+go:norace and plain go:norace. Fixes #24651. Change-Id: Id806342430c20baf4679a985d12eea3b677092e0 Reviewed-on: https://go-review.googlesource.com/119195 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2018-06-19 20:59:10 +00:00
Robert Griesemer	707ca18d97	cmd/compile: more accurate position for select case error message Fixes #25958. Change-Id: I1f4808a70c20334ecfc4eb1789f5389d94dcf00e Reviewed-on: https://go-review.googlesource.com/119755 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-06-19 17:52:23 +00:00
David Chase	c359d759a7	cmd/compile: ensure that operand of ORETURN is not double-walked Inlining of switch statements into a RETURNed expression can sometimes lead to the switch being walked twice, which results in a miscompiled switch statement. The bug depends on: 1) multiple results 2) named results 3) a return statement whose expression includes a call to a function containing a switch statement that is inlined. It may also be significant that the default case of that switch is a panic(), though that's not proven. Rearranged the walk case for ORETURN so that double walks are not possible. Added a test, because this is so fiddly. Added a check against double walks, verified that it fires w/o other fix. Fixes #25776. Change-Id: I2d594351fa082632512ef989af67eb887059729b Reviewed-on: https://go-review.googlesource.com/118318 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-06-14 20:08:10 +00:00
Emmanuel T Odeke	7bc99ffa05	cmd/compile: make case insensitive suggestions aware of package Ensure that compiler error suggestions after case insensitive field lookups don't mistakenly reported unexported fields if those fields aren't in the local package being processed. Fixes #25727 Change-Id: Icae84388c2a82c8cb539f3d43ad348f50a644caa Reviewed-on: https://go-review.googlesource.com/117755 Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com> Reviewed-by: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-06-13 23:12:54 +00:00
Brad Fitzpatrick	39ad208c13	test: add test to verify that string copies don't get optimized away Fixes #25834 Change-Id: I33e58dabfd04b84dfee1a9a3796796b5d19862e7 Reviewed-on: https://go-review.googlesource.com/118295 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-06-12 19:10:34 +00:00
Robert Griesemer	48987baa09	cmd/compile: correct alias cycle detection The original fix (https://go-review.googlesource.com/c/go/+/35831) for this issue was incorrect as it reported cycles in cases where it shouldn't. Instead, use a different approach: A type cycle containing aliases is only a cycle if there are no type definitions. As soon as there is a type definition, alias expansion terminates and there is no cycle. Approach: Split sprint_depchain into two non-recursive and more easily understandable functions (cycleFor and cycleTrace), and use those instead for cycle reporting. Analyze the cycle returned by cycleFor before issueing an alias cycle error. Also: Removed original fix (main.go) which introduced a separate crash (#23823). Fixes #18640. Fixes #23823. Fixes #24939. Change-Id: Ic3707a9dec40a71dc928a3e49b4868c5fac3d3b7 Reviewed-on: https://go-review.googlesource.com/118078 Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2018-06-12 18:55:39 +00:00
David Chase	c08b01ecb4	cmd/compile: fix panic-okay-to-inline change; adjust tests This line of the inlining tuning experiment https://go-review.googlesource.com/c/go/+/109918/1/src/cmd/compile/internal/gc/inl.go#347 was incorrectly rewritten in a later patch to use the call cost, not the panic cost, and thus the inlining of panic didn't occur when it should. I discovered this when I realized that tests should have failed, but didn't. Fix is to make the correct change, and also to modify the tests that this causes to fail. One test now asserts the new normal, the other calls "ppanic" instead which is designed to behave like panic but not be inlined. Change-Id: I423bb7f08bd66a70d999826dd9b87027abf34cdf Reviewed-on: https://go-review.googlesource.com/116656 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-06-06 20:35:23 +00:00
Yury Smolsky	7c1f361e25	test: add comments for all test actions Every action has a short annotation. The errorCheck function has a comment adapted from errchk script. Removed redundant assigments to tmpDir. Change-Id: Ifdd1284de046a0ce2aad26bd8da8a8e6a7707a8e Reviewed-on: https://go-review.googlesource.com/115856 Run-TryBot: Yury Smolsky <yury@smolsky.by> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-06-06 15:12:32 +00:00
Richard Musiol	88dc4aee7c	cmd/compile: fix OffPtr with negative offset on wasm The wasm archtecture was missing a rule to handle OffPtr with a negative offset. This commit makes it so OffPtr always gets lowered to I64AddConst. Fixes #25741 Change-Id: I1d48e2954e3ff31deb8cba9a9bf0cab7c4bab71a Reviewed-on: https://go-review.googlesource.com/116595 Reviewed-by: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com>	2018-06-06 14:40:25 +00:00
Robert Griesemer	9be4f312bf	cmd/compile: revert internal parameter rename (from ".anonX" to "") before export In the old binary export format, parameter names for parameter lists which contained only types where never written, so this problem didn't come up. Fixes #25101. Change-Id: Ia8b817f7f467570b05f88d584e86b6ef4acdccc6 Reviewed-on: https://go-review.googlesource.com/116376 Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2018-06-05 20:22:07 +00:00
Robert Griesemer	4a2bec9726	cmd/compile: fix printing of array types in error messages Fixes #23094. Change-Id: I9aa36046488baa5f55cf2099e10cfb39ecd17b06 Reviewed-on: https://go-review.googlesource.com/116256 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-06-05 00:43:22 +00:00
Robert Griesemer	75c1aed345	runtime: slightly better error message for assertion panics with identical looking types Fixes #18911. Change-Id: Ice10f37460a4f0a66cddeacfe26c28045f5e60fe Reviewed-on: https://go-review.googlesource.com/116255 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-06-05 00:29:50 +00:00
Keith Randall	06b326054d	cmd/compile: include callee args section when checking frame too large The stack frame includes the callee args section. At the point where we were checking the frame size, that part of the frame had not been computed yet. Move the check later so we can include the callee args size. Fixes #20780 Update #25507 Change-Id: Iab97cb89b3a24f8ca19b9123ef2a111d6850c3fe Reviewed-on: https://go-review.googlesource.com/115195 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2018-06-02 18:00:44 +00:00
Yury Smolsky	7cb1810fe8	test: skip test/fixedbugs/bug345.go on windows Before the CL 115277 we did not run the test on Windows, so let's just go back to not running the test on Windows. There is nothing OS-specific about this test, so skipping it on Windows doesn't seem like a big deal. Updates #25693 Fixes #25586 Change-Id: I1eb3e158b322d73e271ef388f8c6e2f2af0a0729 Reviewed-on: https://go-review.googlesource.com/115857 Run-TryBot: Yury Smolsky <yury@smolsky.by> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-06-01 21:46:07 +00:00
Yury Smolsky	57d40f1b27	test: remove rundircmpout and cmpout actions This CL removes the rundircmpout action completely because it is not used anywhere. The run case already looks for output files. Rename the cmpout action mentioned in tests to the run action and remove "cmpout" from run.go. Change-Id: I835ceb70082927f8e9360e0ea0ba74f296363ab3 Reviewed-on: https://go-review.googlesource.com/115575 Run-TryBot: Yury Smolsky <yury@smolsky.by> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-05-31 17:36:45 +00:00
Yury Smolsky	caf968616a	test: eliminate use of Perl in fixedbugs/bug345.go To allow testing of fixedbugs/bug345.go in Go, a new flag -n is introduced. This flag disables setting of relative path for local imports and imports search path to current dir, namely -D . -I . are not passed to the compiler. Error regexps are fixed to allow running the test in temp directory. This change eliminates the last place where Perl script "errchk" was used. Fixes #25586. Change-Id: If085f466e6955312d77315f96d3ef1cb68495aef Reviewed-on: https://go-review.googlesource.com/115277 Run-TryBot: Yury Smolsky <yury@smolsky.by> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-05-31 13:29:50 +00:00
Keith Randall	db9341a024	cmd/compile: update WBLoads during deadcode When we deadcode-remove a block which is a write barrier test, remove that block from the list of write barrier test blocks. Fixes #25516 Change-Id: I1efe732d5476003eab4ad6bf67d0340d7874ff0c Reviewed-on: https://go-review.googlesource.com/115037 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-05-29 17:45:36 +00:00
Keith Randall	d15d055054	cmd/compile: reject large argument areas Extend stack frame limit of 1GB to include large argument/return areas. Argument/return areas are part of the parent frame, not the frame itself, so they need to be handled separately. Fixes #25507. Change-Id: I309298a58faee3e7c1dac80bd2f1166c82460087 Reviewed-on: https://go-review.googlesource.com/115036 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-05-29 16:58:05 +00:00
Josh Bleecher Snyder	002c764533	test: gofmt bounds.go Change-Id: I8b462e20064658120afc8eb1cbac926254d1e24e Reviewed-on: https://go-review.googlesource.com/114937 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-05-29 02:39:16 +00:00
Yury Smolsky	cb80c28961	test: eliminate use of Perl in test/fixedbugs/bug248.go This change enables bug248 to be tested with Go code. For that, it adds a flag -1 to error check and run directory with one package failing compilation prior the last package which should be run. Specifically, the "p" package in bug1.go file was renamed into "q" to compile them in separate steps, bug2.go and bug3.go files were reordered, bug2.go was changed into non-main package. Updates #25586. Change-Id: Ie47aacd56ebb2ce4eac66c792d1a53e1e30e637c Reviewed-on: https://go-review.googlesource.com/114818 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-05-26 15:20:42 +00:00
Ben Shi	8a85bce215	test/codegen: improve test cases for arm64 1. Some incorrect test cases are disabled. 2. Some wrong test cases are corrected. 3. Some new test cases are added. Change-Id: Ib5d0473d55159f233ddab79f96967eaec7b08597 Reviewed-on: https://go-review.googlesource.com/113736 Reviewed-by: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-05-22 14:50:41 +00:00
Austin Clements	a367f44c18	cmd/compile: enable stack maps everywhere except unsafe points This modifies issafepoint in liveness analysis to report almost every operation as a safe point. There are four things we don't mark as safe-points: 1. Runtime code (other than at calls). 2. go:nosplit functions (other than at calls). 3. Instructions between the load of the write barrier-enabled flag and the write. 4. Instructions leading up to a uintptr -> unsafe.Pointer conversion. We'll optimize this in later CLs: name old time/op new time/op delta Template 185ms ± 2% 190ms ± 2% +2.95% (p=0.000 n=10+10) Unicode 96.3ms ± 3% 96.4ms ± 1% ~ (p=0.905 n=10+9) GoTypes 658ms ± 0% 669ms ± 1% +1.72% (p=0.000 n=10+9) Compiler 3.14s ± 1% 3.18s ± 1% +1.56% (p=0.000 n=9+10) SSA 7.41s ± 2% 7.59s ± 1% +2.48% (p=0.000 n=9+10) Flate 126ms ± 1% 128ms ± 1% +2.08% (p=0.000 n=10+10) GoParser 153ms ± 1% 157ms ± 2% +2.38% (p=0.000 n=10+10) Reflect 437ms ± 1% 442ms ± 1% +0.98% (p=0.001 n=10+10) Tar 178ms ± 1% 179ms ± 1% +0.67% (p=0.035 n=10+9) XML 223ms ± 1% 229ms ± 1% +2.58% (p=0.000 n=10+10) [Geo mean] 394ms 401ms +1.75% No effect on binary size because we're not yet emitting these extra safe points. For #24543. Change-Id: I16a1eebb9183cad7cef9d53c0fd21a973cad6859 Reviewed-on: https://go-review.googlesource.com/109348 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2018-05-22 14:43:37 +00:00
Austin Clements	837ed98d63	cmd/compile: don't produce a past-the-end pointer in range loops Currently, range loops over slices and arrays are compiled roughly like: for i, x := range s { b } ⇓ for i, _n, _p := 0, len(s), &s[0]; i < _n; i, _p = i+1, _p + unsafe.Sizeof(s[0]) { b } ⇓ i, _n, _p := 0, len(s), &s[0] goto cond body: { b } i, _p = i+1, _p + unsafe.Sizeof(s[0]) cond: if i < _n { goto body } else { goto end } end: The problem with this lowering is that _p may temporarily point past the end of the allocation the moment before the loop terminates. Right now this isn't a problem because there's never a safe-point during this brief moment. We're about to introduce safe-points everywhere, so this bad pointer is going to be a problem. We could mark the increment as an unsafe block, but this inhibits reordering opportunities and could result in infrequent safe-points if the body is short. Instead, this CL fixes this by changing how we compile range loops to never produce this past-the-end pointer. It changes the lowering to roughly: i, _n, _p := 0, len(s), &s[0] if i < _n { goto body } else { goto end } top: _p += unsafe.Sizeof(s[0]) body: { b } i++ if i < _n { goto top } else { goto end } end: Notably, the increment is split into two parts: we increment the index before checking the condition, but increment the pointer only after the condition check has succeeded. The implementation builds on the OFORUNTIL construct that was introduced during the loop preemption experiments, since OFORUNTIL places the increment and condition after the loop body. To support the extra "late increment" step, we further define OFORUNTIL's "List" field to contain the late increment statements. This makes all of this a relatively small change. This depends on the improvements to the prove pass in CL 102603. With the current lowering, bounds-check elimination knows that i < _n in the body because the body block is dominated by the cond block. In the new lowering, deriving this fact requires detecting that i < _n on both paths into body and hence is true in body. CL 102603 made prove able to detect this. The code size effect of this is minimal. The cmd/go binary on linux/amd64 increases by 0.17%. Performance-wise, this actually appears to be a net win, though it's mostly noise: name old time/op new time/op delta BinaryTree17-12 2.80s ± 0% 2.61s ± 1% -6.88% (p=0.000 n=20+18) Fannkuch11-12 2.41s ± 0% 2.42s ± 0% +0.05% (p=0.005 n=20+20) FmtFprintfEmpty-12 41.6ns ± 5% 41.4ns ± 6% ~ (p=0.765 n=20+19) FmtFprintfString-12 69.4ns ± 3% 69.3ns ± 1% ~ (p=0.084 n=19+17) FmtFprintfInt-12 76.1ns ± 1% 77.3ns ± 1% +1.57% (p=0.000 n=19+19) FmtFprintfIntInt-12 122ns ± 2% 123ns ± 3% +0.95% (p=0.015 n=20+20) FmtFprintfPrefixedInt-12 153ns ± 2% 151ns ± 3% -1.27% (p=0.013 n=20+20) FmtFprintfFloat-12 215ns ± 0% 216ns ± 0% +0.47% (p=0.000 n=20+16) FmtManyArgs-12 486ns ± 1% 498ns ± 0% +2.40% (p=0.000 n=20+17) GobDecode-12 6.43ms ± 0% 6.50ms ± 0% +1.08% (p=0.000 n=18+19) GobEncode-12 5.43ms ± 1% 5.47ms ± 0% +0.76% (p=0.000 n=20+20) Gzip-12 218ms ± 1% 218ms ± 1% ~ (p=0.883 n=20+20) Gunzip-12 38.8ms ± 0% 38.9ms ± 0% ~ (p=0.644 n=19+19) HTTPClientServer-12 76.2µs ± 1% 76.4µs ± 2% ~ (p=0.218 n=20+20) JSONEncode-12 12.2ms ± 0% 12.3ms ± 1% +0.45% (p=0.000 n=19+19) JSONDecode-12 54.2ms ± 1% 53.3ms ± 0% -1.67% (p=0.000 n=20+20) Mandelbrot200-12 3.71ms ± 0% 3.71ms ± 0% ~ (p=0.143 n=19+20) GoParse-12 3.22ms ± 0% 3.19ms ± 1% -0.72% (p=0.000 n=20+20) RegexpMatchEasy0_32-12 76.7ns ± 1% 75.8ns ± 1% -1.19% (p=0.000 n=20+17) RegexpMatchEasy0_1K-12 245ns ± 1% 243ns ± 0% -0.72% (p=0.000 n=18+17) RegexpMatchEasy1_32-12 71.9ns ± 0% 71.7ns ± 1% -0.39% (p=0.006 n=12+18) RegexpMatchEasy1_1K-12 358ns ± 1% 354ns ± 1% -1.13% (p=0.000 n=20+19) RegexpMatchMedium_32-12 105ns ± 2% 105ns ± 1% -0.63% (p=0.007 n=19+20) RegexpMatchMedium_1K-12 31.9µs ± 1% 31.9µs ± 1% ~ (p=1.000 n=17+17) RegexpMatchHard_32-12 1.51µs ± 1% 1.52µs ± 2% +0.46% (p=0.042 n=18+18) RegexpMatchHard_1K-12 45.3µs ± 1% 45.5µs ± 2% +0.44% (p=0.029 n=18+19) Revcomp-12 388ms ± 1% 385ms ± 0% -0.57% (p=0.000 n=19+18) Template-12 63.0ms ± 1% 63.3ms ± 0% +0.50% (p=0.000 n=19+20) TimeParse-12 309ns ± 1% 307ns ± 0% -0.62% (p=0.000 n=20+20) TimeFormat-12 328ns ± 0% 333ns ± 0% +1.35% (p=0.000 n=19+19) [Geo mean] 47.0µs 46.9µs -0.20% (https://perf.golang.org/search?q=upload:20180326.1) For #10958. For #24543. Change-Id: Icbd52e711fdbe7938a1fea3e6baca1104b53ac3a Reviewed-on: https://go-review.googlesource.com/102604 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: David Chase <drchase@google.com>	2018-05-22 14:15:46 +00:00
Austin Clements	b812eec928	cmd/compile: detect OFORUNTIL inductive facts in prove Currently, we compile range loops into for loops with the obvious initialization and update of the index variable. In this form, the prove pass can see that the body is dominated by an i < len condition, and findIndVar can detect that i is an induction variable and that 0 <= i < len. GOEXPERIMENT=preemptibleloops compiles range loops to OFORUNTIL and we're preparing to unconditionally switch to a variation of this for #24543. OFORUNTIL moves the increment and condition after the body, which makes the bounds on the index variable much less obvious. With OFORUNTIL, proving anything about the index variable requires understanding the phi that joins the index values at the top of the loop body block. This interferes with both prove's ability to see that i < len (this is true on both paths that enter the body, but from two different conditional checks) and with findIndVar's ability to detect the induction pattern. Fix this by teaching prove to detect that the index in the pattern constructed by OFORUNTIL is an induction variable and add both bounds to the facts table. Currently this is done separately from findIndVar because it depends on prove's factsTable, while findIndVar runs before visiting blocks and building the factsTable. Without any GOEXPERIMENT, this has no effect on std or cmd. However, with GOEXPERIMENT=preemptibleloops, this change becomes necessary to prove 90 conditions in std and cmd. Change-Id: Ic025d669f81b53426309da5a6e8010e5ccaf4f49 Reviewed-on: https://go-review.googlesource.com/102603 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-05-22 14:15:44 +00:00
David Chase	eeff8fa453	cmd/compile: remove now-irrelevant test This test measures "line churn" which was minimized to help improve the debugger experience. With proper is_stmt markers, this is no longer necessary, and it is more accurate (for profiling) to allow line numbers to vary willy-nilly. "Debugger experience" is now better measured by cmd/compile/internal/ssa/debug_test.go This CL made the obsoleting change: https://go-review.googlesource.com/c/go/+/102435 Change-Id: I874ab89f3b243b905aaeba7836118f632225a667 Reviewed-on: https://go-review.googlesource.com/113155 Run-TryBot: David Chase <drchase@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-05-14 20:33:00 +00:00
David Chase	c2c1822b12	cmd/compile: assign and preserve statement boundaries. A new pass run after ssa building (before any other optimization) identifies the "first" ssa node for each statement. Other "noise" nodes are tagged as being never appropriate for a statement boundary (e.g., VarKill, VarDef, Phi). Rewrite, deadcode, cse, and nilcheck are modified to move the statement boundaries forward whenever possible if a boundary-tagged ssa value is removed; never-boundary nodes are ignored in this search (some operations involving constants are also tagged as never-boundary and also ignored because they are likely to be moved or removed during optimization). Code generation treats all nodes except those explicitly marked as statement boundaries as "not statement" nodes, and floats statement boundaries to the beginning of each same-line run of instructions found within a basic block. Line number html conversion was modified to make statement boundary nodes a bit more obvious by prepending a "+". The code in fuse.go that glued together the value slices of two blocks produced a result that depended on the former capacities (not lengths) of the two slices. This causes differences in the 386 bootstrap, and also can sometimes put values into an order that does a worse job of preserving statement boundaries when values are removed. Portions of two delve tests that had caught problems were incorporated into ssa/debug_test.go. There are some opportunities to do better with optimized code, but the next-ing is not lying or overly jumpy. Over 4 CLs, compilebench geomean measured binary size increase of 3.5% and compile user time increase of 3.8% (this is after optimization to reuse a sparse map instead of creating multiple maps.) This CL worsens the optimized-debugging experience with Delve; we need to work with the delve team so that they can use the is_stmt marks that we're emitting now. The reference output changes from time to time depending on other changes in the compiler, sometimes better, sometimes worse. This CL now includes a test ensuring that 99+% of the lines in the Go command itself (a handy optimized binary) include is_stmt markers. Change-Id: I359c94e06843f1eb41f9da437bd614885aa9644a Reviewed-on: https://go-review.googlesource.com/102435 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2018-05-14 14:09:49 +00:00
Keith Randall	599f56dc05	cmd/compile: fix zero extend after float->int conversion Don't do direct loads from argument slots if the sizes don't match. This prevents us from loading from a float32 using a uint64 load during expressions like uint64(math.float32Bits(f)) where f is a float32 arg. Fixes #25322 Change-Id: I3887d76f78c844ba546243e7721d811c3d4a9700 Reviewed-on: https://go-review.googlesource.com/112637 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-05-10 17:32:43 +00:00
Lynn Boger	e4172e5f5e	cmd/compile/internal/ssa: initialize t.UInt in SetTypPtrs() Initialization of t.UInt is missing from SetTypPtrs in config.go, preventing rules that use it from matching when they should. This adds the initialization to allow those rules to work. Updated test/codegen/rotate.go to test for this case, which appears in math/bits RotateLeft32 and RotateLeft64. There had been a testcase for this in go 1.10 but that went away when asm_test.go was removed. Change-Id: I82fc825ad8364df6fc36a69a1e448214d2e24ed5 Reviewed-on: https://go-review.googlesource.com/112518 Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-05-10 14:50:30 +00:00
Michael Munday	6d00e8c478	cmd/compile: convert memmove call into Move when arguments are disjoint Move ops can be faster than memmove calls because the number of bytes to be moved is fixed and they don't incur the overhead of a call. This change allows memmove to be converted into a Move op when the arguments are disjoint. The optimization is only enabled on s390x at the moment, however other architectures may also benefit from it in the future. The memmove inlining rule triggers an extra 12 times when compiling the standard library. It will most likely make more of a difference as the disjoint function is improved over time (to recognize fresh heap allocations for example). Change-Id: I9af570dcfff28257b8e59e0ff584a46d8e248310 Reviewed-on: https://go-review.googlesource.com/110064 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>	2018-05-09 11:20:40 +00:00
Matt Juran	e0adc35c47	test: fast GC+concurrency+types verification This test runs independent goroutines modifying a comprehensive variety of local vars to look for garbage collector regressions. This test has been verified to trigger issue 22781 on the go1.9.2 tag. This test expands on test/fixedbugs/issue22781.go. Tests #22781 Change-Id: Id32f8dde7ef650aea1b1b4cf518e6d045537bfdc Reviewed-on: https://go-review.googlesource.com/93715 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-05-08 21:15:48 +00:00
Martin Möhrmann	aee71dd70b	cmd/compile: optimize map-clearing range idiom replace map clears of the form: for k := range m { delete(m, k) } (where m is map with key type that is reflexive for ==) with a new runtime function that clears the maps backing array with a memclr and reinitializes the hmap struct. Map key types that for example contain floats are not replaced by this optimization since NaN keys cannot be deleted from maps using delete. name old time/op new time/op delta GoMapClear/Reflexive/1 92.2ns ± 1% 47.1ns ± 2% -48.89% (p=0.000 n=9+9) GoMapClear/Reflexive/10 108ns ± 1% 48ns ± 2% -55.68% (p=0.000 n=10+10) GoMapClear/Reflexive/100 303ns ± 2% 110ns ± 3% -63.56% (p=0.000 n=10+10) GoMapClear/Reflexive/1000 3.58µs ± 3% 1.23µs ± 2% -65.49% (p=0.000 n=9+10) GoMapClear/Reflexive/10000 28.2µs ± 3% 10.3µs ± 2% -63.55% (p=0.000 n=9+10) GoMapClear/NonReflexive/1 121ns ± 2% 124ns ± 7% ~ (p=0.097 n=10+10) GoMapClear/NonReflexive/10 137ns ± 2% 139ns ± 3% +1.53% (p=0.033 n=10+10) GoMapClear/NonReflexive/100 331ns ± 3% 334ns ± 2% ~ (p=0.342 n=10+10) GoMapClear/NonReflexive/1000 3.64µs ± 3% 3.64µs ± 2% ~ (p=0.887 n=9+10) GoMapClear/NonReflexive/10000 28.1µs ± 2% 28.4µs ± 3% ~ (p=0.247 n=10+10) Fixes #20138 Change-Id: I181332a8ef434a4f0d89659f492d8711db3f3213 Reviewed-on: https://go-review.googlesource.com/110055 Reviewed-by: Keith Randall <khr@golang.org>	2018-05-08 21:15:16 +00:00
Michael Munday	8af0c77df3	cmd/compile: simplify shift lowering on s390x Use conditional moves instead of subtractions with borrow to handle saturation cases. This allows us to delete the SUBE/SUBEW ops and associated rules from the SSA backend. Using conditional moves also means we can detect when shift values are masked so I've added some new rules to constant fold the relevant comparisons and masking ops. Also use the new shiftIsBounded() function to avoid generating code to handle saturation cases where possible. Updates #25167 for s390x. Change-Id: Ief9991c91267c9151ce4c5ec07642abb4dcc1c0d Reviewed-on: https://go-review.googlesource.com/110070 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-05-08 16:19:56 +00:00
Lynn Boger	28edaf4584	cmd/compile,test: combine byte loads and stores on ppc64le CL 74410 added rules to combine consecutive byte loads and stores when the byte order was little endian for ppc64le. This is the corresponding change for bytes that are in big endian order. These rules are all intended for a little endian target arch. This adds new testcases in test/codegen/memcombine.go Fixes #22496 Updates #24242 Benchmark improvement for encoding/binary: name old time/op new time/op delta ReadSlice1000Int32s-16 11.0µs ± 0% 9.0µs ± 0% -17.47% (p=0.029 n=4+4) ReadStruct-16 2.47µs ± 1% 2.48µs ± 0% +0.67% (p=0.114 n=4+4) ReadInts-16 642ns ± 1% 630ns ± 1% -2.02% (p=0.029 n=4+4) WriteInts-16 654ns ± 0% 653ns ± 1% -0.08% (p=0.629 n=4+4) WriteSlice1000Int32s-16 8.75µs ± 0% 8.20µs ± 0% -6.19% (p=0.029 n=4+4) PutUint16-16 1.16ns ± 0% 0.93ns ± 0% -19.83% (p=0.029 n=4+4) PutUint32-16 1.16ns ± 0% 0.93ns ± 0% -19.83% (p=0.029 n=4+4) PutUint64-16 1.85ns ± 0% 0.93ns ± 0% -49.73% (p=0.029 n=4+4) LittleEndianPutUint16-16 1.03ns ± 0% 0.93ns ± 0% -9.71% (p=0.029 n=4+4) LittleEndianPutUint32-16 0.93ns ± 0% 0.93ns ± 0% ~ (all equal) LittleEndianPutUint64-16 0.93ns ± 0% 0.93ns ± 0% ~ (all equal) PutUvarint32-16 43.0ns ± 0% 43.1ns ± 0% +0.12% (p=0.429 n=4+4) PutUvarint64-16 174ns ± 0% 175ns ± 0% +0.29% (p=0.429 n=4+4) Updates made to functions in gcm.go to enable their matching. An existing testcase prevents these functions from being replaced by those in encoding/binary due to import dependencies. Change-Id: Idb3bd1e6e7b12d86cd828fb29cb095848a3e485a Reviewed-on: https://go-review.googlesource.com/98136 Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-05-08 13:15:39 +00:00
Michael Munday	f31a18ded4	cmd/compile: add some generic composite type optimizations Propagate values through some wide Zero/Move operations. Among other things this allows us to optimize some kinds of array initialization. For example, the following code no longer requires a temporary be allocated on the stack. Instead it writes the values directly into the return value. func f(i uint32) [4]uint32 { return [4]uint32{i, i+1, i+2, i+3} } The return value is unnecessarily cleared but removing that is probably a task for dead store analysis (I think it needs to be able to match multiple Store ops to wide Zero ops). In order to reliably remove stack variables that are rendered unnecessary by these new rules I've added a new generic version of the unread autos elimination pass. These rules are triggered more than 5000 times when building and testing the standard library. Updates #15925 (fixes for arrays of up to 4 elements). Updates #24386 (fixes for up to 4 kept elements). Updates #24416. compilebench results: name old time/op new time/op delta Template 353ms ± 5% 359ms ± 3% ~ (p=0.143 n=10+10) Unicode 219ms ± 1% 217ms ± 4% ~ (p=0.740 n=7+10) GoTypes 1.26s ± 1% 1.26s ± 2% ~ (p=0.549 n=9+10) Compiler 6.00s ± 1% 6.08s ± 1% +1.42% (p=0.000 n=9+8) SSA 15.3s ± 2% 15.6s ± 1% +2.43% (p=0.000 n=10+10) Flate 237ms ± 2% 240ms ± 2% +1.31% (p=0.015 n=10+10) GoParser 285ms ± 1% 285ms ± 1% ~ (p=0.878 n=8+8) Reflect 797ms ± 3% 807ms ± 2% ~ (p=0.065 n=9+10) Tar 334ms ± 0% 335ms ± 4% ~ (p=0.460 n=8+10) XML 419ms ± 0% 423ms ± 1% +0.91% (p=0.001 n=7+9) StdCmd 46.0s ± 0% 46.4s ± 0% +0.85% (p=0.000 n=9+9) name old user-time/op new user-time/op delta Template 337ms ± 3% 346ms ± 5% ~ (p=0.053 n=9+10) Unicode 205ms ±10% 205ms ± 8% ~ (p=1.000 n=10+10) GoTypes 1.22s ± 2% 1.21s ± 3% ~ (p=0.436 n=10+10) Compiler 5.85s ± 1% 5.93s ± 0% +1.46% (p=0.000 n=10+8) SSA 14.9s ± 1% 15.3s ± 1% +2.62% (p=0.000 n=10+10) Flate 229ms ± 4% 228ms ± 6% ~ (p=0.796 n=10+10) GoParser 271ms ± 3% 275ms ± 4% ~ (p=0.165 n=10+10) Reflect 779ms ± 5% 775ms ± 2% ~ (p=0.971 n=10+10) Tar 317ms ± 4% 319ms ± 5% ~ (p=0.853 n=10+10) XML 404ms ± 4% 409ms ± 5% ~ (p=0.436 n=10+10) name old alloc/op new alloc/op delta Template 34.9MB ± 0% 35.0MB ± 0% +0.26% (p=0.000 n=10+10) Unicode 29.3MB ± 0% 29.3MB ± 0% +0.02% (p=0.000 n=10+10) GoTypes 115MB ± 0% 115MB ± 0% +0.30% (p=0.000 n=10+10) Compiler 519MB ± 0% 521MB ± 0% +0.30% (p=0.000 n=10+10) SSA 1.55GB ± 0% 1.57GB ± 0% +1.34% (p=0.000 n=10+9) Flate 24.1MB ± 0% 24.2MB ± 0% +0.10% (p=0.000 n=10+10) GoParser 28.1MB ± 0% 28.1MB ± 0% +0.07% (p=0.000 n=10+10) Reflect 78.7MB ± 0% 78.7MB ± 0% +0.03% (p=0.000 n=8+10) Tar 34.4MB ± 0% 34.5MB ± 0% +0.12% (p=0.000 n=10+10) XML 43.2MB ± 0% 43.2MB ± 0% +0.13% (p=0.000 n=10+10) name old allocs/op new allocs/op delta Template 330k ± 0% 330k ± 0% -0.01% (p=0.017 n=10+10) Unicode 337k ± 0% 337k ± 0% +0.01% (p=0.000 n=9+10) GoTypes 1.15M ± 0% 1.15M ± 0% +0.03% (p=0.000 n=10+10) Compiler 4.77M ± 0% 4.77M ± 0% +0.03% (p=0.000 n=9+10) SSA 12.5M ± 0% 12.6M ± 0% +1.16% (p=0.000 n=10+10) Flate 221k ± 0% 221k ± 0% +0.05% (p=0.000 n=9+10) GoParser 275k ± 0% 275k ± 0% +0.01% (p=0.014 n=10+9) Reflect 944k ± 0% 944k ± 0% -0.02% (p=0.000 n=10+10) Tar 324k ± 0% 323k ± 0% -0.12% (p=0.000 n=10+10) XML 384k ± 0% 384k ± 0% -0.01% (p=0.001 n=10+10) name old object-bytes new object-bytes delta Template 476kB ± 0% 476kB ± 0% -0.04% (p=0.000 n=10+10) Unicode 218kB ± 0% 218kB ± 0% ~ (all equal) GoTypes 1.58MB ± 0% 1.58MB ± 0% -0.04% (p=0.000 n=10+10) Compiler 6.25MB ± 0% 6.24MB ± 0% -0.09% (p=0.000 n=10+10) SSA 15.9MB ± 0% 16.1MB ± 0% +1.22% (p=0.000 n=10+10) Flate 304kB ± 0% 304kB ± 0% -0.13% (p=0.000 n=10+10) GoParser 370kB ± 0% 370kB ± 0% -0.00% (p=0.000 n=10+10) Reflect 1.27MB ± 0% 1.27MB ± 0% -0.12% (p=0.000 n=10+10) Tar 421kB ± 0% 419kB ± 0% -0.64% (p=0.000 n=10+10) XML 518kB ± 0% 517kB ± 0% -0.12% (p=0.000 n=10+10) name old export-bytes new export-bytes delta Template 16.7kB ± 0% 16.7kB ± 0% ~ (all equal) Unicode 6.52kB ± 0% 6.52kB ± 0% ~ (all equal) GoTypes 29.2kB ± 0% 29.2kB ± 0% ~ (all equal) Compiler 88.0kB ± 0% 88.0kB ± 0% ~ (all equal) SSA 109kB ± 0% 109kB ± 0% ~ (all equal) Flate 4.49kB ± 0% 4.49kB ± 0% ~ (all equal) GoParser 8.10kB ± 0% 8.10kB ± 0% ~ (all equal) Reflect 7.71kB ± 0% 7.71kB ± 0% ~ (all equal) Tar 9.15kB ± 0% 9.15kB ± 0% ~ (all equal) XML 12.3kB ± 0% 12.3kB ± 0% ~ (all equal) name old text-bytes new text-bytes delta HelloSize 676kB ± 0% 672kB ± 0% -0.59% (p=0.000 n=10+10) CmdGoSize 7.26MB ± 0% 7.24MB ± 0% -0.18% (p=0.000 n=10+10) name old data-bytes new data-bytes delta HelloSize 10.2kB ± 0% 10.2kB ± 0% ~ (all equal) CmdGoSize 248kB ± 0% 248kB ± 0% ~ (all equal) name old bss-bytes new bss-bytes delta HelloSize 125kB ± 0% 125kB ± 0% ~ (all equal) CmdGoSize 145kB ± 0% 145kB ± 0% ~ (all equal) name old exe-bytes new exe-bytes delta HelloSize 1.46MB ± 0% 1.45MB ± 0% -0.31% (p=0.000 n=10+10) CmdGoSize 14.7MB ± 0% 14.7MB ± 0% -0.17% (p=0.000 n=10+10) Change-Id: Ic72b0c189dd542f391e1c9ab88a76e9148dc4285 Reviewed-on: https://go-review.googlesource.com/106495 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-05-08 10:31:21 +00:00
Ben Shi	098ca846c7	cmd/compile: emit more compact 386 instructions ADDL/SUBL/ANDL/ORL/XORL can have a memory operand as destination, and this CL optimize the compiler to emit such instructions on 386 for more compact binary. Here is test report: 1. The total size of pkg/linux_386/ and pkg/tool/linux_386/ decreases about 14KB. (pkg/linux_386/cmd/compile/ and pkg/tool/linux_386/compile are excluded) 2. The go1 benchmark shows little change, excluding ±2% noise. name old time/op new time/op delta BinaryTree17-4 3.34s ± 2% 3.38s ± 2% +1.27% (p=0.000 n=40+39) Fannkuch11-4 3.55s ± 1% 3.51s ± 1% -1.33% (p=0.000 n=40+40) FmtFprintfEmpty-4 46.3ns ± 3% 46.9ns ± 4% +1.41% (p=0.002 n=40+40) FmtFprintfString-4 80.8ns ± 3% 80.4ns ± 6% -0.54% (p=0.044 n=40+40) FmtFprintfInt-4 93.0ns ± 3% 92.2ns ± 4% -0.88% (p=0.007 n=39+40) FmtFprintfIntInt-4 144ns ± 5% 145ns ± 2% +0.78% (p=0.015 n=40+40) FmtFprintfPrefixedInt-4 184ns ± 2% 182ns ± 2% -1.06% (p=0.004 n=40+40) FmtFprintfFloat-4 415ns ± 4% 419ns ± 4% ~ (p=0.434 n=40+40) FmtManyArgs-4 615ns ± 3% 619ns ± 3% ~ (p=0.100 n=40+40) GobDecode-4 7.30ms ± 6% 7.36ms ± 6% ~ (p=0.074 n=40+40) GobEncode-4 7.10ms ± 6% 7.21ms ± 5% ~ (p=0.082 n=40+39) Gzip-4 364ms ± 3% 362ms ± 6% -0.71% (p=0.020 n=40+40) Gunzip-4 42.4ms ± 3% 42.2ms ± 3% ~ (p=0.303 n=40+40) HTTPClientServer-4 62.9µs ± 1% 62.9µs ± 1% ~ (p=0.768 n=38+39) JSONEncode-4 21.4ms ± 4% 21.5ms ± 5% ~ (p=0.210 n=40+40) JSONDecode-4 67.7ms ± 3% 67.9ms ± 4% ~ (p=0.713 n=40+40) Mandelbrot200-4 5.18ms ± 3% 5.21ms ± 3% +0.59% (p=0.021 n=40+40) GoParse-4 3.35ms ± 3% 3.34ms ± 2% ~ (p=0.996 n=40+40) RegexpMatchEasy0_32-4 98.5ns ± 5% 96.3ns ± 4% -2.15% (p=0.001 n=40+40) RegexpMatchEasy0_1K-4 851ns ± 4% 850ns ± 5% ~ (p=0.700 n=40+40) RegexpMatchEasy1_32-4 105ns ± 7% 107ns ± 4% +1.50% (p=0.017 n=40+40) RegexpMatchEasy1_1K-4 1.03µs ± 5% 1.03µs ± 4% ~ (p=0.992 n=40+40) RegexpMatchMedium_32-4 130ns ± 6% 128ns ± 4% -1.66% (p=0.012 n=40+40) RegexpMatchMedium_1K-4 44.0µs ± 5% 43.6µs ± 3% ~ (p=0.704 n=40+40) RegexpMatchHard_32-4 2.29µs ± 3% 2.23µs ± 4% -2.38% (p=0.000 n=40+40) RegexpMatchHard_1K-4 69.0µs ± 3% 68.1µs ± 3% -1.28% (p=0.003 n=40+40) Revcomp-4 1.85s ± 2% 1.87s ± 3% +1.11% (p=0.000 n=40+40) Template-4 69.8ms ± 3% 69.6ms ± 3% ~ (p=0.125 n=40+40) TimeParse-4 442ns ± 5% 440ns ± 3% ~ (p=0.585 n=40+40) TimeFormat-4 419ns ± 3% 420ns ± 3% ~ (p=0.824 n=40+40) [Geo mean] 67.3µs 67.2µs -0.11% name old speed new speed delta GobDecode-4 105MB/s ± 6% 104MB/s ± 6% ~ (p=0.074 n=40+40) GobEncode-4 108MB/s ± 7% 107MB/s ± 5% ~ (p=0.080 n=40+39) Gzip-4 53.3MB/s ± 3% 53.7MB/s ± 6% +0.73% (p=0.021 n=40+40) Gunzip-4 458MB/s ± 3% 460MB/s ± 3% ~ (p=0.301 n=40+40) JSONEncode-4 90.8MB/s ± 4% 90.3MB/s ± 4% ~ (p=0.213 n=40+40) JSONDecode-4 28.7MB/s ± 3% 28.6MB/s ± 4% ~ (p=0.679 n=40+40) GoParse-4 17.3MB/s ± 3% 17.3MB/s ± 2% ~ (p=1.000 n=40+40) RegexpMatchEasy0_32-4 325MB/s ± 5% 333MB/s ± 4% +2.44% (p=0.000 n=40+38) RegexpMatchEasy0_1K-4 1.20GB/s ± 4% 1.21GB/s ± 5% ~ (p=0.684 n=40+40) RegexpMatchEasy1_32-4 303MB/s ± 7% 298MB/s ± 4% -1.52% (p=0.022 n=40+40) RegexpMatchEasy1_1K-4 995MB/s ± 5% 996MB/s ± 4% ~ (p=0.996 n=40+40) RegexpMatchMedium_32-4 7.67MB/s ± 6% 7.80MB/s ± 4% +1.68% (p=0.011 n=40+40) RegexpMatchMedium_1K-4 23.3MB/s ± 5% 23.5MB/s ± 3% ~ (p=0.697 n=40+40) RegexpMatchHard_32-4 14.0MB/s ± 3% 14.3MB/s ± 4% +2.43% (p=0.000 n=40+40) RegexpMatchHard_1K-4 14.8MB/s ± 3% 15.0MB/s ± 3% +1.30% (p=0.003 n=40+40) Revcomp-4 137MB/s ± 2% 136MB/s ± 3% -1.10% (p=0.000 n=40+40) Template-4 27.8MB/s ± 3% 27.9MB/s ± 3% ~ (p=0.128 n=40+40) [Geo mean] 79.6MB/s 79.9MB/s +0.28% Change-Id: I02a3efc125dc81e18fc8495eb2bf1bba59ab8733 Reviewed-on: https://go-review.googlesource.com/110157 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>	2018-05-08 06:44:54 +00:00
Martin Möhrmann	b9a59d9f2e	cmd/compile: optimize len([]rune(string)) Adds a new runtime function to count runes in a string. Modifies the compiler to detect the pattern len([]rune(string)) and replaces it with the new rune counting runtime function. RuneCount/lenruneslice/ASCII 27.8ns ± 2% 14.5ns ± 3% -47.70% (p=0.000 n=10+10) RuneCount/lenruneslice/Japanese 126ns ± 2% 60ns ± 2% -52.03% (p=0.000 n=10+10) RuneCount/lenruneslice/MixedLength 104ns ± 2% 50ns ± 1% -51.71% (p=0.000 n=10+9) Fixes #24923 Change-Id: Ie9c7e7391a4e2cca675c5cdcc1e5ce7d523948b9 Reviewed-on: https://go-review.googlesource.com/108985 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2018-05-06 05:31:01 +00:00
Martin Möhrmann	a8a60ac2a7	cmd/compile: optimize append(x, make([]T, y)...) slice extension Changes the compiler to recognize the slice extension pattern append(x, make([]T, y)...) and replace it with growslice and an optional memclr to avoid an allocation for make([]T, y). Memclr is not called in case growslice already allocated a new cleared backing array when T contains pointers. amd64: name old time/op new time/op delta ExtendSlice/IntSlice 103ns ± 4% 57ns ± 4% -44.55% (p=0.000 n=18+18) ExtendSlice/PointerSlice 155ns ± 3% 77ns ± 3% -49.93% (p=0.000 n=20+20) ExtendSlice/NoGrow 50.2ns ± 3% 5.2ns ± 2% -89.67% (p=0.000 n=18+18) name old alloc/op new alloc/op delta ExtendSlice/IntSlice 64.0B ± 0% 32.0B ± 0% -50.00% (p=0.000 n=20+20) ExtendSlice/PointerSlice 64.0B ± 0% 32.0B ± 0% -50.00% (p=0.000 n=20+20) ExtendSlice/NoGrow 32.0B ± 0% 0.0B -100.00% (p=0.000 n=20+20) name old allocs/op new allocs/op delta ExtendSlice/IntSlice 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=20+20) ExtendSlice/PointerSlice 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=20+20) ExtendSlice/NoGrow 1.00 ± 0% 0.00 -100.00% (p=0.000 n=20+20) Fixes #21266 Change-Id: Idc3077665f63cbe89762b590c5967a864fd1c07f Reviewed-on: https://go-review.googlesource.com/109517 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2018-05-06 04:28:23 +00:00
Martin Möhrmann	500d79c410	cmd/compile: refactor memclrrange for arrays and slices Rename memclrrange to signify that it does not handle all types of range clears. Simplify checks to detect the range clear idiom for arrays and slices. Add tests to verify the optimization for the slice range clear idiom is being applied by the compiler. Change-Id: I5c3b7c9a479699ebdb4c407fde692f30f377860c Reviewed-on: https://go-review.googlesource.com/110477 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-05-02 04:20:25 +00:00
Matthew Dempsky	004260afde	cmd/compile: open code select{send,recv,default} Registration now looks like: var cases [4]runtime.scases var order [8]uint16 cases[0].kind = caseSend cases[0].c = c1 cases[0].elem = &v1 if raceenabled \|\| msanenabled { selectsetpc(&cases[0]) } cases[1].kind = caseRecv cases[1].c = c2 cases[1].elem = &v2 if raceenabled \|\| msanenabled { selectsetpc(&cases[1]) } ... Change-Id: Ib9bcf426a4797fe4bfd8152ca9e6e08e39a70b48 Reviewed-on: https://go-review.googlesource.com/37934 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2018-05-01 03:17:44 +00:00
Matthew Dempsky	3aa53b3135	runtime: eliminate runtime.hselect Now the registration phase looks like: var cases [4]runtime.scases var order [8]uint16 selectsend(&cases[0], c1, &v1) selectrecv(&cases[1], c2, &v2, nil) selectrecv(&cases[2], c3, &v3, &ok) selectdefault(&cases[3]) chosen := selectgo(&cases[0], &order[0], 4) Primarily, this is just preparation for having the compiler open-code selectsend, selectrecv, and selectdefault. As a minor benefit, order can now be layed out separately on the stack in the pointer-free segment, so it won't take up space in the function's stack pointer maps. Change-Id: I5552ba594201efd31fcb40084da20b42ea569a45 Reviewed-on: https://go-review.googlesource.com/37933 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2018-05-01 03:17:31 +00:00
Richard Musiol	e3c684777a	all: skip unsupported tests for js/wasm The general policy for the current state of js/wasm is that it only has to support tests that are also supported by nacl. The test nilptr3.go makes assumptions about which nil checks can be removed. Since WebAssembly does not signal on reading a null pointer, all nil checks have to be explicit. Updates #18892 Change-Id: I06a687860b8d22ae26b1c391499c0f5183e4c485 Reviewed-on: https://go-review.googlesource.com/110096 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-04-30 19:39:18 +00:00
Giovanni Bajo	e0d37a33ab	cmd/compile: teach prove to handle expressions like len(s)-delta When a loop has bound len(s)-delta, findIndVar detected it and returned len(s) as (conservative) upper bound. This little lie allowed loopbce to drop bound checks. It is obviously more generic to teach prove about relations like x+d<w for non-constant "w"; we already handled the case for constant "w", so we just want to learn that if d<0, then x+d<w proves that x<w. To be able to remove the code from findIndVar, we also need to teach prove that len() and cap() are always non-negative. This CL allows to prove 633 more checks in cmd+std. Most of them are cases where the code was already testing before accessing a slice but the compiler didn't know it. For instance, take strings.HasSuffix: func HasSuffix(s, suffix string) bool { return len(s) >= len(suffix) && s[len(s)-len(suffix):] == suffix } When suffix is a literal string, the compiler now understands that the explicit check is enough to not emit a slice check. I also found a loopbce test that was incorrectly written to detect an overflow but had a off-by-one (on the conservative side), so it unexpectly passed with this CL; I changed it to really trigger the overflow as intended. Change-Id: Ib5abade337db46b8811425afebad4719b6e46c4a Reviewed-on: https://go-review.googlesource.com/105635 Run-TryBot: Giovanni Bajo <rasky@develer.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2018-04-29 09:38:32 +00:00
Giovanni Bajo	6d379add0f	cmd/compile: in prove, detect loops with negative increments To be effective, this also requires being able to relax constraints on min/max bound inclusiveness; they are now exposed through a flags, and prove has been updated to handle it correctly. Change-Id: I3490e54461b7b9de8bc4ae40d3b5e2fa2d9f0556 Reviewed-on: https://go-review.googlesource.com/104041 Run-TryBot: Giovanni Bajo <rasky@develer.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2018-04-29 09:38:18 +00:00
Giovanni Bajo	980fdb8dd5	cmd/compile: improve testing of induction variables Test both minimum and maximum bound, and prepare formatting for more advanced tests (inclusive / esclusive bounds). Change-Id: Ibe432916d9c938343bc07943798bc9709ad71845 Reviewed-on: https://go-review.googlesource.com/104040 Run-TryBot: Giovanni Bajo <rasky@develer.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-04-29 09:38:09 +00:00
Giovanni Bajo	7ec25d0acf	cmd/compile: implement loop BCE in prove Reuse findIndVar to discover induction variables, and then register the facts we know about them into the facts table when entering the loop block. Moreover, handle "x+delta > w" while updating the facts table, to be able to prove accesses to slices with constant offsets such as slice[i-10]. Change-Id: I2a63d050ed58258136d54712ac7015b25c893d71 Reviewed-on: https://go-review.googlesource.com/104038 Run-TryBot: Giovanni Bajo <rasky@develer.com> Reviewed-by: David Chase <drchase@google.com>	2018-04-29 09:37:35 +00:00
Giovanni Bajo	29162ec9a7	cmd/compile: in prove, infer unsigned relations while branching When a branch is followed, we apply the relation as described in the domain relation table. In case the relation is in the positive domain, we can also infer an unsigned relation if, by that point, we know that both operands are non-negative. Fixes #20393 Change-Id: Ieaf0c81558b36d96616abae3eb834c788dd278d5 Reviewed-on: https://go-review.googlesource.com/100278 Run-TryBot: Giovanni Bajo <rasky@develer.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Giovanni Bajo <rasky@develer.com> Reviewed-by: David Chase <drchase@google.com>	2018-04-29 09:37:15 +00:00
Giovanni Bajo	5c40210987	cmd/compile: in prove, add transitive closure of relations Implement it through a partial order datastructure, which keeps the relations between SSA values in a forest of DAGs and is able to discover contradictions. In make.bash, this patch is able to prove hundreds of conditions which were not proved before. Compilebench: name old time/op new time/op delta Template 371ms ± 2% 368ms ± 1% ~ (p=0.222 n=5+5) Unicode 203ms ± 6% 199ms ± 3% ~ (p=0.421 n=5+5) GoTypes 1.17s ± 4% 1.18s ± 1% ~ (p=0.151 n=5+5) Compiler 5.54s ± 2% 5.59s ± 1% ~ (p=0.548 n=5+5) SSA 12.9s ± 2% 13.2s ± 1% +2.96% (p=0.032 n=5+5) Flate 245ms ± 2% 247ms ± 3% ~ (p=0.690 n=5+5) GoParser 302ms ± 6% 302ms ± 4% ~ (p=0.548 n=5+5) Reflect 764ms ± 4% 773ms ± 3% ~ (p=0.095 n=5+5) Tar 354ms ± 6% 361ms ± 3% ~ (p=0.222 n=5+5) XML 434ms ± 3% 429ms ± 1% ~ (p=0.421 n=5+5) StdCmd 22.6s ± 1% 22.9s ± 1% +1.40% (p=0.032 n=5+5) name old user-time/op new user-time/op delta Template 436ms ± 8% 426ms ± 5% ~ (p=0.579 n=5+5) Unicode 219ms ±15% 219ms ±12% ~ (p=1.000 n=5+5) GoTypes 1.47s ± 6% 1.53s ± 6% ~ (p=0.222 n=5+5) Compiler 7.26s ± 4% 7.40s ± 2% ~ (p=0.389 n=5+5) SSA 17.7s ± 4% 18.5s ± 4% +4.13% (p=0.032 n=5+5) Flate 257ms ± 5% 268ms ± 9% ~ (p=0.333 n=5+5) GoParser 354ms ± 6% 348ms ± 6% ~ (p=0.913 n=5+5) Reflect 904ms ± 2% 944ms ± 4% ~ (p=0.056 n=5+5) Tar 398ms ±11% 430ms ± 7% ~ (p=0.079 n=5+5) XML 501ms ± 7% 489ms ± 5% ~ (p=0.444 n=5+5) name old text-bytes new text-bytes delta HelloSize 670kB ± 0% 670kB ± 0% +0.00% (p=0.008 n=5+5) CmdGoSize 7.22MB ± 0% 7.21MB ± 0% -0.07% (p=0.008 n=5+5) name old data-bytes new data-bytes delta HelloSize 9.88kB ± 0% 9.88kB ± 0% ~ (all equal) CmdGoSize 248kB ± 0% 248kB ± 0% -0.06% (p=0.008 n=5+5) name old bss-bytes new bss-bytes delta HelloSize 125kB ± 0% 125kB ± 0% ~ (all equal) CmdGoSize 145kB ± 0% 144kB ± 0% -0.20% (p=0.008 n=5+5) name old exe-bytes new exe-bytes delta HelloSize 1.43MB ± 0% 1.43MB ± 0% ~ (all equal) CmdGoSize 14.5MB ± 0% 14.5MB ± 0% -0.06% (p=0.008 n=5+5) Fixes #19714 Updates #20393 Change-Id: Ia090f5b5dc1bcd274ba8a39b233c1e1ace1b330e Reviewed-on: https://go-review.googlesource.com/100277 Run-TryBot: Giovanni Bajo <rasky@develer.com> Reviewed-by: David Chase <drchase@google.com>	2018-04-29 09:35:39 +00:00
Ben Shi	aaf73c6d1e	cmd/compile: optimize ARM64 with shifted register indexed load/store ARM64 supports efficient instructions which combine shift, addition, load/store together. Such as "MOVD (R0)(R1<<3), R2" and "MOVWU R6, (R4)(R1<<2)". This CL optimizes the compiler to emit such efficient instuctions. And below is some test data. 1. binary size before/after binary size change pkg/linux_arm64 +80.1KB pkg/tool/linux_arm64 +121.9KB go -4.3KB gofmt -64KB 2. go1 benchmark There is big improvement for the test case Fannkuch11, and slight improvement for sme others, excluding noise. name old time/op new time/op delta BinaryTree17-4 43.9s ± 2% 44.0s ± 2% ~ (p=0.820 n=30+30) Fannkuch11-4 30.6s ± 2% 24.5s ± 3% -19.93% (p=0.000 n=25+30) FmtFprintfEmpty-4 500ns ± 0% 499ns ± 0% -0.11% (p=0.000 n=23+25) FmtFprintfString-4 1.03µs ± 0% 1.04µs ± 3% ~ (p=0.065 n=29+30) FmtFprintfInt-4 1.15µs ± 3% 1.15µs ± 4% -0.56% (p=0.000 n=30+30) FmtFprintfIntInt-4 1.80µs ± 5% 1.82µs ± 0% ~ (p=0.094 n=30+24) FmtFprintfPrefixedInt-4 2.17µs ± 5% 2.20µs ± 0% ~ (p=0.100 n=30+23) FmtFprintfFloat-4 3.08µs ± 3% 3.09µs ± 4% ~ (p=0.123 n=30+30) FmtManyArgs-4 7.41µs ± 4% 7.17µs ± 1% -3.26% (p=0.000 n=30+23) GobDecode-4 93.7ms ± 0% 94.7ms ± 4% ~ (p=0.685 n=24+30) GobEncode-4 78.7ms ± 7% 77.1ms ± 0% ~ (p=0.729 n=30+23) Gzip-4 4.01s ± 0% 3.97s ± 5% -1.11% (p=0.037 n=24+30) Gunzip-4 389ms ± 4% 384ms ± 0% ~ (p=0.155 n=30+23) HTTPClientServer-4 536µs ± 1% 537µs ± 1% ~ (p=0.236 n=30+30) JSONEncode-4 179ms ± 1% 182ms ± 6% ~ (p=0.763 n=24+30) JSONDecode-4 843ms ± 0% 839ms ± 6% -0.42% (p=0.003 n=25+30) Mandelbrot200-4 46.5ms ± 0% 46.5ms ± 0% +0.02% (p=0.000 n=26+26) GoParse-4 44.3ms ± 6% 43.3ms ± 0% ~ (p=0.067 n=30+27) RegexpMatchEasy0_32-4 1.07µs ± 7% 1.07µs ± 4% ~ (p=0.835 n=30+30) RegexpMatchEasy0_1K-4 5.51µs ± 0% 5.49µs ± 0% -0.35% (p=0.000 n=23+26) RegexpMatchEasy1_32-4 1.01µs ± 0% 1.02µs ± 4% +0.96% (p=0.014 n=24+30) RegexpMatchEasy1_1K-4 7.43µs ± 0% 7.18µs ± 0% -3.41% (p=0.000 n=23+24) RegexpMatchMedium_32-4 1.78µs ± 0% 1.81µs ± 4% +1.47% (p=0.012 n=23+30) RegexpMatchMedium_1K-4 547µs ± 1% 542µs ± 3% -0.90% (p=0.003 n=24+30) RegexpMatchHard_32-4 30.4µs ± 0% 29.7µs ± 0% -2.15% (p=0.000 n=19+23) RegexpMatchHard_1K-4 913µs ± 0% 915µs ± 6% +0.25% (p=0.012 n=24+30) Revcomp-4 6.32s ± 1% 6.42s ± 4% ~ (p=0.342 n=25+30) Template-4 868ms ± 6% 878ms ± 6% +1.15% (p=0.000 n=30+30) TimeParse-4 4.57µs ± 4% 4.59µs ± 3% +0.65% (p=0.010 n=29+30) TimeFormat-4 4.51µs ± 0% 4.50µs ± 0% -0.27% (p=0.000 n=27+24) [Geo mean] 695µs 689µs -0.92% name old speed new speed delta GobDecode-4 8.19MB/s ± 0% 8.12MB/s ± 4% ~ (p=0.680 n=24+30) GobEncode-4 9.76MB/s ± 7% 9.96MB/s ± 0% ~ (p=0.616 n=30+23) Gzip-4 4.84MB/s ± 0% 4.89MB/s ± 4% +1.16% (p=0.030 n=24+30) Gunzip-4 49.9MB/s ± 4% 50.6MB/s ± 0% ~ (p=0.162 n=30+23) JSONEncode-4 10.9MB/s ± 1% 10.7MB/s ± 6% ~ (p=0.575 n=24+30) JSONDecode-4 2.30MB/s ± 0% 2.32MB/s ± 5% +0.72% (p=0.003 n=22+30) GoParse-4 1.31MB/s ± 6% 1.34MB/s ± 0% +2.26% (p=0.002 n=30+27) RegexpMatchEasy0_32-4 30.0MB/s ± 6% 30.0MB/s ± 4% ~ (p=1.000 n=30+30) RegexpMatchEasy0_1K-4 186MB/s ± 0% 187MB/s ± 0% +0.35% (p=0.000 n=23+26) RegexpMatchEasy1_32-4 31.8MB/s ± 0% 31.5MB/s ± 4% -0.92% (p=0.012 n=25+30) RegexpMatchEasy1_1K-4 138MB/s ± 0% 143MB/s ± 0% +3.53% (p=0.000 n=23+24) RegexpMatchMedium_32-4 560kB/s ± 0% 553kB/s ± 4% -1.19% (p=0.005 n=23+30) RegexpMatchMedium_1K-4 1.87MB/s ± 0% 1.89MB/s ± 3% +1.04% (p=0.002 n=24+30) RegexpMatchHard_32-4 1.05MB/s ± 0% 1.08MB/s ± 0% +2.40% (p=0.000 n=19+23) RegexpMatchHard_1K-4 1.12MB/s ± 0% 1.12MB/s ± 5% +0.12% (p=0.006 n=25+30) Revcomp-4 40.2MB/s ± 1% 39.6MB/s ± 4% ~ (p=0.242 n=25+30) Template-4 2.24MB/s ± 6% 2.21MB/s ± 6% -1.15% (p=0.000 n=30+30) [Geo mean] 7.87MB/s 7.91MB/s +0.44% Change-Id: If374cb7abf83537aa0a176f73c0f736f7800db03 Reviewed-on: https://go-review.googlesource.com/108735 Reviewed-by: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-04-27 20:02:05 +00:00
Milan Knezevic	2959128dc5	cmd/compile: add softfloat support to mips64{,le} mips64 softfloat support is based on mips implementation and introduces new enviroment variable GOMIPS64. GOMIPS64 is a GOARCH=mips64{,le} specific option, for a choice between hard-float and soft-float. Valid values are 'hardfloat' (default) and 'softfloat'. It is passed to the assembler as 'GOMIPS64_{hardfloat,softfloat}'. Change-Id: I7f73078627f7cb37c588a38fb5c997fe09c56134 Reviewed-on: https://go-review.googlesource.com/108475 Reviewed-by: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-04-27 14:50:17 +00:00
Josh Bleecher Snyder	d9a50a6531	cmd/compile: use prove pass to detect Ctz of non-zero values On amd64, Ctz must include special handling of zeros. But the prove pass has enough information to detect whether the input is non-zero, allowing a more efficient lowering. Introduce new CtzNonZero ops to capture and use this information. Benchmark code: func BenchmarkVisitBits(b testing.B) { b.Run("8", func(b testing.B) { for i := 0; i < b.N; i++ { x := uint8(0xff) for x != 0 { sink = bits.TrailingZeros8(x) x &= x - 1 } } }) // and similarly so for 16, 32, 64 } name old time/op new time/op delta VisitBits/8-8 7.27ns ± 4% 5.58ns ± 4% -23.35% (p=0.000 n=28+26) VisitBits/16-8 14.7ns ± 7% 10.5ns ± 4% -28.43% (p=0.000 n=30+28) VisitBits/32-8 27.6ns ± 8% 19.3ns ± 3% -30.14% (p=0.000 n=30+26) VisitBits/64-8 44.0ns ±11% 38.0ns ± 5% -13.48% (p=0.000 n=30+30) Fixes #25077 Change-Id: Ie6e5bd86baf39ee8a4ca7cadcf56d934e047f957 Reviewed-on: https://go-review.googlesource.com/109358 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-04-26 18:22:28 +00:00
Carlos Eduardo Seo	ebb67d993a	cmd/compile, cmd/internal/obj/ppc64: make math.Round an intrinsic on ppc64x This change implements math.Round as an intrinsic on ppc64x so it can be done using a single instruction. benchmark old ns/op new ns/op delta BenchmarkRound-16 2.60 0.69 -73.46% Change-Id: I9408363e96201abdfc73ced7bcd5f0c29db006a8 Reviewed-on: https://go-review.googlesource.com/109395 Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>	2018-04-26 14:12:09 +00:00
Josh Bleecher Snyder	c5f0104daf	cmd/compile: use intrinsic for LeadingZeros8 on amd64 The previous change sped up the pure computation form of LeadingZeros8. This places it somewhat close to the table lookup form. Depending on something that varies from toolchain to toolchain (alignment, perhaps?), the slowdown from ditching the table lookup is either 20% or 5%. This benchmark is the best case scenario for the table lookup: It is in the L1 cache already. I think we're close enough that we can switch to the computational version, and trust that the memory effects and binary size savings will be worth it. Code: func f8(x uint8) { z = bits.LeadingZeros8(x) } Before: "".f8 STEXT nosplit size=34 args=0x8 locals=0x0 0x0000 00000 (x.go:7) TEXT "".f8(SB), NOSPLIT, $0-8 0x0000 00000 (x.go:7) FUNCDATA $0, gclocals·2a5305abe05176240e61b8620e19a815(SB) 0x0000 00000 (x.go:7) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB) 0x0000 00000 (x.go:7) MOVBLZX "".x+8(SP), AX 0x0005 00005 (x.go:7) MOVBLZX AL, AX 0x0008 00008 (x.go:7) LEAQ math/bits.len8tab(SB), CX 0x000f 00015 (x.go:7) MOVBLZX (CX)(AX1), AX 0x0013 00019 (x.go:7) ADDQ $-8, AX 0x0017 00023 (x.go:7) NEGQ AX 0x001a 00026 (x.go:7) MOVQ AX, "".z(SB) 0x0021 00033 (x.go:7) RET After: "".f8 STEXT nosplit size=30 args=0x8 locals=0x0 0x0000 00000 (x.go:7) TEXT "".f8(SB), NOSPLIT, $0-8 0x0000 00000 (x.go:7) FUNCDATA $0, gclocals·2a5305abe05176240e61b8620e19a815(SB) 0x0000 00000 (x.go:7) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB) 0x0000 00000 (x.go:7) MOVBLZX "".x+8(SP), AX 0x0005 00005 (x.go:7) MOVBLZX AL, AX 0x0008 00008 (x.go:7) LEAL 1(AX)(AX1), AX 0x000c 00012 (x.go:7) BSRL AX, AX 0x000f 00015 (x.go:7) ADDQ $-8, AX 0x0013 00019 (x.go:7) NEGQ AX 0x0016 00022 (x.go:7) MOVQ AX, "".z(SB) 0x001d 00029 (x.go:7) RET Change-Id: Icc7db50a7820fb9a3da8a816d6b6940d7f8e193e Reviewed-on: https://go-review.googlesource.com/108942 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-04-25 21:34:15 +00:00
Josh Bleecher Snyder	1d321ada73	cmd/compile: optimize LeadingZeros(16\|32) on amd64 Introduce Len8 and Len16 ops and provide optimized lowerings for them. amd64 only for this CL, although it wouldn't surprise me if other architectures also admit of optimized lowerings. Also use and optimize the Len32 lowering, along the same lines. Leave Len8 unused for the moment; a subsequent CL will enable it. For 16 and 32 bits, this leads to a speed-up. name old time/op new time/op delta LeadingZeros16-8 1.42ns ± 5% 1.23ns ± 5% -13.42% (p=0.000 n=20+20) LeadingZeros32-8 1.25ns ± 5% 1.03ns ± 5% -17.63% (p=0.000 n=20+16) Code: func f16(x uint16) { z = bits.LeadingZeros16(x) } func f32(x uint32) { z = bits.LeadingZeros32(x) } Before: "".f16 STEXT nosplit size=38 args=0x8 locals=0x0 0x0000 00000 (x.go:8) TEXT "".f16(SB), NOSPLIT, $0-8 0x0000 00000 (x.go:8) FUNCDATA $0, gclocals·2a5305abe05176240e61b8620e19a815(SB) 0x0000 00000 (x.go:8) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB) 0x0000 00000 (x.go:8) MOVWLZX "".x+8(SP), AX 0x0005 00005 (x.go:8) MOVWLZX AX, AX 0x0008 00008 (x.go:8) BSRQ AX, AX 0x000c 00012 (x.go:8) MOVQ $-1, CX 0x0013 00019 (x.go:8) CMOVQEQ CX, AX 0x0017 00023 (x.go:8) ADDQ $-15, AX 0x001b 00027 (x.go:8) NEGQ AX 0x001e 00030 (x.go:8) MOVQ AX, "".z(SB) 0x0025 00037 (x.go:8) RET "".f32 STEXT nosplit size=34 args=0x8 locals=0x0 0x0000 00000 (x.go:9) TEXT "".f32(SB), NOSPLIT, $0-8 0x0000 00000 (x.go:9) FUNCDATA $0, gclocals·2a5305abe05176240e61b8620e19a815(SB) 0x0000 00000 (x.go:9) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB) 0x0000 00000 (x.go:9) MOVL "".x+8(SP), AX 0x0004 00004 (x.go:9) BSRQ AX, AX 0x0008 00008 (x.go:9) MOVQ $-1, CX 0x000f 00015 (x.go:9) CMOVQEQ CX, AX 0x0013 00019 (x.go:9) ADDQ $-31, AX 0x0017 00023 (x.go:9) NEGQ AX 0x001a 00026 (x.go:9) MOVQ AX, "".z(SB) 0x0021 00033 (x.go:9) RET After: "".f16 STEXT nosplit size=30 args=0x8 locals=0x0 0x0000 00000 (x.go:8) TEXT "".f16(SB), NOSPLIT, $0-8 0x0000 00000 (x.go:8) FUNCDATA $0, gclocals·2a5305abe05176240e61b8620e19a815(SB) 0x0000 00000 (x.go:8) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB) 0x0000 00000 (x.go:8) MOVWLZX "".x+8(SP), AX 0x0005 00005 (x.go:8) MOVWLZX AX, AX 0x0008 00008 (x.go:8) LEAL 1(AX)(AX1), AX 0x000c 00012 (x.go:8) BSRL AX, AX 0x000f 00015 (x.go:8) ADDQ $-16, AX 0x0013 00019 (x.go:8) NEGQ AX 0x0016 00022 (x.go:8) MOVQ AX, "".z(SB) 0x001d 00029 (x.go:8) RET "".f32 STEXT nosplit size=28 args=0x8 locals=0x0 0x0000 00000 (x.go:9) TEXT "".f32(SB), NOSPLIT, $0-8 0x0000 00000 (x.go:9) FUNCDATA $0, gclocals·2a5305abe05176240e61b8620e19a815(SB) 0x0000 00000 (x.go:9) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB) 0x0000 00000 (x.go:9) MOVL "".x+8(SP), AX 0x0004 00004 (x.go:9) LEAQ 1(AX)(AX1), AX 0x0009 00009 (x.go:9) BSRQ AX, AX 0x000d 00013 (x.go:9) ADDQ $-32, AX 0x0011 00017 (x.go:9) NEGQ AX 0x0014 00020 (x.go:9) MOVQ AX, "".z(SB) 0x001b 00027 (x.go:9) RET Change-Id: I6c93c173752a7bfdeab8be30777ae05a736e1f4b Reviewed-on: https://go-review.googlesource.com/108941 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Giovanni Bajo <rasky@develer.com> Reviewed-by: Keith Randall <khr@golang.org>	2018-04-25 21:34:04 +00:00
Josh Bleecher Snyder	54dbab5221	cmd/compile: optimize TrailingZeros(8\|16) on amd64 Introduce Ctz8 and Ctz16 ops and provide optimized lowerings for them. amd64 only for this CL, although it wouldn't surprise me if other architectures also admit of optimized lowerings. name old time/op new time/op delta TrailingZeros8-8 1.33ns ± 6% 0.84ns ± 3% -36.90% (p=0.000 n=20+20) TrailingZeros16-8 1.26ns ± 5% 0.84ns ± 5% -33.50% (p=0.000 n=20+18) Code: func f8(x uint8) { z = bits.TrailingZeros8(x) } func f16(x uint16) { z = bits.TrailingZeros16(x) } Before: "".f8 STEXT nosplit size=34 args=0x8 locals=0x0 0x0000 00000 (x.go:7) TEXT "".f8(SB), NOSPLIT, $0-8 0x0000 00000 (x.go:7) FUNCDATA $0, gclocals·2a5305abe05176240e61b8620e19a815(SB) 0x0000 00000 (x.go:7) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB) 0x0000 00000 (x.go:7) MOVBLZX "".x+8(SP), AX 0x0005 00005 (x.go:7) MOVBLZX AL, AX 0x0008 00008 (x.go:7) BTSQ $8, AX 0x000d 00013 (x.go:7) BSFQ AX, AX 0x0011 00017 (x.go:7) MOVL $64, CX 0x0016 00022 (x.go:7) CMOVQEQ CX, AX 0x001a 00026 (x.go:7) MOVQ AX, "".z(SB) 0x0021 00033 (x.go:7) RET "".f16 STEXT nosplit size=34 args=0x8 locals=0x0 0x0000 00000 (x.go:8) TEXT "".f16(SB), NOSPLIT, $0-8 0x0000 00000 (x.go:8) FUNCDATA $0, gclocals·2a5305abe05176240e61b8620e19a815(SB) 0x0000 00000 (x.go:8) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB) 0x0000 00000 (x.go:8) MOVWLZX "".x+8(SP), AX 0x0005 00005 (x.go:8) MOVWLZX AX, AX 0x0008 00008 (x.go:8) BTSQ $16, AX 0x000d 00013 (x.go:8) BSFQ AX, AX 0x0011 00017 (x.go:8) MOVL $64, CX 0x0016 00022 (x.go:8) CMOVQEQ CX, AX 0x001a 00026 (x.go:8) MOVQ AX, "".z(SB) 0x0021 00033 (x.go:8) RET After: "".f8 STEXT nosplit size=20 args=0x8 locals=0x0 0x0000 00000 (x.go:7) TEXT "".f8(SB), NOSPLIT, $0-8 0x0000 00000 (x.go:7) FUNCDATA $0, gclocals·2a5305abe05176240e61b8620e19a815(SB) 0x0000 00000 (x.go:7) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB) 0x0000 00000 (x.go:7) MOVBLZX "".x+8(SP), AX 0x0005 00005 (x.go:7) BTSL $8, AX 0x0009 00009 (x.go:7) BSFL AX, AX 0x000c 00012 (x.go:7) MOVQ AX, "".z(SB) 0x0013 00019 (x.go:7) RET "".f16 STEXT nosplit size=20 args=0x8 locals=0x0 0x0000 00000 (x.go:8) TEXT "".f16(SB), NOSPLIT, $0-8 0x0000 00000 (x.go:8) FUNCDATA $0, gclocals·2a5305abe05176240e61b8620e19a815(SB) 0x0000 00000 (x.go:8) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB) 0x0000 00000 (x.go:8) MOVWLZX "".x+8(SP), AX 0x0005 00005 (x.go:8) BTSL $16, AX 0x0009 00009 (x.go:8) BSFL AX, AX 0x000c 00012 (x.go:8) MOVQ AX, "".z(SB) 0x0013 00019 (x.go:8) RET Change-Id: I0551e357348de2b724737d569afd6ac9f5c3aa11 Reviewed-on: https://go-review.googlesource.com/108940 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Giovanni Bajo <rasky@develer.com> Reviewed-by: Keith Randall <khr@golang.org>	2018-04-25 21:33:52 +00:00
Ilya Tocar	fb017c60bc	cmd/compile/internal/ssa: fix endless compile loop on AMD64 We currently rewrite (TESTQ (MOVQconst [c] x)) into (TESTQconst [c] x) and (TESTQconst [-1] x) into (TESTQ x x) if x is a (MOVQconst [-1]) we will be stuck in the endless rewrite loop. Don't perform the rewrite in such cases. Fixes #25006 Change-Id: I77f561ba2605fc104f1e5d5c57f32e9d67a2c000 Reviewed-on: https://go-review.googlesource.com/108879 Run-TryBot: Ilya Tocar <ilya.tocar@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-04-24 16:20:41 +00:00
Josh Bleecher Snyder	d292f77e95	cmd/compile: rewrite 2*x+c into LEAx1 on amd64 Rewrite x<<1+c into x+x+c, which can be expressed as a single LEAQ/LEAL. Bit of a special case, but the single-instruction LEA is both shorter and faster than SHL then ADD. Triggers 293 times during make.bash. Change-Id: I3f09c8e9a8f3859d1eeed336f095fc3ada79c2c1 Reviewed-on: https://go-review.googlesource.com/108938 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-04-23 22:40:10 +00:00
Josh Bleecher Snyder	566e3e074c	cmd/compile: avoid runtime call during switch string(byteslice) This triggers three times while building std, once in image/png and twice in go/internal/gccgoimporter. There are no instances in std in which a more aggressive optimization would have triggered. This doesn't necessarily avoid an allocation, because escape analysis is already able in many cases to use a temporary backing for the string, but it does at a minimum avoid the runtime call and copy. Fixes #24937 Change-Id: I7019e85638ba8cd7e2f03890e672558b858579bc Reviewed-on: https://go-review.googlesource.com/108035 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2018-04-21 00:50:50 +00:00
Lynn Boger	30311e8860	cmd/compile: generate load without DS relocation for go.string on ppc64le Due to some recent optimizations related to the compare instruction, DS-form load instructions started to be used to load 8-byte go.strings. This can cause link time errors if the go.string is not aligned to 4 bytes. For DS-form instructions, the value in the offset field must be a multiple of 4. If the offset is known at the time the rules are processed, a DS-form load will not be chosen. But for go.strings, the offset is not known at that time, but a relocation is generated indicating that the linker should fill in the DS relocation. When the linker tries to fill in the relocation, if the offset is not aligned properly, a link error will occur. To fix this, when loading a go.string using MOVDload, the full address of the go.string is generated and loaded into the base register. Then the go.string is loaded with a 0 offset field. Added a testcase that reproduces this problem. Fixes #24799 Change-Id: I6a154e8e1cba64eae290be0fbcb608b75884ecdd Reviewed-on: https://go-review.googlesource.com/107855 Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2018-04-20 16:16:47 +00:00
Ben Shi	34f5f8a580	cmd/compile: optimize ARM64 with register indexed load/store ARM64 supports load/store instructions with a memory operand that the address is calculated by base register + index register. In this CL, 1. Some rules are added to the compile's ARM64 backend to emit such efficient instructions. 2. A wrong rule of load combination is fixed. The go1 benchmark does show improvement. name old time/op new time/op delta BinaryTree17-4 44.5s ± 2% 44.1s ± 1% -0.81% (p=0.000 n=28+29) Fannkuch11-4 32.7s ± 3% 30.5s ± 0% -6.79% (p=0.000 n=30+26) FmtFprintfEmpty-4 499ns ± 0% 506ns ± 5% +1.39% (p=0.003 n=25+30) FmtFprintfString-4 1.07µs ± 0% 1.04µs ± 4% -3.17% (p=0.000 n=23+30) FmtFprintfInt-4 1.15µs ± 4% 1.13µs ± 0% -1.55% (p=0.000 n=30+23) FmtFprintfIntInt-4 1.77µs ± 4% 1.74µs ± 0% -1.71% (p=0.000 n=30+24) FmtFprintfPrefixedInt-4 2.37µs ± 5% 2.12µs ± 0% -10.56% (p=0.000 n=30+23) FmtFprintfFloat-4 3.03µs ± 1% 3.03µs ± 4% -0.13% (p=0.003 n=25+30) FmtManyArgs-4 7.38µs ± 1% 7.43µs ± 4% +0.59% (p=0.003 n=25+30) GobDecode-4 101ms ± 6% 95ms ± 5% -5.55% (p=0.000 n=30+30) GobEncode-4 78.0ms ± 4% 78.8ms ± 6% +1.05% (p=0.000 n=30+30) Gzip-4 4.25s ± 0% 4.27s ± 4% +0.45% (p=0.003 n=24+30) Gunzip-4 428ms ± 1% 420ms ± 0% -1.88% (p=0.000 n=23+23) HTTPClientServer-4 549µs ± 1% 541µs ± 1% -1.56% (p=0.000 n=29+29) JSONEncode-4 194ms ± 0% 188ms ± 4% ~ (p=0.417 n=23+30) JSONDecode-4 890ms ± 5% 831ms ± 0% -6.55% (p=0.000 n=30+23) Mandelbrot200-4 47.3ms ± 2% 46.5ms ± 0% ~ (p=0.980 n=30+26) GoParse-4 43.1ms ± 6% 43.8ms ± 6% +1.65% (p=0.000 n=30+30) RegexpMatchEasy0_32-4 1.06µs ± 0% 1.07µs ± 3% ~ (p=0.092 n=23+30) RegexpMatchEasy0_1K-4 5.53µs ± 0% 5.51µs ± 0% -0.24% (p=0.000 n=25+25) RegexpMatchEasy1_32-4 1.02µs ± 3% 1.01µs ± 0% -1.27% (p=0.000 n=30+24) RegexpMatchEasy1_1K-4 7.26µs ± 0% 7.33µs ± 0% +0.95% (p=0.000 n=23+26) RegexpMatchMedium_32-4 1.84µs ± 7% 1.79µs ± 1% ~ (p=0.333 n=30+23) RegexpMatchMedium_1K-4 553µs ± 0% 547µs ± 0% -1.14% (p=0.000 n=24+22) RegexpMatchHard_32-4 30.8µs ± 1% 30.3µs ± 0% -1.40% (p=0.000 n=24+24) RegexpMatchHard_1K-4 928µs ± 0% 929µs ± 5% +0.12% (p=0.013 n=23+30) Revcomp-4 8.13s ± 4% 6.32s ± 1% -22.23% (p=0.000 n=30+23) Template-4 899ms ± 6% 854ms ± 1% -5.01% (p=0.000 n=30+24) TimeParse-4 4.66µs ± 4% 4.59µs ± 1% -1.57% (p=0.000 n=30+23) TimeFormat-4 4.58µs ± 0% 4.61µs ± 0% +0.57% (p=0.000 n=26+24) [Geo mean] 717µs 698µs -2.55% name old speed new speed delta GobDecode-4 7.63MB/s ± 6% 8.08MB/s ± 5% +5.88% (p=0.000 n=30+30) GobEncode-4 9.85MB/s ± 4% 9.75MB/s ± 6% -1.04% (p=0.000 n=30+30) Gzip-4 4.56MB/s ± 0% 4.55MB/s ± 4% -0.36% (p=0.003 n=24+30) Gunzip-4 45.3MB/s ± 1% 46.2MB/s ± 0% +1.92% (p=0.000 n=23+23) JSONEncode-4 10.0MB/s ± 0% 10.4MB/s ± 4% ~ (p=0.403 n=23+30) JSONDecode-4 2.18MB/s ± 5% 2.33MB/s ± 0% +6.91% (p=0.000 n=30+23) GoParse-4 1.34MB/s ± 5% 1.32MB/s ± 5% -1.66% (p=0.000 n=30+30) RegexpMatchEasy0_32-4 30.2MB/s ± 0% 29.8MB/s ± 3% ~ (p=0.099 n=23+30) RegexpMatchEasy0_1K-4 185MB/s ± 0% 186MB/s ± 0% +0.24% (p=0.000 n=25+25) RegexpMatchEasy1_32-4 31.4MB/s ± 3% 31.8MB/s ± 0% +1.24% (p=0.000 n=30+24) RegexpMatchEasy1_1K-4 141MB/s ± 0% 140MB/s ± 0% -0.94% (p=0.000 n=23+26) RegexpMatchMedium_32-4 541kB/s ± 6% 560kB/s ± 0% +3.45% (p=0.000 n=30+23) RegexpMatchMedium_1K-4 1.85MB/s ± 0% 1.87MB/s ± 0% +1.08% (p=0.000 n=24+23) RegexpMatchHard_32-4 1.04MB/s ± 1% 1.06MB/s ± 1% +1.48% (p=0.000 n=24+24) RegexpMatchHard_1K-4 1.10MB/s ± 0% 1.10MB/s ± 5% +0.15% (p=0.004 n=23+30) Revcomp-4 31.3MB/s ± 4% 40.2MB/s ± 1% +28.52% (p=0.000 n=30+23) Template-4 2.16MB/s ± 6% 2.27MB/s ± 1% +5.18% (p=0.000 n=30+24) [Geo mean] 7.57MB/s 7.79MB/s +2.98% fixes #24907 Change-Id: I94afd0e3f53d62a1cf5e452f3dd6daf61be21785 Reviewed-on: https://go-review.googlesource.com/107376 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-04-19 15:08:10 +00:00
Cherry Zhang	3042463d61	cmd/compile: in escape analysis, use element type for OIND of slice The escape analysis models the flow of "content" of X with a level of "indirection" (OIND node) of X. This content can be pointer dereference, or slice/string element. For the latter case, the type of the OIND node should be the element type of the slice/string. This CL fixes this. In particular, this matters when the element type is pointerless, where the data flow should not cause any escape. Fixes #15730. Change-Id: Iba9f92898681625e7e3ddef76ae65d7cd61c41e0 Reviewed-on: https://go-review.googlesource.com/107597 Reviewed-by: David Chase <drchase@google.com>	2018-04-18 02:59:37 +00:00
Michael Munday	58cdecb9c8	cmd/compile: generate constants for NeqPtr, EqPtr and IsNonNil ops If both inputs are constant offsets from the same pointer then we can evaluate NeqPtr and EqPtr at compile time. Triggers a few times during all.bash. Removes a conditional branch in the following code: copy(x[1:], x[:]) This branch was recently added as an optimization in CL 94596. We now skip the memmove if the pointers are equal. However, in the above code we know at compile time that they are never equal. Also, when the offset is variable, check if the offset is zero rather than if the pointers are equal. For example: copy(x[a:], x[:]) This would now skip the copy if a == 0, rather than if x + a == x. Finally I've also added a rule to make IsNonNil true for pointers to values on the stack. The nil check elimination pass will catch these anyway, but eliminating them here might eliminate branches earlier. Change-Id: If72f436fef0a96ad0f4e296d3a1f8b6c3e712085 Reviewed-on: https://go-review.googlesource.com/106635 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-04-16 20:43:57 +00:00
Ben Shi	cd65bbc01b	cmd/compile/internal/ssa: optimize 386's subtraction The SUBL instruction can take a memory operand, and this CL implements this optimization. The go1 benchmark shows a little improvement. name old time/op new time/op delta BinaryTree17-4 3.27s ± 2% 3.29s ± 3% ~ (p=0.322 n=37+40) Fannkuch11-4 3.49s ± 0% 3.53s ± 1% +1.21% (p=0.000 n=31+40) FmtFprintfEmpty-4 46.2ns ± 3% 46.3ns ± 2% ~ (p=0.351 n=40+28) FmtFprintfString-4 82.0ns ± 3% 81.5ns ± 2% -0.69% (p=0.002 n=40+30) FmtFprintfInt-4 94.6ns ± 3% 94.6ns ± 6% ~ (p=0.913 n=39+37) FmtFprintfIntInt-4 147ns ± 3% 150ns ± 2% +1.72% (p=0.000 n=40+25) FmtFprintfPrefixedInt-4 186ns ± 3% 186ns ± 0% -0.33% (p=0.006 n=40+25) FmtFprintfFloat-4 388ns ± 4% 388ns ± 4% ~ (p=0.162 n=40+40) FmtManyArgs-4 612ns ± 3% 616ns ± 4% ~ (p=0.223 n=40+40) GobDecode-4 7.35ms ± 5% 7.42ms ± 5% ~ (p=0.095 n=40+40) GobEncode-4 7.21ms ± 8% 7.23ms ± 4% ~ (p=0.294 n=40+40) Gzip-4 360ms ± 4% 359ms ± 4% ~ (p=0.097 n=40+40) Gunzip-4 46.1ms ± 3% 45.6ms ± 3% -1.20% (p=0.000 n=40+40) HTTPClientServer-4 64.0µs ± 2% 64.1µs ± 2% ~ (p=0.648 n=39+40) JSONEncode-4 21.9ms ± 4% 22.1ms ± 5% ~ (p=0.086 n=40+40) JSONDecode-4 67.9ms ± 4% 66.7ms ± 4% -1.63% (p=0.000 n=40+40) Mandelbrot200-4 5.19ms ± 3% 5.17ms ± 3% ~ (p=0.881 n=40+40) GoParse-4 3.34ms ± 3% 3.28ms ± 2% -1.78% (p=0.000 n=40+40) RegexpMatchEasy0_32-4 101ns ± 5% 99ns ± 3% -2.40% (p=0.000 n=40+40) RegexpMatchEasy0_1K-4 851ns ± 1% 848ns ± 3% -0.36% (p=0.004 n=33+40) RegexpMatchEasy1_32-4 109ns ± 5% 105ns ± 3% -3.53% (p=0.000 n=39+40) RegexpMatchEasy1_1K-4 1.03µs ± 4% 1.03µs ± 3% ~ (p=0.638 n=40+38) RegexpMatchMedium_32-4 131ns ± 5% 127ns ± 4% -3.36% (p=0.000 n=38+40) RegexpMatchMedium_1K-4 43.4µs ± 4% 43.2µs ± 3% -0.46% (p=0.008 n=40+40) RegexpMatchHard_32-4 2.21µs ± 4% 2.23µs ± 1% +0.77% (p=0.014 n=40+28) RegexpMatchHard_1K-4 67.6µs ± 4% 67.7µs ± 3% +0.11% (p=0.016 n=40+40) Revcomp-4 1.86s ± 3% 1.77s ± 2% -4.81% (p=0.000 n=40+40) Template-4 71.7ms ± 3% 71.6ms ± 4% ~ (p=0.200 n=40+40) TimeParse-4 436ns ± 4% 433ns ± 3% ~ (p=0.358 n=40+40) TimeFormat-4 413ns ± 4% 412ns ± 3% ~ (p=0.415 n=40+40) [Geo mean] 63.9µs 63.6µs -0.49% name old speed new speed delta GobDecode-4 105MB/s ± 5% 104MB/s ± 5% ~ (p=0.096 n=40+40) GobEncode-4 106MB/s ± 7% 106MB/s ± 3% ~ (p=0.385 n=39+40) Gzip-4 54.0MB/s ± 4% 54.0MB/s ± 4% ~ (p=0.100 n=40+40) Gunzip-4 421MB/s ± 3% 426MB/s ± 3% +1.21% (p=0.000 n=40+40) JSONEncode-4 88.5MB/s ± 5% 88.0MB/s ± 5% ~ (p=0.083 n=40+40) JSONDecode-4 28.6MB/s ± 4% 29.1MB/s ± 4% +1.65% (p=0.000 n=40+40) GoParse-4 17.3MB/s ± 3% 17.7MB/s ± 2% +1.82% (p=0.000 n=40+40) RegexpMatchEasy0_32-4 316MB/s ± 5% 323MB/s ± 4% +2.44% (p=0.000 n=40+40) RegexpMatchEasy0_1K-4 1.20GB/s ± 1% 1.21GB/s ± 3% +0.40% (p=0.004 n=33+40) RegexpMatchEasy1_32-4 291MB/s ± 7% 302MB/s ± 4% +3.82% (p=0.000 n=40+40) RegexpMatchEasy1_1K-4 993MB/s ± 4% 990MB/s ± 3% ~ (p=0.623 n=40+38) RegexpMatchMedium_32-4 7.61MB/s ± 5% 7.87MB/s ± 4% +3.36% (p=0.000 n=38+40) RegexpMatchMedium_1K-4 23.6MB/s ± 4% 23.7MB/s ± 4% +0.46% (p=0.007 n=40+40) RegexpMatchHard_32-4 14.5MB/s ± 4% 14.3MB/s ± 1% -0.79% (p=0.017 n=40+28) RegexpMatchHard_1K-4 15.1MB/s ± 4% 15.1MB/s ± 3% -0.11% (p=0.015 n=40+40) Revcomp-4 137MB/s ± 3% 144MB/s ± 3% +5.06% (p=0.000 n=40+40) Template-4 27.1MB/s ± 3% 27.1MB/s ± 4% ~ (p=0.211 n=40+40) [Geo mean] 78.9MB/s 79.7MB/s +1.01% Change-Id: I638fa4fef85833e8605919d693f9570cc3cf7334 Reviewed-on: https://go-review.googlesource.com/107275 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-04-16 04:41:20 +00:00
Giovanni Bajo	e7b1d0a9cf	test: add missing copyright header Change-Id: Ia64535492515f725fe3c4b59ea300363a0c4ce10 Reviewed-on: https://go-review.googlesource.com/107136 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-04-15 21:17:54 +00:00
Giovanni Bajo	2954ef20bb	test: small cleanup of code and comments in run.go While writing CL 107315, I went back and forth for the syntax used for constraints of build environments in which the architecture did not support varitants ("plan9/amd64" vs "plan9/amd64/"). I eventually settled for the latter because the code required less heuristics (think parsing "plan9/386" vs "386/sse2") but there were a few leftovers in code and comments. Change-Id: I9d9a008f3814f9a1642609650eb571e7f1a675cf Reviewed-on: https://go-review.googlesource.com/107338 Run-TryBot: Giovanni Bajo <rasky@develer.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-04-15 21:17:43 +00:00
Giovanni Bajo	284ba47b49	test: run codegen tests on all supported architecture variants This CL makes the codegen testsuite automatically test all architecture variants for architecture specified in tests. For instance, if a test file specifies a "arm" test, it will be automatically run on all GOARM variants (5,6,7), to increase the coverage. The CL also introduces a syntax to specify only a specific variant (eg: "arm/7") in case the test makes sense only there. The same syntax also allows to specify the operating system in case it matters (eg: "plan9/386/sse2"). Fixes #24658 Change-Id: I2eba8b918f51bb6a77a8431a309f8b71af07ea22 Reviewed-on: https://go-review.googlesource.com/107315 Run-TryBot: Giovanni Bajo <rasky@develer.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-04-15 20:02:43 +00:00
Giovanni Bajo	01aa1d7dbe	test: migrate plan9 tests to codegen And remove it from asmtest. Next CL will remove the whole asmtest infrastructure. Change-Id: I5851bf7c617456d62a3c6cffacf70252df7b056b Reviewed-on: https://go-review.googlesource.com/107335 Run-TryBot: Giovanni Bajo <rasky@develer.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-04-15 20:02:30 +00:00
Cherry Zhang	b08a9b7ecc	all: use new softfloat on GOARM=5 Use the new softfloat support in the compiler, originally added for softfloat on MIPS. This support is portable, so we can just use it for softfloat on ARM. In the old softfloat support on ARM, the compiler generates floating point instructions, then the assembler inserts calls to _sfloat before FP instructions. _sfloat decodes the following FP instructions and simulates them. In the new scheme, the compiler generates runtime calls to do FP operations at a higher level. It doesn't generate FP instructions, and therefore the assembler won't insert _sfloat calls, i.e. the old mechanism is automatically suppressed. The old method may be still be triggered with assembly code using FP instructions. In the standard library, the only occurance is math/sqrt_arm.s, which is rewritten to call to the Go implementation instead. Some significant speedups for code using floating points: name old time/op new time/op delta BinaryTree17-4 37.1s ± 2% 37.3s ± 1% ~ (p=0.105 n=10+10) Fannkuch11-4 13.0s ± 0% 13.1s ± 0% +0.46% (p=0.000 n=10+10) FmtFprintfEmpty-4 700ns ± 4% 734ns ± 6% +4.84% (p=0.009 n=10+10) FmtFprintfString-4 1.22µs ± 3% 1.22µs ± 4% ~ (p=0.897 n=10+10) FmtFprintfInt-4 1.27µs ± 2% 1.30µs ± 1% +1.91% (p=0.001 n=10+9) FmtFprintfIntInt-4 1.83µs ± 2% 1.81µs ± 3% ~ (p=0.149 n=10+10) FmtFprintfPrefixedInt-4 1.80µs ± 3% 1.81µs ± 2% ~ (p=0.421 n=10+8) FmtFprintfFloat-4 6.89µs ± 3% 3.59µs ± 2% -47.93% (p=0.000 n=10+10) FmtManyArgs-4 6.39µs ± 1% 6.09µs ± 1% -4.61% (p=0.000 n=10+9) GobDecode-4 109ms ± 2% 81ms ± 2% -25.99% (p=0.000 n=9+10) GobEncode-4 109ms ± 2% 76ms ± 2% -29.88% (p=0.000 n=10+9) Gzip-4 3.61s ± 1% 3.59s ± 1% ~ (p=0.247 n=10+10) Gunzip-4 449ms ± 4% 450ms ± 1% ~ (p=0.230 n=10+7) HTTPClientServer-4 1.55ms ± 3% 1.53ms ± 2% ~ (p=0.400 n=9+10) JSONEncode-4 356ms ± 1% 183ms ± 1% -48.73% (p=0.000 n=10+10) JSONDecode-4 1.12s ± 2% 0.87s ± 1% -21.88% (p=0.000 n=10+10) Mandelbrot200-4 5.49s ± 1% 2.55s ± 1% -53.45% (p=0.000 n=9+10) GoParse-4 49.6ms ± 2% 47.5ms ± 1% -4.08% (p=0.000 n=10+9) RegexpMatchEasy0_32-4 1.13µs ± 4% 1.20µs ± 4% +6.42% (p=0.000 n=10+10) RegexpMatchEasy0_1K-4 4.41µs ± 2% 4.44µs ± 2% ~ (p=0.128 n=10+10) RegexpMatchEasy1_32-4 1.15µs ± 5% 1.20µs ± 5% +4.85% (p=0.002 n=10+10) RegexpMatchEasy1_1K-4 6.21µs ± 2% 6.37µs ± 4% +2.62% (p=0.001 n=9+10) RegexpMatchMedium_32-4 1.58µs ± 5% 1.65µs ± 3% +4.85% (p=0.000 n=10+10) RegexpMatchMedium_1K-4 341µs ± 3% 351µs ± 7% ~ (p=0.573 n=8+10) RegexpMatchHard_32-4 21.4µs ± 3% 21.5µs ± 5% ~ (p=0.931 n=9+9) RegexpMatchHard_1K-4 626µs ± 2% 626µs ± 1% ~ (p=0.645 n=8+8) Revcomp-4 46.4ms ± 2% 47.4ms ± 2% +2.07% (p=0.000 n=10+10) Template-4 1.31s ± 3% 1.23s ± 4% -6.13% (p=0.000 n=10+10) TimeParse-4 4.49µs ± 1% 4.41µs ± 2% -1.81% (p=0.000 n=10+9) TimeFormat-4 9.31µs ± 1% 9.32µs ± 2% ~ (p=0.561 n=9+9) Change-Id: Iaeeff6c9a09c1b2c064d06e09dd88101dc02bfa4 Reviewed-on: https://go-review.googlesource.com/106735 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-04-13 16:39:39 +00:00
Cherry Zhang	5a91c83ce8	cmd/compile: in escape analysis, propagate loop depth to field The escape analysis models "loop depth". If the address of an expression is assigned to something defined at a lower (outer) loop depth, the escape analysis decides it escapes. However, it uses the loop depth of the address operator instead of where the RHS is defined. This causes an unnecessary escape if there is an assignment inside a loop but the RHS is defined outside the loop. This CL propagates the loop depth. Fixes #24730. Change-Id: I5ff1530688bdfd90561a7b39c8be9bfc009a9dae Reviewed-on: https://go-review.googlesource.com/105257 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2018-04-13 14:48:23 +00:00
Josh Bleecher Snyder	c1ed1f3c80	cmd/compile: fix evaluation of "" < s Fixes #24817 Change-Id: Ifa79ab3dfe69297eeef85f7193cd5f85e5982bc5 Reviewed-on: https://go-review.googlesource.com/106655 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2018-04-12 19:38:37 +00:00
Josh Bleecher Snyder	2dfb423e6e	cmd/compile: loop to ensure all autogenerated functions are compiled I was wrong. There was a need to loop here. Fixes #24761 Change-Id: If13b3ab72febde930bdaebdddd1c05e0d0446020 Reviewed-on: https://go-review.googlesource.com/105615 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2018-04-11 23:46:30 +00:00
Alberto Donizetti	467eca6076	test/codegen: port last stack and memcombining tests And delete them from asm_test. Also delete an arm64 cmov test has been already ported to the new test harness. Change-Id: I4458721e1f512bc9ecbbe1c22a2c9c7109ad68fe Reviewed-on: https://go-review.googlesource.com/106335 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Giovanni Bajo <rasky@develer.com>	2018-04-11 16:08:04 +00:00
Robert Griesemer	3d501df441	cmd/compile: better error message when referring to ambiguous method/field Fixes #14321. Change-Id: I9c92c767b01cf7938c4808a8fef9f2936fc667bc Reviewed-on: https://go-review.googlesource.com/106119 Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2018-04-10 23:39:13 +00:00
Matthew Dempsky	535ad8efb8	cmd/compile: fix check that ensures main.main is a function The check was previously disallowing package main from even importing a non-function symbol named "main". Fixes #24801. Change-Id: I849b9713890429f0a16860ef16b5dc7e970d04a4 Reviewed-on: https://go-review.googlesource.com/106120 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2018-04-10 23:34:12 +00:00
Alberto Donizetti	188e2bf897	test/codegen: port arm64 BIC/EON/ORN and masking tests And delete them from asm_test. Change-Id: I24f421b87e8cb4770c887a6dfd58eacd0088947d Reviewed-on: https://go-review.googlesource.com/106056 Reviewed-by: Keith Randall <khr@golang.org>	2018-04-10 10:57:50 +00:00
Alberto Donizetti	d5ff631e6b	test/codegen: port last remaining misc bit/arithmetic tests And delete them from asm_test. Change-Id: I9a75efe9858ef9d7ac86065f860c2ae3f25b0941 Reviewed-on: https://go-review.googlesource.com/105597 Reviewed-by: Daniel Martí <mvdan@mvdan.cc>	2018-04-10 07:58:35 +00:00
Matthew Dempsky	49ed4cbe85	cmd/compile: sort method sets using package height Also, when statically building itabs, compare *types.Sym instead of name alone so that method sets with duplicate non-exported methods are handled correctly. Fixes #24693. Change-Id: I2db8a3d6e80991a71fef5586a15134b6de116269 Reviewed-on: https://go-review.googlesource.com/105039 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2018-04-10 00:06:06 +00:00
Matthew Dempsky	fe77a5413e	cmd/compile: fix constant pointer comparison failure Previously, constant pointer-typed expressions could use either Mpint or NilVal as their Val depending on their construction, but const.go expects each type to have a single corresponding Val kind. This CL changes pointer-typed expressions to exclusively use Mpint. Fixes #21221. Change-Id: I6ba36c9b11eb19a68306f0b296acb11a8c254c41 Reviewed-on: https://go-review.googlesource.com/105315 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2018-04-09 23:19:45 +00:00
Michael Munday	b65122f99a	cmd/compile: optimize comparisons using load merging where available Multi-byte comparison operations were used on amd64, arm64, i386 and s390x for comparisons with constant arrays, but only amd64 and i386 for comparisons with string constants. This CL combines the check for platform capability, since they have the same requirements, and also enables both on ppc64le which also supports load merging. Note that these optimizations currently use little endian byte order which results in byte reversal instructions on s390x. This should be fixed at some point. Change-Id: Ie612d13359b50c77f4d7c6e73fea4a59fa11f322 Reviewed-on: https://go-review.googlesource.com/102558 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-04-09 21:16:47 +00:00
Keith Randall	0de0ed369f	test: check that unaligned load-add opcodes work. A test for CL 102036. Change-Id: Ief6dcb4f478670813fbe22ea75a06815a4b201a3 Reviewed-on: https://go-review.googlesource.com/105875 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-04-09 18:57:37 +00:00
Alberto Donizetti	54c3f56ee0	test/codegen: port various mem-combining tests And delete them from asm_test. Change-Id: I0e33d58274951ab5acb67b0117b60ef617ea887a Reviewed-on: https://go-review.googlesource.com/105735 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Daniel Martí <mvdan@mvdan.cc>	2018-04-09 12:00:06 +00:00
Alberto Donizetti	3e31eb6b84	test/codegen: port arm64 slice zeroing tests Finish porting arm64 slice zeroing codegen tests; delete them from asm_test. Change-Id: Id2532df8ba1c340fa662a6b5238daa3de30548be Reviewed-on: https://go-review.googlesource.com/105136 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Giovanni Bajo <rasky@develer.com>	2018-04-07 09:55:51 +00:00
Matthew Dempsky	950a56899a	cmd/compile: fix method expressions with anonymous receivers Method expressions with anonymous receiver types like "struct { T }.m" require wrapper functions, which we weren't always creating. This in turn resulted in linker errors. This CL ensures that we generate wrapper functions for any anonymous receiver types used in a method expression. Fixes #22444. Change-Id: Ia8ac27f238c2898965e57b82a91d959792d2ddd4 Reviewed-on: https://go-review.googlesource.com/105044 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-04-06 15:39:11 +00:00
Matthew Dempsky	638f112d69	cmd/compile: cleanup method symbol creation There were multiple ad hoc ways to create method symbols, with subtle and confusing differences between them. This CL unifies them into a single well-documented encoding and implementation. This introduces some inconsequential changes to symbol format for the sake of simplicity and consistency. Two notable changes: 1) Symbol construction is now insensitive to the package currently being compiled. Previously, non-exported methods on anonymous types received different method symbols depending on whether the method was local or imported. 2) Symbols for method values parenthesized non-pointer receiver types and non-exported method names, and also always package-qualified non-exported method names. Now they use the same rules as normal method symbols. The methodSym function is also now stricter about rejecting non-sensical method/receiver combinations. Notably, this means that typecheckfunc needs to call addmethod to validate the method before calling declare, which also means we no longer emit errors about redeclaring bogus methods. Change-Id: I9501c7a53dd70ef60e5c74603974e5ecc06e2003 Reviewed-on: https://go-review.googlesource.com/104876 Reviewed-by: Robert Griesemer <gri@golang.org>	2018-04-05 22:01:17 +00:00
Daniel Martí	9767727353	test: skip locklinear's lockmany test for now Since it's been reliably failing on one of the linux-arm builders (arm5spacemonkey) for a long time. Updates #24221. Change-Id: I8fccc7e16631de497ccc2c285e510a110a93ad95 Reviewed-on: https://go-review.googlesource.com/104535 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-04-05 10:53:40 +00:00
Alberto Donizetti	f2abca90a2	test/codegen: port arm64 byte slice zeroing tests And delete them from asm_test. Change-Id: Id533130470da9176a401cb94972f626f43a62148 Reviewed-on: https://go-review.googlesource.com/103656 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Giovanni Bajo <rasky@develer.com>	2018-04-04 13:18:15 +00:00
Robert Griesemer	4637699e92	cmd/compile/internal/syntax: better error message for incorrect if/switch header Fixes #23664. Change-Id: Ic0637e9f896b2fc6502dfbab2d1c4de3c62c0bd2 Reviewed-on: https://go-review.googlesource.com/104616 Reviewed-by: Matthew Dempsky <mdempsky@google.com> Run-TryBot: Robert Griesemer <gri@golang.org>	2018-04-03 21:57:37 +00:00
Giovanni Bajo	ac43de3ae5	cmd/compile: in prove, complete support for OpIsInBounds/OpIsSliceInBounds The logic in addBranchRestrictions didn't allow to correctly model OpIs(Slice)Bound for signed domain, and it was also partly implemented within addRestrictions. Thanks to the previous changes, it is now possible to handle the negative conditions correctly, so that we can learn both signed/LT + unsigned/LT on the positive side, and signed/GE + unsigned/GE on the negative side (but only if the index can be proved to be non-negative). This is able to prove ~50 more slice accesses in std+cmd. Change-Id: I9858080dc03b16f85993a55983dbc4b00f8491b0 Reviewed-on: https://go-review.googlesource.com/104037 Run-TryBot: Giovanni Bajo <rasky@develer.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2018-04-03 20:25:34 +00:00
isharipo	dcaf3fb134	cmd/compile: make DCE remove nodes after terminating if This change makes compiler frontend dead code elimination of const expr if statements introduced in https://golang.org/cl/38773 treat both if constCondTrue { ...; returnStmt } toBeRemoved... if constCondFalse { ...; } else { returnStmt } toBeRemoved... identically to: if constCondTrue { ...; returnStmt } else { toBeRemoved... } Where "constCondTrue" is a an expression that can be evaluated to "true" during compile time. The additional checks are only triggered for const expr if conditions that evaluate to true. name old time/op new time/op delta Template 431ms ± 2% 429ms ± 1% ~ (p=0.491 n=8+6) Unicode 198ms ± 4% 201ms ± 2% ~ (p=0.234 n=7+6) GoTypes 1.40s ± 1% 1.41s ± 2% ~ (p=0.053 n=7+7) Compiler 6.72s ± 2% 6.81s ± 1% +1.35% (p=0.011 n=7+7) SSA 17.3s ± 1% 17.3s ± 2% ~ (p=0.731 n=6+7) Flate 275ms ± 2% 275ms ± 2% ~ (p=0.902 n=7+7) GoParser 340ms ± 2% 339ms ± 2% ~ (p=0.902 n=7+7) Reflect 910ms ± 2% 905ms ± 1% ~ (p=0.310 n=6+6) Tar 403ms ± 1% 403ms ± 2% ~ (p=0.366 n=7+6) XML 486ms ± 1% 490ms ± 1% ~ (p=0.065 n=6+6) StdCmd 56.2s ± 1% 56.6s ± 2% ~ (p=0.620 n=7+7) name old user-time/op new user-time/op delta Template 559ms ± 8% 557ms ± 7% ~ (p=0.713 n=8+7) Unicode 266ms ±13% 277ms ± 9% ~ (p=0.157 n=8+7) GoTypes 1.83s ± 2% 1.84s ± 1% ~ (p=0.522 n=8+7) Compiler 8.67s ± 4% 8.89s ± 4% ~ (p=0.077 n=7+7) SSA 23.9s ± 1% 24.2s ± 1% +1.31% (p=0.005 n=7+7) Flate 351ms ± 4% 342ms ± 5% ~ (p=0.105 n=7+7) GoParser 437ms ± 2% 423ms ± 5% -3.14% (p=0.016 n=7+7) Reflect 1.16s ± 3% 1.15s ± 2% ~ (p=0.362 n=7+7) Tar 517ms ± 4% 511ms ± 3% ~ (p=0.538 n=7+7) XML 619ms ± 3% 617ms ± 4% ~ (p=0.483 n=7+7) Fixes #23521 Change-Id: I165a7827d869aeb93ce6047d026ff873d039a4f3 Reviewed-on: https://go-review.googlesource.com/91056 Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2018-04-03 15:19:41 +00:00
Robert Griesemer	c65a2781be	cmd/compile: better handling of incorrect type switches Don't report errors if we don't have a correct type switch guard; instead ignore it and leave it to the type-checker to report the error. This leads to better error messages concentrating on the type switch guard rather than errors around (confusing) syntactic details. Also clean up some code setting up AssertExpr (they never have a nil Type field) and remove some incorrect TODOs. Fixes #24470. Change-Id: I69512f36e0417e3b5ea9c8856768e04b19d654a8 Reviewed-on: https://go-review.googlesource.com/103615 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2018-04-03 05:34:20 +00:00
isharipo	b44e73eacb	test/fixedbugs: fix bug248 and bug345 When test/run script was removed, these two tests were changed to be executed by test/run.go. Because errchk does not exit with non-zero status on errors, they were silently failing for a while. This change makes 2 things: 1. Compile tested packages in GOROOT/test to match older runner script behavior (strictly required only in bug345, optional in bug248) 2. Check command output with "(?m)^BUG" regexp. It approximates older `grep -q '^BUG' that was used before. See referenced issue for detailed explanation. Fixes #24629 Change-Id: Ie888dcdb4e25cdbb19d434bbc5cb03eb633e9ee8 Reviewed-on: https://go-review.googlesource.com/104095 Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com> Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-04-02 20:08:27 +00:00
Bryan Chan	625f2dccd4	cmd/compile/internal/ssa: handle symbol address comparisons consistently CL 38338 introduced SSA rules to optimize two types of pointer equality tests: a pointer compared with itself, and comparison of addresses taken of two symbols which may have the same base. This patch adds rules to apply the same optimization to pointer inequality tests, which also ensures that two pointers to zero-width types cannot be both equal and unequal at the same time. Fixes #24503. Change-Id: Ic828aeb86ae2e680caf66c35f4c247674768a9ba Reviewed-on: https://go-review.googlesource.com/102275 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-03-31 21:37:13 +00:00
Alberto Donizetti	3b0b8bcd68	test/codegen: port stack-related tests to codegen And delete them from asm_test. Change-Id: Idfe1249052d82d15b9c30b292c78656a0bf7b48d Reviewed-on: https://go-review.googlesource.com/103315 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-30 08:08:06 +00:00
Alberto Donizetti	56eaf574a1	test/codegen: match 387 ops too for GOARCH=386 Change-Id: I99407e27e340689009af798989b33cef7cb92070 Reviewed-on: https://go-review.googlesource.com/103376 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-03-29 20:05:40 +00:00
Alberto Donizetti	aacf7a1846	test: avoid touching GOOS/GOARCH in codegen driver This change modifies the codegen test harness driver so that it no longer modifies the environment GOOS/GOARCH, since that seems to cause flakiness in other concurrently-running tests. The change also enables the codegen tests in run.go. Fixes #24538 Change-Id: I997ac1eb38eb92054efff67fe5c4d3cccc86412b Reviewed-on: https://go-review.googlesource.com/103455 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-03-29 19:00:10 +00:00
Alberto Donizetti	04e993f953	test: update list of escape reasons The escape_because.go test file (which tests the "because" escape explainations printed by `-m -m`) cointains a machine-generated list of all the escape reasons seen in the escape tests. The list appears to be outdated; moreove a new escape reason was added in CL 102895. This change re-generates the list. Change-Id: Idc721c6bbfe9516895b5cf1e6d09b77deda5a3dd Reviewed-on: https://go-review.googlesource.com/103375 Reviewed-by: David Chase <drchase@google.com> Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-03-29 14:17:40 +00:00
Alberto Donizetti	360c19157a	cmd/compile: print accurate escape reason for non-const-length slices This change makes `-m -m` print a better explanation for the case where a slice is marked as escaping and heap-allocated because it has a non-constant len/cap. Fixes #24578 Change-Id: I0ebafb77c758a99857d72b365817bdba7b446cc0 Reviewed-on: https://go-review.googlesource.com/102895 Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>	2018-03-28 16:56:03 +00:00
Matthew Dempsky	7b177b1a03	cmd/compile: fix method set computation for shadowed methods In expandmeth, we call expand1/expand0 to build a list of all candidate methods to promote, and then we use dotpath to prune down which names actually resolve to a promoted method and how. However, previously we still computed "followsptr" based on the expand1/expand0 traversal (which is depth-first), rather than dotpath (which is breadth-first). The result is that we could sometimes end up miscomputing whether a particular promoted method involves a pointer traversal, which could result in bad code generation for method trampolines. Fixes #24547. Change-Id: I57dc014466d81c165b05d78b98610dc3765b7a90 Reviewed-on: https://go-review.googlesource.com/102618 Reviewed-by: Robert Griesemer <gri@golang.org>	2018-03-27 18:56:36 +00:00
Alberto Donizetti	a27cd4fd31	test/codegen: port tbz/tbnz arm64 tests And delete them from asm_test. Change-Id: I34fcf85ae8ce09cd146fe4ce6a0ae7616bd97e2d Reviewed-on: https://go-review.googlesource.com/102296 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Giovanni Bajo <rasky@develer.com>	2018-03-24 09:35:53 +00:00
Giovanni Bajo	79112707bb	cmd/compile: add patterns for bit set/clear/complement on amd64 This patch completes implementation of BT(Q\|L), and adds support for BT(S\|R\|C)(Q\|L). Example of code changes from time.(*Time).addSec: if t.wall&hasMonotonic != 0 { 0x1073465 488b08 MOVQ 0(AX), CX 0x1073468 4889ca MOVQ CX, DX 0x107346b 48c1e93f SHRQ $0x3f, CX 0x107346f 48c1e13f SHLQ $0x3f, CX 0x1073473 48f7c1ffffffff TESTQ $-0x1, CX 0x107347a 746b JE 0x10734e7 if t.wall&hasMonotonic != 0 { 0x1073435 488b08 MOVQ 0(AX), CX 0x1073438 480fbae13f BTQ $0x3f, CX 0x107343d 7363 JAE 0x10734a2 Another example: t.wall = t.wall&nsecMask \| uint64(dsec)<<nsecShift \| hasMonotonic 0x10734c8 4881e1ffffff3f ANDQ $0x3fffffff, CX 0x10734cf 48c1e61e SHLQ $0x1e, SI 0x10734d3 4809ce ORQ CX, SI 0x10734d6 48b90000000000000080 MOVQ $0x8000000000000000, CX 0x10734e0 4809f1 ORQ SI, CX 0x10734e3 488908 MOVQ CX, 0(AX) t.wall = t.wall&nsecMask \| uint64(dsec)<<nsecShift \| hasMonotonic 0x107348b 4881e2ffffff3f ANDQ $0x3fffffff, DX 0x1073492 48c1e61e SHLQ $0x1e, SI 0x1073496 4809f2 ORQ SI, DX 0x1073499 480fbaea3f BTSQ $0x3f, DX 0x107349e 488910 MOVQ DX, 0(AX) Go1 benchmarks seem unaffected, and I would be surprised otherwise: name old time/op new time/op delta BinaryTree17-4 2.64s ± 4% 2.56s ± 9% -2.92% (p=0.008 n=9+9) Fannkuch11-4 2.90s ± 1% 2.95s ± 3% +1.76% (p=0.010 n=10+9) FmtFprintfEmpty-4 35.3ns ± 1% 34.5ns ± 2% -2.34% (p=0.004 n=9+8) FmtFprintfString-4 57.0ns ± 1% 58.4ns ± 5% +2.52% (p=0.029 n=9+10) FmtFprintfInt-4 59.8ns ± 3% 59.8ns ± 6% ~ (p=0.565 n=10+10) FmtFprintfIntInt-4 93.9ns ± 3% 91.2ns ± 5% -2.94% (p=0.014 n=10+9) FmtFprintfPrefixedInt-4 107ns ± 6% 104ns ± 6% ~ (p=0.099 n=10+10) FmtFprintfFloat-4 187ns ± 3% 188ns ± 3% ~ (p=0.505 n=10+9) FmtManyArgs-4 410ns ± 1% 415ns ± 6% ~ (p=0.649 n=8+10) GobDecode-4 5.30ms ± 3% 5.27ms ± 3% ~ (p=0.436 n=10+10) GobEncode-4 4.62ms ± 5% 4.47ms ± 2% -3.24% (p=0.001 n=9+10) Gzip-4 197ms ± 4% 193ms ± 3% ~ (p=0.123 n=10+10) Gunzip-4 30.4ms ± 3% 30.1ms ± 3% ~ (p=0.481 n=10+10) HTTPClientServer-4 76.3µs ± 1% 76.0µs ± 1% ~ (p=0.236 n=8+9) JSONEncode-4 10.5ms ± 9% 10.3ms ± 3% ~ (p=0.280 n=10+10) JSONDecode-4 42.3ms ±10% 41.3ms ± 2% ~ (p=0.053 n=9+10) Mandelbrot200-4 3.80ms ± 2% 3.72ms ± 2% -2.15% (p=0.001 n=9+10) GoParse-4 2.88ms ±10% 2.81ms ± 2% ~ (p=0.247 n=10+10) RegexpMatchEasy0_32-4 69.5ns ± 4% 68.6ns ± 2% ~ (p=0.171 n=10+10) RegexpMatchEasy0_1K-4 165ns ± 3% 162ns ± 3% ~ (p=0.137 n=10+10) RegexpMatchEasy1_32-4 65.7ns ± 6% 64.4ns ± 2% -2.02% (p=0.037 n=10+10) RegexpMatchEasy1_1K-4 278ns ± 2% 279ns ± 3% ~ (p=0.991 n=8+9) RegexpMatchMedium_32-4 99.3ns ± 3% 98.5ns ± 4% ~ (p=0.457 n=10+9) RegexpMatchMedium_1K-4 30.1µs ± 1% 30.4µs ± 2% ~ (p=0.173 n=8+10) RegexpMatchHard_32-4 1.40µs ± 2% 1.41µs ± 4% ~ (p=0.565 n=10+10) RegexpMatchHard_1K-4 42.5µs ± 1% 41.5µs ± 3% -2.13% (p=0.002 n=8+9) Revcomp-4 332ms ± 4% 328ms ± 5% ~ (p=0.720 n=9+10) Template-4 48.3ms ± 2% 49.6ms ± 3% +2.56% (p=0.002 n=8+10) TimeParse-4 252ns ± 2% 249ns ± 3% ~ (p=0.116 n=9+10) TimeFormat-4 262ns ± 4% 252ns ± 3% -4.01% (p=0.000 n=9+10) name old speed new speed delta GobDecode-4 145MB/s ± 3% 146MB/s ± 3% ~ (p=0.436 n=10+10) GobEncode-4 166MB/s ± 5% 172MB/s ± 2% +3.28% (p=0.001 n=9+10) Gzip-4 98.6MB/s ± 4% 100.4MB/s ± 3% ~ (p=0.123 n=10+10) Gunzip-4 639MB/s ± 3% 645MB/s ± 3% ~ (p=0.481 n=10+10) JSONEncode-4 185MB/s ± 8% 189MB/s ± 3% ~ (p=0.280 n=10+10) JSONDecode-4 46.0MB/s ± 9% 47.0MB/s ± 2% +2.21% (p=0.046 n=9+10) GoParse-4 20.1MB/s ± 9% 20.6MB/s ± 2% ~ (p=0.239 n=10+10) RegexpMatchEasy0_32-4 460MB/s ± 4% 467MB/s ± 2% ~ (p=0.165 n=10+10) RegexpMatchEasy0_1K-4 6.19GB/s ± 3% 6.28GB/s ± 3% ~ (p=0.165 n=10+10) RegexpMatchEasy1_32-4 487MB/s ± 5% 497MB/s ± 2% +2.00% (p=0.043 n=10+10) RegexpMatchEasy1_1K-4 3.67GB/s ± 2% 3.67GB/s ± 3% ~ (p=0.963 n=8+9) RegexpMatchMedium_32-4 10.1MB/s ± 3% 10.1MB/s ± 4% ~ (p=0.435 n=10+9) RegexpMatchMedium_1K-4 34.0MB/s ± 1% 33.7MB/s ± 2% ~ (p=0.173 n=8+10) RegexpMatchHard_32-4 22.9MB/s ± 2% 22.7MB/s ± 4% ~ (p=0.565 n=10+10) RegexpMatchHard_1K-4 24.0MB/s ± 3% 24.7MB/s ± 3% +2.64% (p=0.001 n=9+9) Revcomp-4 766MB/s ± 4% 775MB/s ± 5% ~ (p=0.720 n=9+10) Template-4 40.2MB/s ± 2% 39.2MB/s ± 3% -2.47% (p=0.002 n=8+10) The rules match ~1800 times during all.bash. Fixes #18943 Change-Id: I64be1ada34e89c486dfd935bf429b35652117ed4 Reviewed-on: https://go-review.googlesource.com/94766 Run-TryBot: Giovanni Bajo <rasky@develer.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-24 02:38:50 +00:00
Alberto Donizetti	fc6280d4b0	test/codegen: port direct comparisons with memory tests And remove them from asm_test. Change-Id: I1ca29b40546d6de06f20bfd550ed8ff87f495454 Reviewed-on: https://go-review.googlesource.com/102115 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-22 17:20:09 +00:00
Alberto Donizetti	be371edd67	test/codegen: port comparisons tests to codegen And delete them from asm_test. Change-Id: I64c512bfef3b3da6db5c5d29277675dade28b8ab Reviewed-on: https://go-review.googlesource.com/101595 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Giovanni Bajo <rasky@develer.com>	2018-03-20 19:38:06 +00:00
Michael Munday	ae10914e67	cmd/compile: mark LAA and LAAG as clobbering flags on s390x The atomic add instructions modify the condition code and so need to be marked as clobbering flags. Fixes #24449. Change-Id: Ic69c8d775fbdbfb2a56c5e0cfca7a49c0d7f6897 Reviewed-on: https://go-review.googlesource.com/101455 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-20 09:44:50 +00:00
Vladimir Kuzmin	c12b185a6e	cmd/compile: avoid mapaccess at m[k]=append(m[k].. Currently rvalue m[k] is transformed during walk into: tmp1 := mapaccess(m, k) tmp2 := append(tmp1, ...) mapassign(m, k) = tmp2 However, this is suboptimal, as we could instead produce just: tmp := mapassign(m, k) tmp := append(tmp, ...) Optimization is possible only if during Order it may tell that m[k] is exactly the same at left and right part of assignment. It doesn't work: 1) m[f(k)] = append(m[f(k)], ...) 2) sink, m[k] = sink, append(m[k]...) 3) m[k] = append(..., m[k],...) Benchmark: name old time/op new time/op delta MapAppendAssign/Int32/256-8 33.5ns ± 3% 22.4ns ±10% -33.24% (p=0.000 n=16+18) MapAppendAssign/Int32/65536-8 68.2ns ± 6% 48.5ns ±29% -28.90% (p=0.000 n=20+20) MapAppendAssign/Int64/256-8 34.3ns ± 4% 23.3ns ± 5% -32.23% (p=0.000 n=17+18) MapAppendAssign/Int64/65536-8 65.9ns ± 7% 61.2ns ±19% -7.06% (p=0.002 n=18+20) MapAppendAssign/Str/256-8 116ns ±12% 79ns ±16% -31.70% (p=0.000 n=20+19) MapAppendAssign/Str/65536-8 134ns ±15% 111ns ±45% -16.95% (p=0.000 n=19+20) name old alloc/op new alloc/op delta MapAppendAssign/Int32/256-8 47.0B ± 0% 46.0B ± 0% -2.13% (p=0.000 n=19+18) MapAppendAssign/Int32/65536-8 27.0B ± 0% 20.7B ±30% -23.33% (p=0.000 n=20+20) MapAppendAssign/Int64/256-8 47.0B ± 0% 46.0B ± 0% -2.13% (p=0.000 n=20+17) MapAppendAssign/Int64/65536-8 27.0B ± 0% 27.0B ± 0% ~ (all equal) MapAppendAssign/Str/256-8 94.0B ± 0% 78.0B ± 0% -17.02% (p=0.000 n=20+16) MapAppendAssign/Str/65536-8 54.0B ± 0% 54.0B ± 0% ~ (all equal) Fixes #24364 Updates #5147 Change-Id: Id257d052b75b9a445b4885dc571bf06ce6f6b409 Reviewed-on: https://go-review.googlesource.com/100838 Reviewed-by: Matthew Dempsky <mdempsky@google.com> Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-03-20 01:47:07 +00:00
Alberto Donizetti	5a4e09837c	test/codegen: port maps test to codegen And delete them from asm_test. Change-Id: I3cf0934706a640136cb0f646509174f8c1bf3363 Reviewed-on: https://go-review.googlesource.com/101395 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Giovanni Bajo <rasky@develer.com>	2018-03-19 13:39:34 +00:00
Alberto Donizetti	b61b1d2c57	test/codegen: port structs test to codegen And delete them from asm_test. Change-Id: Ia286239a3d8f3915f2ca25dbcb39f3354a4f8aea Reviewed-on: https://go-review.googlesource.com/101138 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-03-18 16:53:53 +00:00
Alberto Donizetti	cceee685be	test/codegen: port floats tests to codegen And delete them from asm_test. Change-Id: Ibdaca3496eefc73c731b511ddb9636a1f3dff68c Reviewed-on: https://go-review.googlesource.com/100915 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-15 18:05:59 +00:00
Giovanni Bajo	a35ec9a59e	cmd/compile: implement CMOV on amd64 This builds upon the branchelim pass, activating it for amd64 and lowering CondSelect. Special care is made to FPU instructions for NaN handling. Benchmark results on Xeon E5630 (Westmere EP): name old time/op new time/op delta BinaryTree17-16 4.99s ± 9% 4.66s ± 2% ~ (p=0.095 n=5+5) Fannkuch11-16 4.93s ± 3% 5.04s ± 2% ~ (p=0.548 n=5+5) FmtFprintfEmpty-16 58.8ns ± 7% 61.4ns ±14% ~ (p=0.579 n=5+5) FmtFprintfString-16 114ns ± 2% 114ns ± 4% ~ (p=0.603 n=5+5) FmtFprintfInt-16 181ns ± 4% 125ns ± 3% -30.90% (p=0.008 n=5+5) FmtFprintfIntInt-16 263ns ± 2% 217ns ± 2% -17.34% (p=0.008 n=5+5) FmtFprintfPrefixedInt-16 230ns ± 1% 212ns ± 1% -7.99% (p=0.008 n=5+5) FmtFprintfFloat-16 411ns ± 3% 344ns ± 5% -16.43% (p=0.008 n=5+5) FmtManyArgs-16 828ns ± 4% 790ns ± 2% -4.59% (p=0.032 n=5+5) GobDecode-16 10.9ms ± 4% 10.8ms ± 5% ~ (p=0.548 n=5+5) GobEncode-16 9.52ms ± 5% 9.46ms ± 2% ~ (p=1.000 n=5+5) Gzip-16 334ms ± 2% 337ms ± 2% ~ (p=0.548 n=5+5) Gunzip-16 64.4ms ± 1% 65.0ms ± 1% +1.00% (p=0.008 n=5+5) HTTPClientServer-16 156µs ± 3% 155µs ± 3% ~ (p=0.690 n=5+5) JSONEncode-16 21.0ms ± 1% 21.8ms ± 0% +3.76% (p=0.016 n=5+4) JSONDecode-16 95.1ms ± 0% 95.7ms ± 1% ~ (p=0.151 n=5+5) Mandelbrot200-16 6.38ms ± 1% 6.42ms ± 1% ~ (p=0.095 n=5+5) GoParse-16 5.47ms ± 2% 5.36ms ± 1% -1.95% (p=0.016 n=5+5) RegexpMatchEasy0_32-16 111ns ± 1% 111ns ± 1% ~ (p=0.635 n=5+4) RegexpMatchEasy0_1K-16 408ns ± 1% 411ns ± 2% ~ (p=0.087 n=5+5) RegexpMatchEasy1_32-16 103ns ± 1% 104ns ± 1% ~ (p=0.484 n=5+5) RegexpMatchEasy1_1K-16 659ns ± 2% 652ns ± 1% ~ (p=0.571 n=5+5) RegexpMatchMedium_32-16 176ns ± 2% 174ns ± 1% ~ (p=0.476 n=5+5) RegexpMatchMedium_1K-16 58.6µs ± 4% 57.7µs ± 4% ~ (p=0.548 n=5+5) RegexpMatchHard_32-16 3.07µs ± 3% 3.04µs ± 4% ~ (p=0.421 n=5+5) RegexpMatchHard_1K-16 89.2µs ± 1% 87.9µs ± 2% -1.52% (p=0.032 n=5+5) Revcomp-16 575ms ± 0% 587ms ± 2% +2.12% (p=0.032 n=4+5) Template-16 110ms ± 1% 107ms ± 3% -3.00% (p=0.032 n=5+5) TimeParse-16 463ns ± 0% 462ns ± 0% ~ (p=0.810 n=5+4) TimeFormat-16 538ns ± 0% 535ns ± 0% -0.63% (p=0.024 n=5+5) name old speed new speed delta GobDecode-16 70.7MB/s ± 4% 71.4MB/s ± 5% ~ (p=0.452 n=5+5) GobEncode-16 80.7MB/s ± 5% 81.2MB/s ± 2% ~ (p=1.000 n=5+5) Gzip-16 58.2MB/s ± 2% 57.7MB/s ± 2% ~ (p=0.452 n=5+5) Gunzip-16 302MB/s ± 1% 299MB/s ± 1% -0.99% (p=0.008 n=5+5) JSONEncode-16 92.4MB/s ± 1% 89.1MB/s ± 0% -3.63% (p=0.016 n=5+4) JSONDecode-16 20.4MB/s ± 0% 20.3MB/s ± 1% ~ (p=0.135 n=5+5) GoParse-16 10.6MB/s ± 2% 10.8MB/s ± 1% +2.00% (p=0.016 n=5+5) RegexpMatchEasy0_32-16 286MB/s ± 1% 285MB/s ± 3% ~ (p=1.000 n=5+5) RegexpMatchEasy0_1K-16 2.51GB/s ± 1% 2.49GB/s ± 2% ~ (p=0.095 n=5+5) RegexpMatchEasy1_32-16 309MB/s ± 1% 307MB/s ± 1% ~ (p=0.548 n=5+5) RegexpMatchEasy1_1K-16 1.55GB/s ± 2% 1.57GB/s ± 1% ~ (p=0.690 n=5+5) RegexpMatchMedium_32-16 5.68MB/s ± 2% 5.73MB/s ± 1% ~ (p=0.579 n=5+5) RegexpMatchMedium_1K-16 17.5MB/s ± 4% 17.8MB/s ± 4% ~ (p=0.500 n=5+5) RegexpMatchHard_32-16 10.4MB/s ± 3% 10.5MB/s ± 4% ~ (p=0.460 n=5+5) RegexpMatchHard_1K-16 11.5MB/s ± 1% 11.7MB/s ± 2% +1.57% (p=0.032 n=5+5) Revcomp-16 442MB/s ± 0% 433MB/s ± 2% -2.05% (p=0.032 n=4+5) Template-16 17.7MB/s ± 1% 18.2MB/s ± 3% +3.12% (p=0.032 n=5+5) Change-Id: I6972e8f35f2b31f9a42ac473a6bf419a18022558 Reviewed-on: https://go-review.googlesource.com/100935 Run-TryBot: Giovanni Bajo <rasky@develer.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-15 16:41:59 +00:00
Geoff Berry	e244a7a7d3	cmd/compile/internal/ssa: add patterns for arm64 bitfield opcodes Add patterns to match common idioms for EXTR, BFI, BFXIL, SBFIZ, SBFX, UBFIZ and UBFX opcodes. go1 benchmarks results on Amberwing: name old time/op new time/op delta FmtManyArgs 786ns ± 2% 714ns ± 1% -9.20% (p=0.000 n=10+10) Gzip 437ms ± 0% 402ms ± 0% -7.99% (p=0.000 n=10+10) FmtFprintfIntInt 196ns ± 0% 182ns ± 0% -7.28% (p=0.000 n=10+9) FmtFprintfPrefixedInt 207ns ± 0% 199ns ± 0% -3.86% (p=0.000 n=10+10) FmtFprintfFloat 324ns ± 0% 316ns ± 0% -2.47% (p=0.000 n=10+8) FmtFprintfInt 119ns ± 0% 117ns ± 0% -1.68% (p=0.000 n=10+9) GobDecode 12.8ms ± 2% 12.6ms ± 1% -1.62% (p=0.002 n=10+10) JSONDecode 94.4ms ± 1% 93.4ms ± 0% -1.10% (p=0.000 n=10+10) RegexpMatchEasy0_32 247ns ± 0% 245ns ± 0% -0.65% (p=0.000 n=10+10) RegexpMatchMedium_32 314ns ± 0% 312ns ± 0% -0.64% (p=0.000 n=10+10) RegexpMatchEasy0_1K 541ns ± 0% 538ns ± 0% -0.55% (p=0.000 n=10+9) TimeParse 450ns ± 1% 448ns ± 1% -0.42% (p=0.035 n=9+9) RegexpMatchEasy1_32 244ns ± 0% 243ns ± 0% -0.41% (p=0.000 n=10+10) GoParse 6.03ms ± 0% 6.00ms ± 0% -0.40% (p=0.002 n=10+10) RegexpMatchEasy1_1K 779ns ± 0% 777ns ± 0% -0.26% (p=0.000 n=10+10) RegexpMatchHard_32 2.75µs ± 0% 2.74µs ± 1% -0.06% (p=0.026 n=9+9) BinaryTree17 11.7s ± 0% 11.6s ± 0% ~ (p=0.089 n=10+10) HTTPClientServer 89.1µs ± 1% 89.5µs ± 2% ~ (p=0.436 n=10+10) RegexpMatchHard_1K 78.9µs ± 0% 79.5µs ± 2% ~ (p=0.469 n=10+10) FmtFprintfEmpty 58.5ns ± 0% 58.5ns ± 0% ~ (all equal) GobEncode 12.0ms ± 1% 12.1ms ± 0% ~ (p=0.075 n=10+10) Revcomp 669ms ± 0% 668ms ± 0% ~ (p=0.091 n=7+9) Mandelbrot200 5.35ms ± 0% 5.36ms ± 0% +0.07% (p=0.000 n=9+9) RegexpMatchMedium_1K 52.1µs ± 0% 52.1µs ± 0% +0.10% (p=0.000 n=9+9) Fannkuch11 3.25s ± 0% 3.26s ± 0% +0.36% (p=0.000 n=9+10) FmtFprintfString 114ns ± 1% 115ns ± 0% +0.52% (p=0.011 n=10+10) JSONEncode 20.2ms ± 0% 20.3ms ± 0% +0.65% (p=0.000 n=10+10) Template 91.3ms ± 0% 92.3ms ± 0% +1.08% (p=0.000 n=10+10) TimeFormat 484ns ± 0% 495ns ± 1% +2.30% (p=0.000 n=9+10) There are some opportunities to improve this change further by adding patterns to match the "extended register" versions of ADD/SUB/CMP, but I think that should be evaluated on its own. The regressions in Template and TimeFormat would likely be recovered by this, as they seem to be due to generating: ubfiz x0, x0, #3, #8 add x1, x2, x0 instead of add x1, x2, x0, lsl #3 Change-Id: I5644a8d70ac7a98e784a377a2b76ab47a3415a4b Reviewed-on: https://go-review.googlesource.com/88355 Reviewed-by: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-03-15 14:10:41 +00:00
Alberto Donizetti	ded9a1b372	test/codegen: port len/cap pow2 div tests to codegen And delete them from asm_test. Change-Id: I29c8d098a8893e6b669b6272a2f508985ac9d618 Reviewed-on: https://go-review.googlesource.com/100876 Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-03-15 13:34:01 +00:00
Ilya Tocar	644d14ea0f	Revert "cmd/compile: implement CMOV on amd64" This reverts commit `080187f4f7`. It broke build of golang.org/x/exp/shiny/iconvg See issue 24395 for details Change-Id: Ifd6134f6214e6cee40bd3c63c32941d5fc96ae8b Reviewed-on: https://go-review.googlesource.com/100755 Run-TryBot: Ilya Tocar <ilya.tocar@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-14 21:21:23 +00:00
Alberto Donizetti	cd3aae9b81	test/codegen: port all small memmove tests to codegen This change ports all the remaining tests checking that small memmoves are replaced with MOVs to the new codegen test harness, and deletes them from the asm_test file. Change-Id: I01c94b441e27a5d61518035af62d62779dafeb56 Reviewed-on: https://go-review.googlesource.com/100476 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-14 15:57:07 +00:00
Alberto Donizetti	858042b8fd	test/codegen: add codegen tests for div Change-Id: I6ce8981e85fd55ade6078b0946e54a9215d9deca Reviewed-on: https://go-review.googlesource.com/100575 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-03-14 15:56:46 +00:00
Tobias Klauser	d32018a500	test: check that size argument errors are emitted at call site Add tests for the "negative size argument in make." and "size argument too large in make." error messages to appear at call sites in case the size is a const defined on another line. As suggested by Matthew in a comment on CL 69910. Change-Id: I5c33d4bec4e3d20bb21fe8019df27999997ddff3 Reviewed-on: https://go-review.googlesource.com/100395 Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2018-03-14 08:36:15 +00:00
Matthew Dempsky	e601c07908	cmd/compile: reject type switch with guarded declaration and no cases Fixes #23116. Change-Id: I5db5c5c39bbb50148ffa18c9393b045f255f80a3 Reviewed-on: https://go-review.googlesource.com/100459 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2018-03-13 22:02:46 +00:00
Robert Griesemer	363bcd7b4f	cmd/compile: use key position for key:val elements in composite literals Fixes #24339. Change-Id: Ie47764fed27f76b480834b1fdbed0512c94831d9 Reviewed-on: https://go-review.googlesource.com/100457 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2018-03-13 21:44:12 +00:00
Matthew Dempsky	09d4455f45	cmd/compile: enable inlining variadic functions As a side effect of working on mid-stack inlining, we've fixed support for inlining variadic functions. Might as well enable it. Change-Id: I7f555f8b941969791db7eb598c0b49f6dc0820aa Reviewed-on: https://go-review.googlesource.com/100456 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-13 20:34:03 +00:00
Vladimir Kuzmin	7395083136	cmd/compile: avoid extra mapaccess in "m[k] op= r" Currently, order desugars map assignment operations like m[k] op= r into m[k] = m[k] op r which in turn is transformed during walk into: tmp := mapaccess(m, k) tmp = tmp op r mapassign(m, k) = tmp However, this is suboptimal, as we could instead produce just: mapassign(m, k) op= r One complication though is if "r == 0", then "m[k] /= r" and "m[k] %= r" will panic, and they need to do so before* calling mapassign, otherwise we may insert a new zero-value element into the map. It would be spec compliant to just emit the "r != 0" check before calling mapassign (see #23735), but currently these checks aren't generated until SSA construction. For now, it's simpler to continue desugaring /= and %= into two map indexing operations. Fixes #23661. Change-Id: I46e3739d9adef10e92b46fdd78b88d5aabe68952 Reviewed-on: https://go-review.googlesource.com/91557 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2018-03-12 19:27:44 +00:00
isharipo	85a8d25d53	cmd/compile/internal/ssa: emit IMUL3{L/Q} for MUL{L/Q}const on x86 cmd/asm now supports three-operand form of IMUL, so instead of using IMUL with resultInArg0, emit IMUL3 instruction. This results in less redundant MOVs where SSA assigns different registers to input[0] and dst arguments. Note: these have exactly the same encoding when reg0=reg1: IMUL3x $const, reg0, reg1 IMULx $const, reg Two-operand IMULx is like a crippled IMUL3x, with dst fixed to input[0]. This is why we don't bother to generate IMULx for the case where dst is the same as input[0]. Change-Id: I4becda475b3dffdd07b6fdf1c75bacc82af654e4 Reviewed-on: https://go-review.googlesource.com/99656 Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Giovanni Bajo <rasky@develer.com> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-12 19:02:36 +00:00
Giovanni Bajo	080187f4f7	cmd/compile: implement CMOV on amd64 This builds upon the branchelim pass, activating it for amd64 and lowering CondSelect. Special care is made to FPU instructions for NaN handling. Benchmark results on Xeon E5630 (Westmere EP): name old time/op new time/op delta BinaryTree17-16 4.99s ± 9% 4.66s ± 2% ~ (p=0.095 n=5+5) Fannkuch11-16 4.93s ± 3% 5.04s ± 2% ~ (p=0.548 n=5+5) FmtFprintfEmpty-16 58.8ns ± 7% 61.4ns ±14% ~ (p=0.579 n=5+5) FmtFprintfString-16 114ns ± 2% 114ns ± 4% ~ (p=0.603 n=5+5) FmtFprintfInt-16 181ns ± 4% 125ns ± 3% -30.90% (p=0.008 n=5+5) FmtFprintfIntInt-16 263ns ± 2% 217ns ± 2% -17.34% (p=0.008 n=5+5) FmtFprintfPrefixedInt-16 230ns ± 1% 212ns ± 1% -7.99% (p=0.008 n=5+5) FmtFprintfFloat-16 411ns ± 3% 344ns ± 5% -16.43% (p=0.008 n=5+5) FmtManyArgs-16 828ns ± 4% 790ns ± 2% -4.59% (p=0.032 n=5+5) GobDecode-16 10.9ms ± 4% 10.8ms ± 5% ~ (p=0.548 n=5+5) GobEncode-16 9.52ms ± 5% 9.46ms ± 2% ~ (p=1.000 n=5+5) Gzip-16 334ms ± 2% 337ms ± 2% ~ (p=0.548 n=5+5) Gunzip-16 64.4ms ± 1% 65.0ms ± 1% +1.00% (p=0.008 n=5+5) HTTPClientServer-16 156µs ± 3% 155µs ± 3% ~ (p=0.690 n=5+5) JSONEncode-16 21.0ms ± 1% 21.8ms ± 0% +3.76% (p=0.016 n=5+4) JSONDecode-16 95.1ms ± 0% 95.7ms ± 1% ~ (p=0.151 n=5+5) Mandelbrot200-16 6.38ms ± 1% 6.42ms ± 1% ~ (p=0.095 n=5+5) GoParse-16 5.47ms ± 2% 5.36ms ± 1% -1.95% (p=0.016 n=5+5) RegexpMatchEasy0_32-16 111ns ± 1% 111ns ± 1% ~ (p=0.635 n=5+4) RegexpMatchEasy0_1K-16 408ns ± 1% 411ns ± 2% ~ (p=0.087 n=5+5) RegexpMatchEasy1_32-16 103ns ± 1% 104ns ± 1% ~ (p=0.484 n=5+5) RegexpMatchEasy1_1K-16 659ns ± 2% 652ns ± 1% ~ (p=0.571 n=5+5) RegexpMatchMedium_32-16 176ns ± 2% 174ns ± 1% ~ (p=0.476 n=5+5) RegexpMatchMedium_1K-16 58.6µs ± 4% 57.7µs ± 4% ~ (p=0.548 n=5+5) RegexpMatchHard_32-16 3.07µs ± 3% 3.04µs ± 4% ~ (p=0.421 n=5+5) RegexpMatchHard_1K-16 89.2µs ± 1% 87.9µs ± 2% -1.52% (p=0.032 n=5+5) Revcomp-16 575ms ± 0% 587ms ± 2% +2.12% (p=0.032 n=4+5) Template-16 110ms ± 1% 107ms ± 3% -3.00% (p=0.032 n=5+5) TimeParse-16 463ns ± 0% 462ns ± 0% ~ (p=0.810 n=5+4) TimeFormat-16 538ns ± 0% 535ns ± 0% -0.63% (p=0.024 n=5+5) name old speed new speed delta GobDecode-16 70.7MB/s ± 4% 71.4MB/s ± 5% ~ (p=0.452 n=5+5) GobEncode-16 80.7MB/s ± 5% 81.2MB/s ± 2% ~ (p=1.000 n=5+5) Gzip-16 58.2MB/s ± 2% 57.7MB/s ± 2% ~ (p=0.452 n=5+5) Gunzip-16 302MB/s ± 1% 299MB/s ± 1% -0.99% (p=0.008 n=5+5) JSONEncode-16 92.4MB/s ± 1% 89.1MB/s ± 0% -3.63% (p=0.016 n=5+4) JSONDecode-16 20.4MB/s ± 0% 20.3MB/s ± 1% ~ (p=0.135 n=5+5) GoParse-16 10.6MB/s ± 2% 10.8MB/s ± 1% +2.00% (p=0.016 n=5+5) RegexpMatchEasy0_32-16 286MB/s ± 1% 285MB/s ± 3% ~ (p=1.000 n=5+5) RegexpMatchEasy0_1K-16 2.51GB/s ± 1% 2.49GB/s ± 2% ~ (p=0.095 n=5+5) RegexpMatchEasy1_32-16 309MB/s ± 1% 307MB/s ± 1% ~ (p=0.548 n=5+5) RegexpMatchEasy1_1K-16 1.55GB/s ± 2% 1.57GB/s ± 1% ~ (p=0.690 n=5+5) RegexpMatchMedium_32-16 5.68MB/s ± 2% 5.73MB/s ± 1% ~ (p=0.579 n=5+5) RegexpMatchMedium_1K-16 17.5MB/s ± 4% 17.8MB/s ± 4% ~ (p=0.500 n=5+5) RegexpMatchHard_32-16 10.4MB/s ± 3% 10.5MB/s ± 4% ~ (p=0.460 n=5+5) RegexpMatchHard_1K-16 11.5MB/s ± 1% 11.7MB/s ± 2% +1.57% (p=0.032 n=5+5) Revcomp-16 442MB/s ± 0% 433MB/s ± 2% -2.05% (p=0.032 n=4+5) Template-16 17.7MB/s ± 1% 18.2MB/s ± 3% +3.12% (p=0.032 n=5+5) Change-Id: Ic7cb7374d07da031e771bdcbfdd832fd1b17159c Reviewed-on: https://go-review.googlesource.com/98695 Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>	2018-03-12 18:01:33 +00:00
Giovanni Bajo	f7ac70a566	test: move rotate tests to top-level testsuite. Remove old tests from asm_test. Change-Id: Ib408ec7faa60068bddecf709b93ce308e0ef665a Reviewed-on: https://go-review.googlesource.com/100075 Reviewed-by: Alberto Donizetti <alb.donizetti@gmail.com>	2018-03-11 10:08:18 +00:00
Alberto Donizetti	8e427a3878	test/codegen: add README file for the codegen test harness This change adds a README file inside the test/codegen directory, explaining how to run the codegen tests and the syntax of the regexps comments used to match assembly instructions. Change-Id: Ica4eb3ffa9c6975371538cc8ae0ac3c1a3a03baf Reviewed-on: https://go-review.googlesource.com/99156 Reviewed-by: Keith Randall <khr@golang.org>	2018-03-09 18:38:53 +00:00
Alberto Donizetti	5f541b11aa	test/codegen: port MULs merging tests to codegen And delete them from asm_go. Change-Id: I0057cbd90ca55fa51c596e32406e190f3866f93e Reviewed-on: https://go-review.googlesource.com/99815 Reviewed-by: Keith Randall <khr@golang.org>	2018-03-09 17:01:56 +00:00
Alberto Donizetti	cde34780b7	test/codegen: port math/bits.RotateLeft tests to codegen Only RotateLeft{64,32} were tested, and just for ppc64. This CL adds tests for RotateLeft{64,32,16,8} on arm64 and amd64/386, for the cases where the calls are actually instrinsified. RotateLeft tests (the last ones for math/bits functions) are deleted from asm_test. This CL also adds a space between the "//" and the arch name in the comments, to uniform this file to the style used in all the other files. Change-Id: Ifc2a27261d70bcc294b4ec64490d8367f62d2b89 Reviewed-on: https://go-review.googlesource.com/99596 Reviewed-by: Giovanni Bajo <rasky@develer.com>	2018-03-09 10:53:38 +00:00
Austin Clements	6436270dad	cmd/compile: add fence-post implications to prove This adds four new deductions to the prove pass, all related to adding or subtracting one from a value. This is the first hint of actual arithmetic relations in the prove pass. The most effective of these is x-1 >= w && x > min ⇒ x > w This helps eliminate bounds checks in code like if x > 0 { // do something with s[x-1] } Altogether, these deductions prove an additional 260 branches in std and cmd. Furthermore, they will let us eliminate some tricky compiler-inserted panics in the runtime that are interfering with static analysis. Fixes #23354. Change-Id: I7088223e0e0cd6ff062a75c127eb4bb60e6dce02 Reviewed-on: https://go-review.googlesource.com/87480 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Alexandru Moșoi <alexandru@mosoi.ro>	2018-03-08 22:25:28 +00:00
Austin Clements	941fc129e2	cmd/compile: derive unsigned limits from signed limits in prove This adds a few simple deductions to the prove pass' fact table to derive unsigned concrete limits from signed concrete limits where possible. This tweak lets the pass prove 70 additional branch conditions in std and cmd. This is based on a comment from the recently-deleted factsTable.get: "// TODO: also use signed data if lim.min >= 0". Change-Id: Ib4340249e7733070f004a0aa31254adf5df8a392 Reviewed-on: https://go-review.googlesource.com/87479 Reviewed-by: Alexandru Moșoi <alexandru@mosoi.ro> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-08 22:25:27 +00:00
Austin Clements	669db2cef5	cmd/compile: make prove pass use unsatisfiability Currently the prove pass uses implication queries. For each block, it collects the set of branch conditions leading to that block, and queries this fact table for whether any of these facts imply the block's own branch condition (or its inverse). This works remarkably well considering it doesn't do any deduction on these facts, but it has various downsides: 1. It requires an implementation both of adding facts to the table and determining implications. These are very nearly duals of each other, but require separate implementations. Likewise, the process of asserting facts of dominating branch conditions is very nearly the dual of the process of querying implied branch conditions. 2. It leads to less effective use of derived facts. For example, the prove pass currently derives facts about the relations between len and cap, but can't make use of these unless a branch condition is in the exact form of a derived fact. If one of these derived facts contradicts another fact, it won't notice or make use of this. This CL changes the approach of the prove pass to instead use contradiction instead of implication. Rather than ever querying a branch condition, it simply adds branch conditions to the fact table. If this leads to a contradiction (specifically, it makes the fact set unsatisfiable), that branch is impossible and can be cut. As a result, 1. We can eliminate the code for determining implications (factsTable.get disappears entirely). Also, there is now a single implementation of visiting and asserting branch conditions, since we don't have to flip them around to treat them as facts in one place and queries in another. 2. Derived facts can be used effectively. It doesn't matter why the fact table is unsatisfiable; a contradiction in any of the facts is enough. 3. As an added benefit, it's now quite easy to avoid traversing beyond provably-unreachable blocks. In contrast, the current implementation always visits all blocks. The prove pass already has nearly all of the mechanism necessary to compute unsatisfiability, which means this both simplifies the code and makes it more powerful. The only complication is that the current implication procedure has a hack for dealing with the 0 <= Args[0] condition of OpIsInBounds and OpIsSliceInBounds. We replace this with asserting the appropriate fact when we process one of these conditions. This seems much cleaner anyway, and works because we can now take advantage of derived facts. This has no measurable effect on compiler performance. Effectiveness: There is exactly one condition in all of std and cmd that this fails to prove that the old implementation could: (int64(^uint(0)>>1) < x) in encoding/gob. This can never be true because x is an int, and it's basically coincidence that the old code gets this. (For example, it fails to prove the similar (x < ^int64(^uint(0)>>1)) condition that immediately precedes it, and even though the conditions are logically unrelated, it wouldn't get the second one if it hadn't first processed the first!) It does, however, prove a few dozen additional branches. These come from facts that are added to the fact table about the relations between len and cap. These were almost never queried directly before, but could lead to contradictions, which the unsat-based approach is able to use. There are exactly two branches in std and cmd that this implementation proves in the other direction. This sounds scary, but is okay because both occur in already-unreachable blocks, so it doesn't matter what we chose. Because the fact table logic is sound but incomplete, it fails to prove that the block isn't reachable, even though it is able to prove that both outgoing branches are impossible. We could turn these blocks into BlockExit blocks, but it doesn't seem worth the trouble of the extra proof effort for something that happens twice in all of std and cmd. Tests: This CL updates test/prove.go to change the expected messages because it can no longer give a "reason" why it proved or disproved a condition. It also adds a new test of a branch it couldn't prove before. It mostly guts test/sliceopt.go, removing everything related to slice bounds optimizations and moving a few relevant tests to test/prove.go. Much of this test is actually unreachable. The new prove pass figures this out and doesn't try to prove anything about the unreachable parts. The output on the unreachable parts is already suspect because anything can be proved at that point, so it's really just a regression test for an algorithm the compiler no longer uses. This is a step toward fixing #23354. That issue is quite easy to fix once we can use derived facts effectively. Change-Id: Ia48a1b9ee081310579fe474e4a61857424ff8ce8 Reviewed-on: https://go-review.googlesource.com/87478 Reviewed-by: Keith Randall <khr@golang.org>	2018-03-08 22:25:25 +00:00
Alberto Donizetti	3772b2e1d5	test/codegen: port 2^n muls tests to codegen harness And delete them from the asm_test.go file. Change-Id: I124c8c352299646ec7db0968cdb0fe59a3b5d83d Reviewed-on: https://go-review.googlesource.com/99475 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Giovanni Bajo <rasky@develer.com>	2018-03-08 16:30:14 +00:00
Matthew Dempsky	88466e93a4	cmd/compile: mark anonymous receiver parameters as non-escaping This was already done for normal parameters, and the same logic applies for receiver parameters too. Updates #24305. Change-Id: Ia2a46f68d14e8fb62004ff0da1db0f065a95a1b7 Reviewed-on: https://go-review.googlesource.com/99335 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-08 00:20:01 +00:00
Alberto Donizetti	c028958393	test/codegen: fix issue with arm64 memmove codegen test This recently added arm64 memmove codegen check: func movesmall() { // arm64:-"memmove" x := [...]byte{1, 2, 3, 4, 5, 6, 7} copy(x[1:], x[:]) } is not correct, for two reasons: 1. regexps are matched from the start of the disasm line (excluding line information). This mean that a negative -"memmove" check will pass against a 'CALL runtime.memmove' line because the line does not start with 'memmove' (its starts with CALL...). The way to specify no 'memmove' match whatsoever on the line is -".memmove" 2. AFAIK comments on their own line are matched against the first subsequent non-comment line. So the code above only verifies that the x := ... line does not generate a memmove. The comment should be moved near the copy() line, if it's that one we want to not generate a memmove call. The fact that the test above is not effective can be checked by running `go run run.go -v codegen` in the toplevel test directory with a go1.10 toolchain (that does not have the memmove-elision optimization). The test will still pass (it shouldn't). This change changes the regexp to -".memmove" and moves it near the line it needs to (not)match. Change-Id: Ie01ef4d775e77d92dc8d8b7856b89b200f5e5ef2 Reviewed-on: https://go-review.googlesource.com/98977 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-03-07 16:41:24 +00:00
Kunpei Sakai	b75e8a2a3b	cmd/compile: prevent detection of wrong duplicates by including *types.Type in typeVal. Updates #21866 Fixes #24159 Change-Id: I2f8cac252d88d43e723124f2867b1410b7abab7b Reviewed-on: https://go-review.googlesource.com/98476 Run-TryBot: Kunpei Sakai <namusyaka@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2018-03-07 01:26:00 +00:00
ChrisALiles	42ecf39e85	cmd/compile: improve compiler error on embedded structs Fixes #23609 Change-Id: I751aae3d849de7fce1306324fcb1a4c3842d873e Reviewed-on: https://go-review.googlesource.com/97076 Reviewed-by: Matthew Dempsky <mdempsky@google.com> Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-03-06 21:06:46 +00:00
Alberto Donizetti	8516ecd05f	test/codegen: port math/bits.ReverseBytes tests to codegen And remove them from ssa_test. Change-Id: If767af662801219774d1bdb787c77edfa6067770 Reviewed-on: https://go-review.googlesource.com/98976 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Giovanni Bajo <rasky@develer.com>	2018-03-06 20:34:33 +00:00
Wei Xiao	05962561ae	cmd/compile/internal/ssa: improve store combine optimization on arm64 Current implementation doesn't consider MOVDreg type operand and fail to combine it into larger store. This patch fixes the issue. Fixes #24242 Change-Id: I7d68697f80e76f48c3528ece01a602bf513248ec Reviewed-on: https://go-review.googlesource.com/98397 Run-TryBot: Giovanni Bajo <rasky@develer.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-03-06 20:29:04 +00:00
Balaram Makam	0e8b7110f6	cmd/compile/internal/ssa: inline small memmove for arm64 This patch enables the optimization for arm64 target. Performance results on Amberwing for strconv benchmark: name old time/op new time/op delta Quote 721ns ± 0% 617ns ± 0% -14.40% (p=0.016 n=5+4) QuoteRune 118ns ± 0% 117ns ± 0% -0.85% (p=0.008 n=5+5) AppendQuote 436ns ± 2% 321ns ± 0% -26.31% (p=0.008 n=5+5) AppendQuoteRune 34.7ns ± 0% 28.4ns ± 0% -18.16% (p=0.000 n=5+4) [Geo mean] 189ns 160ns -15.41% Change-Id: I5714c474e7483d07ca338fbaf49beb4bbcc11c44 Reviewed-on: https://go-review.googlesource.com/98735 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Giovanni Bajo <rasky@develer.com> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-03-06 18:37:19 +00:00
Alberto Donizetti	18ae5eca3b	test/codegen: port math/bits.OnesCount tests to codegen And remove them from ssa_test. Change-Id: I3efac5fea529bb0efa2dae32124530482ba5058e Reviewed-on: https://go-review.googlesource.com/98815 Reviewed-by: Keith Randall <khr@golang.org>	2018-03-06 17:53:00 +00:00
Alberto Donizetti	85dcc709a8	test/codegen: port math/bits.TrailingZeros tests to codegen And remove them from ssa_test. Change-Id: Ib5de5c0d908f23915e0847eca338cacf2fa5325b Reviewed-on: https://go-review.googlesource.com/98795 Reviewed-by: Giovanni Bajo <rasky@develer.com>	2018-03-06 11:48:37 +00:00
Alberto Donizetti	83e41b3e76	test/codegen: port math/bits.Leadingzero tests to codegen Change-Id: Ic21d25db5d56ce77516c53082dfbc010e5875b81 Reviewed-on: https://go-review.googlesource.com/98655 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-05 19:52:04 +00:00

... 2 3 4 5 6 ...

3244 commits