Plan 9 versions for amd64 have 2 megabyte pages.
This also fixes the logic for 32-bit vs 64-bit Plan 9,
making 64-bit the default, and adds logic to generate
a symbols table.
R=golang-dev, rsc, rminnich, ality, 0intro
CC=golang-dev, john
https://golang.org/cl/6218046
The old code generated for a bounds check was
CMP
JLT ok
CALL panicindex
ok:
...
The new code is (once the linker finishes with it):
CMP
JGE panic
...
panic:
CALL panicindex
which moves the calls out of line, putting more useful
code in each cache line. This matters especially in tight
loops, such as in Fannkuch. The benefit is more modest
elsewhere, but real.
From test/bench/go1, amd64:
benchmark old ns/op new ns/op delta
BenchmarkBinaryTree17 6096092000 6088808000 -0.12%
BenchmarkFannkuch11 6151404000 4020463000 -34.64%
BenchmarkGobDecode 28990050 28894630 -0.33%
BenchmarkGobEncode 12406310 12136730 -2.17%
BenchmarkGzip 179923 179903 -0.01%
BenchmarkGunzip 11219 11130 -0.79%
BenchmarkJSONEncode 86429350 86515900 +0.10%
BenchmarkJSONDecode 334593800 315728400 -5.64%
BenchmarkRevcomp25M 1219763000 1180767000 -3.20%
BenchmarkTemplate 492947600 483646800 -1.89%
And 386:
benchmark old ns/op new ns/op delta
BenchmarkBinaryTree17 6354902000 6243000000 -1.76%
BenchmarkFannkuch11 8043769000 7326965000 -8.91%
BenchmarkGobDecode 19010800 18941230 -0.37%
BenchmarkGobEncode 14077500 13792460 -2.02%
BenchmarkGzip 194087 193619 -0.24%
BenchmarkGunzip 12495 12457 -0.30%
BenchmarkJSONEncode 125636400 125451400 -0.15%
BenchmarkJSONDecode 696648600 685032800 -1.67%
BenchmarkRevcomp25M 2058088000 2052545000 -0.27%
BenchmarkTemplate 602140000 589876800 -2.04%
To implement this, two new instruction forms:
JLT target // same as always
JLT $0, target // branch expected not taken
JLT $1, target // branch expected taken
The linker could also emit the prediction prefixes, but it
does not: expected taken branches are reversed so that the
expected case is not taken (as in example above), and
the default expectaton for such a jump is not taken
already.
R=golang-dev, gri, r, dave
CC=golang-dev
https://golang.org/cl/6248049
Implement the (3-per-family) Noah's Ark clause (i.e. don't put
more than three identical elements on the list of active formatting
elements.
Also, when running tests, sort attributes by name before dumping
them.
Pass 4 additional tests with Noah's Ark clause (including one
that needs attributes to be sorted).
Pass 5 additional, unrelated tests because of sorting attributes.
R=nigeltao, rsc
CC=golang-dev
https://golang.org/cl/6247056
CanonicalHeaderKey didn't allocate, but it did use unnecessary
CPU in the hot path, deciding it didn't need to allocate.
I considered using constants for all these common header keys
but I didn't think it would be prettier. "Content-Length" looks
better than contentLength or hdrContentLength, etc.
R=golang-dev, dave
CC=golang-dev
https://golang.org/cl/6255053
Comment on cache keys above connectMethod says "http to proxy, http
anywhere after that", however in reality target address was always
included, which prevented http requests to different target
addresses to reuse the same http proxy connection.
R=golang-dev, r, rsc, bradfitz
CC=golang-dev
https://golang.org/cl/5901064
CL 5956051 introduced too many call != nil checks, so
attempt to improve this by splitting logic into three
distinct parts.
R=r
CC=golang-dev
https://golang.org/cl/6248048
for expr1, expr2 = range slice
was assigning to expr1 and expr2 in sequence
instead of in parallel. Now it assigns in parallel,
as it should. This matters for things like
for i, x[i] = range slice.
Fixes#3464.
R=ken2
CC=golang-dev
https://golang.org/cl/6252048
This is from CL 5451105 but was dropped from that CL.
See also CL 6137051.
The only change compared to 5451105 is to check for
h != nil in reflect·mapiterinit; allowing use of nil maps
must have happened after that original CL.
Fixes#3573.
R=golang-dev, dave, r
CC=golang-dev
https://golang.org/cl/6215078
Remove redundant checks for integration points.
Ignore null bytes in text.
Don't break out of foreign content for a <font> tag unless it
has a color, face, or size attribute.
Check for MathML text integration points when breaking out of
foreign content.
Pass two new tests.
R=nigeltao
CC=golang-dev
https://golang.org/cl/6256045
The bulk of the gains come from hoisting the modulo ops outside of
the inner loop.
Reducing the digest type from 8 bytes to 4 bytes gains another 1% on
the hash/adler32 micro-benchmark.
Benchmarks for $GOOS,$GOARCH = linux,amd64 below.
hash/adler32 benchmark:
benchmark old ns/op new ns/op delta
BenchmarkAdler32KB 1660 1364 -17.83%
image/png benchmark:
benchmark old ns/op new ns/op delta
BenchmarkDecodeGray 2466909 2425539 -1.68%
BenchmarkDecodeNRGBAGradient 9884500 9751705 -1.34%
BenchmarkDecodeNRGBAOpaque 8511615 8379800 -1.55%
BenchmarkDecodePaletted 1366683 1330677 -2.63%
BenchmarkDecodeRGB 6987496 6884974 -1.47%
BenchmarkEncodePaletted 6292408 6040052 -4.01%
BenchmarkEncodeRGBOpaque 19780680 19178440 -3.04%
BenchmarkEncodeRGBA 80738600 79076800 -2.06%
Wall time for Denis Cheremisov's PNG-decoding program given in
https://groups.google.com/group/golang-nuts/browse_thread/thread/22aa8a05040fdd49
Before: 2.44s
After: 2.26s
Delta: -7%
R=rsc
CC=golang-dev
https://golang.org/cl/6251044
When client fails to write a request is sends caller that error,
however server might have failed to read that request in the mean
time and replied with that error. When client then reads the
response the call would no longer be pending, so call will be nil
Handle this gracefully by discarding such server responses
R=golang-dev, r
CC=golang-dev, rsc
https://golang.org/cl/5956051
* Eliminate bounds check on known small shifts.
* Rewrite x<<s | x>>(32-s) as a rotate (constant s).
* More aggressive (but still minimal) range analysis.
R=ken, dave, iant
CC=golang-dev
https://golang.org/cl/6209077
The previous attempt to explain this got it backwards (all the more reason to be
sad we couldn't make the two functions behave the same).
Fixes#3669.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6249051
There's no need for the 16-bit arithmetic here,
and it tickles a long-standing compiler bug.
Fix the exp code not to use 16-bit math and
create an explicit test for the compiler bug.
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/6256048
- interface methods appeared under VarDecl in search results
(long-standing TODO)
- don't walk parts of AST which contain no indexable material
(minor performance tuning)
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6228047
The documentation says so, but in the case of a normalized
integral Rat, the denominator was a new value. Changed the
internal representation to use an Int to represent the
denominator (with the sign ignored), so a reference to it
can always be returned.
Clarified documentation and added test cases.
Fixes#3521.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6237045
* Shift/rotate by constant doesn't have to stop subprop. (also in 8g)
* Remove redundant MOVLQZX instructions.
* An attempt at issuing loads early.
Good for 0.5% on a good day, might not be worth keeping.
Need to understand more about whether the x86
looks ahead to what loads might be coming up.
R=ken2, ken
CC=golang-dev
https://golang.org/cl/6203091
Detect HTML integration points and MathML text integration points.
At these points, process tokens as HTML, not as foreign content.
Pass 33 more tests.
R=nigeltao
CC=golang-dev
https://golang.org/cl/6249044
Import updated test data from the WebKit Subversion repository (SVN revision 118111).
Some of the old tests were failing because we were HTML5 compliant, but the tests weren't.
R=nigeltao
CC=golang-dev
https://golang.org/cl/6228049
I needed this to explore per-GOOS/GOARCH differences in pkg
syscall for a recent CL. Others may find it useful too.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6236046
- there is no label scope at package level
- open/close all scopes symmetrically now
that there is only one parse entry point
(parseFile)
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/6230047