mirror of
https://github.com/golang/go
synced 2024-11-02 11:50:30 +00:00
cmd/compile: reorganise and improve ssa/README.md
Since the initial version was written, I've gotten help writing cmd/compile/README.md and I've also learned some more on my own, so it's time to organise this document better and expand it. First, split up the document in sections, starting from the simplest ideas that can be explained on their own. From there, build all the way up into SSA functions and how they are compiled. Each of the sections also gets more detail now; most ideas that were a paragraph are now a section with several paragraphs. No new major sections have been added in this CL. While at it, add a copyright notice and make better use of markdown, just like in the other README.md. Also fix a file path in value.go, which I noticed to be stale while reading godocs to write the document. Finally, leave a few TODO comments for areas that would benefit from extra input from people familiar with the SSA package. They will be taken care of in future CLs. Change-Id: I85e7a69a0b3260e72139991a625d926099624f71 Reviewed-on: https://go-review.googlesource.com/110067 Reviewed-by: Keith Randall <khr@golang.org>
This commit is contained in:
parent
2ee6bfbd7e
commit
aad71d3163
2 changed files with 192 additions and 42 deletions
|
@ -1,59 +1,209 @@
|
|||
This package contains the compiler's Static Single Assignment form
|
||||
component. If you're not familiar with SSA, Wikipedia is a good starting
|
||||
point:
|
||||
<!---
|
||||
// Copyright 2018 The Go Authors. All rights reserved.
|
||||
// Use of this source code is governed by a BSD-style
|
||||
// license that can be found in the LICENSE file.
|
||||
-->
|
||||
|
||||
https://en.wikipedia.org/wiki/Static_single_assignment_form
|
||||
## Introduction to the Go compiler's SSA backend
|
||||
|
||||
SSA is useful to perform transformations and optimizations, which can be
|
||||
found in this package in the form of compiler passes and rewrite rules.
|
||||
The former can be found in the "passes" array in compile.go, while the
|
||||
latter are generated from gen/*.rules.
|
||||
This package contains the compiler's Static Single Assignment form component. If
|
||||
you're not familiar with SSA, its [Wikipedia
|
||||
article](https://en.wikipedia.org/wiki/Static_single_assignment_form) is a good
|
||||
starting point.
|
||||
|
||||
Like most other SSA forms, funcs consist of blocks and values. Values
|
||||
perform an operation, which is encoded in the form of an operator and a
|
||||
number of arguments. The semantics of each Op can be found in
|
||||
gen/*Ops.go.
|
||||
It is recommended that you first read [cmd/compile/README.md](../../README.md)
|
||||
if you are not familiar with the Go compiler already. That document gives an
|
||||
overview of the compiler, and explains what is SSA's part and purpose in it.
|
||||
|
||||
gen/* is used to generate code in the ssa package. This includes
|
||||
opGen.go from gen/*Ops.go, and the rewrite*.go files from gen/*.rules.
|
||||
To regenerate these files, see gen/README.
|
||||
### Key concepts
|
||||
|
||||
Blocks can have multiple forms. For example, BlockPlain will always hand
|
||||
the control flow to another block, and BlockIf will flow to one of two
|
||||
blocks depending on a value. See block.go for more details.
|
||||
The names described below may be loosely related to their Go counterparts, but
|
||||
note that they are not equivalent. For example, a Go block statement has a
|
||||
variable scope, yet SSA has no notion of variables nor variable scopes.
|
||||
|
||||
Values also have types. For example, a constant boolean value will have
|
||||
a Bool type, and a variable definition value will have a memory type.
|
||||
It may also be surprising that values and blocks are named after their unique
|
||||
sequential IDs. They rarely correspond to named entities in the original code,
|
||||
such as variables or function parameters. The sequential IDs also allow the
|
||||
compiler to avoid maps, and it is always possible to track back the values to Go
|
||||
code using debug and position information.
|
||||
|
||||
The memory type is special - it represents the global memory state. For
|
||||
example, an Op that takes a memory argument depends on that memory
|
||||
state, and an Op which has the memory type impacts the state of memory.
|
||||
This is important so that memory operations are kept in the right order.
|
||||
#### Values
|
||||
|
||||
For example, take this program:
|
||||
Values are the basic building blocks of SSA. Per SSA's very definition, a
|
||||
value is defined exactly once, but it may be used any number of times. A value
|
||||
mainly consists of a unique identifier, an operator, a type, and some arguments.
|
||||
|
||||
func f(a, b *int) {
|
||||
*a = 3
|
||||
*b = *a
|
||||
}
|
||||
An operator or `Op` describes the operation that computes the value. The
|
||||
semantics of each operator can be found in `gen/*Ops.go`. For example, `OpAdd8`
|
||||
takes two value arguments holding 8-bit integers and results in their addition.
|
||||
Here is a possible SSA representation of the addition of two `uint8` values:
|
||||
|
||||
The two generated stores may show up as follows:
|
||||
// var c uint8 = a + b
|
||||
v4 = Add8 <uint8> v2 v3
|
||||
|
||||
v10 (4) = Store <mem> {int} v6 v8 v1
|
||||
v14 (5) = Store <mem> {int} v7 v8 v10
|
||||
A value's type will usually be a Go type. For example, the value in the example
|
||||
above has a `uint8` type, and a constant boolean value will have a `bool` type.
|
||||
However, certain types don't come from Go and are special; below we will cover
|
||||
`memory`, the most common of them.
|
||||
|
||||
Since the second store has a memory argument v10, it cannot be reordered
|
||||
before the first store, which sets that global memory state. And the
|
||||
logic translates to the code; reordering the two assignments would
|
||||
result in a different program.
|
||||
See [value.go](value.go) for more information.
|
||||
|
||||
#### Memory types
|
||||
|
||||
`memory` represents the global memory state. An `Op` that takes a memory
|
||||
argument depends on that memory state, and an `Op` which has the memory type
|
||||
impacts the state of memory. This ensures that memory operations are kept in the
|
||||
right order. For example:
|
||||
|
||||
// *a = 3
|
||||
// *b = *a
|
||||
v10 = Store <mem> {int} v6 v8 v1
|
||||
v14 = Store <mem> {int} v7 v8 v10
|
||||
|
||||
Here, `Store` stores its second argument (of type `int`) into the first argument
|
||||
(of type `*int`). The last argument is the memory state; since the second store
|
||||
depends on the memory value defined by the first store, the two stores cannot be
|
||||
reordered.
|
||||
|
||||
See [cmd/compile/internal/types/type.go](../types/type.go) for more information.
|
||||
|
||||
#### Blocks
|
||||
|
||||
A block represents a basic block in the control flow graph of a function. It is,
|
||||
essentially, a list of values that define the operation of this block. Besides
|
||||
the list of values, blocks mainly consist of a unique identifier, a kind, and a
|
||||
list of successor blocks.
|
||||
|
||||
The simplest kind is a `plain` block; it simply hands the control flow to
|
||||
another block, thus its successors list contains one block.
|
||||
|
||||
Another common block kind is the `exit` block. These have a final value, called
|
||||
control value, which must return a memory state. This is necessary for functions
|
||||
to return some values, for example - the caller needs some memory state to
|
||||
depend on, to ensure that it receives those return values correctly.
|
||||
|
||||
The last important block kind we will mention is the `if` block. Its control
|
||||
value must be a boolean value, and it has exactly two successor blocks. The
|
||||
control flow is handed to the first successor if the bool is true, and to the
|
||||
second otherwise.
|
||||
|
||||
Here is a sample if-else control flow represented with basic blocks:
|
||||
|
||||
// func(b bool) int {
|
||||
// if b {
|
||||
// return 2
|
||||
// }
|
||||
// return 3
|
||||
// }
|
||||
b1:
|
||||
v1 = InitMem <mem>
|
||||
v2 = SP <uintptr>
|
||||
v5 = Addr <*int> {~r1} v2
|
||||
v6 = Arg <bool> {b}
|
||||
v8 = Const64 <int> [2]
|
||||
v12 = Const64 <int> [3]
|
||||
If v6 -> b2 b3
|
||||
b2: <- b1
|
||||
v10 = VarDef <mem> {~r1} v1
|
||||
v11 = Store <mem> {int} v5 v8 v10
|
||||
Ret v11
|
||||
b3: <- b1
|
||||
v14 = VarDef <mem> {~r1} v1
|
||||
v15 = Store <mem> {int} v5 v12 v14
|
||||
Ret v15
|
||||
|
||||
<!---
|
||||
TODO: can we come up with a shorter example that still shows the control flow?
|
||||
-->
|
||||
|
||||
See [block.go](block.go) for more information.
|
||||
|
||||
#### Functions
|
||||
|
||||
A function represents a function declaration along with its body. It mainly
|
||||
consists of a name, a type (its signature), a list of blocks that form its body,
|
||||
and the entry block within said list.
|
||||
|
||||
When a function is called, the control flow is handed to its entry block. If the
|
||||
function terminates, the control flow will eventually reach an exit block, thus
|
||||
ending the function call.
|
||||
|
||||
Note that a function may have zero or multiple exit blocks, just like a Go
|
||||
function can have any number of return points, but it must have exactly one
|
||||
entry point block.
|
||||
|
||||
Also note that some SSA functions are autogenerated, such as the hash functions
|
||||
for each type used as a map key.
|
||||
|
||||
For example, this is what an empty function can look like in SSA, with a single
|
||||
exit block that returns an uninteresting memory state:
|
||||
|
||||
foo func()
|
||||
b1:
|
||||
v1 = InitMem <mem>
|
||||
Ret v1
|
||||
|
||||
See [func.go](func.go) for more information.
|
||||
|
||||
### Compiler passes
|
||||
|
||||
Having a program in SSA form is not very useful on its own. Its advantage lies
|
||||
in how easy it is to write optimizations that modify the program to make it
|
||||
better. The way the Go compiler accomplishes this is via a list of passes.
|
||||
|
||||
Each pass transforms a SSA function in some way. For example, a dead code
|
||||
elimination pass will remove blocks and values that it can prove will never be
|
||||
executed, and a nil check elimination pass will remove nil checks which it can
|
||||
prove to be redundant.
|
||||
|
||||
Compiler passes work on one function at a time, and by default run sequentially
|
||||
and exactly once.
|
||||
|
||||
The `lower` pass is special; it converts the SSA representation from being
|
||||
machine-independent to being machine-dependent. That is, some abstract operators
|
||||
are replaced with their non-generic counterparts, potentially reducing or
|
||||
increasing the final number of values.
|
||||
|
||||
<!---
|
||||
TODO: Probably explain here why the ordering of the passes matters, and why some
|
||||
passes like deadstore have multiple variants at different stages.
|
||||
-->
|
||||
|
||||
See the `passes` list defined in [compile.go](compile.go) for more information.
|
||||
|
||||
### Playing with SSA
|
||||
|
||||
A good way to see and get used to the compiler's SSA in action is via
|
||||
GOSSAFUNC. For example, to see func Foo's initial SSA form and final
|
||||
`GOSSAFUNC`. For example, to see func `Foo`'s initial SSA form and final
|
||||
generated assembly, one can run:
|
||||
|
||||
GOSSAFUNC=Foo go build
|
||||
|
||||
The generated ssa.html file will also contain the SSA func at each of
|
||||
the compile passes, making it easy to see what each pass does to a
|
||||
particular program. You can also click on values and blocks to highlight
|
||||
them, to help follow the control flow and values.
|
||||
The generated `ssa.html` file will also contain the SSA func at each of the
|
||||
compile passes, making it easy to see what each pass does to a particular
|
||||
program. You can also click on values and blocks to highlight them, to help
|
||||
follow the control flow and values.
|
||||
|
||||
<!---
|
||||
TODO: need more ideas for this section
|
||||
-->
|
||||
|
||||
### Hacking on SSA
|
||||
|
||||
While most compiler passes are implemented directly in Go code, some others are
|
||||
code generated. This is currently done via rewrite rules, which have their own
|
||||
syntax and are maintained in `gen/*.rules`. Simpler optimizations can be written
|
||||
easily and quickly this way, but rewrite rules are not suitable for more complex
|
||||
optimizations.
|
||||
|
||||
To read more on rewrite rules, have a look at the top comments in
|
||||
[gen/generic.rules](gen/generic.rules) and [gen/rulegen.go](gen/rulegen.go).
|
||||
|
||||
Similarly, the code to manage operators is also code generated from
|
||||
`gen/*Ops.go`, as it is easier to maintain a few tables than a lot of code.
|
||||
After changing the rules or operators, see [gen/README](gen/README) for
|
||||
instructions on how to generate the Go code again.
|
||||
|
||||
<!---
|
||||
TODO: more tips and info could likely go here
|
||||
-->
|
||||
|
|
|
@ -25,7 +25,7 @@ type Value struct {
|
|||
Op Op
|
||||
|
||||
// The type of this value. Normally this will be a Go type, but there
|
||||
// are a few other pseudo-types, see type.go.
|
||||
// are a few other pseudo-types, see ../types/type.go.
|
||||
Type *types.Type
|
||||
|
||||
// Auxiliary info for this value. The type of this information depends on the opcode and type.
|
||||
|
|
Loading…
Reference in a new issue