mirror of
https://github.com/golang/go
synced 2024-10-06 08:00:07 +00:00
[dev.typeparams] cmd/compile/abi-internal.md: specify ARM64 register-based ABI
The ABI is similar to the AMD64 ABI, just uses different registers and stack layout. The stack layout is compatible with the current stack-based ABI0. To be implemented in Go 1.18. Change-Id: If9c5e664574947f959d3427e3bed769e05d2d673 Reviewed-on: https://go-review.googlesource.com/c/go/+/319829 Trust: Cherry Mui <cherryyz@google.com> Reviewed-by: Than McIntosh <thanm@google.com>
This commit is contained in:
parent
b18b2d372e
commit
ae26b45113
|
@ -505,6 +505,128 @@ control bits specified by the ELF AMD64 ABI.
|
||||||
|
|
||||||
The x87 floating-point control word is not used by Go on amd64.
|
The x87 floating-point control word is not used by Go on amd64.
|
||||||
|
|
||||||
|
### arm64 architecture
|
||||||
|
|
||||||
|
The arm64 architecture uses R0 – R15 for integer arguments and results.
|
||||||
|
|
||||||
|
It uses F0 – F15 for floating-point arguments and results.
|
||||||
|
|
||||||
|
*Rationale*: 16 integer registers and 16 floating-point registers are
|
||||||
|
more than enough for passing arguments and results for practically all
|
||||||
|
functions (see Appendix). While there are more registers available,
|
||||||
|
using more registers provides little benefit. Additionally, it will add
|
||||||
|
overhead on code paths where the number of arguments are not statically
|
||||||
|
known (e.g. reflect call), and will consume more stack space when there
|
||||||
|
is only limited stack space available to fit in the nosplit limit.
|
||||||
|
|
||||||
|
Registers R16 and R17 are permanent scratch registers. They are also
|
||||||
|
used as scratch registers by the linker (Go linker and external
|
||||||
|
linker) in trampolines.
|
||||||
|
|
||||||
|
Register R18 is reserved and never used. It is reserved for the OS
|
||||||
|
on some platforms (e.g. macOS).
|
||||||
|
|
||||||
|
Registers R19 – R25 are permanent scratch registers. In addition,
|
||||||
|
R27 is a permanent scratch register used by the assembler when
|
||||||
|
expanding instructions.
|
||||||
|
|
||||||
|
Floating-point registers F16 – F31 are also permanent scratch
|
||||||
|
registers.
|
||||||
|
|
||||||
|
Special-purpose registers are as follows:
|
||||||
|
|
||||||
|
| Register | Call meaning | Return meaning | Body meaning |
|
||||||
|
| --- | --- | --- | --- |
|
||||||
|
| RSP | Stack pointer | Same | Same |
|
||||||
|
| R30 | Link register | Same | Scratch (non-leaf functions) |
|
||||||
|
| R29 | Frame pointer | Same | Same |
|
||||||
|
| R28 | Current goroutine | Same | Same |
|
||||||
|
| R27 | Scratch | Scratch | Scratch |
|
||||||
|
| R26 | Closure context pointer | Scratch | Scratch |
|
||||||
|
| R18 | Reserved (not used) | Same | Same |
|
||||||
|
| ZR | Zero value | Same | Same |
|
||||||
|
|
||||||
|
*Rationale*: These register meanings are compatible with Go’s
|
||||||
|
stack-based calling convention.
|
||||||
|
|
||||||
|
*Rationale*: The link register, R30, holds the function return
|
||||||
|
address at the function entry. For functions that have frames
|
||||||
|
(including most non-leaf functions), R30 is saved to stack in the
|
||||||
|
function prologue and restored in the epilogue. Within the function
|
||||||
|
body, R30 can be used as a scratch register.
|
||||||
|
|
||||||
|
*Implementation note*: Registers with fixed meaning at calls but not
|
||||||
|
in function bodies must be initialized by "injected" calls such as
|
||||||
|
signal-based panics.
|
||||||
|
|
||||||
|
#### Stack layout
|
||||||
|
|
||||||
|
The stack pointer, RSP, grows down and is always aligned to 16 bytes.
|
||||||
|
|
||||||
|
*Rationale*: The arm64 architecture requires the stack pointer to be
|
||||||
|
16-byte aligned.
|
||||||
|
|
||||||
|
A function's stack frame, after the frame is created, is laid out as
|
||||||
|
follows:
|
||||||
|
|
||||||
|
+------------------------------+
|
||||||
|
| ... locals ... |
|
||||||
|
| ... outgoing arguments ... |
|
||||||
|
| return PC | ← RSP points to
|
||||||
|
| frame pointer on entry |
|
||||||
|
+------------------------------+ ↓ lower addresses
|
||||||
|
|
||||||
|
The "return PC" is loaded to the link register, R30, as part of the
|
||||||
|
arm64 `CALL` operation.
|
||||||
|
|
||||||
|
On entry, a function subtracts from RSP to open its stack frame, and
|
||||||
|
saves the values of R30 and R29 at the bottom of the frame.
|
||||||
|
Specifically, R30 is saved at 0(RSP) and R29 is saved at -8(RSP),
|
||||||
|
after RSP is updated.
|
||||||
|
|
||||||
|
A leaf function that does not require any stack space may omit the
|
||||||
|
saved R30 and R29.
|
||||||
|
|
||||||
|
The Go ABI's use of R29 as a frame pointer register is compatible with
|
||||||
|
arm64 architecture requirement so that Go can inter-operate with platform
|
||||||
|
debuggers and profilers.
|
||||||
|
|
||||||
|
This stack layout is used by both register-based (ABIInternal) and
|
||||||
|
stack-based (ABI0) calling conventions.
|
||||||
|
|
||||||
|
#### Flags
|
||||||
|
|
||||||
|
The arithmetic status flags (NZCV) are treated like scratch registers
|
||||||
|
and not preserved across calls.
|
||||||
|
All other bits in PSTATE are system flags and are not modified by Go.
|
||||||
|
|
||||||
|
The floating-point status register (FPSR) is treated like scratch
|
||||||
|
registers and not preserved across calls.
|
||||||
|
|
||||||
|
At calls, the floating-point control register (FPCR) bits are always
|
||||||
|
set as follows:
|
||||||
|
|
||||||
|
| Flag | Bit | Value | Meaning |
|
||||||
|
| --- | --- | --- | --- |
|
||||||
|
| DN | 25 | 0 | Propagate NaN operands |
|
||||||
|
| FZ | 24 | 0 | Do not flush to zero |
|
||||||
|
| RC | 23/22 | 0 (RN) | Round to nearest, choose even if tied |
|
||||||
|
| IDE | 15 | 0 | Denormal operations trap disabled |
|
||||||
|
| IXE | 12 | 0 | Inexact trap disabled |
|
||||||
|
| UFE | 11 | 0 | Underflow trap disabled |
|
||||||
|
| OFE | 10 | 0 | Overflow trap disabled |
|
||||||
|
| DZE | 9 | 0 | Divide-by-zero trap disabled |
|
||||||
|
| IOE | 8 | 0 | Invalid operations trap disabled |
|
||||||
|
| NEP | 2 | 0 | Scalar operations do not affect higher elements in vector registers |
|
||||||
|
| AH | 1 | 0 | No alternate handling of de-normal inputs |
|
||||||
|
| FIZ | 0 | 0 | Do not zero de-normals |
|
||||||
|
|
||||||
|
*Rationale*: Having a fixed FPCR control configuration allows Go
|
||||||
|
functions to use floating-point and vector (SIMD) operations without
|
||||||
|
modifying or saving the FPCR.
|
||||||
|
Functions are allowed to modify it between calls (as long as they
|
||||||
|
restore it), but as of this writing Go code never does.
|
||||||
|
|
||||||
## Future directions
|
## Future directions
|
||||||
|
|
||||||
### Spill path improvements
|
### Spill path improvements
|
||||||
|
|
Loading…
Reference in a new issue