From 865911424d509184d95d3f9fc6a8301927117fdc Mon Sep 17 00:00:00 2001 From: Russ Cox Date: Wed, 26 Jan 2022 16:42:05 -0500 Subject: [PATCH] doc: update Go memory model Following discussion on #47141, make the following changes: - Document Go's overall approach. - Document that multiword races can cause crashes. - Document happens-before for runtime.SetFinalizer. - Document (or link to) happens-before for more sync types. - Document happens-before for sync/atomic. - Document disallowed compiler optimizations. See also https://research.swtch.com/gomm for background. Fixes #50859. Change-Id: I17d837756a77f4d8569f263489c2c45de20a8778 Reviewed-on: https://go-review.googlesource.com/c/go/+/381315 Reviewed-by: Ian Lance Taylor --- doc/go_mem.html | 587 ++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 492 insertions(+), 95 deletions(-) diff --git a/doc/go_mem.html b/doc/go_mem.html index 5f1eb68af3..9bcf06707a 100644 --- a/doc/go_mem.html +++ b/doc/go_mem.html @@ -8,12 +8,9 @@ p.rule { font-style: italic; } -span.event { - font-style: italic; -} -

Introduction

+

Introduction

The Go memory model specifies the conditions under which @@ -22,7 +19,7 @@ observe values produced by writes to the same variable in a different goroutine.

-

Advice

+

Advice

Programs that modify data being simultaneously accessed by multiple goroutines @@ -44,90 +41,237 @@ you are being too clever. Don't be clever.

-

Happens Before

+

Informal Overview

-Within a single goroutine, reads and writes must behave -as if they executed in the order specified by the program. -That is, compilers and processors may reorder the reads and writes -executed within a single goroutine only when the reordering -does not change the behavior within that goroutine -as defined by the language specification. -Because of this reordering, the execution order observed -by one goroutine may differ from the order perceived -by another. For example, if one goroutine -executes a = 1; b = 2;, another might observe -the updated value of b before the updated value of a. +Go approaches its memory model in much the same way as the rest of the language, +aiming to keep the semantics simple, understandable, and useful. +This section gives a general overview of the approach and should suffice for most programmers. +The memory model is specified more formally in the next section.

-To specify the requirements of reads and writes, we define -happens before, a partial order on the execution -of memory operations in a Go program. If event e1 happens -before event e2, then we say that e2 happens after e1. -Also, if e1 does not happen before e2 and does not happen -after e2, then we say that e1 and e2 happen concurrently. -

- -

-Within a single goroutine, the happens-before order is the -order expressed by the program. +A data race is defined as +a write to a memory location happening concurrently with another read or write to that same location, +unless all the accesses involved are atomic data accesses as provided by the sync/atomic package. +As noted already, programmers are strongly encouraged to use appropriate synchronization +to avoid data races. +In the absence of data races, Go programs behave as if all the goroutines +were multiplexed onto a single processor. +This property is sometimes referred to as DRF-SC: data-race-free programs +execute in a sequentially consistent manner.

-A read r of a variable v is allowed to observe a write w to v -if both of the following hold: +While programmers should write Go programs without data races, +there are limitations to what a Go implementation can do in response to a data race. +An implementation may always react to a data race by reporting the race and terminating the program. +Otherwise, each read of a single-word-sized or sub-word-sized memory location +must observe a value actually written to that location (perhaps by a concurrent executing goroutine) +and not yet overwritten. +These implementation constraints make Go more like Java or JavaScript, +in that most races have a limited number of outcomes, +and less like C and C++, where the meaning of any program with a race +is entirely undefined, and the compiler may do anything at all. +Go's approach aims to make errant programs more reliable and easier to debug, +while still insisting that races are errors and that tools can diagnose and report them.

+

Memory Model

+ +

+The following formal definition of Go's memory model closely follows +the approach presented by Hans-J. Boehm and Sarita V. Adve in +“Foundations of the C++ Concurrency Memory Model”, +published in PLDI 2008. +The definition of data-race-free programs and the guarantee of sequential consistency +for race-free progams are equivalent to the ones in that work. +

+ +

+The memory model describes the requirements on program executions, +which are made up of goroutine executions, +which in turn are made up of memory operations. +

+ +

+A memory operation is modeled by four details: +

+ +

+Some memory operations are read-like, including read, atomic read, mutex lock, and channel receive. +Other memory operations are write-like, including write, atomic write, mutex unlock, channel send, and channel close. +Some, such as atomic compare-and-swap, are both read-like and write-like. +

+ +

+A goroutine execution is modeled as a set of memory operations executed by a single goroutine. +

+ +

+Requirement 1: +The memory operations in each goroutine must correspond to a correct sequential execution of that goroutine, +given the values read from and written to memory. +That execution must be consistent with the sequenced before relation, +defined as the partial order requirements set out by the Go language specification +for Go's control flow constructs as well as the order of evaluation for expressions. +

+ +

+A Go program execution is modeled as a set of goroutine executions, +together with a mapping W that specifies the write-like operation that each read-like operation reads from. +(Multiple executions of the same program can have different program executions.) +

+ +

+Requirement 2: +For a given program execution, the mapping W, when limited to synchronizing operations, +must be explainable by some implicit total order of the synchronizing operations +that is consistent with sequencing and the values read and written by those operations. +

+ +

+The synchronized before relation is a partial order on synchronizing memory operations, +derived from W. +If a synchronizing read-like memory operation r +observes a synchronizing write-like memory operation w +(that is, if W(r) = w), +then w is synchronized before r. +Informally, the synchronized before relation is a subset of the implied total order +mentioned in the previous paragraph, +limited to the information that W directly observes. +

+ +

+The happens before relation is defined as the transitive closure of the +union of the sequenced before and synchronized before relations. +

+ +

+Requirement 3: +For an ordinary (non-synchronizing) data read r on a memory location x, +W(r) must be a write w that is visible to r, +where visible means that both of the following hold: +

    -
  1. r does not happen before w.
  2. -
  3. There is no other write w' to v that happens - after w but before r.
  4. +
  5. w happens before r. +
  6. w does not happen before any other write w' (to x) that happens before r.

-To guarantee that a read r of a variable v observes a -particular write w to v, ensure that w is the only -write r is allowed to observe. -That is, r is guaranteed to observe w if both of the following hold: -

- -
    -
  1. w happens before r.
  2. -
  3. Any other write to the shared variable v -either happens before w or after r.
  4. -
- -

-This pair of conditions is stronger than the first pair; -it requires that there are no other writes happening -concurrently with w or r. +A read-write data race on memory location x +consists of a read-like memory operation r on x +and a write-like memory operation w on x, +at least one of which is non-synchronizing, +which are unordered by happens before +(that is, neither r happens before w +nor w happens before r).

-Within a single goroutine, -there is no concurrency, so the two definitions are equivalent: -a read r observes the value written by the most recent write w to v. -When multiple goroutines access a shared variable v, -they must use synchronization events to establish -happens-before conditions that ensure reads observe the -desired writes. +A write-write data race on memory location x +consists of two write-like memory operations w and w' on x, +at least one of which is non-synchronizing, +which are unordered by happens before.

-The initialization of variable v with the zero value -for v's type behaves as a write in the memory model. +Note that if there are no read-write or write-write data races on memory location x, +then any read r on x has only one possible W(r): +the single w that immediately precedes it in the happens before order.

-Reads and writes of values larger than a single machine word -behave as multiple machine-word-sized operations in an -unspecified order. +More generally, it can be shown that any Go program that is data-race-free, +meaning it has no program executions with read-write or write-write data races, +can only have outcomes explained by some sequentially consistent interleaving +of the goroutine executions. +(The proof is the same as Section 7 of Boehm and Adve's paper cited above.) +This property is called DRF-SC.

-

Synchronization

+

+The intent of the formal definition is to match +the DRF-SC guarantee provided to race-free programs +by other languages, including C, C++, Java, JavaScript, Rust, and Swift. +

-

Initialization

+

+Certain Go language operations such as goroutine creation and memory allocation +act as synchronization opeartions. +The effect of these operations on the synchronized-before partial order +is documented in the “Synchronization” section below. +Individual packages are responsible for providing similar documentation +for their own operations. +

+ +

Implementation Restrictions for Programs Containing Data Races

+ +

+The preceding section gave a formal definition of data-race-free program execution. +This section informally describes the semantics that implementations must provide +for programs that do contain races. +

+ +

+First, any implementation can, upon detecting a data race, +report the race and halt execution of the program. +Implementations using ThreadSanitizer +(accessed with “go build -race”) +do exactly this. +

+ +

+Otherwise, a read r of a memory location x +that is not larger than a machine word must observe +some write w such that r does not happen before w +and there is no write w' such that w happens before w' +and w' happens before r. +That is, each read must observe a value written by a preceding or concurrent write. +

+ +

+Additionally, observation of acausal and “out of thin air” writes is disallowed. +

+ +

+Reads of memory locations larger than a single machine word +are encouraged but not required to meet the same semantics +as word-sized memory locations, +observing a single allowed write w. +For performance reasons, +implementations may instead treat larger operations +as a set of individual machine-word-sized operations +in an unspecified order. +This means that races on multiword data structures +can lead to inconsistent values not corresponding to a single write. +When the values depend on the consistency +of internal (pointer, length) or (pointer, type) pairs, +as can be the case for interface values, maps, +slices, and strings in most Go implementations, +such races can in turn lead to arbitrary memory corruption. +

+ +

+Examples of incorrect synchronization are given in the +“Incorrect synchronization” section below. +

+ +

+Examples of the limitations on implementations are given in the +“Incorrect compilation” section below. +

+ +

Synchronization

+ +

Initialization

Program initialization runs in a single goroutine, @@ -141,15 +285,15 @@ If a package p imports package q, the completion of

-The start of the function main.main happens after -all init functions have finished. +The completion of all init functions is synchronized before +the start of the function main.main.

-

Goroutine creation

+

Goroutine creation

The go statement that starts a new goroutine -happens before the goroutine's execution begins. +is synchronized before the start of the goroutine's execution.

@@ -174,11 +318,12 @@ calling hello will print "hello, world" at some point in the future (perhaps after hello has returned).

-

Goroutine destruction

+

Goroutine destruction

-The exit of a goroutine is not guaranteed to happen before -any event in the program. For example, in this program: +The exit of a goroutine is not guaranteed to be synchronized before +any event in the program. +For example, in this program:

@@ -203,7 +348,7 @@ use a synchronization mechanism such as a lock or channel
 communication to establish a relative ordering.
 

-

Channel communication

+

Channel communication

Channel communication is the main method of synchronization @@ -213,8 +358,8 @@ usually in a different goroutine.

-A send on a channel happens before the corresponding -receive from that channel completes. +A send on a channel is synchronized before the completion of the +corresponding receive from that channel.

@@ -239,13 +384,13 @@ func main() {

is guaranteed to print "hello, world". The write to a -happens before the send on c, which happens before -the corresponding receive on c completes, which happens before +is sequenced before the send on c, which is synchronized before +the corresponding receive on c completes, which is sequenced before the print.

-The closing of a channel happens before a receive that returns a zero value +The closing of a channel is synchronized before a receive that returns a zero value because the channel is closed.

@@ -256,8 +401,8 @@ yields a program with the same guaranteed behavior.

-A receive from an unbuffered channel happens before -the send on that channel completes. +A receive from an unbuffered channel is synchronized before the completion of +the corresponding send on that channel.

@@ -283,8 +428,8 @@ func main() {

is also guaranteed to print "hello, world". The write to a -happens before the receive on c, which happens before -the corresponding send on c completes, which happens +is sequenced before the receive on c, which is synchronized before +the corresponding send on c completes, which is sequenced before the print.

@@ -296,7 +441,7 @@ crash, or do something else.)

-The kth receive on a channel with capacity C happens before the k+Cth send from that channel completes. +The kth receive on a channel with capacity C is synchronized before the completion of the k+Cth send from that channel completes.

@@ -330,7 +475,7 @@ func main() { }

-

Locks

+

Locks

The sync package implements two lock data types, @@ -339,7 +484,7 @@ The sync package implements two lock data types,

For any sync.Mutex or sync.RWMutex variable l and n < m, -call n of l.Unlock() happens before call m of l.Lock() returns. +call n of l.Unlock() is synchronized before call m of l.Lock() returns.

@@ -365,19 +510,29 @@ func main() {

is guaranteed to print "hello, world". -The first call to l.Unlock() (in f) happens +The first call to l.Unlock() (in f) is synchronized before the second call to l.Lock() (in main) returns, -which happens before the print. +which is sequenced before the print.

For any call to l.RLock on a sync.RWMutex variable l, -there is an n such that the l.RLock happens (returns) after call n to -l.Unlock and the matching l.RUnlock happens -before call n+1 to l.Lock. +there is an n such that the nth call to l.Unlock +is synchronized before the return from l.RLock, +and the matching call to l.RUnlock is synchronized before the return from call n+1 to l.Lock.

-

Once

+

+A successful call to l.TryLock (or l.TryRLock) +is equivalent to a call to l.Lock (or l.RLock). +An unsuccessful call has no synchronizing effect at all. +As far as the memory model is concerned, +l.TryLock (or l.TryRLock) +may be considered to be able to return false +even when the mutex l is unlocked. +

+ +

Once

The sync package provides a safe mechanism for @@ -389,7 +544,8 @@ until f() has returned.

-A single call of f() from once.Do(f) happens (returns) before any call of once.Do(f) returns. +The completion of a single call of f() from once.Do(f) +is synchronized before the return of any call of once.Do(f).

@@ -424,13 +580,60 @@ The result will be that "hello, world" will be printed twice.

-

Incorrect synchronization

+

Atomic Values

-Note that a read r may observe the value written by a write w -that happens concurrently with r. -Even if this occurs, it does not imply that reads happening after r -will observe writes that happened before w. +The APIs in the sync/atomic +package are collectively “atomic operations” +that can be used to synchronize the execution of different goroutines. +If the effect of an atomic operation A is observed by atomic operation B, +then A is synchronized before B. +All the atomic operations executed in a program behave as though executed +in some sequentially consistent order. +

+ +

+The preceding definition has the same semantics as C++’s sequentially consistent atomics +and Java’s volatile variables. +

+ +

Finalizers

+ +

+The runtime package provides +a SetFinalizer function that adds a finalizer to be called when +a particular object is no longer reachable by the program. +A call to SetFinalizer(x, f) is synchronized before the finalization call f(x). +

+ +

Additional Mechanisms

+ +

+The sync package provides additional synchronization abstractions, +including condition variables, +lock-free maps, +allocation pools, +and +wait groups. +The documentation for each of these specifies the guarantees it +makes concerning synchronization. +

+ +

+Other packages that provide synchronization abstractions +should document the guarantees they make too. +

+ + +

Incorrect synchronization

+ +

+Programs with races are incorrect and +can exhibit non-sequentially consistent executions. +In particular, note that a read r may observe the value written by any write w +that executes concurrently with r. +Even if this occurs, it does not imply that reads happening after r +will observe writes that happened before w.

@@ -566,3 +769,197 @@ value for g.msg. In all these examples, the solution is the same: use explicit synchronization.

+ +

Incorrect compilation

+ +

+The Go memory model restricts compiler optimizations as much as it does Go programs. +Some compiler optimizations that would be valid in single-threaded programs are not valid in all Go programs. +In particular, a compiler must not introduce writes that do not exist in the original program, +it must not allow a single read to observe multiple values, +and it must not allow a single write to write multiple values. +

+ +

+All the following examples assume that `*p` and `*q` refer to +memory locations accessible to multiple goroutines. +

+ +

+Not introducing data races into race-free programs means not moving +writes out of conditional statements in which they appear. +For example, a compiler must not invert the conditional in this program: +

+ +
+*p = 1
+if cond {
+	*p = 2
+}
+
+ +

+That is, the compiler must not rewrite the program into this one: +

+ +
+*p = 2
+if !cond {
+	*p = 1
+}
+
+ +

+If cond is false and another goroutine is reading *p, +then in the original program, the other goroutine can only observe any prior value of *p and 1. +In the rewritten program, the other goroutine can observe 2, which was previously impossible. +

+ +

+Not introducing data races also means not assuming that loops terminate. +For example, a compiler must in general not move the accesses to *p or *q +ahead of the loop in this program: +

+ +
+n := 0
+for e := list; e != nil; e = e.next {
+	n++
+}
+i := *p
+*q = 1
+
+ +

+If list pointed to a cyclic list, +then the original program would never access *p or *q, +but the rewritten program would. +(Moving `*p` ahead would be safe if the compiler can prove `*p` will not panic; +moving `*q` ahead would also require the compiler proving that no other +goroutine can access `*q`.) +

+ +

+Not introducing data races also means not assuming that called functions +always return or are free of synchronization operations. +For example, a compiler must not move the accesses to *p or *q +ahead of the function call in this program +(at least not without direct knowledge of the precise behavior of f): +

+ +
+f()
+i := *p
+*q = 1
+
+ +

+If the call never returned, then once again the original program +would never access *p or *q, but the rewritten program would. +And if the call contained synchronizing operations, then the original program +could establish happens before edges preceding the accesses +to *p and *q, but the rewritten program would not. +

+ +

+Not allowing a single read to observe multiple values means +not reloading local variables from shared memory. +For example, a compiler must not discard i and reload it +a second time from *p in this program: +

+ +
+i := *p
+if i < 0 || i >= len(funcs) {
+	panic("invalid function index")
+}
+... complex code ...
+// compiler must NOT reload i = *p here
+funcs[i]()
+
+ +

+If the complex code needs many registers, a compiler for single-threaded programs +could discard i without saving a copy and then reload +i = *p just before +funcs[i](). +A Go compiler must not, because the value of *p may have changed. +(Instead, the compiler could spill i to the stack.) +

+ +

+Not allowing a single write to write multiple values also means not using +the memory where a local variable will be written as temporary storage before the write. +For example, a compiler must not use *p as temporary storage in this program: +

+ +
+*p = i + *p/2
+
+ +

+That is, it must not rewrite the program into this one: +

+ +
+*p /= 2
+*p += i
+
+ +

+If i and *p start equal to 2, +the original code does *p = 3, +so a racing thread can read only 2 or 3 from *p. +The rewritten code does *p = 1 and then *p = 3, +allowing a racing thread to read 1 as well. +

+ +

+Note that all these optimizations are permitted in C/C++ compilers: +a Go compiler sharing a back end with a C/C++ compiler must take care +to disable optimizations that are invalid for Go. +

+ +

+Note that the prohibition on introducing data races +does not apply if the compiler can prove that the races +do not affect correct execution on the target platform. +For example, on essentially all CPUs, it is valid to rewrite +

+ +
+n := 0
+for i := 0; i < m; i++ {
+	n += *shared
+}
+
+ +into: + +
+n := 0
+local := *shared
+for i := 0; i < m; i++ {
+	n += local
+}
+
+ +

+provided it can be proved that *shared will not fault on access, +because the potential added read will not affect any existing concurrent reads or writes. +On the other hand, the rewrite would not be valid in a source-to-source translator. +

+ +

Conclusion

+ +

+Go programmers writing data-race-free programs can rely on +sequentially consistent execution of those programs, +just as in essentially all other modern programming languages. +

+ +

+When it comes to programs with races, +both programmers and compilers should remember the advice: +don't be clever. +