runtime: fix handling of SPWRITE functions in traceback

It is valid to see SPWRITE functions at the top of a GC stack traceback,
in the case where they self-preempted during the stack growth check
and haven't actually modified SP in a traceback-unfriendly manner yet.
The current check is therefore too aggressive.

isAsyncSafePoint is taking care of not async-preempting SPWRITE functions
because it doesn't async-preempt any assembly functions at all.
But perhaps it will in the future.

To keep a check that SPWRITE assembly functions are not async-preempted,
add one in preemptPark. Then relax the check in traceback to avoid
triggering on self-preempted SPWRITE functions.

The long and short of this is that the assembly we corrected in x/crypto
issue #44269 was incredibly dodgy but not technically incompatible with
the Go runtime. After this change, the original x/crypto assembly no longer
causes GC traceback crashes during "GOGC=1 go test -count=1000".
But we'll still leave the corrected assembly.

This also means that we don't need to worry about diagnosing SPWRITE
assembly functions that may exist in the wild. They will be skipped for
async preemption and no harm no foul.

Fixes #44269, which was open pending some kind of check for
bad SPWRITE functions in the wild. (No longer needed.)

Change-Id: I6000197b62812bbd2cd92da28eab422634cf75a8
Reviewed-on: https://go-review.googlesource.com/c/go/+/317669
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
This commit is contained in:
Russ Cox 2021-05-06 11:38:46 -04:00
parent e03383a2e2
commit 03886707f9
3 changed files with 29 additions and 5 deletions

View file

@ -413,6 +413,8 @@ func isAsyncSafePoint(gp *g, pc, sp, lr uintptr) (bool, uintptr) {
//
// TODO: Are there cases that are safe but don't have a
// locals pointer map, like empty frame functions?
// It might be possible to preempt any assembly functions
// except the ones that have funcFlag_SPWRITE set in f.flag.
return false, 0
}
name := funcname(f)

View file

@ -3570,6 +3570,21 @@ func preemptPark(gp *g) {
throw("bad g status")
}
gp.waitreason = waitReasonPreempted
if gp.asyncSafePoint {
// Double-check that async preemption does not
// happen in SPWRITE assembly functions.
// isAsyncSafePoint must exclude this case.
f := findfunc(gp.sched.pc)
if !f.valid() {
throw("preempt at unknown pc")
}
if f.flag&funcFlag_SPWRITE != 0 {
println("runtime: unexpected SPWRITE function", funcname(f), "in async preempt")
throw("preempt SPWRITE")
}
}
// Transition from _Grunning to _Gscan|_Gpreempted. We can't
// be in _Grunning when we dropg because then we'd be running
// without an M, but the moment we're in _Gpreempted,

View file

@ -219,18 +219,25 @@ func gentraceback(pc0, sp0, lr0 uintptr, gp *g, skip int, pcbuf *uintptr, max in
// This function marks the top of the stack. Stop the traceback.
frame.lr = 0
flr = funcInfo{}
} else if flag&funcFlag_SPWRITE != 0 {
} else if flag&funcFlag_SPWRITE != 0 && (callback == nil || n > 0) {
// The function we are in does a write to SP that we don't know
// how to encode in the spdelta table. Examples include context
// switch routines like runtime.gogo but also any code that switches
// to the g0 stack to run host C code. Since we can't reliably unwind
// the SP (we might not even be on the stack we think we are),
// we stop the traceback here.
// This only applies for profiling signals (callback == nil).
//
// For a GC stack traversal (callback != nil), we should only see
// a function when it has voluntarily preempted itself on entry
// during the stack growth check. In that case, the function has
// not yet had a chance to do any writes to SP and is safe to unwind.
// isAsyncSafePoint does not allow assembly functions to be async preempted,
// and preemptPark double-checks that SPWRITE functions are not async preempted.
// So for GC stack traversal we leave things alone (this if body does not execute for n == 0)
// at the bottom frame of the stack. But farther up the stack we'd better not
// find any.
if callback != nil {
// Finding an SPWRITE should only happen for a profiling signal, which can
// arrive at any time. For a GC stack traversal (callback != nil),
// we shouldn't see this case, and we must be sure to walk the
// entire stack or the GC is invalid. So crash.
println("traceback: unexpected SPWRITE function", funcname(f))
throw("traceback")
}