freebsd-src/sys
Konstantin Belousov d929ad7f91 Ensure memory consistency on COW.
From the submitter description:
The process is forked transitioning a map entry to COW
Thread A writes to a page on the map entry, faults, updates the pmap to
  writable at a new phys addr, and starts TLB invalidations...
Thread B acquires a lock, writes to a location on the new phys addr, and
  releases the lock
Thread C acquires the lock, reads from the location on the old phys addr...
Thread A ...continues the TLB invalidations which are completed
Thread C ...reads from the location on the new phys addr, and releases
  the lock

In this example Thread B and C [lock, use and unlock] properly and
neither own the lock at the same time.  Thread A was writing somewhere
else on the page and so never had/needed the lock. Thread C sees a
location that is only ever read|modified under a lock change beneath
it while it is the lock owner.

To fix this, perform the two-stage update of the copied PTE.  First,
the PTE is updated with the address of the new physical page with
copied content, but in read-only mode.  The pmap locking and the page
busy state during PTE update and TLB invalidation IPIs ensure that any
writer to the page cannot upgrade the PTE to the writable state until
all CPUs updated their TLB to not cache old mapping.  Then, after the
busy state of the page is lifted, the faults for write can proceed and
do not violate the consistency of the reads.

The change is done in vm_fault because most architectures do need IPIs
to invalidate remote TLBs.  More, I think that hardware guarantees of
atomicity of the remote TLB invalidation are not enough to prevent the
inconsistent reads of non-atomic reads, like multi-word accesses
protected by a lock.  So instead of modifying each pmap invalidation
code, I did it there.

Discovered and analyzed by: Elliott.Rabe@dell.com
Reviewed by:	markj
PR:	225584 (appeared to have the same cause)
Tested by:	Elliott.Rabe@dell.com, emaste, Mike Tancsa <mike@sentex.net>, truckman
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D14347
2018-02-14 00:31:45 +00:00
..
amd64 amd64/pmap: Move Foundation copyright to the 2-clause section 2018-02-13 19:19:26 +00:00
arm Make v_wire_count a per-cpu counter(9) counter. This eliminates a 2018-02-12 22:53:00 +00:00
arm64 Make v_wire_count a per-cpu counter(9) counter. This eliminates a 2018-02-12 22:53:00 +00:00
bsm sys: further adoption of SPDX licensing ID tags. 2017-11-20 19:43:44 +00:00
cam Fix cut and pasted comments to reflect differences in code from the 2018-02-07 18:33:46 +00:00
cddl Add sysctls for dnode block and indirect block shifts. 2018-02-09 23:29:50 +00:00
compat linuxkpi: Do not leak pages on put. 2018-02-13 15:44:35 +00:00
conf Add support for zstd-compressed user and kernel core dumps. 2018-02-13 19:28:02 +00:00
contrib Move zstd malloc()/free()/calloc() macros to stdlib.h. 2018-02-13 19:18:00 +00:00
crypto ccp(4): Store IV in output buffer in GCM software fallback when requested 2018-01-27 07:41:31 +00:00
ddb Implement 'domainset', a cpuset based NUMA policy mechanism. This allows 2018-01-12 22:48:23 +00:00
dev bwn(4): Conditionalize "RX decryption attempted" message on a new 2018-02-13 20:07:40 +00:00
dts Add a skeleton Clock Manager for RPi2/3, and use that from pwm 2018-01-22 07:10:30 +00:00
fs {ext2|ufs}_readdir: Avoid setting negative ncookies. 2018-02-06 22:38:19 +00:00
gdb sys/gdb: further adoption of SPDX licensing ID tags. 2017-11-27 15:16:59 +00:00
geom Narrow a race, and fix a leak, in g_part_wither 2018-02-13 17:40:09 +00:00
gnu bwn(4): txpid2g/txpid5g[lh] are not defined after sromrev 7; the default 2018-02-13 17:43:54 +00:00
i386 Import the mthca kernel side infiniband driver from Linux 4.9 and fix 2018-02-13 17:04:34 +00:00
isa Add ISA PNP tables to ISA drivers. Fix a few incidental comments. 2018-01-29 00:22:30 +00:00
kern Add support for zstd-compressed user and kernel core dumps. 2018-02-13 19:28:02 +00:00
kgssapi kgssapi: Remove trivial deadcode 2018-02-14 00:12:03 +00:00
libkern libkern: use nul for terminating char rather than 0 2018-02-13 19:17:48 +00:00
mips Make v_wire_count a per-cpu counter(9) counter. This eliminates a 2018-02-12 22:53:00 +00:00
modules Import the mthca kernel side infiniband driver from Linux 4.9 and fix 2018-02-13 17:04:34 +00:00
net BPF: Switch to 32 bit compatible mode only when thread is 32 bit 2018-01-25 12:13:41 +00:00
net80211 net80211: sanitize input for ieee80211_output() 2017-12-30 00:40:34 +00:00
netgraph Revert r327828, r327949, r327953, r328016-r328026, r328041: 2018-01-21 15:42:36 +00:00
netinet Reinitialize IP header length after checksum calculation. It is used 2018-02-10 10:13:17 +00:00
netinet6 Update the MTU in affected routes when IPv6 RA changes the MTU 2018-02-12 19:49:20 +00:00
netipsec Adopt revision 1.76 and 1.77 from NetBSD: 2018-01-24 19:48:25 +00:00
netpfil Remove duplicate #include <netinet/ip_var.h>. 2018-02-07 19:12:05 +00:00
netsmb Unsign some values related to allocation. 2018-01-22 02:08:10 +00:00
nfs Modernize nfssvc(2) registartion. 2018-02-08 20:09:42 +00:00
nfsclient style: Remove remaining deprecated MALLOC/FREE macros 2018-01-25 22:25:13 +00:00
nfsserver sys: general adoption of SPDX licensing ID tags. 2017-11-27 15:23:17 +00:00
nlm Use syscall_helper_register() to register syscalls and initialize though 2018-02-10 01:09:22 +00:00
ofed Import the mthca kernel side infiniband driver from Linux 4.9 and fix 2018-02-13 17:04:34 +00:00
opencrypto Move per-operation data out of the csession structure. 2018-01-26 23:21:50 +00:00
powerpc Make v_wire_count a per-cpu counter(9) counter. This eliminates a 2018-02-12 22:53:00 +00:00
riscv Make v_wire_count a per-cpu counter(9) counter. This eliminates a 2018-02-12 22:53:00 +00:00
rpc Do pass removing some write-only variables from the kernel. 2017-12-25 04:48:39 +00:00
security Do pass removing some write-only variables from the kernel. 2017-12-25 04:48:39 +00:00
sparc64 Make v_wire_count a per-cpu counter(9) counter. This eliminates a 2018-02-12 22:53:00 +00:00
sys Add support for zstd-compressed user and kernel core dumps. 2018-02-13 19:28:02 +00:00
teken sys: general adoption of SPDX licensing ID tags. 2017-11-27 15:23:17 +00:00
tests
tools Avoid using \$. It's an unknown escape sequence. Some awks warn about 2018-01-28 05:13:08 +00:00
ufs The goal of this change is to prevent accidental foot shooting by 2018-02-08 23:06:58 +00:00
vm Ensure memory consistency on COW. 2018-02-14 00:31:45 +00:00
x86 Fix build with gas. 2018-02-13 15:30:31 +00:00
xdr sys: general adoption of SPDX licensing ID tags. 2017-11-27 15:23:17 +00:00
xen sys: general adoption of SPDX licensing ID tags. 2017-11-27 15:23:17 +00:00
Makefile Move sys/boot to stand. Fix all references to new location 2017-11-14 23:02:19 +00:00