Find a file
Sean Christopherson 58cd407ca4 KVM: Fix multiple races in gfn=>pfn cache refresh
Rework the gfn=>pfn cache (gpc) refresh logic to address multiple races
between the cache itself, and between the cache and mmu_notifier events.

The existing refresh code attempts to guard against races with the
mmu_notifier by speculatively marking the cache valid, and then marking
it invalid if a mmu_notifier invalidation occurs.  That handles the case
where an invalidation occurs between dropping and re-acquiring gpc->lock,
but it doesn't handle the scenario where the cache is refreshed after the
cache was invalidated by the notifier, but before the notifier elevates
mmu_notifier_count.  The gpc refresh can't use the "retry" helper as its
invalidation occurs _before_ mmu_notifier_count is elevated and before
mmu_notifier_range_start is set/updated.

  CPU0                                    CPU1
  ----                                    ----

  gfn_to_pfn_cache_invalidate_start()
  |
  -> gpc->valid = false;
                                          kvm_gfn_to_pfn_cache_refresh()
                                          |
                                          |-> gpc->valid = true;

                                          hva_to_pfn_retry()
                                          |
                                          -> acquire kvm->mmu_lock
                                             kvm->mmu_notifier_count == 0
                                             mmu_seq == kvm->mmu_notifier_seq
                                             drop kvm->mmu_lock
                                             return pfn 'X'
  acquire kvm->mmu_lock
  kvm_inc_notifier_count()
  drop kvm->mmu_lock()
  kernel frees pfn 'X'
                                          kvm_gfn_to_pfn_cache_check()
                                          |
                                          |-> gpc->valid == true

                                          caller accesses freed pfn 'X'

Key off of mn_active_invalidate_count to detect that a pfncache refresh
needs to wait for an in-progress mmu_notifier invalidation.  While
mn_active_invalidate_count is not guaranteed to be stable, it is
guaranteed to be elevated prior to an invalidation acquiring gpc->lock,
so either the refresh will see an active invalidation and wait, or the
invalidation will run after the refresh completes.

Speculatively marking the cache valid is itself flawed, as a concurrent
kvm_gfn_to_pfn_cache_check() would see a valid cache with stale pfn/khva
values.  The KVM Xen use case explicitly allows/wants multiple users;
even though the caches are allocated per vCPU, __kvm_xen_has_interrupt()
can read a different vCPU (or vCPUs).  Address this race by invalidating
the cache prior to dropping gpc->lock (this is made possible by fixing
the above mmu_notifier race).

Complicating all of this is the fact that both the hva=>pfn resolution
and mapping of the kernel address can sleep, i.e. must be done outside
of gpc->lock.

Fix the above races in one fell swoop, trying to fix each individual race
is largely pointless and essentially impossible to test, e.g. closing one
hole just shifts the focus to the other hole.

Fixes: 982ed0de47 ("KVM: Reinstate gfn_to_pfn_cache with invalidation support")
Cc: stable@vger.kernel.org
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Mingwei Zhang <mizhang@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220429210025.3293691-8-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-05-25 05:23:44 -04:00
arch KVM: set_msr_mce: Permit guests to ignore single-bit ECC errors 2022-05-25 05:23:40 -04:00
block Revert "block: release rq qos structures for queue without disk" 2022-05-02 10:06:40 -06:00
certs Kbuild updates for v5.18 2022-03-31 11:59:03 -07:00
crypto for-5.18/64bit-pi-2022-03-25 2022-03-26 12:01:35 -07:00
Documentation Documentation: kvm: reorder ARM-specific section about KVM_SYSTEM_EVENT_SUSPEND 2022-05-25 05:15:43 -04:00
drivers s390/uv_uapi: depend on CONFIG_S390 2022-05-25 05:11:36 -04:00
fs gfs2 fixes 2022-05-13 14:32:53 -07:00
include KVM: Fully serialize gfn=>pfn cache refresh via mutex 2022-05-25 05:23:43 -04:00
init Kconfig: Add option for asm goto w/ tied outputs to workaround clang-13 bug 2022-04-13 13:37:47 -04:00
ipc fs: allocate inode by using alloc_inode_sb() 2022-03-22 15:57:03 -07:00
kernel KVM/riscv changes for 5.19 2022-05-25 05:09:49 -04:00
lib dim: initialize all struct fields 2022-05-09 17:20:37 -07:00
LICENSES LICENSES/LGPL-2.1: Add LGPL-2.1-or-later as valid identifiers 2021-12-16 14:33:10 +01:00
mm hotfixes for 5.18-rc7 2022-05-13 10:22:37 -07:00
net NFS client bugfixes for Linux 5.18 2022-05-13 11:04:37 -07:00
samples sched/tracing: Append prev_state to tp args instead 2022-05-12 00:37:11 +02:00
scripts Merge branch kvm-arm64/hyp-stack-guard into kvmarm-master/next 2022-05-04 09:42:37 +01:00
security hardening updates for v5.18-rc1-fix1 2022-03-31 11:43:01 -07:00
sound ASoC: Fixes for v5.18 2022-05-08 10:49:25 +02:00
tools KVM: selftests: x86: Sync the new name of the test case to .gitignore 2022-05-25 05:15:48 -04:00
usr Kbuild updates for v5.18 2022-03-31 11:59:03 -07:00
virt KVM: Fix multiple races in gfn=>pfn cache refresh 2022-05-25 05:23:44 -04:00
.clang-format genirq/msi: Make interrupt allocation less convoluted 2021-12-16 22:22:20 +01:00
.cocciconfig
.get_maintainer.ignore
.gitattributes
.gitignore .gitignore: ignore only top-level modules.builtin 2021-05-02 00:43:35 +09:00
.mailmap hotfixes for 5.18-rc7 2022-05-13 10:22:37 -07:00
COPYING
CREDITS MAINTAINERS: replace a Microchip AT91 maintainer 2022-02-09 11:30:01 +01:00
Kbuild
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS KVM: s390: Fix and feature for 5.19 2022-05-25 05:11:21 -04:00
Makefile Linux 5.18-rc7 2022-05-15 18:08:58 -07:00
README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.