linux/mm
Linus Torvalds feb889fb40 mm: don't put pinned pages into the swap cache
So technically there is nothing wrong with adding a pinned page to the
swap cache, but the pinning obviously means that the page can't actually
be free'd right now anyway, so it's a bit pointless.

However, the real problem is not with it being a bit pointless: the real
issue is that after we've added it to the swap cache, we'll try to unmap
the page.  That will succeed, because the code in mm/rmap.c doesn't know
or care about pinned pages.

Even the unmapping isn't fatal per se, since the page will stay around
in memory due to the pinning, and we do hold the connection to it using
the swap cache.  But when we then touch it next and take a page fault,
the logic in do_swap_page() will map it back into the process as a
possibly read-only page, and we'll then break the page association on
the next COW fault.

Honestly, this issue could have been fixed in any of those other places:
(a) we could refuse to unmap a pinned page (which makes conceptual
sense), or (b) we could make sure to re-map a pinned page writably in
do_swap_page(), or (c) we could just make do_wp_page() not COW the
pinned page (which was what we historically did before that "mm:
do_wp_page() simplification" commit).

But while all of them are equally valid models for breaking this chain,
not putting pinned pages into the swap cache in the first place is the
simplest one by far.

It's also the safest one: the reason why do_wp_page() was changed in the
first place was that getting the "can I re-use this page" wrong is so
fraught with errors.  If you do it wrong, you end up with an incorrectly
shared page.

As a result, using "page_maybe_dma_pinned()" in either do_wp_page() or
do_swap_page() would be a serious bug since it is only a (very good)
heuristic.  Re-using the page requires a hard black-and-white rule with
no room for ambiguity.

In contrast, saying "this page is very likely dma pinned, so let's not
add it to the swap cache and try to unmap it" is an obviously safe thing
to do, and if the heuristic might very rarely be a false positive, no
harm is done.

Fixes: 09854ba94c ("mm: do_wp_page() simplification")
Reported-and-tested-by: Martin Raiber <martin@urbackup.org>
Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-01-17 12:08:04 -08:00
..
kasan arm/kasan: fix the array size of kasan_early_shadow_pte[] 2021-01-12 18:12:54 -08:00
backing-dev.c
balloon_compaction.c
cleancache.c
cma.c
cma.h
cma_debug.c
compaction.c
debug.c
debug_page_ref.c
debug_vm_pgtable.c
dmapool.c
early_ioremap.c
fadvise.c
failslab.c
filemap.c
frame_vector.c
frontswap.c
gup.c
gup_test.c
gup_test.h
highmem.c
hmm.c
huge_memory.c
hugetlb.c mm/hugetlb: fix potential missing huge page size info 2021-01-12 18:12:54 -08:00
hugetlb_cgroup.c
hwpoison-inject.c
init-mm.c
internal.h
interval_tree.c
ioremap.c
Kconfig mm/Kconfig: fix spelling mistake "whats" -> "what's" 2020-12-19 11:25:41 -08:00
Kconfig.debug
khugepaged.c
kmemleak.c
ksm.c
list_lru.c
maccess.c
madvise.c
Makefile
mapping_dirty_helpers.c
memblock.c
memcontrol.c mm/memcontrol:rewrite mem_cgroup_page_lruvec() 2020-12-19 11:18:37 -08:00
memfd.c
memory-failure.c mm,hwpoison: fix printing of page flags 2021-01-12 18:12:54 -08:00
memory.c mm: generalise COW SMC TLB flushing race comment 2020-12-29 15:36:49 -08:00
memory_hotplug.c mm: memmap defer init doesn't work as expected 2020-12-29 15:36:49 -08:00
mempolicy.c mm: migrate: initialize err in do_migrate_pages 2021-01-12 18:12:54 -08:00
mempool.c kasan, mm: rename kasan_poison_kfree 2020-12-22 12:55:09 -08:00
memremap.c
memtest.c
migrate.c
mincore.c
mlock.c
mm_init.c
mmap.c
mmap_lock.c
mmu_gather.c
mmu_notifier.c
mmzone.c
mprotect.c
mremap.c mm/mremap.c: fix extent calculation 2020-12-29 15:36:49 -08:00
msync.c
nommu.c
oom_kill.c
page-writeback.c mm: make wait_on_page_writeback() wait for multiple pending writebacks 2021-01-05 11:33:00 -08:00
page_alloc.c mm/page_alloc: add a missing mm_page_alloc_zone_locked() tracepoint 2021-01-12 18:12:54 -08:00
page_counter.c
page_ext.c
page_idle.c
page_io.c
page_isolation.c
page_owner.c
page_poison.c kasan, mm: reset tags when accessing metadata 2020-12-22 12:55:08 -08:00
page_reporting.c
page_reporting.h
page_vma_mapped.c
pagewalk.c
percpu-internal.h
percpu-km.c
percpu-stats.c
percpu-vm.c
percpu.c
pgalloc-track.h
pgtable-generic.c
process_vm_access.c mm/process_vm_access.c: include compat.h 2021-01-12 18:12:54 -08:00
ptdump.c kasan, arm64: expand CONFIG_KASAN checks 2020-12-22 12:55:08 -08:00
readahead.c
rmap.c
rodata_test.c
shmem.c
shuffle.c
shuffle.h
slab.c
slab.h
slab_common.c kasan, mm: allow cache merging with no metadata 2020-12-22 12:55:09 -08:00
slob.c
slub.c mm, slub: consider rest of partial list if acquire_slab() fails 2021-01-12 18:12:54 -08:00
sparse-vmemmap.c
sparse.c
swap.c
swap_cgroup.c
swap_slots.c
swap_state.c
swapfile.c
truncate.c
usercopy.c
userfaultfd.c
util.c
vmacache.c
vmalloc.c mm/vmalloc.c: fix potential memory leak 2021-01-12 18:12:54 -08:00
vmpressure.c
vmscan.c mm: don't put pinned pages into the swap cache 2021-01-17 12:08:04 -08:00
vmstat.c
workingset.c
z3fold.c
zbud.c
zpool.c
zsmalloc.c
zswap.c