linux/mm
Johannes Weiner f78dfc7b77 workingset: fix confusion around eviction vs refault container
Refault decisions are made based on the lruvec where the page was evicted,
as that determined its LRU order while it was alive.  Stats and workingset
aging must then occur on the lruvec of the new page, as that's the node
and cgroup that experience the refault and that's the lruvec whose
nonresident info ages out by a new resident page.  Those lruvecs could be
different when a page is shared between cgroups, or the refaulting page is
allocated on a different node.

There are currently two mix-ups:

1. When swap is available, the resident anon set must be considered
   when comparing the refault distance. The comparison is made against
   the right anon set, but the check for swap is not. When pages get
   evicted from a cgroup with swap, and refault in one without, this
   can incorrectly consider a hot refault as cold - and vice
   versa. Fix that by using the eviction cgroup for the swap check.

2. The stats and workingset age are updated against the wrong lruvec
   altogether: the right cgroup but the wrong NUMA node. When a page
   refaults on a different NUMA node, this will have confusing stats
   and distort the workingset age on a different lruvec - again
   possibly resulting in hot/cold misclassifications down the line.

Fix the swap check and the refault pgdat to address both concerns.

This was found during code review.  It hasn't caused notable issues in
production, suggesting that those refault-migrations are relatively rare
in practice.

Link: https://lkml.kernel.org/r/20230104222944.2380117-1-nphamcs@gmail.com
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Co-developed-by: Nhat Pham <nphamcs@gmail.com>
Signed-off-by: Nhat Pham <nphamcs@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-18 17:12:53 -08:00
..
damon mm/damon/vaddr: convert hugetlb related functions to use a folio 2023-01-18 17:12:53 -08:00
kasan kasan: allow sampling page_alloc allocations for HW_TAGS 2023-01-18 17:12:45 -08:00
kfence hardening updates for v6.2-rc1 2022-12-14 12:20:00 -08:00
kmsan kmsan: export kmsan_handle_urb 2022-12-21 14:31:52 -08:00
backing-dev.c
balloon_compaction.c
bootmem_info.c
cma.c cma: tracing: print alloc result in trace_cma_alloc_finish 2023-01-18 17:12:41 -08:00
cma.h
cma_debug.c
cma_sysfs.c
compaction.c
debug.c
debug_page_ref.c
debug_vm_pgtable.c
dmapool.c
early_ioremap.c
fadvise.c
failslab.c
filemap.c
folio-compat.c
frontswap.c
gup.c New Feature: 2022-12-17 14:06:53 -06:00
gup_test.c mm/gup_test: free memory allocated via kvcalloc() using kvfree() 2022-12-15 16:37:48 -08:00
gup_test.h
highmem.c
hmm.c mm/hugetlb: make walk_hugetlb_range() safe to pmd unshare 2023-01-18 17:12:39 -08:00
huge_memory.c mm: huge_memory: convert split_huge_pages_all() to use a folio 2023-01-18 17:12:51 -08:00
hugetlb.c mm/uffd: detect pgtable allocation failures 2023-01-18 17:12:53 -08:00
hugetlb_cgroup.c
hugetlb_vmemmap.c
hugetlb_vmemmap.h
hwpoison-inject.c
init-mm.c
internal.h mm: move folio_set_compound_order() to mm/internal.h 2023-01-18 17:12:36 -08:00
interval_tree.c
io-mapping.c
ioremap.c
Kconfig New Feature: 2022-12-17 14:06:53 -06:00
Kconfig.debug
khugepaged.c mm/MADV_COLLAPSE: don't expand collapse when vm_end is past requested end 2023-01-11 16:14:20 -08:00
kmemleak.c mm/kmemleak: use %pK to display kernel pointers in backtrace 2022-12-15 16:37:49 -08:00
ksm.c
list_lru.c
maccess.c
madvise.c mm/swap: convert deactivate_page() to folio_deactivate() 2023-01-18 17:12:47 -08:00
Makefile
mapping_dirty_helpers.c mm: Rename pmd_read_atomic() 2022-12-15 10:37:27 -08:00
memblock.c mm: Always release pages to the buddy allocator in memblock_free_late(). 2023-01-08 18:49:33 +02:00
memcontrol.c mm: memcg: add folio_memcg_check() 2023-01-18 17:12:52 -08:00
memfd.c mm/memfd: add write seals when apply SEAL_EXEC to executable memfd 2023-01-18 17:12:37 -08:00
memory-failure.c
memory-tiers.c
memory.c mm/memory: add vm_normal_folio() 2023-01-18 17:12:47 -08:00
memory_hotplug.c
mempolicy.c mm/uffd: detect pgtable allocation failures 2023-01-18 17:12:53 -08:00
mempool.c
memremap.c
memtest.c
migrate.c mm/hugetlb: move swap entry handling into vma lock when faulted 2023-01-18 17:12:38 -08:00
migrate_device.c
mincore.c
mlock.c
mm_init.c
mm_slot.h
mmap.c mm: update mmap_sem comments to refer to mmap_lock 2023-01-11 16:14:22 -08:00
mmap_lock.c
mmu_gather.c
mmu_notifier.c
mmzone.c
mprotect.c mm/uffd: detect pgtable allocation failures 2023-01-18 17:12:53 -08:00
mremap.c mm, mremap: fix mremap() expanding vma with addr inside vma 2022-12-21 14:31:51 -08:00
msync.c
nommu.c nommu: fix split_vma() map_count error 2023-01-11 16:14:23 -08:00
oom_kill.c
page-writeback.c mm: remove generic_writepages 2023-01-18 17:12:51 -08:00
page_alloc.c mm: multi-gen LRU: per-node lru_gen_folio lists 2023-01-18 17:12:49 -08:00
page_counter.c
page_ext.c
page_idle.c mm: page_idle: convert page idle to use a folio 2023-01-18 17:12:52 -08:00
page_io.c page_io: remove buffer_head include 2023-01-18 17:12:40 -08:00
page_isolation.c
page_owner.c
page_poison.c
page_reporting.c mm/page_reporting: replace rcu_access_pointer() with rcu_dereference_protected() 2023-01-18 17:12:50 -08:00
page_reporting.h
page_table_check.c
page_vma_mapped.c mm/hugetlb: introduce hugetlb_walk() 2023-01-18 17:12:39 -08:00
pagewalk.c mm/hugetlb: introduce hugetlb_walk() 2023-01-18 17:12:39 -08:00
percpu-internal.h
percpu-km.c
percpu-stats.c
percpu-vm.c
percpu.c
pgalloc-track.h
pgtable-generic.c
process_vm_access.c
ptdump.c
readahead.c
rmap.c mm: rmap: remove lock_page_memcg() 2023-01-18 17:12:42 -08:00
rodata_test.c
secretmem.c
shmem.c swap: avoid holding swap reference in swap_cache_get_folio 2023-01-18 17:12:45 -08:00
shrinker_debug.c
shuffle.c
shuffle.h
slab.c
slab.h
slab_common.c hardening updates for v6.2-rc1 2022-12-14 12:20:00 -08:00
slob.c
slub.c
sparse-vmemmap.c
sparse.c
swap.c mm/swap: convert deactivate_page() to folio_deactivate() 2023-01-18 17:12:47 -08:00
swap.h
swap_cgroup.c
swap_slots.c
swap_state.c swap: avoid holding swap reference in swap_cache_get_folio 2023-01-18 17:12:45 -08:00
swapfile.c swapfile: get rid of volatile and avoid redundant read 2023-01-18 17:12:44 -08:00
truncate.c
usercopy.c
userfaultfd.c mm/uffd: detect pgtable allocation failures 2023-01-18 17:12:53 -08:00
util.c mm: new primitive kvmemdup() 2023-01-18 17:12:47 -08:00
vmalloc.c mm: vmalloc: replace BUG_ON() by WARN_ON_ONCE() 2023-01-18 17:12:48 -08:00
vmpressure.c
vmscan.c mm: multi-gen LRU: simplify arch_has_hw_pte_young() check 2023-01-18 17:12:49 -08:00
vmstat.c
workingset.c workingset: fix confusion around eviction vs refault container 2023-01-18 17:12:53 -08:00
z3fold.c
zbud.c
zpool.c
zsmalloc.c
zswap.c