linux/Documentation/mm
David Hildenbrand 05c5323b2a mm: track mapcount of large folios in single value
Let's track the mapcount of large folios in a single value.  The mapcount
of a large folio currently corresponds to the sum of the entire mapcount
and all page mapcounts.

This sum is what we actually want to know in folio_mapcount() and it is
also sufficient for implementing folio_mapped().

With PTE-mapped THP becoming more important and more widely used, we want
to avoid looping over all pages of a folio just to obtain the mapcount of
large folios.  The comment "In the common case, avoid the loop when no
pages mapped by PTE" in folio_total_mapcount() does no longer hold for
mTHP that are always mapped by PTE.

Further, we are planning on using folio_mapcount() more frequently, and
might even want to remove page mapcounts for large folios in some kernel
configs.  Therefore, allow for reading the mapcount of large folios
efficiently and atomically without looping over any pages.

Maintain the mapcount also for hugetlb pages for simplicity.  Use the new
mapcount to implement folio_mapcount() and folio_mapped().  Make
page_mapped() simply call folio_mapped().  We can now get rid of
folio_large_is_mapped().

_nr_pages_mapped is now only used in rmap code and for debugging purposes.
Keep folio_nr_pages_mapped() around, but document that its use should be
limited to rmap internals and debugging purposes.

This change implies one additional atomic add/sub whenever
mapping/unmapping (parts of) a large folio.

As we now batch RMAP operations for PTE-mapped THP during fork(), during
unmap/zap, and when PTE-remapping a PMD-mapped THP, and we adjust the
large mapcount for a PTE batch only once, the added overhead in the common
case is small.  Only when unmapping individual pages of a large folio
(e.g., during COW), the overhead might be bigger in comparison, but it's
essentially one additional atomic operation.

Note that before the new mapcount would overflow, already our refcount
would overflow: each mapping requires a folio reference.  Extend the
focumentation of folio_mapcount().

Link: https://lkml.kernel.org/r/20240409192301.907377-5-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Richard Chang <richardycc@google.com>
Cc: Rich Felker <dalias@libc.org>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-05 17:53:28 -07:00
..
damon Docs/mm/damon/design: remove the details for pageout as paddr doesn't use MADV_PAGEOUT 2024-03-04 17:01:17 -08:00
active_mm.rst lazy tlb: allow lazy tlb mm refcounting to be configurable 2023-03-28 16:20:08 -07:00
allocation-profiling.rst memprofiling: documentation 2024-04-25 20:55:58 -07:00
arch_pgtable_helpers.rst Documentation/mm: drop pte_bad() descriptions from arch page table helpers 2023-12-10 16:51:40 -08:00
balance.rst - Daniel Verkamp has contributed a memfd series ("mm/memfd: add 2023-02-23 17:09:35 -08:00
bootmem.rst
free_page_reporting.rst docs/mm: remove useless markup 2023-02-02 10:18:05 -07:00
highmem.rst Documentation work keeps chugging along; stuff for 6.6 includes: 2023-08-30 20:05:42 -07:00
hmm.rst docs/mm: remove references to hmm_mirror ops and clean typos 2023-08-28 12:41:17 -06:00
hugetlbfs_reserv.rst mm: convert free_huge_page() to free_huge_folio() 2023-08-21 14:28:43 -07:00
hwpoison.rst Documentation: Fix typos 2023-08-18 11:29:03 -06:00
index.rst memprofiling: documentation 2024-04-25 20:55:58 -07:00
ksm.rst docs/mm: remove useless markup 2023-02-02 10:18:05 -07:00
memory-model.rst docs/mm: remove useless markup 2023-02-02 10:18:05 -07:00
mmu_notifier.rst docs/mm: remove useless markup 2023-02-02 10:18:05 -07:00
multigen_lru.rst mm: multi-gen LRU: improve design doc 2023-03-28 16:20:07 -07:00
numa.rst docs/mm: remove useless markup 2023-02-02 10:18:05 -07:00
oom.rst
overcommit-accounting.rst docs: mm: fix vm overcommit documentation for OVERCOMMIT_GUESS 2023-10-10 13:35:55 -06:00
page_allocation.rst
page_cache.rst ubifs: Convert ubifs_vm_page_mkwrite() to use a folio 2024-02-25 21:08:00 +01:00
page_frags.rst docs/mm: remove useless markup 2023-02-02 10:18:05 -07:00
page_migration.rst Documentation: Fix typos 2023-08-18 11:29:03 -06:00
page_owner.rst mm,page_owner: fix refcount imbalance 2024-04-16 15:39:49 -07:00
page_reclaim.rst docs/mm: Physical Memory: remove useless markup 2023-02-02 10:18:04 -07:00
page_table_check.rst mm: page_table_check: Make it dependent on EXCLUSIVE_SYSTEM_RAM 2023-05-29 16:14:28 +01:00
page_tables.rst Documentation/page_tables: Add info about MMU/TLB and Page Faults 2023-10-10 13:35:55 -06:00
physical_memory.rst docs/mm: Physical Memory: Fix grammar 2023-04-11 16:16:50 -06:00
process_addrs.rst
remap_file_pages.rst docs/mm: remove useless markup 2023-02-02 10:18:05 -07:00
shmfs.rst
slab.rst
slub.rst mm/slub: make the description of slab_min_objects helpful in doc 2024-01-22 10:31:08 +01:00
split_page_table_lock.rst mm: remove pgtable_{pmd, pte}_page_{ctor, dtor}() wrappers 2023-08-21 13:37:58 -07:00
swap.rst
transhuge.rst mm: track mapcount of large folios in single value 2024-05-05 17:53:28 -07:00
unevictable-lru.rst Documentation: stop referring to page_remove_rmap() 2023-12-29 11:58:54 -08:00
vmalloc.rst
vmalloced-kernel-stacks.rst
vmemmap_dedup.rst remove references to page->flags in documentation 2024-04-25 20:56:15 -07:00
z3fold.rst docs/mm: remove useless markup 2023-02-02 10:18:05 -07:00
zsmalloc.rst mm: add orphaned kernel-doc to the rst files. 2023-08-24 16:20:31 -07:00