linux/mm
Aaron Lu 85ccc8fa81 mm/page_alloc: make sure __rmqueue() etc are always inline
__rmqueue(), __rmqueue_fallback(), __rmqueue_smallest() and
__rmqueue_cma_fallback() are all in page allocator's hot path and better
be finished as soon as possible.  One way to make them faster is by making
them inline.  But as Andrew Morton and Andi Kleen pointed out:

  https://lkml.org/lkml/2017/10/10/1252
  https://lkml.org/lkml/2017/10/10/1279

To make sure they are inlined, we should use __always_inline for them.

With the will-it-scale/page_fault1/process benchmark, when using nr_cpu
processes to stress buddy, the results for will-it-scale.processes with
and without the patch are:

On a 2-sockets Intel-Skylake machine:

   compiler          base        head
  gcc-4.4.7       6496131     6911823 +6.4%
  gcc-4.9.4       7225110     7731072 +7.0%
  gcc-5.4.1       7054224     7688146 +9.0%
  gcc-6.2.0       7059794     7651675 +8.4%

On a 4-sockets Intel-Skylake machine:

   compiler          base        head
  gcc-4.4.7      13162890    13508193 +2.6%
  gcc-4.9.4      14997463    15484353 +3.2%
  gcc-5.4.1      14708711    15449805 +5.0%
  gcc-6.2.0      14574099    15349204 +5.3%

The above 4 compilers are used because I've done the tests through
Intel's Linux Kernel Performance(LKP) infrastructure and they are the
available compilers there.

The benefit being less on 4 sockets machine is due to the lock
contention there(perf-profile/native_queued_spin_lock_slowpath=81%) is
less severe than on the 2 sockets machine(85%).

What the benchmark does is: it forks nr_cpu processes and then each
process does the following:
    1 mmap() 128M anonymous space;
    2 writes to each page there to trigger actual page allocation;
    3 munmap() it.
in a loop.

  https://github.com/antonblanchard/will-it-scale/blob/master/tests/page_fault1.c

Binary size wise, I have locally built them with different compilers:

[aaron@aaronlu obj]$ size */*/mm/page_alloc.o
   text    data     bss     dec     hex filename
  37409    9904    8524   55837    da1d gcc-4.9.4/base/mm/page_alloc.o
  38273    9904    8524   56701    dd7d gcc-4.9.4/head/mm/page_alloc.o
  37465    9840    8428   55733    d9b5 gcc-5.5.0/base/mm/page_alloc.o
  38169    9840    8428   56437    dc75 gcc-5.5.0/head/mm/page_alloc.o
  37573    9840    8428   55841    da21 gcc-6.4.0/base/mm/page_alloc.o
  38261    9840    8428   56529    dcd1 gcc-6.4.0/head/mm/page_alloc.o
  36863    9840    8428   55131    d75b gcc-7.2.0/base/mm/page_alloc.o
  37711    9840    8428   55979    daab gcc-7.2.0/head/mm/page_alloc.o

Text size increased about 800 bytes for mm/page_alloc.o.

[aaron@aaronlu obj]$ size */*/vmlinux
   text    data     bss     dec       hex     filename
10342757   5903208 17723392 33969357  20654cd gcc-4.9.4/base/vmlinux
10342757   5903208 17723392 33969357  20654cd gcc-4.9.4/head/vmlinux
10332448   5836608 17715200 33884256  2050860 gcc-5.5.0/base/vmlinux
10332448   5836608 17715200 33884256  2050860 gcc-5.5.0/head/vmlinux
10094546   5836696 17715200 33646442  201676a gcc-6.4.0/base/vmlinux
10094546   5836696 17715200 33646442  201676a gcc-6.4.0/head/vmlinux
10018775   5828732 17715200 33562707  2002053 gcc-7.2.0/base/vmlinux
10018775   5828732 17715200 33562707  2002053 gcc-7.2.0/head/vmlinux

Text size for vmlinux has no change though, probably due to function
alignment.

Link: http://lkml.kernel.org/r/20171013063111.GA26032@intel.com
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Kemi Wang <kemi.wang@intel.com>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-15 18:21:05 -08:00
..
kasan slab, slub, slob: add slab_flags_t 2017-11-15 18:21:01 -08:00
backing-dev.c backing-dev: kill unused pdflush_proc_obsolete() 2017-10-06 08:15:15 -06:00
balloon_compaction.c mm/migrate: new migrate mode MIGRATE_SYNC_NO_COPY 2017-09-08 18:26:46 -07:00
bootmem.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
cleancache.c fs: switch ->s_uuid to uuid_t 2017-06-05 16:59:12 +02:00
cma.c mm/cma.c: change pr_info to pr_err for cma_alloc fail log 2017-11-15 18:21:03 -08:00
cma.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
cma_debug.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
compaction.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
debug.c mm: consolidate page table accounting 2017-11-15 18:21:04 -08:00
debug_page_ref.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
dmapool.c lib/vsprintf.c: remove %Z support 2017-02-27 18:43:47 -08:00
early_ioremap.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
fadvise.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
failslab.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
filemap.c mm: remove nr_pages argument from pagevec_lookup_{,range}_tag() 2017-11-15 18:21:04 -08:00
frame_vector.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
frontswap.c mm, frontswap: convert frontswap_enabled to static key 2016-07-26 16:19:19 -07:00
gup.c Merge branch 'x86/urgent' into x86/mm, to pick up fixes 2017-10-20 13:06:52 +02:00
highmem.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
hmm.c mm/hmm: avoid bloating arch that do not make use of HMM 2017-09-08 18:26:46 -07:00
huge_memory.c mm: consolidate page table accounting 2017-11-15 18:21:04 -08:00
hugetlb.c mm/mmu_notifier: avoid double notification when it is useless 2017-11-15 18:21:03 -08:00
hugetlb_cgroup.c mm, hugetlb_cgroup: round limit_in_bytes down to hugepage size 2016-05-20 17:58:30 -07:00
hwpoison-inject.c mm: hwpoison: call shake_page() unconditionally 2017-05-03 15:52:12 -07:00
init-mm.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
internal.h mm, oom: do not rely on TIF_MEMDIE for memory reserves access 2017-09-06 17:27:30 -07:00
interval_tree.c lib/interval_tree: fast overlap detection 2017-09-08 18:26:49 -07:00
Kconfig mm/hmm: avoid bloating arch that do not make use of HMM 2017-09-08 18:26:46 -07:00
Kconfig.debug kmemcheck: rip it out 2017-11-15 18:21:05 -08:00
khugepaged.c mm: introduce wrappers to access mm->nr_ptes 2017-11-15 18:21:04 -08:00
kmemcheck.c kmemcheck: rip it out 2017-11-15 18:21:05 -08:00
kmemleak-test.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
kmemleak.c kmemcheck: remove annotations 2017-11-15 18:21:04 -08:00
ksm.c mm/mmu_notifier: avoid double notification when it is useless 2017-11-15 18:21:03 -08:00
list_lru.c mm: memcontrol: use vmalloc fallback for large kmem memcg arrays 2017-10-03 17:54:25 -07:00
maccess.c x86: remove more uaccess_32.h complexity 2016-05-22 17:21:27 -07:00
madvise.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
Makefile kmemcheck: rip it out 2017-11-15 18:21:05 -08:00
memblock.c mm: define memblock_virt_alloc_try_nid_raw 2017-11-15 18:21:05 -08:00
memcontrol.c mm: slabinfo: remove CONFIG_SLABINFO 2017-11-15 18:21:01 -08:00
memory-failure.c x86/mm, mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages 2017-08-17 10:30:49 +02:00
memory.c mm: introduce wrappers to access mm->nr_ptes 2017-11-15 18:21:04 -08:00
memory_hotplug.c mm, memory_hotplug: remove timeout from __offline_memory 2017-11-15 18:21:02 -08:00
mempolicy.c mm/mempolicy: fix NUMA_INTERLEAVE_HIT counter 2017-10-13 16:18:32 -07:00
mempool.c mm/mempool.c: use kmalloc_array_node() 2017-11-15 18:21:02 -08:00
memtest.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
migrate.c mm/mmu_notifier: avoid call to invalidate_range() in range_end() 2017-11-15 18:21:03 -08:00
mincore.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
mlock.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
mm_init.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
mmap.c lib/interval_tree: fast overlap detection 2017-09-08 18:26:49 -07:00
mmu_context.c sched/headers: Prepare to move the task_lock()/unlock() APIs to <linux/sched/task.h> 2017-03-02 08:42:38 +01:00
mmu_notifier.c mm/mmu_notifier: avoid call to invalidate_range() in range_end() 2017-11-15 18:21:03 -08:00
mmzone.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
mprotect.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
mremap.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
msync.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
nobootmem.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
nommu.c Merge branch 'work.set_fs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-09-14 18:13:32 -07:00
oom_kill.c mm: consolidate page table accounting 2017-11-15 18:21:04 -08:00
page-writeback.c mm: remove nr_pages argument from pagevec_lookup_{,range}_tag() 2017-11-15 18:21:04 -08:00
page_alloc.c mm/page_alloc: make sure __rmqueue() etc are always inline 2017-11-15 18:21:05 -08:00
page_counter.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
page_ext.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
page_idle.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
page_io.c mm, swap: skip swapcache for swapin of synchronous device 2017-11-15 18:21:02 -08:00
page_isolation.c mm: distinguish CMA and MOVABLE isolation in has_unmovable_pages() 2017-11-15 18:21:02 -08:00
page_owner.c mm/page_owner.c: reduce page_owner structure size 2017-11-15 18:21:03 -08:00
page_poison.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
page_vma_mapped.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
pagewalk.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
percpu-internal.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
percpu-km.c percpu: replace area map allocator with bitmap 2017-07-26 17:41:05 -04:00
percpu-stats.c percpu: fix starting offset for chunk statistics traversal 2017-09-27 14:45:57 -07:00
percpu-vm.c percpu: fix static checker warnings in pcpu_destroy_chunk 2017-06-29 11:23:38 -04:00
percpu.c mm, percpu: add support for __GFP_NOWARN flag 2017-10-19 13:13:49 +01:00
pgtable-generic.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
process_vm_access.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/mm.h> 2017-03-02 08:42:28 +01:00
quicklist.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
readahead.c mm: don't cap request size based on read-ahead setting 2016-12-12 18:55:08 -08:00
rmap.c mm/rmap.c: remove redundant variable cend 2017-11-15 18:21:04 -08:00
rodata_test.c mm: fix RODATA_TEST failure "rodata_test: test data was not read only" 2017-10-03 17:54:24 -07:00
shmem.c mm: treewide: remove GFP_TEMPORARY allocation flag 2017-09-13 18:53:16 -07:00
slab.c kmemcheck: stop using GFP_NOTRACK and SLAB_NOTRACK 2017-11-15 18:21:04 -08:00
slab.h kmemcheck: stop using GFP_NOTRACK and SLAB_NOTRACK 2017-11-15 18:21:04 -08:00
slab_common.c kmemcheck: stop using GFP_NOTRACK and SLAB_NOTRACK 2017-11-15 18:21:04 -08:00
slob.c slab, slub, slob: add slab_flags_t 2017-11-15 18:21:01 -08:00
slub.c kmemcheck: rip it out 2017-11-15 18:21:05 -08:00
sparse-vmemmap.c mm: stop zeroing memory during allocation in vmemmap 2017-11-15 18:21:05 -08:00
sparse.c mm: stop zeroing memory during allocation in vmemmap 2017-11-15 18:21:05 -08:00
swap.c mm: remove nr_pages argument from pagevec_lookup_{,range}_tag() 2017-11-15 18:21:04 -08:00
swap_cgroup.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
swap_slots.c mm/swap_slots.c: fix race conditions in swap_slots cache init 2017-11-15 18:21:03 -08:00
swap_state.c mm/swap_state.c: declare a few variables as __read_mostly 2017-11-15 18:21:05 -08:00
swapfile.c mm: swap: SWP_SYNCHRONOUS_IO: skip swapcache only if swapped page has no other reference 2017-11-15 18:21:02 -08:00
truncate.c mm/truncate.c: fix THP handling in invalidate_mapping_pages() 2017-07-10 16:32:32 -07:00
usercopy.c mm/usercopy: Drop extra is_vmalloc_or_module() check 2017-04-05 12:30:18 -07:00
userfaultfd.c userfaultfd: shmem: wire up shmem_mfill_zeropage_pte 2017-09-06 17:27:28 -07:00
util.c mm: rename global_page_state to global_zone_page_state 2017-09-06 17:27:29 -07:00
vmacache.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
vmalloc.c Revert "vmalloc: back off when the current task is killed" 2017-10-13 16:18:32 -07:00
vmpressure.c mm, vmpressure: pass-through notification support 2017-07-10 16:32:31 -07:00
vmscan.c mm: remove unused pgdat->inactive_ratio 2017-11-15 18:21:03 -08:00
vmstat.c mm: remove unused pgdat->inactive_ratio 2017-11-15 18:21:03 -08:00
workingset.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
z3fold.c z3fold: fix stale list handling 2017-10-03 17:54:24 -07:00
zbud.c mm/zbud.c: use list_last_entry() instead of list_tail_entry() 2016-01-15 11:40:52 -08:00
zpool.c
zsmalloc.c zsmalloc: calling zs_map_object() from irq is a bug 2017-11-15 18:21:03 -08:00
zswap.c mm/zswap.c: delete an error message for a failed memory allocation in zswap_dstmem_prepare() 2017-07-06 16:24:35 -07:00