Merge branch 'akpm' (patches from Andrew)

Merge misc updates from Andrew Morton:
 "191 patches.

  Subsystems affected by this patch series: kthread, ia64, scripts,
  ntfs, squashfs, ocfs2, kernel/watchdog, and mm (gup, pagealloc, slab,
  slub, kmemleak, dax, debug, pagecache, gup, swap, memcg, pagemap,
  mprotect, bootmem, dma, tracing, vmalloc, kasan, initialization,
  pagealloc, and memory-failure)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (191 commits)
  mm,hwpoison: make get_hwpoison_page() call get_any_page()
  mm,hwpoison: send SIGBUS with error virutal address
  mm/page_alloc: split pcp->high across all online CPUs for cpuless nodes
  mm/page_alloc: allow high-order pages to be stored on the per-cpu lists
  mm: replace CONFIG_FLAT_NODE_MEM_MAP with CONFIG_FLATMEM
  mm: replace CONFIG_NEED_MULTIPLE_NODES with CONFIG_NUMA
  docs: remove description of DISCONTIGMEM
  arch, mm: remove stale mentions of DISCONIGMEM
  mm: remove CONFIG_DISCONTIGMEM
  m68k: remove support for DISCONTIGMEM
  arc: remove support for DISCONTIGMEM
  arc: update comment about HIGHMEM implementation
  alpha: remove DISCONTIGMEM and NUMA
  mm/page_alloc: move free_the_page
  mm/page_alloc: fix counting of managed_pages
  mm/page_alloc: improve memmap_pages dbg msg
  mm: drop SECTION_SHIFT in code comments
  mm/page_alloc: introduce vm.percpu_pagelist_high_fraction
  mm/page_alloc: limit the number of pages on PCP lists when reclaim is active
  mm/page_alloc: scale the number of pages that are batch freed
  ...
This commit is contained in:
Linus Torvalds 2021-06-29 17:29:11 -07:00
commit 65090f30ab
259 changed files with 3775 additions and 2822 deletions

View file

@ -3591,6 +3591,12 @@
off: turn off poisoning (default)
on: turn on poisoning
page_reporting.page_reporting_order=
[KNL] Minimal page reporting order
Format: <integer>
Adjust the minimal page reporting order. The page
reporting is disabled when it exceeds (MAX_ORDER-1).
panic= [KNL] Kernel behaviour on panic: delay <timeout>
timeout > 0: seconds before rebooting
timeout = 0: wait forever

View file

@ -39,7 +39,7 @@ in principle, they should work in any architecture where these
subsystems are present.
A periodic hrtimer runs to generate interrupts and kick the watchdog
task. An NMI perf event is generated every "watchdog_thresh"
job. An NMI perf event is generated every "watchdog_thresh"
(compile-time initialized to 10 and configurable through sysctl of the
same name) seconds to check for hardlockups. If any CPU in the system
does not receive any hrtimer interrupt during that time the
@ -47,7 +47,7 @@ does not receive any hrtimer interrupt during that time the
generate a kernel warning or call panic, depending on the
configuration.
The watchdog task is a high priority kernel thread that updates a
The watchdog job runs in a stop scheduling thread that updates a
timestamp every time it is scheduled. If that timestamp is not updated
for 2*watchdog_thresh seconds (the softlockup threshold) the
'softlockup detector' (coded inside the hrtimer callback function)

View file

@ -1297,11 +1297,11 @@ This parameter can be used to control the soft lockup detector.
= =================================
The soft lockup detector monitors CPUs for threads that are hogging the CPUs
without rescheduling voluntarily, and thus prevent the 'watchdog/N' threads
from running. The mechanism depends on the CPUs ability to respond to timer
interrupts which are needed for the 'watchdog/N' threads to be woken up by
the watchdog timer function, otherwise the NMI watchdog — if enabled — can
detect a hard lockup condition.
without rescheduling voluntarily, and thus prevent the 'migration/N' threads
from running, causing the watchdog work fail to execute. The mechanism depends
on the CPUs ability to respond to timer interrupts which are needed for the
watchdog work to be queued by the watchdog timer function, otherwise the NMI
watchdog — if enabled — can detect a hard lockup condition.
stack_erasing

View file

@ -64,7 +64,7 @@ Currently, these files are in /proc/sys/vm:
- overcommit_ratio
- page-cluster
- panic_on_oom
- percpu_pagelist_fraction
- percpu_pagelist_high_fraction
- stat_interval
- stat_refresh
- numa_stat
@ -790,22 +790,24 @@ panic_on_oom=2+kdump gives you very strong tool to investigate
why oom happens. You can get snapshot.
percpu_pagelist_fraction
========================
percpu_pagelist_high_fraction
=============================
This is the fraction of pages at most (high mark pcp->high) in each zone that
are allocated for each per cpu page list. The min value for this is 8. It
means that we don't allow more than 1/8th of pages in each zone to be
allocated in any single per_cpu_pagelist. This entry only changes the value
of hot per cpu pagelists. User can specify a number like 100 to allocate
1/100th of each zone to each per cpu page list.
This is the fraction of pages in each zone that are can be stored to
per-cpu page lists. It is an upper boundary that is divided depending
on the number of online CPUs. The min value for this is 8 which means
that we do not allow more than 1/8th of pages in each zone to be stored
on per-cpu page lists. This entry only changes the value of hot per-cpu
page lists. A user can specify a number like 100 to allocate 1/100th of
each zone between per-cpu lists.
The batch value of each per cpu pagelist is also updated as a result. It is
set to pcp->high/4. The upper limit of batch is (PAGE_SHIFT * 8)
The batch value of each per-cpu page list remains the same regardless of
the value of the high fraction so allocation latencies are unaffected.
The initial value is zero. Kernel does not use this value at boot time to set
the high water marks for each per cpu page list. If the user writes '0' to this
sysctl, it will revert to this default behavior.
The initial value is zero. Kernel uses this value to set the high pcp->high
mark based on the low watermark for the zone and the number of local
online CPUs. If the user writes '0' to this sysctl, it will revert to
this default behavior.
stat_interval
@ -936,12 +938,12 @@ allocations, THP and hugetlbfs pages.
To make it sensible with respect to the watermark_scale_factor
parameter, the unit is in fractions of 10,000. The default value of
15,000 on !DISCONTIGMEM configurations means that up to 150% of the high
watermark will be reclaimed in the event of a pageblock being mixed due
to fragmentation. The level of reclaim is determined by the number of
fragmentation events that occurred in the recent past. If this value is
smaller than a pageblock then a pageblocks worth of pages will be reclaimed
(e.g. 2MB on 64-bit x86). A boost factor of 0 will disable the feature.
15,000 means that up to 150% of the high watermark will be reclaimed in the
event of a pageblock being mixed due to fragmentation. The level of reclaim
is determined by the number of fragmentation events that occurred in the
recent past. If this value is smaller than a pageblock then a pageblocks
worth of pages will be reclaimed (e.g. 2MB on 64-bit x86). A boost factor
of 0 will disable the feature.
watermark_scale_factor

View file

@ -447,11 +447,10 @@ When a test fails due to a failed ``kmalloc``::
When a test fails due to a missing KASAN report::
# kmalloc_double_kzfree: EXPECTATION FAILED at lib/test_kasan.c:629
Expected kasan_data->report_expected == kasan_data->report_found, but
kasan_data->report_expected == 1
kasan_data->report_found == 0
not ok 28 - kmalloc_double_kzfree
# kmalloc_double_kzfree: EXPECTATION FAILED at lib/test_kasan.c:974
KASAN failure expected in "kfree_sensitive(ptr)", but none occurred
not ok 44 - kmalloc_double_kzfree
At the end the cumulative status of all KASAN tests is printed. On success::

View file

@ -14,15 +14,11 @@ for the CPU. Then there could be several contiguous ranges at
completely distinct addresses. And, don't forget about NUMA, where
different memory banks are attached to different CPUs.
Linux abstracts this diversity using one of the three memory models:
FLATMEM, DISCONTIGMEM and SPARSEMEM. Each architecture defines what
Linux abstracts this diversity using one of the two memory models:
FLATMEM and SPARSEMEM. Each architecture defines what
memory models it supports, what the default memory model is and
whether it is possible to manually override that default.
.. note::
At time of this writing, DISCONTIGMEM is considered deprecated,
although it is still in use by several architectures.
All the memory models track the status of physical page frames using
struct page arranged in one or more arrays.
@ -63,43 +59,6 @@ straightforward: `PFN - ARCH_PFN_OFFSET` is an index to the
The `ARCH_PFN_OFFSET` defines the first page frame number for
systems with physical memory starting at address different from 0.
DISCONTIGMEM
============
The DISCONTIGMEM model treats the physical memory as a collection of
`nodes` similarly to how Linux NUMA support does. For each node Linux
constructs an independent memory management subsystem represented by
`struct pglist_data` (or `pg_data_t` for short). Among other
things, `pg_data_t` holds the `node_mem_map` array that maps
physical pages belonging to that node. The `node_start_pfn` field of
`pg_data_t` is the number of the first page frame belonging to that
node.
The architecture setup code should call :c:func:`free_area_init_node` for
each node in the system to initialize the `pg_data_t` object and its
`node_mem_map`.
Every `node_mem_map` behaves exactly as FLATMEM's `mem_map` -
every physical page frame in a node has a `struct page` entry in the
`node_mem_map` array. When DISCONTIGMEM is enabled, a portion of the
`flags` field of the `struct page` encodes the node number of the
node hosting that page.
The conversion between a PFN and the `struct page` in the
DISCONTIGMEM model became slightly more complex as it has to determine
which node hosts the physical page and which `pg_data_t` object
holds the `struct page`.
Architectures that support DISCONTIGMEM provide :c:func:`pfn_to_nid`
to convert PFN to the node number. The opposite conversion helper
:c:func:`page_to_nid` is generic as it uses the node number encoded in
page->flags.
Once the node number is known, the PFN can be used to index
appropriate `node_mem_map` array to access the `struct page` and
the offset of the `struct page` from the `node_mem_map` plus
`node_start_pfn` is the PFN of that page.
SPARSEMEM
=========

View file

@ -549,29 +549,12 @@ config NR_CPUS
MARVEL support can handle a maximum of 32 CPUs, all the others
with working support have a maximum of 4 CPUs.
config ARCH_DISCONTIGMEM_ENABLE
bool "Discontiguous Memory Support"
depends on BROKEN
help
Say Y to support efficient handling of discontiguous physical memory,
for architectures which are either NUMA (Non-Uniform Memory Access)
or have huge holes in the physical address space for other reasons.
See <file:Documentation/vm/numa.rst> for more.
config ARCH_SPARSEMEM_ENABLE
bool "Sparse Memory Support"
help
Say Y to support efficient handling of discontiguous physical memory,
for systems that have huge holes in the physical address space.
config NUMA
bool "NUMA Support (EXPERIMENTAL)"
depends on DISCONTIGMEM && BROKEN
help
Say Y to compile the kernel to support NUMA (Non-Uniform Memory
Access). This option is for configuring high-end multiprocessor
server machines. If in doubt, say N.
config ALPHA_WTINT
bool "Use WTINT" if ALPHA_SRM || ALPHA_GENERIC
default y if ALPHA_QEMU
@ -596,11 +579,6 @@ config ALPHA_WTINT
If unsure, say N.
config NODES_SHIFT
int
default "7"
depends on NEED_MULTIPLE_NODES
# LARGE_VMALLOC is racy, if you *really* need it then fix it first
config ALPHA_LARGE_VMALLOC
bool

View file

@ -99,12 +99,6 @@ struct alpha_machine_vector
const char *vector_name;
/* NUMA information */
int (*pa_to_nid)(unsigned long);
int (*cpuid_to_nid)(int);
unsigned long (*node_mem_start)(int);
unsigned long (*node_mem_size)(int);
/* System specific parameters. */
union {
struct {

View file

@ -1,100 +0,0 @@
/* SPDX-License-Identifier: GPL-2.0 */
/*
* Written by Kanoj Sarcar (kanoj@sgi.com) Aug 99
* Adapted for the alpha wildfire architecture Jan 2001.
*/
#ifndef _ASM_MMZONE_H_
#define _ASM_MMZONE_H_
#ifdef CONFIG_DISCONTIGMEM
#include <asm/smp.h>
/*
* Following are macros that are specific to this numa platform.
*/
extern pg_data_t node_data[];
#define alpha_pa_to_nid(pa) \
(alpha_mv.pa_to_nid \
? alpha_mv.pa_to_nid(pa) \
: (0))
#define node_mem_start(nid) \
(alpha_mv.node_mem_start \
? alpha_mv.node_mem_start(nid) \
: (0UL))
#define node_mem_size(nid) \
(alpha_mv.node_mem_size \
? alpha_mv.node_mem_size(nid) \
: ((nid) ? (0UL) : (~0UL)))
#define pa_to_nid(pa) alpha_pa_to_nid(pa)
#define NODE_DATA(nid) (&node_data[(nid)])
#define node_localnr(pfn, nid) ((pfn) - NODE_DATA(nid)->node_start_pfn)
#if 1
#define PLAT_NODE_DATA_LOCALNR(p, n) \
(((p) >> PAGE_SHIFT) - PLAT_NODE_DATA(n)->gendata.node_start_pfn)
#else
static inline unsigned long
PLAT_NODE_DATA_LOCALNR(unsigned long p, int n)
{
unsigned long temp;
temp = p >> PAGE_SHIFT;
return temp - PLAT_NODE_DATA(n)->gendata.node_start_pfn;
}
#endif
/*
* Following are macros that each numa implementation must define.
*/
/*
* Given a kernel address, find the home node of the underlying memory.
*/
#define kvaddr_to_nid(kaddr) pa_to_nid(__pa(kaddr))
/*
* Given a kaddr, LOCAL_BASE_ADDR finds the owning node of the memory
* and returns the kaddr corresponding to first physical page in the
* node's mem_map.
*/
#define LOCAL_BASE_ADDR(kaddr) \
((unsigned long)__va(NODE_DATA(kvaddr_to_nid(kaddr))->node_start_pfn \
<< PAGE_SHIFT))
/* XXX: FIXME -- nyc */
#define kern_addr_valid(kaddr) (0)
#define mk_pte(page, pgprot) \
({ \
pte_t pte; \
unsigned long pfn; \
\
pfn = page_to_pfn(page) << 32; \
pte_val(pte) = pfn | pgprot_val(pgprot); \
\
pte; \
})
#define pte_page(x) \
({ \
unsigned long kvirt; \
struct page * __xx; \
\
kvirt = (unsigned long)__va(pte_val(x) >> (32-PAGE_SHIFT)); \
__xx = virt_to_page(kvirt); \
\
__xx; \
})
#define pfn_to_nid(pfn) pa_to_nid(((u64)(pfn) << PAGE_SHIFT))
#define pfn_valid(pfn) \
(((pfn) - node_start_pfn(pfn_to_nid(pfn))) < \
node_spanned_pages(pfn_to_nid(pfn))) \
#endif /* CONFIG_DISCONTIGMEM */
#endif /* _ASM_MMZONE_H_ */

View file

@ -206,7 +206,6 @@ extern unsigned long __zero_page(void);
#define page_to_pa(page) (page_to_pfn(page) << PAGE_SHIFT)
#define pte_pfn(pte) (pte_val(pte) >> 32)
#ifndef CONFIG_DISCONTIGMEM
#define pte_page(pte) pfn_to_page(pte_pfn(pte))
#define mk_pte(page, pgprot) \
({ \
@ -215,7 +214,6 @@ extern unsigned long __zero_page(void);
pte_val(pte) = (page_to_pfn(page) << 32) | pgprot_val(pgprot); \
pte; \
})
#endif
extern inline pte_t pfn_pte(unsigned long physpfn, pgprot_t pgprot)
{ pte_t pte; pte_val(pte) = (PHYS_TWIDDLE(physpfn) << 32) | pgprot_val(pgprot); return pte; }
@ -330,9 +328,7 @@ extern inline pte_t mk_swap_pte(unsigned long type, unsigned long offset)
#define __pte_to_swp_entry(pte) ((swp_entry_t) { pte_val(pte) })
#define __swp_entry_to_pte(x) ((pte_t) { (x).val })
#ifndef CONFIG_DISCONTIGMEM
#define kern_addr_valid(addr) (1)
#endif
#define pte_ERROR(e) \
printk("%s:%d: bad pte %016lx.\n", __FILE__, __LINE__, pte_val(e))

View file

@ -7,45 +7,6 @@
#include <linux/numa.h>
#include <asm/machvec.h>
#ifdef CONFIG_NUMA
static inline int cpu_to_node(int cpu)
{
int node;
if (!alpha_mv.cpuid_to_nid)
return 0;
node = alpha_mv.cpuid_to_nid(cpu);
#ifdef DEBUG_NUMA
BUG_ON(node < 0);
#endif
return node;
}
extern struct cpumask node_to_cpumask_map[];
/* FIXME: This is dumb, recalculating every time. But simple. */
static const struct cpumask *cpumask_of_node(int node)
{
int cpu;
if (node == NUMA_NO_NODE)
return cpu_all_mask;
cpumask_clear(&node_to_cpumask_map[node]);
for_each_online_cpu(cpu) {
if (cpu_to_node(cpu) == node)
cpumask_set_cpu(cpu, node_to_cpumask_map[node]);
}
return &node_to_cpumask_map[node];
}
#define cpumask_of_pcibus(bus) (cpu_online_mask)
#endif /* !CONFIG_NUMA */
# include <asm-generic/topology.h>
#endif /* _ASM_ALPHA_TOPOLOGY_H */

View file

@ -287,8 +287,7 @@ io7_init_hose(struct io7 *io7, int port)
/*
* Set up window 0 for scatter-gather 8MB at 8MB.
*/
hose->sg_isa = iommu_arena_new_node(marvel_cpuid_to_nid(io7->pe),
hose, 0x00800000, 0x00800000, 0);
hose->sg_isa = iommu_arena_new_node(0, hose, 0x00800000, 0x00800000, 0);
hose->sg_isa->align_entry = 8; /* cache line boundary */
csrs->POx_WBASE[0].csr =
hose->sg_isa->dma_base | wbase_m_ena | wbase_m_sg;
@ -305,8 +304,7 @@ io7_init_hose(struct io7 *io7, int port)
/*
* Set up window 2 for scatter-gather (up-to) 1GB at 3GB.
*/
hose->sg_pci = iommu_arena_new_node(marvel_cpuid_to_nid(io7->pe),
hose, 0xc0000000, 0x40000000, 0);
hose->sg_pci = iommu_arena_new_node(0, hose, 0xc0000000, 0x40000000, 0);
hose->sg_pci->align_entry = 8; /* cache line boundary */
csrs->POx_WBASE[2].csr =
hose->sg_pci->dma_base | wbase_m_ena | wbase_m_sg;
@ -843,53 +841,8 @@ EXPORT_SYMBOL(marvel_ioportmap);
EXPORT_SYMBOL(marvel_ioread8);
EXPORT_SYMBOL(marvel_iowrite8);
#endif
/*
* NUMA Support
*/
/**********
* FIXME - for now each cpu is a node by itself
* -- no real support for striped mode
**********
*/
int
marvel_pa_to_nid(unsigned long pa)
{
int cpuid;
if ((pa >> 43) & 1) /* I/O */
cpuid = (~(pa >> 35) & 0xff);
else /* mem */
cpuid = ((pa >> 34) & 0x3) | ((pa >> (37 - 2)) & (0x1f << 2));
return marvel_cpuid_to_nid(cpuid);
}
int
marvel_cpuid_to_nid(int cpuid)
{
return cpuid;
}
unsigned long
marvel_node_mem_start(int nid)
{
unsigned long pa;
pa = (nid & 0x3) | ((nid & (0x1f << 2)) << 1);
pa <<= 34;
return pa;
}
unsigned long
marvel_node_mem_size(int nid)
{
return 16UL * 1024 * 1024 * 1024; /* 16GB */
}
/*
* AGP GART Support.
*/
#include <linux/agp_backend.h>

View file

@ -434,39 +434,12 @@ wildfire_write_config(struct pci_bus *bus, unsigned int devfn, int where,
return PCIBIOS_SUCCESSFUL;
}
struct pci_ops wildfire_pci_ops =
struct pci_ops wildfire_pci_ops =
{
.read = wildfire_read_config,
.write = wildfire_write_config,
};
/*
* NUMA Support
*/
int wildfire_pa_to_nid(unsigned long pa)
{
return pa >> 36;
}
int wildfire_cpuid_to_nid(int cpuid)
{
/* assume 4 CPUs per node */
return cpuid >> 2;
}
unsigned long wildfire_node_mem_start(int nid)
{
/* 64GB per node */
return (unsigned long)nid * (64UL * 1024 * 1024 * 1024);
}
unsigned long wildfire_node_mem_size(int nid)
{
/* 64GB per node */
return 64UL * 1024 * 1024 * 1024;
}
#if DEBUG_DUMP_REGS
static void __init

View file

@ -71,33 +71,6 @@ iommu_arena_new_node(int nid, struct pci_controller *hose, dma_addr_t base,
if (align < mem_size)
align = mem_size;
#ifdef CONFIG_DISCONTIGMEM
arena = memblock_alloc_node(sizeof(*arena), align, nid);
if (!NODE_DATA(nid) || !arena) {
printk("%s: couldn't allocate arena from node %d\n"
" falling back to system-wide allocation\n",
__func__, nid);
arena = memblock_alloc(sizeof(*arena), SMP_CACHE_BYTES);
if (!arena)
panic("%s: Failed to allocate %zu bytes\n", __func__,
sizeof(*arena));
}
arena->ptes = memblock_alloc_node(sizeof(*arena), align, nid);
if (!NODE_DATA(nid) || !arena->ptes) {
printk("%s: couldn't allocate arena ptes from node %d\n"
" falling back to system-wide allocation\n",
__func__, nid);
arena->ptes = memblock_alloc(mem_size, align);
if (!arena->ptes)
panic("%s: Failed to allocate %lu bytes align=0x%lx\n",
__func__, mem_size, align);
}
#else /* CONFIG_DISCONTIGMEM */
arena = memblock_alloc(sizeof(*arena), SMP_CACHE_BYTES);
if (!arena)
panic("%s: Failed to allocate %zu bytes\n", __func__,
@ -107,8 +80,6 @@ iommu_arena_new_node(int nid, struct pci_controller *hose, dma_addr_t base,
panic("%s: Failed to allocate %lu bytes align=0x%lx\n",
__func__, mem_size, align);
#endif /* CONFIG_DISCONTIGMEM */
spin_lock_init(&arena->lock);
arena->hose = hose;
arena->dma_base = base;

View file

@ -49,10 +49,6 @@ extern void marvel_init_arch(void);
extern void marvel_kill_arch(int);
extern void marvel_machine_check(unsigned long, unsigned long);
extern void marvel_pci_tbi(struct pci_controller *, dma_addr_t, dma_addr_t);
extern int marvel_pa_to_nid(unsigned long);
extern int marvel_cpuid_to_nid(int);
extern unsigned long marvel_node_mem_start(int);
extern unsigned long marvel_node_mem_size(int);
extern struct _alpha_agp_info *marvel_agp_info(void);
struct io7 *marvel_find_io7(int pe);
struct io7 *marvel_next_io7(struct io7 *prev);
@ -101,10 +97,6 @@ extern void wildfire_init_arch(void);
extern void wildfire_kill_arch(int);
extern void wildfire_machine_check(unsigned long vector, unsigned long la_ptr);
extern void wildfire_pci_tbi(struct pci_controller *, dma_addr_t, dma_addr_t);
extern int wildfire_pa_to_nid(unsigned long);
extern int wildfire_cpuid_to_nid(int);
extern unsigned long wildfire_node_mem_start(int);
extern unsigned long wildfire_node_mem_size(int);
/* console.c */
#ifdef CONFIG_VGA_HOSE

View file

@ -79,11 +79,6 @@ int alpha_l3_cacheshape;
unsigned long alpha_verbose_mcheck = CONFIG_VERBOSE_MCHECK_ON;
#endif
#ifdef CONFIG_NUMA
struct cpumask node_to_cpumask_map[MAX_NUMNODES] __read_mostly;
EXPORT_SYMBOL(node_to_cpumask_map);
#endif
/* Which processor we booted from. */
int boot_cpuid;
@ -305,7 +300,6 @@ move_initrd(unsigned long mem_limit)
}
#endif
#ifndef CONFIG_DISCONTIGMEM
static void __init
setup_memory(void *kernel_end)
{
@ -389,9 +383,6 @@ setup_memory(void *kernel_end)
}
#endif /* CONFIG_BLK_DEV_INITRD */
}
#else
extern void setup_memory(void *);
#endif /* !CONFIG_DISCONTIGMEM */
int __init
page_is_ram(unsigned long pfn)
@ -618,13 +609,6 @@ setup_arch(char **cmdline_p)
"VERBOSE_MCHECK "
#endif
#ifdef CONFIG_DISCONTIGMEM
"DISCONTIGMEM "
#ifdef CONFIG_NUMA
"NUMA "
#endif
#endif
#ifdef CONFIG_DEBUG_SPINLOCK
"DEBUG_SPINLOCK "
#endif

View file

@ -461,10 +461,5 @@ struct alpha_machine_vector marvel_ev7_mv __initmv = {
.kill_arch = marvel_kill_arch,
.pci_map_irq = marvel_map_irq,
.pci_swizzle = common_swizzle,
.pa_to_nid = marvel_pa_to_nid,
.cpuid_to_nid = marvel_cpuid_to_nid,
.node_mem_start = marvel_node_mem_start,
.node_mem_size = marvel_node_mem_size,
};
ALIAS_MV(marvel_ev7)

View file

@ -337,10 +337,5 @@ struct alpha_machine_vector wildfire_mv __initmv = {
.kill_arch = wildfire_kill_arch,
.pci_map_irq = wildfire_map_irq,
.pci_swizzle = common_swizzle,
.pa_to_nid = wildfire_pa_to_nid,
.cpuid_to_nid = wildfire_cpuid_to_nid,
.node_mem_start = wildfire_node_mem_start,
.node_mem_size = wildfire_node_mem_size,
};
ALIAS_MV(wildfire)

View file

@ -6,5 +6,3 @@
ccflags-y := -Werror
obj-y := init.o fault.o
obj-$(CONFIG_DISCONTIGMEM) += numa.o

View file

@ -235,8 +235,6 @@ callback_init(void * kernel_end)
return kernel_end;
}
#ifndef CONFIG_DISCONTIGMEM
/*
* paging_init() sets up the memory map.
*/
@ -257,7 +255,6 @@ void __init paging_init(void)
/* Initialize the kernel's ZERO_PGE. */
memset((void *)ZERO_PGE, 0, PAGE_SIZE);
}
#endif /* CONFIG_DISCONTIGMEM */
#if defined(CONFIG_ALPHA_GENERIC) || defined(CONFIG_ALPHA_SRM)
void

View file

@ -1,223 +0,0 @@
// SPDX-License-Identifier: GPL-2.0
/*
* linux/arch/alpha/mm/numa.c
*
* DISCONTIGMEM NUMA alpha support.
*
* Copyright (C) 2001 Andrea Arcangeli <andrea@suse.de> SuSE
*/
#include <linux/types.h>
#include <linux/kernel.h>
#include <linux/mm.h>
#include <linux/memblock.h>
#include <linux/swap.h>
#include <linux/initrd.h>
#include <linux/pfn.h>
#include <linux/module.h>
#include <asm/hwrpb.h>
#include <asm/sections.h>
pg_data_t node_data[MAX_NUMNODES];
EXPORT_SYMBOL(node_data);
#undef DEBUG_DISCONTIG
#ifdef DEBUG_DISCONTIG
#define DBGDCONT(args...) printk(args)
#else
#define DBGDCONT(args...)
#endif
#define for_each_mem_cluster(memdesc, _cluster, i) \
for ((_cluster) = (memdesc)->cluster, (i) = 0; \
(i) < (memdesc)->numclusters; (i)++, (_cluster)++)
static void __init show_mem_layout(void)
{
struct memclust_struct * cluster;
struct memdesc_struct * memdesc;
int i;
/* Find free clusters, and init and free the bootmem accordingly. */
memdesc = (struct memdesc_struct *)
(hwrpb->mddt_offset + (unsigned long) hwrpb);
printk("Raw memory layout:\n");
for_each_mem_cluster(memdesc, cluster, i) {
printk(" memcluster %2d, usage %1lx, start %8lu, end %8lu\n",
i, cluster->usage, cluster->start_pfn,
cluster->start_pfn + cluster->numpages);
}
}
static void __init
setup_memory_node(int nid, void *kernel_end)
{
extern unsigned long mem_size_limit;
struct memclust_struct * cluster;
struct memdesc_struct * memdesc;
unsigned long start_kernel_pfn, end_kernel_pfn;
unsigned long start, end;
unsigned long node_pfn_start, node_pfn_end;
unsigned long node_min_pfn, node_max_pfn;
int i;
int show_init = 0;
/* Find the bounds of current node */
node_pfn_start = (node_mem_start(nid)) >> PAGE_SHIFT;
node_pfn_end = node_pfn_start + (node_mem_size(nid) >> PAGE_SHIFT);
/* Find free clusters, and init and free the bootmem accordingly. */
memdesc = (struct memdesc_struct *)
(hwrpb->mddt_offset + (unsigned long) hwrpb);
/* find the bounds of this node (node_min_pfn/node_max_pfn) */
node_min_pfn = ~0UL;
node_max_pfn = 0UL;
for_each_mem_cluster(memdesc, cluster, i) {
/* Bit 0 is console/PALcode reserved. Bit 1 is
non-volatile memory -- we might want to mark
this for later. */
if (cluster->usage & 3)
continue;
start = cluster->start_pfn;
end = start + cluster->numpages;
if (start >= node_pfn_end || end <= node_pfn_start)
continue;
if (!show_init) {
show_init = 1;
printk("Initializing bootmem allocator on Node ID %d\n", nid);
}
printk(" memcluster %2d, usage %1lx, start %8lu, end %8lu\n",
i, cluster->usage, cluster->start_pfn,
cluster->start_pfn + cluster->numpages);
if (start < node_pfn_start)
start = node_pfn_start;
if (end > node_pfn_end)
end = node_pfn_end;
if (start < node_min_pfn)
node_min_pfn = start;
if (end > node_max_pfn)
node_max_pfn = end;
}
if (mem_size_limit && node_max_pfn > mem_size_limit) {
static int msg_shown = 0;
if (!msg_shown) {
msg_shown = 1;
printk("setup: forcing memory size to %ldK (from %ldK).\n",
mem_size_limit << (PAGE_SHIFT - 10),
node_max_pfn << (PAGE_SHIFT - 10));
}
node_max_pfn = mem_size_limit;
}
if (node_min_pfn >= node_max_pfn)
return;
/* Update global {min,max}_low_pfn from node information. */
if (node_min_pfn < min_low_pfn)
min_low_pfn = node_min_pfn;
if (node_max_pfn > max_low_pfn)
max_pfn = max_low_pfn = node_max_pfn;
#if 0 /* we'll try this one again in a little while */
/* Cute trick to make sure our local node data is on local memory */
node_data[nid] = (pg_data_t *)(__va(node_min_pfn << PAGE_SHIFT));
#endif
printk(" Detected node memory: start %8lu, end %8lu\n",
node_min_pfn, node_max_pfn);
DBGDCONT(" DISCONTIG: node_data[%d] is at 0x%p\n", nid, NODE_DATA(nid));
/* Find the bounds of kernel memory. */
start_kernel_pfn = PFN_DOWN(KERNEL_START_PHYS);
end_kernel_pfn = PFN_UP(virt_to_phys(kernel_end));
if (!nid && (node_max_pfn < end_kernel_pfn || node_min_pfn > start_kernel_pfn))
panic("kernel loaded out of ram");
memblock_add_node(PFN_PHYS(node_min_pfn),
(node_max_pfn - node_min_pfn) << PAGE_SHIFT, nid);
/* Zone start phys-addr must be 2^(MAX_ORDER-1) aligned.
Note that we round this down, not up - node memory
has much larger alignment than 8Mb, so it's safe. */
node_min_pfn &= ~((1UL << (MAX_ORDER-1))-1);
NODE_DATA(nid)->node_start_pfn = node_min_pfn;
NODE_DATA(nid)->node_present_pages = node_max_pfn - node_min_pfn;
node_set_online(nid);
}
void __init
setup_memory(void *kernel_end)
{
unsigned long kernel_size;
int nid;
show_mem_layout();
nodes_clear(node_online_map);
min_low_pfn = ~0UL;
max_low_pfn = 0UL;
for (nid = 0; nid < MAX_NUMNODES; nid++)
setup_memory_node(nid, kernel_end);
kernel_size = virt_to_phys(kernel_end) - KERNEL_START_PHYS;
memblock_reserve(KERNEL_START_PHYS, kernel_size);
#ifdef CONFIG_BLK_DEV_INITRD
initrd_start = INITRD_START;
if (initrd_start) {
extern void *move_initrd(unsigned long);
initrd_end = initrd_start+INITRD_SIZE;
printk("Initial ramdisk at: 0x%p (%lu bytes)\n",
(void *) initrd_start, INITRD_SIZE);
if ((void *)initrd_end > phys_to_virt(PFN_PHYS(max_low_pfn))) {
if (!move_initrd(PFN_PHYS(max_low_pfn)))
printk("initrd extends beyond end of memory "
"(0x%08lx > 0x%p)\ndisabling initrd\n",
initrd_end,
phys_to_virt(PFN_PHYS(max_low_pfn)));
} else {
nid = kvaddr_to_nid(initrd_start);
memblock_reserve(virt_to_phys((void *)initrd_start),
INITRD_SIZE);
}
}
#endif /* CONFIG_BLK_DEV_INITRD */
}
void __init paging_init(void)
{
unsigned long max_zone_pfn[MAX_NR_ZONES] = {0, };
unsigned long dma_local_pfn;
/*
* The old global MAX_DMA_ADDRESS per-arch API doesn't fit
* in the NUMA model, for now we convert it to a pfn and
* we interpret this pfn as a local per-node information.
* This issue isn't very important since none of these machines
* have legacy ISA slots anyways.
*/
dma_local_pfn = virt_to_phys((char *)MAX_DMA_ADDRESS) >> PAGE_SHIFT;
max_zone_pfn[ZONE_DMA] = dma_local_pfn;
max_zone_pfn[ZONE_NORMAL] = max_pfn;
free_area_init(max_zone_pfn);
/* Initialize the kernel's ZERO_PGE. */
memset((void *)ZERO_PGE, 0, PAGE_SIZE);
}

View file

@ -62,10 +62,6 @@ config SCHED_OMIT_FRAME_POINTER
config GENERIC_CSUM
def_bool y
config ARCH_DISCONTIGMEM_ENABLE
def_bool n
depends on BROKEN
config ARCH_FLATMEM_ENABLE
def_bool y
@ -344,15 +340,6 @@ config ARC_HUGEPAGE_16M
endchoice
config NODES_SHIFT
int "Maximum NUMA Nodes (as a power of 2)"
default "0" if !DISCONTIGMEM
default "1" if DISCONTIGMEM
depends on NEED_MULTIPLE_NODES
help
Accessing memory beyond 1GB (with or w/o PAE) requires 2 memory
zones.
config ARC_COMPACT_IRQ_LEVELS
depends on ISA_ARCOMPACT
bool "Setup Timer IRQ as high Priority"

View file

@ -1,40 +0,0 @@
/* SPDX-License-Identifier: GPL-2.0-only */
/*
* Copyright (C) 2016 Synopsys, Inc. (www.synopsys.com)
*/
#ifndef _ASM_ARC_MMZONE_H
#define _ASM_ARC_MMZONE_H
#ifdef CONFIG_DISCONTIGMEM
extern struct pglist_data node_data[];
#define NODE_DATA(nid) (&node_data[nid])
static inline int pfn_to_nid(unsigned long pfn)
{
int is_end_low = 1;
if (IS_ENABLED(CONFIG_ARC_HAS_PAE40))
is_end_low = pfn <= virt_to_pfn(0xFFFFFFFFUL);
/*
* node 0: lowmem: 0x8000_0000 to 0xFFFF_FFFF
* node 1: HIGHMEM w/o PAE40: 0x0 to 0x7FFF_FFFF
* HIGHMEM with PAE40: 0x1_0000_0000 to ...
*/
if (pfn >= ARCH_PFN_OFFSET && is_end_low)
return 0;
return 1;
}
static inline int pfn_valid(unsigned long pfn)
{
int nid = pfn_to_nid(pfn);
return (pfn <= node_end_pfn(nid));
}
#endif /* CONFIG_DISCONTIGMEM */
#endif

View file

@ -83,12 +83,12 @@ static void show_faulting_vma(unsigned long address)
* non-inclusive vma
*/
mmap_read_lock(active_mm);
vma = find_vma(active_mm, address);
vma = vma_lookup(active_mm, address);
/* check against the find_vma( ) behaviour which returns the next VMA
* if the container VMA is not found
/* Lookup the vma at the address and report if the container VMA is not
* found
*/
if (vma && (vma->vm_start <= address)) {
if (vma) {
char buf[ARC_PATH_MAX];
char *nm = "?";

View file

@ -32,11 +32,6 @@ unsigned long arch_pfn_offset;
EXPORT_SYMBOL(arch_pfn_offset);
#endif
#ifdef CONFIG_DISCONTIGMEM
struct pglist_data node_data[MAX_NUMNODES] __read_mostly;
EXPORT_SYMBOL(node_data);
#endif
long __init arc_get_mem_sz(void)
{
return low_mem_sz;
@ -139,20 +134,14 @@ void __init setup_arch_memory(void)
#ifdef CONFIG_HIGHMEM
/*
* Populate a new node with highmem
*
* On ARC (w/o PAE) HIGHMEM addresses are actually smaller (0 based)
* than addresses in normal ala low memory (0x8000_0000 based).
* than addresses in normal aka low memory (0x8000_0000 based).
* Even with PAE, the huge peripheral space hole would waste a lot of
* mem with single mem_map[]. This warrants a mem_map per region design.
* Thus HIGHMEM on ARC is imlemented with DISCONTIGMEM.
*
* DISCONTIGMEM in turns requires multiple nodes. node 0 above is
* populated with normal memory zone while node 1 only has highmem
* mem with single contiguous mem_map[].
* Thus when HIGHMEM on ARC is enabled the memory map corresponding
* to the hole is freed and ARC specific version of pfn_valid()
* handles the hole in the memory map.
*/
#ifdef CONFIG_DISCONTIGMEM
node_set_online(1);
#endif
min_high_pfn = PFN_DOWN(high_mem_start);
max_high_pfn = PFN_DOWN(high_mem_start + high_mem_sz);

View file

@ -253,7 +253,7 @@ extern struct cpu_tlb_fns cpu_tlb;
* space.
* - mm - mm_struct describing address space
*
* flush_tlb_range(mm,start,end)
* flush_tlb_range(vma,start,end)
*
* Invalidate a range of TLB entries in the specified
* address space.
@ -261,18 +261,11 @@ extern struct cpu_tlb_fns cpu_tlb;
* - start - start address (may not be aligned)
* - end - end address (exclusive, may not be aligned)
*
* flush_tlb_page(vaddr,vma)
* flush_tlb_page(vma, uaddr)
*
* Invalidate the specified page in the specified address range.
* - vma - vm_area_struct describing address range
* - vaddr - virtual address (may not be aligned)
* - vma - vma_struct describing address range
*
* flush_kern_tlb_page(kaddr)
*
* Invalidate the TLB entry for the specified page. The address
* will be in the kernels virtual memory space. Current uses
* only require the D-TLB to be invalidated.
* - kaddr - Kernel virtual memory address
*/
/*

View file

@ -24,7 +24,7 @@
*
* - start - start address (may not be aligned)
* - end - end address (exclusive, may not be aligned)
* - vma - vma_struct describing address range
* - vma - vm_area_struct describing address range
*
* It is assumed that:
* - the "Invalidate single entry" instruction will invalidate

View file

@ -23,7 +23,7 @@
*
* - start - start address (may not be aligned)
* - end - end address (exclusive, may not be aligned)
* - vma - vma_struct describing address range
* - vma - vm_area_struct describing address range
*
* It is assumed that:
* - the "Invalidate single entry" instruction will invalidate

View file

@ -1035,7 +1035,7 @@ config NODES_SHIFT
int "Maximum NUMA Nodes (as a power of 2)"
range 1 10
default "4"
depends on NEED_MULTIPLE_NODES
depends on NUMA
help
Specify the maximum number of NUMA Nodes available on the target
system. Increases memory reserved to accommodate various tables.

View file

@ -929,7 +929,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
* get block mapping for device MMIO region.
*/
mmap_read_lock(current->mm);
vma = find_vma_intersection(current->mm, hva, hva + 1);
vma = vma_lookup(current->mm, hva);
if (unlikely(!vma)) {
kvm_err("Failed to find VMA for hva 0x%lx\n", hva);
mmap_read_unlock(current->mm);

View file

@ -69,8 +69,6 @@ void __init h8300_fdt_init(void *fdt, char *bootargs)
static void __init bootmem_init(void)
{
struct memblock_region *region;
memory_end = memory_start = 0;
/* Find main memory where is the kernel */

View file

@ -302,7 +302,7 @@ config NODES_SHIFT
int "Max num nodes shift(3-10)"
range 3 10
default "10"
depends on NEED_MULTIPLE_NODES
depends on NUMA
help
This option specifies the maximum number of nodes in your SSI system.
MAX_NUMNODES will be 2^(This value).

View file

@ -1086,7 +1086,7 @@ static inline long ia64_pal_freq_base(unsigned long *platform_base_freq)
/*
* Get the ratios for processor frequency, bus frequency and interval timer to
* to base frequency of the platform
* the base frequency of the platform
*/
static inline s64
ia64_pal_freq_ratios (struct pal_freq_ratio *proc_ratio, struct pal_freq_ratio *bus_ratio,

View file

@ -26,7 +26,7 @@
* the queue, and the other indicating the current tail. The lock is acquired
* by atomically noting the tail and incrementing it by one (thus adding
* ourself to the queue and noting our position), then waiting until the head
* becomes equal to the the initial value of the tail.
* becomes equal to the initial value of the tail.
* The pad bits in the middle are used to prevent the next_ticket number
* overflowing into the now_serving number.
*

View file

@ -257,7 +257,7 @@ static inline int uv_numa_blade_id(void)
return 0;
}
/* Convert a cpu number to the the UV blade number */
/* Convert a cpu number to the UV blade number */
static inline int uv_cpu_to_blade_id(int cpu)
{
return 0;

View file

@ -7,7 +7,7 @@
*
* This stub allows us to make EFI calls in physical mode with interrupts
* turned off. We need this because we can't call SetVirtualMap() until
* the kernel has booted far enough to allow allocation of struct vma_struct
* the kernel has booted far enough to allow allocation of struct vm_area_struct
* entries (which we would need to map stuff with memory attributes other
* than uncached or writeback...). Since the GetTime() service gets called
* earlier than that, we need to be able to make physical mode EFI calls from

View file

@ -343,7 +343,7 @@ init_record_index_pools(void)
/* - 2 - */
sect_min_size = sal_log_sect_min_sizes[0];
for (i = 1; i < sizeof sal_log_sect_min_sizes/sizeof(size_t); i++)
for (i = 1; i < ARRAY_SIZE(sal_log_sect_min_sizes); i++)
if (sect_min_size > sal_log_sect_min_sizes[i])
sect_min_size = sal_log_sect_min_sizes[i];

View file

@ -3,9 +3,8 @@
* License. See the file "COPYING" in the main directory of this archive
* for more details.
*
* This file contains NUMA specific variables and functions which can
* be split away from DISCONTIGMEM and are used on NUMA machines with
* contiguous memory.
* This file contains NUMA specific variables and functions which are used on
* NUMA machines with contiguous memory.
* 2002/08/07 Erich Focht <efocht@ess.nec.de>
* Populate cpu entries in sysfs for non-numa systems as well
* Intel Corporation - Ashok Raj

View file

@ -3,9 +3,8 @@
* License. See the file "COPYING" in the main directory of this archive
* for more details.
*
* This file contains NUMA specific variables and functions which can
* be split away from DISCONTIGMEM and are used on NUMA machines with
* contiguous memory.
* This file contains NUMA specific variables and functions which are used on
* NUMA machines with contiguous memory.
*
* 2002/08/07 Erich Focht <efocht@ess.nec.de>
*/

View file

@ -408,10 +408,6 @@ config SINGLE_MEMORY_CHUNK
order" to save memory that could be wasted for unused memory map.
Say N if not sure.
config ARCH_DISCONTIGMEM_ENABLE
depends on BROKEN
def_bool MMU && !SINGLE_MEMORY_CHUNK
config FORCE_MAX_ZONEORDER
int "Maximum zone order" if ADVANCED
depends on !SINGLE_MEMORY_CHUNK
@ -451,11 +447,6 @@ config M68K_L2_CACHE
depends on MAC
default y
config NODES_SHIFT
int
default "3"
depends on DISCONTIGMEM
config CPU_HAS_NO_BITFIELDS
bool
@ -553,4 +544,3 @@ config CACHE_COPYBACK
The ColdFire CPU cache is set into Copy-back mode.
endchoice
endif

View file

@ -1,10 +0,0 @@
/* SPDX-License-Identifier: GPL-2.0 */
#ifndef _ASM_M68K_MMZONE_H_
#define _ASM_M68K_MMZONE_H_
extern pg_data_t pg_data_map[];
#define NODE_DATA(nid) (&pg_data_map[nid])
#define NODE_MEM_MAP(nid) (NODE_DATA(nid)->node_mem_map)
#endif /* _ASM_M68K_MMZONE_H_ */

View file

@ -62,7 +62,7 @@ extern unsigned long _ramend;
#include <asm/page_no.h>
#endif
#if !defined(CONFIG_MMU) || defined(CONFIG_DISCONTIGMEM)
#ifndef CONFIG_MMU
#define __phys_to_pfn(paddr) ((unsigned long)((paddr) >> PAGE_SHIFT))
#define __pfn_to_phys(pfn) PFN_PHYS(pfn)
#endif

View file

@ -126,26 +126,6 @@ static inline void *__va(unsigned long x)
extern int m68k_virt_to_node_shift;
#ifndef CONFIG_DISCONTIGMEM
#define __virt_to_node(addr) (&pg_data_map[0])
#else
extern struct pglist_data *pg_data_table[];
static inline __attribute_const__ int __virt_to_node_shift(void)
{
int shift;
asm (
"1: moveq #0,%0\n"
m68k_fixup(%c1, 1b)
: "=d" (shift)
: "i" (m68k_fixup_vnode_shift));
return shift;
}
#define __virt_to_node(addr) (pg_data_table[(unsigned long)(addr) >> __virt_to_node_shift()])
#endif
#define virt_to_page(addr) ({ \
pfn_to_page(virt_to_pfn(addr)); \
})
@ -153,23 +133,8 @@ static inline __attribute_const__ int __virt_to_node_shift(void)
pfn_to_virt(page_to_pfn(page)); \
})
#ifdef CONFIG_DISCONTIGMEM
#define pfn_to_page(pfn) ({ \
unsigned long __pfn = (pfn); \
struct pglist_data *pgdat; \
pgdat = __virt_to_node((unsigned long)pfn_to_virt(__pfn)); \
pgdat->node_mem_map + (__pfn - pgdat->node_start_pfn); \
})
#define page_to_pfn(_page) ({ \
const struct page *__p = (_page); \
struct pglist_data *pgdat; \
pgdat = &pg_data_map[page_to_nid(__p)]; \
((__p) - pgdat->node_mem_map) + pgdat->node_start_pfn; \
})
#else
#define ARCH_PFN_OFFSET (m68k_memory[0].addr >> PAGE_SHIFT)
#include <asm-generic/memory_model.h>
#endif
#define virt_addr_valid(kaddr) ((unsigned long)(kaddr) >= PAGE_OFFSET && (unsigned long)(kaddr) < (unsigned long)high_memory)
#define pfn_valid(pfn) virt_addr_valid(pfn_to_virt(pfn))

View file

@ -263,7 +263,7 @@ static inline void flush_tlb_page(struct vm_area_struct *vma, unsigned long addr
BUG();
}
static inline void flush_tlb_range(struct mm_struct *mm,
static inline void flush_tlb_range(struct vm_area_struct *vma,
unsigned long start, unsigned long end)
{
BUG();

View file

@ -402,8 +402,8 @@ sys_cacheflush (unsigned long addr, int scope, int cache, unsigned long len)
* to this process.
*/
mmap_read_lock(current->mm);
vma = find_vma(current->mm, addr);
if (!vma || addr < vma->vm_start || addr + len > vma->vm_end)
vma = vma_lookup(current->mm, addr);
if (!vma || addr + len > vma->vm_end)
goto out_unlock;
}

View file

@ -44,28 +44,8 @@ EXPORT_SYMBOL(empty_zero_page);
int m68k_virt_to_node_shift;
#ifdef CONFIG_DISCONTIGMEM
pg_data_t pg_data_map[MAX_NUMNODES];
EXPORT_SYMBOL(pg_data_map);
pg_data_t *pg_data_table[65];
EXPORT_SYMBOL(pg_data_table);
#endif
void __init m68k_setup_node(int node)
{
#ifdef CONFIG_DISCONTIGMEM
struct m68k_mem_info *info = m68k_memory + node;
int i, end;
i = (unsigned long)phys_to_virt(info->addr) >> __virt_to_node_shift();
end = (unsigned long)phys_to_virt(info->addr + info->size - 1) >> __virt_to_node_shift();
for (; i <= end; i++) {
if (pg_data_table[i])
pr_warn("overlap at %u for chunk %u\n", i, node);
pg_data_table[i] = pg_data_map + node;
}
#endif
node_set_online(node);
}

View file

@ -2867,7 +2867,7 @@ config RANDOMIZE_BASE_MAX_OFFSET
config NODES_SHIFT
int
default "6"
depends on NEED_MULTIPLE_NODES
depends on NUMA
config HW_PERF_EVENTS
bool "Enable hardware performance counter support for perf events"

View file

@ -8,7 +8,7 @@
#include <asm/page.h>
#ifdef CONFIG_NEED_MULTIPLE_NODES
#ifdef CONFIG_NUMA
# include <mmzone.h>
#endif
@ -20,10 +20,4 @@
#define nid_to_addrbase(nid) 0
#endif
#ifdef CONFIG_DISCONTIGMEM
#define pfn_to_nid(pfn) pa_to_nid((pfn) << PAGE_SHIFT)
#endif /* CONFIG_DISCONTIGMEM */
#endif /* _ASM_MMZONE_H_ */

View file

@ -239,7 +239,7 @@ static inline int pfn_valid(unsigned long pfn)
/* pfn_valid is defined in linux/mmzone.h */
#elif defined(CONFIG_NEED_MULTIPLE_NODES)
#elif defined(CONFIG_NUMA)
#define pfn_valid(pfn) \
({ \

View file

@ -784,7 +784,6 @@ void force_fcr31_sig(unsigned long fcr31, void __user *fault_addr,
int process_fpemu_return(int sig, void __user *fault_addr, unsigned long fcr31)
{
int si_code;
struct vm_area_struct *vma;
switch (sig) {
case 0:
@ -800,8 +799,7 @@ int process_fpemu_return(int sig, void __user *fault_addr, unsigned long fcr31)
case SIGSEGV:
mmap_read_lock(current->mm);
vma = find_vma(current->mm, (unsigned long)fault_addr);
if (vma && (vma->vm_start <= (unsigned long)fault_addr))
if (vma_lookup(current->mm, (unsigned long)fault_addr))
si_code = SEGV_ACCERR;
else
si_code = SEGV_MAPERR;

View file

@ -394,7 +394,7 @@ void maar_init(void)
}
}
#ifndef CONFIG_NEED_MULTIPLE_NODES
#ifndef CONFIG_NUMA
void __init paging_init(void)
{
unsigned long max_zone_pfns[MAX_NR_ZONES];
@ -454,9 +454,6 @@ void __init mem_init(void)
BUILD_BUG_ON(IS_ENABLED(CONFIG_32BIT) && (_PFN_SHIFT > PAGE_SHIFT));
#ifdef CONFIG_HIGHMEM
#ifdef CONFIG_DISCONTIGMEM
#error "CONFIG_HIGHMEM and CONFIG_DISCONTIGMEM dont work together yet"
#endif
max_mapnr = highend_pfn ? highend_pfn : max_low_pfn;
#else
max_mapnr = max_low_pfn;
@ -476,7 +473,7 @@ void __init mem_init(void)
0x80000000 - 4, KCORE_TEXT);
#endif
}
#endif /* !CONFIG_NEED_MULTIPLE_NODES */
#endif /* !CONFIG_NUMA */
void free_init_pages(const char *what, unsigned long begin, unsigned long end)
{

View file

@ -76,18 +76,12 @@
* virt_to_page(k) convert a _valid_ virtual address to struct page *
* virt_addr_valid(k) indicates whether a virtual address is valid
*/
#ifndef CONFIG_DISCONTIGMEM
#define ARCH_PFN_OFFSET PHYS_PFN_OFFSET
#define pfn_valid(pfn) ((pfn) >= PHYS_PFN_OFFSET && (pfn) < (PHYS_PFN_OFFSET + max_mapnr))
#define virt_to_page(kaddr) (pfn_to_page(__pa(kaddr) >> PAGE_SHIFT))
#define virt_addr_valid(kaddr) ((unsigned long)(kaddr) >= PAGE_OFFSET && (unsigned long)(kaddr) < (unsigned long)high_memory)
#else /* CONFIG_DISCONTIGMEM */
#error CONFIG_DISCONTIGMEM is not supported yet.
#endif /* !CONFIG_DISCONTIGMEM */
#define page_to_phys(page) (page_to_pfn(page) << PAGE_SHIFT)
#endif

View file

@ -25,7 +25,7 @@
* - flush_tlb_all() flushes all processes TLBs
* - flush_tlb_mm(mm) flushes the specified mm context TLB's
* - flush_tlb_page(vma, vmaddr) flushes one page
* - flush_tlb_range(mm, start, end) flushes a range of pages
* - flush_tlb_range(vma, start, end) flushes a range of pages
*/
extern void local_flush_tlb_all(void);
extern void local_flush_tlb_mm(struct mm_struct *mm);

View file

@ -671,7 +671,7 @@ config NODES_SHIFT
int
default "8" if PPC64
default "4"
depends on NEED_MULTIPLE_NODES
depends on NUMA
config USE_PERCPU_NUMA_NODE_ID
def_bool y

View file

@ -18,7 +18,7 @@
* flags field of the struct page
*/
#ifdef CONFIG_NEED_MULTIPLE_NODES
#ifdef CONFIG_NUMA
extern struct pglist_data *node_data[];
/*
@ -41,7 +41,7 @@ u64 memory_hotplug_max(void);
#else
#define memory_hotplug_max() memblock_end_of_DRAM()
#endif /* CONFIG_NEED_MULTIPLE_NODES */
#endif /* CONFIG_NUMA */
#ifdef CONFIG_FA_DUMP
#define __HAVE_ARCH_RESERVED_KERNEL_PAGES
#endif

View file

@ -788,7 +788,7 @@ static void * __init pcpu_alloc_bootmem(unsigned int cpu, size_t size,
size_t align)
{
const unsigned long goal = __pa(MAX_DMA_ADDRESS);
#ifdef CONFIG_NEED_MULTIPLE_NODES
#ifdef CONFIG_NUMA
int node = early_cpu_to_node(cpu);
void *ptr;

View file

@ -1047,7 +1047,7 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
zalloc_cpumask_var_node(&per_cpu(cpu_coregroup_map, cpu),
GFP_KERNEL, cpu_to_node(cpu));
#ifdef CONFIG_NEED_MULTIPLE_NODES
#ifdef CONFIG_NUMA
/*
* numa_node_id() works after this.
*/

View file

@ -68,11 +68,11 @@ void machine_kexec_cleanup(struct kimage *image)
void arch_crash_save_vmcoreinfo(void)
{
#ifdef CONFIG_NEED_MULTIPLE_NODES
#ifdef CONFIG_NUMA
VMCOREINFO_SYMBOL(node_data);
VMCOREINFO_LENGTH(node_data, MAX_NUMNODES);
#endif
#ifndef CONFIG_NEED_MULTIPLE_NODES
#ifndef CONFIG_NUMA
VMCOREINFO_SYMBOL(contig_page_data);
#endif
#if defined(CONFIG_PPC64) && defined(CONFIG_SPARSEMEM_VMEMMAP)

View file

@ -4924,8 +4924,8 @@ static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu)
/* Look up the VMA for the start of this memory slot */
hva = memslot->userspace_addr;
mmap_read_lock(kvm->mm);
vma = find_vma(kvm->mm, hva);
if (!vma || vma->vm_start > hva || (vma->vm_flags & VM_IO))
vma = vma_lookup(kvm->mm, hva);
if (!vma || (vma->vm_flags & VM_IO))
goto up_out;
psize = vma_kernel_pagesize(vma);

View file

@ -615,7 +615,7 @@ void kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *slot,
/* Fetch the VMA if addr is not in the latest fetched one */
if (!vma || addr >= vma->vm_end) {
vma = find_vma_intersection(kvm->mm, addr, addr+1);
vma = vma_lookup(kvm->mm, addr);
if (!vma) {
pr_err("Can't find VMA for gfn:0x%lx\n", gfn);
break;

View file

@ -13,7 +13,7 @@ obj-y := fault.o mem.o pgtable.o mmap.o maccess.o \
obj-$(CONFIG_PPC_MMU_NOHASH) += nohash/
obj-$(CONFIG_PPC_BOOK3S_32) += book3s32/
obj-$(CONFIG_PPC_BOOK3S_64) += book3s64/
obj-$(CONFIG_NEED_MULTIPLE_NODES) += numa.o
obj-$(CONFIG_NUMA) += numa.o
obj-$(CONFIG_PPC_MM_SLICES) += slice.o
obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o
obj-$(CONFIG_NOT_COHERENT_CACHE) += dma-noncoherent.o

View file

@ -127,7 +127,7 @@ void __ref arch_remove_memory(int nid, u64 start, u64 size,
}
#endif
#ifndef CONFIG_NEED_MULTIPLE_NODES
#ifndef CONFIG_NUMA
void __init mem_topology_setup(void)
{
max_low_pfn = max_pfn = memblock_end_of_DRAM() >> PAGE_SHIFT;
@ -162,7 +162,7 @@ static int __init mark_nonram_nosave(void)
return 0;
}
#else /* CONFIG_NEED_MULTIPLE_NODES */
#else /* CONFIG_NUMA */
static int __init mark_nonram_nosave(void)
{
return 0;

View file

@ -332,7 +332,7 @@ config NODES_SHIFT
int "Maximum NUMA Nodes (as a power of 2)"
range 1 10
default "2"
depends on NEED_MULTIPLE_NODES
depends on NUMA
help
Specify the maximum number of NUMA Nodes available on the target
system. Increases memory reserved to accommodate various tables.

View file

@ -475,7 +475,7 @@ config NUMA
config NODES_SHIFT
int
depends on NEED_MULTIPLE_NODES
depends on NUMA
default "1"
config SCHED_SMT

View file

@ -344,8 +344,6 @@ static inline int is_module_addr(void *addr)
#define PTRS_PER_P4D _CRST_ENTRIES
#define PTRS_PER_PGD _CRST_ENTRIES
#define MAX_PTRS_PER_P4D PTRS_PER_P4D
/*
* Segment table and region3 table entry encoding
* (R = read-only, I = invalid, y = young bit):

View file

@ -2,7 +2,7 @@
#ifndef __ASM_SH_MMZONE_H
#define __ASM_SH_MMZONE_H
#ifdef CONFIG_NEED_MULTIPLE_NODES
#ifdef CONFIG_NUMA
#include <linux/numa.h>
extern struct pglist_data *node_data[];
@ -31,7 +31,7 @@ static inline void
setup_bootmem_node(int nid, unsigned long start, unsigned long end)
{
}
#endif /* CONFIG_NEED_MULTIPLE_NODES */
#endif /* CONFIG_NUMA */
/* Platform specific mem init */
void __init plat_mem_setup(void);

View file

@ -46,7 +46,7 @@ static int __init topology_init(void)
{
int i, ret;
#ifdef CONFIG_NEED_MULTIPLE_NODES
#ifdef CONFIG_NUMA
for_each_online_node(i)
register_one_node(i);
#endif

View file

@ -120,7 +120,7 @@ config NODES_SHIFT
int
default "3" if CPU_SUBTYPE_SHX3
default "1"
depends on NEED_MULTIPLE_NODES
depends on NUMA
config ARCH_FLATMEM_ENABLE
def_bool y

View file

@ -211,7 +211,7 @@ void __init allocate_pgdat(unsigned int nid)
get_pfn_range_for_nid(nid, &start_pfn, &end_pfn);
#ifdef CONFIG_NEED_MULTIPLE_NODES
#ifdef CONFIG_NUMA
NODE_DATA(nid) = memblock_alloc_try_nid(
sizeof(struct pglist_data),
SMP_CACHE_BYTES, MEMBLOCK_LOW_LIMIT,

View file

@ -265,7 +265,7 @@ config NODES_SHIFT
int "Maximum NUMA Nodes (as a power of 2)"
range 4 5 if SPARC64
default "5"
depends on NEED_MULTIPLE_NODES
depends on NUMA
help
Specify the maximum number of NUMA Nodes available on the target
system. Increases memory reserved to accommodate various tables.

View file

@ -2,7 +2,7 @@
#ifndef _SPARC64_MMZONE_H
#define _SPARC64_MMZONE_H
#ifdef CONFIG_NEED_MULTIPLE_NODES
#ifdef CONFIG_NUMA
#include <linux/cpumask.h>
@ -13,6 +13,6 @@ extern struct pglist_data *node_data[];
extern int numa_cpu_lookup_table[];
extern cpumask_t numa_cpumask_lookup_table[];
#endif /* CONFIG_NEED_MULTIPLE_NODES */
#endif /* CONFIG_NUMA */
#endif /* _SPARC64_MMZONE_H */

View file

@ -1543,7 +1543,7 @@ static void * __init pcpu_alloc_bootmem(unsigned int cpu, size_t size,
size_t align)
{
const unsigned long goal = __pa(MAX_DMA_ADDRESS);
#ifdef CONFIG_NEED_MULTIPLE_NODES
#ifdef CONFIG_NUMA
int node = cpu_to_node(cpu);
void *ptr;

View file

@ -903,7 +903,7 @@ struct node_mem_mask {
static struct node_mem_mask node_masks[MAX_NUMNODES];
static int num_node_masks;
#ifdef CONFIG_NEED_MULTIPLE_NODES
#ifdef CONFIG_NUMA
struct mdesc_mlgroup {
u64 node;
@ -1059,7 +1059,7 @@ static void __init allocate_node_data(int nid)
{
struct pglist_data *p;
unsigned long start_pfn, end_pfn;
#ifdef CONFIG_NEED_MULTIPLE_NODES
#ifdef CONFIG_NUMA
NODE_DATA(nid) = memblock_alloc_node(sizeof(struct pglist_data),
SMP_CACHE_BYTES, nid);
@ -1080,7 +1080,7 @@ static void __init allocate_node_data(int nid)
static void init_node_masks_nonnuma(void)
{
#ifdef CONFIG_NEED_MULTIPLE_NODES
#ifdef CONFIG_NUMA
int i;
#endif
@ -1090,7 +1090,7 @@ static void init_node_masks_nonnuma(void)
node_masks[0].match = 0;
num_node_masks = 1;
#ifdef CONFIG_NEED_MULTIPLE_NODES
#ifdef CONFIG_NUMA
for (i = 0; i < NR_CPUS; i++)
numa_cpu_lookup_table[i] = 0;
@ -1098,7 +1098,7 @@ static void init_node_masks_nonnuma(void)
#endif
}
#ifdef CONFIG_NEED_MULTIPLE_NODES
#ifdef CONFIG_NUMA
struct pglist_data *node_data[MAX_NUMNODES];
EXPORT_SYMBOL(numa_cpu_lookup_table);
@ -2487,7 +2487,7 @@ int page_in_phys_avail(unsigned long paddr)
static void __init register_page_bootmem_info(void)
{
#ifdef CONFIG_NEED_MULTIPLE_NODES
#ifdef CONFIG_NUMA
int i;
for_each_online_node(i)

View file

@ -1597,7 +1597,7 @@ config NODES_SHIFT
default "10" if MAXSMP
default "6" if X86_64
default "3"
depends on NEED_MULTIPLE_NODES
depends on NUMA
help
Specify the maximum number of NUMA Nodes available on the target
system. Increases memory reserved to accommodate various tables.

View file

@ -203,7 +203,7 @@ static int load_aout_binary(struct linux_binprm *bprm)
error = vm_mmap(bprm->file, N_TXTADDR(ex), ex.a_text,
PROT_READ | PROT_EXEC,
MAP_FIXED | MAP_PRIVATE | MAP_DENYWRITE |
MAP_EXECUTABLE | MAP_32BIT,
MAP_32BIT,
fd_offset);
if (error != N_TXTADDR(ex))
@ -212,7 +212,7 @@ static int load_aout_binary(struct linux_binprm *bprm)
error = vm_mmap(bprm->file, N_DATADDR(ex), ex.a_data,
PROT_READ | PROT_WRITE | PROT_EXEC,
MAP_FIXED | MAP_PRIVATE | MAP_DENYWRITE |
MAP_EXECUTABLE | MAP_32BIT,
MAP_32BIT,
fd_offset + ex.a_text);
if (error != N_DATADDR(ex))
return error;

View file

@ -1257,19 +1257,28 @@ static void kill_me_maybe(struct callback_head *cb)
{
struct task_struct *p = container_of(cb, struct task_struct, mce_kill_me);
int flags = MF_ACTION_REQUIRED;
int ret;
pr_err("Uncorrected hardware memory error in user-access at %llx", p->mce_addr);
if (!p->mce_ripv)
flags |= MF_MUST_KILL;
if (!memory_failure(p->mce_addr >> PAGE_SHIFT, flags) &&
!(p->mce_kflags & MCE_IN_KERNEL_COPYIN)) {
ret = memory_failure(p->mce_addr >> PAGE_SHIFT, flags);
if (!ret && !(p->mce_kflags & MCE_IN_KERNEL_COPYIN)) {
set_mce_nospec(p->mce_addr >> PAGE_SHIFT, p->mce_whole_page);
sync_core();
return;
}
/*
* -EHWPOISON from memory_failure() means that it already sent SIGBUS
* to the current process with the proper error info, so no need to
* send SIGBUS here again.
*/
if (ret == -EHWPOISON)
return;
if (p->mce_vaddr != (void __user *)-1l) {
force_sig_mceerr(BUS_MCEERR_AR, p->mce_vaddr, PAGE_SHIFT);
} else {

View file

@ -91,8 +91,8 @@ static inline int sgx_encl_find(struct mm_struct *mm, unsigned long addr,
{
struct vm_area_struct *result;
result = find_vma(mm, addr);
if (!result || result->vm_ops != &sgx_vm_ops || addr < result->vm_start)
result = vma_lookup(mm, addr);
if (!result || result->vm_ops != &sgx_vm_ops)
return -EINVAL;
*vma = result;

View file

@ -66,7 +66,7 @@ EXPORT_SYMBOL(__per_cpu_offset);
*/
static bool __init pcpu_need_numa(void)
{
#ifdef CONFIG_NEED_MULTIPLE_NODES
#ifdef CONFIG_NUMA
pg_data_t *last = NULL;
unsigned int cpu;
@ -101,7 +101,7 @@ static void * __init pcpu_alloc_bootmem(unsigned int cpu, unsigned long size,
unsigned long align)
{
const unsigned long goal = __pa(MAX_DMA_ADDRESS);
#ifdef CONFIG_NEED_MULTIPLE_NODES
#ifdef CONFIG_NUMA
int node = early_cpu_to_node(cpu);
void *ptr;
@ -140,7 +140,7 @@ static void __init pcpu_fc_free(void *ptr, size_t size)
static int __init pcpu_cpu_distance(unsigned int from, unsigned int to)
{
#ifdef CONFIG_NEED_MULTIPLE_NODES
#ifdef CONFIG_NUMA
if (early_cpu_to_node(from) == early_cpu_to_node(to))
return LOCAL_DISTANCE;
else

View file

@ -651,7 +651,7 @@ void __init find_low_pfn_range(void)
highmem_pfn_init();
}
#ifndef CONFIG_NEED_MULTIPLE_NODES
#ifndef CONFIG_NUMA
void __init initmem_init(void)
{
#ifdef CONFIG_HIGHMEM
@ -677,7 +677,7 @@ void __init initmem_init(void)
setup_bootmem_allocator();
}
#endif /* !CONFIG_NEED_MULTIPLE_NODES */
#endif /* !CONFIG_NUMA */
void __init setup_bootmem_allocator(void)
{

View file

@ -192,10 +192,6 @@ static inline unsigned long ___pa(unsigned long va)
#define pfn_valid(pfn) \
((pfn) >= ARCH_PFN_OFFSET && ((pfn) - ARCH_PFN_OFFSET) < max_mapnr)
#ifdef CONFIG_DISCONTIGMEM
# error CONFIG_DISCONTIGMEM not supported
#endif
#define virt_to_page(kaddr) pfn_to_page(__pa(kaddr) >> PAGE_SHIFT)
#define page_to_virt(page) __va(page_to_pfn(page) << PAGE_SHIFT)
#define virt_addr_valid(kaddr) pfn_valid(__pa(kaddr) >> PAGE_SHIFT)

View file

@ -26,8 +26,8 @@
*
* - flush_tlb_all() flushes all processes TLB entries
* - flush_tlb_mm(mm) flushes the specified mm context TLB entries
* - flush_tlb_page(mm, vmaddr) flushes a single page
* - flush_tlb_range(mm, start, end) flushes a range of pages
* - flush_tlb_page(vma, page) flushes a single page
* - flush_tlb_range(vma, vmaddr, end) flushes a range of pages
*/
void local_flush_tlb_all(void);

View file

@ -482,6 +482,7 @@ static DEVICE_ATTR(meminfo, 0444, node_read_meminfo, NULL);
static ssize_t node_read_numastat(struct device *dev,
struct device_attribute *attr, char *buf)
{
fold_vm_numa_events();
return sysfs_emit(buf,
"numa_hit %lu\n"
"numa_miss %lu\n"
@ -489,12 +490,12 @@ static ssize_t node_read_numastat(struct device *dev,
"interleave_hit %lu\n"
"local_node %lu\n"
"other_node %lu\n",
sum_zone_numa_state(dev->id, NUMA_HIT),
sum_zone_numa_state(dev->id, NUMA_MISS),
sum_zone_numa_state(dev->id, NUMA_FOREIGN),
sum_zone_numa_state(dev->id, NUMA_INTERLEAVE_HIT),
sum_zone_numa_state(dev->id, NUMA_LOCAL),
sum_zone_numa_state(dev->id, NUMA_OTHER));
sum_zone_numa_event_state(dev->id, NUMA_HIT),
sum_zone_numa_event_state(dev->id, NUMA_MISS),
sum_zone_numa_event_state(dev->id, NUMA_FOREIGN),
sum_zone_numa_event_state(dev->id, NUMA_INTERLEAVE_HIT),
sum_zone_numa_event_state(dev->id, NUMA_LOCAL),
sum_zone_numa_event_state(dev->id, NUMA_OTHER));
}
static DEVICE_ATTR(numastat, 0444, node_read_numastat, NULL);
@ -512,10 +513,11 @@ static ssize_t node_read_vmstat(struct device *dev,
sum_zone_node_page_state(nid, i));
#ifdef CONFIG_NUMA
for (i = 0; i < NR_VM_NUMA_STAT_ITEMS; i++)
fold_vm_numa_events();
for (i = 0; i < NR_VM_NUMA_EVENT_ITEMS; i++)
len += sysfs_emit_at(buf, len, "%s %lu\n",
numa_stat_name(i),
sum_zone_numa_state(nid, i));
sum_zone_numa_event_state(nid, i));
#endif
for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) {

View file

@ -71,7 +71,6 @@
#include <linux/writeback.h>
#include <linux/completion.h>
#include <linux/highmem.h>
#include <linux/kthread.h>
#include <linux/splice.h>
#include <linux/sysfs.h>
#include <linux/miscdevice.h>
@ -79,11 +78,14 @@
#include <linux/uio.h>
#include <linux/ioprio.h>
#include <linux/blk-cgroup.h>
#include <linux/sched/mm.h>
#include "loop.h"
#include <linux/uaccess.h>
#define LOOP_IDLE_WORKER_TIMEOUT (60 * HZ)
static DEFINE_IDR(loop_index_idr);
static DEFINE_MUTEX(loop_ctl_mutex);
@ -515,8 +517,6 @@ static void lo_rw_aio_complete(struct kiocb *iocb, long ret, long ret2)
{
struct loop_cmd *cmd = container_of(iocb, struct loop_cmd, iocb);
if (cmd->css)
css_put(cmd->css);
cmd->ret = ret;
lo_rw_aio_do_completion(cmd);
}
@ -577,8 +577,6 @@ static int lo_rw_aio(struct loop_device *lo, struct loop_cmd *cmd,
cmd->iocb.ki_complete = lo_rw_aio_complete;
cmd->iocb.ki_flags = IOCB_DIRECT;
cmd->iocb.ki_ioprio = IOPRIO_PRIO_VALUE(IOPRIO_CLASS_NONE, 0);
if (cmd->css)
kthread_associate_blkcg(cmd->css);
if (rw == WRITE)
ret = call_write_iter(file, &cmd->iocb, &iter);
@ -586,7 +584,6 @@ static int lo_rw_aio(struct loop_device *lo, struct loop_cmd *cmd,
ret = call_read_iter(file, &cmd->iocb, &iter);
lo_rw_aio_do_completion(cmd);
kthread_associate_blkcg(NULL);
if (ret != -EIOCBQUEUED)
cmd->iocb.ki_complete(&cmd->iocb, ret, 0);
@ -921,27 +918,100 @@ static void loop_config_discard(struct loop_device *lo)
q->limits.discard_alignment = 0;
}
static void loop_unprepare_queue(struct loop_device *lo)
{
kthread_flush_worker(&lo->worker);
kthread_stop(lo->worker_task);
}
struct loop_worker {
struct rb_node rb_node;
struct work_struct work;
struct list_head cmd_list;
struct list_head idle_list;
struct loop_device *lo;
struct cgroup_subsys_state *blkcg_css;
unsigned long last_ran_at;
};
static int loop_kthread_worker_fn(void *worker_ptr)
{
current->flags |= PF_LOCAL_THROTTLE | PF_MEMALLOC_NOIO;
return kthread_worker_fn(worker_ptr);
}
static void loop_workfn(struct work_struct *work);
static void loop_rootcg_workfn(struct work_struct *work);
static void loop_free_idle_workers(struct timer_list *timer);
static int loop_prepare_queue(struct loop_device *lo)
#ifdef CONFIG_BLK_CGROUP
static inline int queue_on_root_worker(struct cgroup_subsys_state *css)
{
kthread_init_worker(&lo->worker);
lo->worker_task = kthread_run(loop_kthread_worker_fn,
&lo->worker, "loop%d", lo->lo_number);
if (IS_ERR(lo->worker_task))
return -ENOMEM;
set_user_nice(lo->worker_task, MIN_NICE);
return 0;
return !css || css == blkcg_root_css;
}
#else
static inline int queue_on_root_worker(struct cgroup_subsys_state *css)
{
return !css;
}
#endif
static void loop_queue_work(struct loop_device *lo, struct loop_cmd *cmd)
{
struct rb_node **node = &(lo->worker_tree.rb_node), *parent = NULL;
struct loop_worker *cur_worker, *worker = NULL;
struct work_struct *work;
struct list_head *cmd_list;
spin_lock_irq(&lo->lo_work_lock);
if (queue_on_root_worker(cmd->blkcg_css))
goto queue_work;
node = &lo->worker_tree.rb_node;
while (*node) {
parent = *node;
cur_worker = container_of(*node, struct loop_worker, rb_node);
if (cur_worker->blkcg_css == cmd->blkcg_css) {
worker = cur_worker;
break;
} else if ((long)cur_worker->blkcg_css < (long)cmd->blkcg_css) {
node = &(*node)->rb_left;
} else {
node = &(*node)->rb_right;
}
}
if (worker)
goto queue_work;
worker = kzalloc(sizeof(struct loop_worker), GFP_NOWAIT | __GFP_NOWARN);
/*
* In the event we cannot allocate a worker, just queue on the
* rootcg worker and issue the I/O as the rootcg
*/
if (!worker) {
cmd->blkcg_css = NULL;
if (cmd->memcg_css)
css_put(cmd->memcg_css);
cmd->memcg_css = NULL;
goto queue_work;
}
worker->blkcg_css = cmd->blkcg_css;
css_get(worker->blkcg_css);
INIT_WORK(&worker->work, loop_workfn);
INIT_LIST_HEAD(&worker->cmd_list);
INIT_LIST_HEAD(&worker->idle_list);
worker->lo = lo;
rb_link_node(&worker->rb_node, parent, node);
rb_insert_color(&worker->rb_node, &lo->worker_tree);
queue_work:
if (worker) {
/*
* We need to remove from the idle list here while
* holding the lock so that the idle timer doesn't
* free the worker
*/
if (!list_empty(&worker->idle_list))
list_del_init(&worker->idle_list);
work = &worker->work;
cmd_list = &worker->cmd_list;
} else {
work = &lo->rootcg_work;
cmd_list = &lo->rootcg_cmd_list;
}
list_add_tail(&cmd->list_entry, cmd_list);
queue_work(lo->workqueue, work);
spin_unlock_irq(&lo->lo_work_lock);
}
static void loop_update_rotational(struct loop_device *lo)
@ -1127,12 +1197,23 @@ static int loop_configure(struct loop_device *lo, fmode_t mode,
!file->f_op->write_iter)
lo->lo_flags |= LO_FLAGS_READ_ONLY;
error = loop_prepare_queue(lo);
if (error)
lo->workqueue = alloc_workqueue("loop%d",
WQ_UNBOUND | WQ_FREEZABLE,
0,
lo->lo_number);
if (!lo->workqueue) {
error = -ENOMEM;
goto out_unlock;
}
set_disk_ro(lo->lo_disk, (lo->lo_flags & LO_FLAGS_READ_ONLY) != 0);
INIT_WORK(&lo->rootcg_work, loop_rootcg_workfn);
INIT_LIST_HEAD(&lo->rootcg_cmd_list);
INIT_LIST_HEAD(&lo->idle_worker_list);
lo->worker_tree = RB_ROOT;
timer_setup(&lo->timer, loop_free_idle_workers,
TIMER_DEFERRABLE);
lo->use_dio = lo->lo_flags & LO_FLAGS_DIRECT_IO;
lo->lo_device = bdev;
lo->lo_backing_file = file;
@ -1200,6 +1281,7 @@ static int __loop_clr_fd(struct loop_device *lo, bool release)
int err = 0;
bool partscan = false;
int lo_number;
struct loop_worker *pos, *worker;
mutex_lock(&lo->lo_mutex);
if (WARN_ON_ONCE(lo->lo_state != Lo_rundown)) {
@ -1219,6 +1301,18 @@ static int __loop_clr_fd(struct loop_device *lo, bool release)
/* freeze request queue during the transition */
blk_mq_freeze_queue(lo->lo_queue);
destroy_workqueue(lo->workqueue);
spin_lock_irq(&lo->lo_work_lock);
list_for_each_entry_safe(worker, pos, &lo->idle_worker_list,
idle_list) {
list_del(&worker->idle_list);
rb_erase(&worker->rb_node, &lo->worker_tree);
css_put(worker->blkcg_css);
kfree(worker);
}
spin_unlock_irq(&lo->lo_work_lock);
del_timer_sync(&lo->timer);
spin_lock_irq(&lo->lo_lock);
lo->lo_backing_file = NULL;
spin_unlock_irq(&lo->lo_lock);
@ -1255,7 +1349,6 @@ static int __loop_clr_fd(struct loop_device *lo, bool release)
partscan = lo->lo_flags & LO_FLAGS_PARTSCAN && bdev;
lo_number = lo->lo_number;
loop_unprepare_queue(lo);
out_unlock:
mutex_unlock(&lo->lo_mutex);
if (partscan) {
@ -2008,14 +2101,19 @@ static blk_status_t loop_queue_rq(struct blk_mq_hw_ctx *hctx,
}
/* always use the first bio's css */
cmd->blkcg_css = NULL;
cmd->memcg_css = NULL;
#ifdef CONFIG_BLK_CGROUP
if (cmd->use_aio && rq->bio && rq->bio->bi_blkg) {
cmd->css = &bio_blkcg(rq->bio)->css;
css_get(cmd->css);
} else
if (rq->bio && rq->bio->bi_blkg) {
cmd->blkcg_css = &bio_blkcg(rq->bio)->css;
#ifdef CONFIG_MEMCG
cmd->memcg_css =
cgroup_get_e_css(cmd->blkcg_css->cgroup,
&memory_cgrp_subsys);
#endif
cmd->css = NULL;
kthread_queue_work(&lo->worker, &cmd->work);
}
#endif
loop_queue_work(lo, cmd);
return BLK_STS_OK;
}
@ -2026,13 +2124,28 @@ static void loop_handle_cmd(struct loop_cmd *cmd)
const bool write = op_is_write(req_op(rq));
struct loop_device *lo = rq->q->queuedata;
int ret = 0;
struct mem_cgroup *old_memcg = NULL;
if (write && (lo->lo_flags & LO_FLAGS_READ_ONLY)) {
ret = -EIO;
goto failed;
}
if (cmd->blkcg_css)
kthread_associate_blkcg(cmd->blkcg_css);
if (cmd->memcg_css)
old_memcg = set_active_memcg(
mem_cgroup_from_css(cmd->memcg_css));
ret = do_req_filebacked(lo, rq);
if (cmd->blkcg_css)
kthread_associate_blkcg(NULL);
if (cmd->memcg_css) {
set_active_memcg(old_memcg);
css_put(cmd->memcg_css);
}
failed:
/* complete non-aio request */
if (!cmd->use_aio || ret) {
@ -2045,26 +2158,82 @@ static void loop_handle_cmd(struct loop_cmd *cmd)
}
}
static void loop_queue_work(struct kthread_work *work)
static void loop_set_timer(struct loop_device *lo)
{
struct loop_cmd *cmd =
container_of(work, struct loop_cmd, work);
loop_handle_cmd(cmd);
timer_reduce(&lo->timer, jiffies + LOOP_IDLE_WORKER_TIMEOUT);
}
static int loop_init_request(struct blk_mq_tag_set *set, struct request *rq,
unsigned int hctx_idx, unsigned int numa_node)
static void loop_process_work(struct loop_worker *worker,
struct list_head *cmd_list, struct loop_device *lo)
{
struct loop_cmd *cmd = blk_mq_rq_to_pdu(rq);
int orig_flags = current->flags;
struct loop_cmd *cmd;
kthread_init_work(&cmd->work, loop_queue_work);
return 0;
current->flags |= PF_LOCAL_THROTTLE | PF_MEMALLOC_NOIO;
spin_lock_irq(&lo->lo_work_lock);
while (!list_empty(cmd_list)) {
cmd = container_of(
cmd_list->next, struct loop_cmd, list_entry);
list_del(cmd_list->next);
spin_unlock_irq(&lo->lo_work_lock);
loop_handle_cmd(cmd);
cond_resched();
spin_lock_irq(&lo->lo_work_lock);
}
/*
* We only add to the idle list if there are no pending cmds
* *and* the worker will not run again which ensures that it
* is safe to free any worker on the idle list
*/
if (worker && !work_pending(&worker->work)) {
worker->last_ran_at = jiffies;
list_add_tail(&worker->idle_list, &lo->idle_worker_list);
loop_set_timer(lo);
}
spin_unlock_irq(&lo->lo_work_lock);
current->flags = orig_flags;
}
static void loop_workfn(struct work_struct *work)
{
struct loop_worker *worker =
container_of(work, struct loop_worker, work);
loop_process_work(worker, &worker->cmd_list, worker->lo);
}
static void loop_rootcg_workfn(struct work_struct *work)
{
struct loop_device *lo =
container_of(work, struct loop_device, rootcg_work);
loop_process_work(NULL, &lo->rootcg_cmd_list, lo);
}
static void loop_free_idle_workers(struct timer_list *timer)
{
struct loop_device *lo = container_of(timer, struct loop_device, timer);
struct loop_worker *pos, *worker;
spin_lock_irq(&lo->lo_work_lock);
list_for_each_entry_safe(worker, pos, &lo->idle_worker_list,
idle_list) {
if (time_is_after_jiffies(worker->last_ran_at +
LOOP_IDLE_WORKER_TIMEOUT))
break;
list_del(&worker->idle_list);
rb_erase(&worker->rb_node, &lo->worker_tree);
css_put(worker->blkcg_css);
kfree(worker);
}
if (!list_empty(&lo->idle_worker_list))
loop_set_timer(lo);
spin_unlock_irq(&lo->lo_work_lock);
}
static const struct blk_mq_ops loop_mq_ops = {
.queue_rq = loop_queue_rq,
.init_request = loop_init_request,
.complete = lo_complete_rq,
};
@ -2153,6 +2322,7 @@ static int loop_add(struct loop_device **l, int i)
mutex_init(&lo->lo_mutex);
lo->lo_number = i;
spin_lock_init(&lo->lo_lock);
spin_lock_init(&lo->lo_work_lock);
disk->major = LOOP_MAJOR;
disk->first_minor = i << part_shift;
disk->fops = &lo_fops;

View file

@ -14,7 +14,6 @@
#include <linux/blk-mq.h>
#include <linux/spinlock.h>
#include <linux/mutex.h>
#include <linux/kthread.h>
#include <uapi/linux/loop.h>
/* Possible states of device */
@ -55,8 +54,13 @@ struct loop_device {
spinlock_t lo_lock;
int lo_state;
struct kthread_worker worker;
struct task_struct *worker_task;
spinlock_t lo_work_lock;
struct workqueue_struct *workqueue;
struct work_struct rootcg_work;
struct list_head rootcg_cmd_list;
struct list_head idle_worker_list;
struct rb_root worker_tree;
struct timer_list timer;
bool use_dio;
bool sysfs_inited;
@ -67,13 +71,14 @@ struct loop_device {
};
struct loop_cmd {
struct kthread_work work;
struct list_head list_entry;
bool use_aio; /* use AIO interface to handle I/O */
atomic_t ref; /* only for aio */
long ret;
struct kiocb iocb;
struct bio_vec *bvec;
struct cgroup_subsys_state *css;
struct cgroup_subsys_state *blkcg_css;
struct cgroup_subsys_state *memcg_css;
};
/* Support for loadable transfer modules */

View file

@ -337,7 +337,7 @@ static unsigned long dax_get_unmapped_area(struct file *filp,
}
static const struct address_space_operations dev_dax_aops = {
.set_page_dirty = noop_set_page_dirty,
.set_page_dirty = __set_page_dirty_no_writeback,
.invalidatepage = noop_invalidatepage,
};

View file

@ -709,8 +709,8 @@ int amdgpu_ttm_tt_get_user_pages(struct amdgpu_bo *bo, struct page **pages)
}
mmap_read_lock(mm);
vma = find_vma(mm, start);
if (unlikely(!vma || start < vma->vm_start)) {
vma = vma_lookup(mm, start);
if (unlikely(!vma)) {
r = -EFAULT;
goto out_unlock;
}

View file

@ -871,7 +871,7 @@ static int __igt_mmap(struct drm_i915_private *i915,
pr_debug("igt_mmap(%s, %d) @ %lx\n", obj->mm.region->name, type, addr);
area = find_vma(current->mm, addr);
area = vma_lookup(current->mm, addr);
if (!area) {
pr_err("%s: Did not create a vm_area_struct for the mmap\n",
obj->mm.region->name);

View file

@ -64,7 +64,7 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames,
do {
unsigned long *nums = frame_vector_pfns(vec);
vma = find_vma_intersection(mm, start, start + 1);
vma = vma_lookup(mm, start);
if (!vma)
break;

View file

@ -49,8 +49,8 @@ struct vm_area_struct *gru_find_vma(unsigned long vaddr)
{
struct vm_area_struct *vma;
vma = find_vma(current->mm, vaddr);
if (vma && vma->vm_start <= vaddr && vma->vm_ops == &gru_vm_ops)
vma = vma_lookup(current->mm, vaddr);
if (vma && vma->vm_ops == &gru_vm_ops)
return vma;
return NULL;
}

View file

@ -567,7 +567,7 @@ static int vaddr_get_pfns(struct mm_struct *mm, unsigned long vaddr,
vaddr = untagged_addr(vaddr);
retry:
vma = find_vma_intersection(mm, vaddr, vaddr + 1);
vma = vma_lookup(mm, vaddr);
if (vma && vma->vm_flags & VM_PFNMAP) {
ret = follow_fault_pfn(vma, mm, vaddr, pfn, prot & IOMMU_WRITE);

View file

@ -993,6 +993,23 @@ static int virtballoon_probe(struct virtio_device *vdev)
goto out_unregister_oom;
}
/*
* The default page reporting order is @pageblock_order, which
* corresponds to 512MB in size on ARM64 when 64KB base page
* size is used. The page reporting won't be triggered if the
* freeing page can't come up with a free area like that huge.
* So we specify the page reporting order to 5, corresponding
* to 2MB. It helps to avoid THP splitting if 4KB base page
* size is used by host.
*
* Ideally, the page reporting order is selected based on the
* host's base page size. However, it needs more work to report
* that value. The hard-coded order would be fine currently.
*/
#if defined(CONFIG_ARM64) && defined(CONFIG_ARM64_64K_PAGES)
vb->pr_dev_info.order = 5;
#endif
err = page_reporting_register(&vb->pr_dev_info);
if (err)
goto out_unregister_oom;

View file

@ -73,6 +73,7 @@ static sector_t _adfs_bmap(struct address_space *mapping, sector_t block)
}
static const struct address_space_operations adfs_aops = {
.set_page_dirty = __set_page_dirty_buffers,
.readpage = adfs_readpage,
.writepage = adfs_writepage,
.write_begin = adfs_write_begin,

View file

@ -453,6 +453,7 @@ static sector_t _affs_bmap(struct address_space *mapping, sector_t block)
}
const struct address_space_operations affs_aops = {
.set_page_dirty = __set_page_dirty_buffers,
.readpage = affs_readpage,
.writepage = affs_writepage,
.write_begin = affs_write_begin,
@ -833,6 +834,7 @@ static int affs_write_end_ofs(struct file *file, struct address_space *mapping,
}
const struct address_space_operations affs_aops_ofs = {
.set_page_dirty = __set_page_dirty_buffers,
.readpage = affs_readpage_ofs,
//.writepage = affs_writepage_ofs,
.write_begin = affs_write_begin_ofs,

View file

@ -188,6 +188,7 @@ static sector_t bfs_bmap(struct address_space *mapping, sector_t block)
}
const struct address_space_operations bfs_aops = {
.set_page_dirty = __set_page_dirty_buffers,
.readpage = bfs_readpage,
.writepage = bfs_writepage,
.write_begin = bfs_write_begin,

View file

@ -222,7 +222,7 @@ static int load_aout_binary(struct linux_binprm * bprm)
error = vm_mmap(bprm->file, N_TXTADDR(ex), ex.a_text,
PROT_READ | PROT_EXEC,
MAP_FIXED | MAP_PRIVATE | MAP_DENYWRITE | MAP_EXECUTABLE,
MAP_FIXED | MAP_PRIVATE | MAP_DENYWRITE,
fd_offset);
if (error != N_TXTADDR(ex))
@ -230,7 +230,7 @@ static int load_aout_binary(struct linux_binprm * bprm)
error = vm_mmap(bprm->file, N_DATADDR(ex), ex.a_data,
PROT_READ | PROT_WRITE | PROT_EXEC,
MAP_FIXED | MAP_PRIVATE | MAP_DENYWRITE | MAP_EXECUTABLE,
MAP_FIXED | MAP_PRIVATE | MAP_DENYWRITE,
fd_offset + ex.a_text);
if (error != N_DATADDR(ex))
return error;

View file

@ -1070,7 +1070,7 @@ static int load_elf_binary(struct linux_binprm *bprm)
elf_prot = make_prot(elf_ppnt->p_flags, &arch_state,
!!interpreter, false);
elf_flags = MAP_PRIVATE | MAP_DENYWRITE | MAP_EXECUTABLE;
elf_flags = MAP_PRIVATE | MAP_DENYWRITE;
vaddr = elf_ppnt->p_vaddr;
/*

View file

@ -928,7 +928,7 @@ static int elf_fdpic_map_file_constdisp_on_uclinux(
{
struct elf32_fdpic_loadseg *seg;
struct elf32_phdr *phdr;
unsigned long load_addr, base = ULONG_MAX, top = 0, maddr = 0, mflags;
unsigned long load_addr, base = ULONG_MAX, top = 0, maddr = 0;
int loop, ret;
load_addr = params->load_addr;
@ -948,12 +948,8 @@ static int elf_fdpic_map_file_constdisp_on_uclinux(
}
/* allocate one big anon block for everything */
mflags = MAP_PRIVATE;
if (params->flags & ELF_FDPIC_FLAG_EXECUTABLE)
mflags |= MAP_EXECUTABLE;
maddr = vm_mmap(NULL, load_addr, top - base,
PROT_READ | PROT_WRITE | PROT_EXEC, mflags, 0);
PROT_READ | PROT_WRITE | PROT_EXEC, MAP_PRIVATE, 0);
if (IS_ERR_VALUE(maddr))
return (int) maddr;
@ -1046,9 +1042,6 @@ static int elf_fdpic_map_file_by_direct_mmap(struct elf_fdpic_params *params,
if (phdr->p_flags & PF_X) prot |= PROT_EXEC;
flags = MAP_PRIVATE | MAP_DENYWRITE;
if (params->flags & ELF_FDPIC_FLAG_EXECUTABLE)
flags |= MAP_EXECUTABLE;
maddr = 0;
switch (params->flags & ELF_FDPIC_FLAG_ARRANGEMENT) {

View file

@ -573,7 +573,7 @@ static int load_flat_file(struct linux_binprm *bprm,
pr_debug("ROM mapping of file (we hope)\n");
textpos = vm_mmap(bprm->file, 0, text_len, PROT_READ|PROT_EXEC,
MAP_PRIVATE|MAP_EXECUTABLE, 0);
MAP_PRIVATE, 0);
if (!textpos || IS_ERR_VALUE(textpos)) {
ret = textpos;
if (!textpos)

View file

@ -1754,6 +1754,7 @@ static int blkdev_writepages(struct address_space *mapping,
}
static const struct address_space_operations def_blk_aops = {
.set_page_dirty = __set_page_dirty_buffers,
.readpage = blkdev_readpage,
.readahead = blkdev_readahead,
.writepage = blkdev_writepage,

View file

@ -588,31 +588,6 @@ void mark_buffer_dirty_inode(struct buffer_head *bh, struct inode *inode)
}
EXPORT_SYMBOL(mark_buffer_dirty_inode);
/*
* Mark the page dirty, and set it dirty in the page cache, and mark the inode
* dirty.
*
* If warn is true, then emit a warning if the page is not uptodate and has
* not been truncated.
*
* The caller must hold lock_page_memcg().
*/
void __set_page_dirty(struct page *page, struct address_space *mapping,
int warn)
{
unsigned long flags;
xa_lock_irqsave(&mapping->i_pages, flags);
if (page->mapping) { /* Race with truncate? */
WARN_ON_ONCE(warn && !PageUptodate(page));
account_page_dirtied(page, mapping);
__xa_set_mark(&mapping->i_pages, page_index(page),
PAGECACHE_TAG_DIRTY);
}
xa_unlock_irqrestore(&mapping->i_pages, flags);
}
EXPORT_SYMBOL_GPL(__set_page_dirty);
/*
* Add a page to the dirty page list.
*

Some files were not shown because too many files have changed in this diff Show more