linux/Documentation/sysctl
David Rientjes def5efe037 mm, madvise: fail with ENOMEM when splitting vma will hit max_map_count
If madvise(2) advice will result in the underlying vma being split and
the number of areas mapped by the process will exceed
/proc/sys/vm/max_map_count as a result, return ENOMEM instead of EAGAIN.

EAGAIN is returned by madvise(2) when a kernel resource, such as slab,
is temporarily unavailable.  It indicates that userspace should retry
the advice in the near future.  This is important for advice such as
MADV_DONTNEED which is often used by malloc implementations to free
memory back to the system: we really do want to free memory back when
madvise(2) returns EAGAIN because slab allocations (for vmas, anon_vmas,
or mempolicies) cannot be allocated.

Encountering /proc/sys/vm/max_map_count is not a temporary failure,
however, so return ENOMEM to indicate this is a more serious issue.  A
followup patch to the man page will specify this behavior.

Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1701241431120.42507@chino.kir.corp.google.com
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24 17:46:55 -08:00
..
00-INDEX sysctl: remove obsolete comments 2011-01-13 08:03:18 -08:00
abi.txt
fs.txt mnt: Add a per mount namespace limit on the number of mounts 2016-09-30 12:46:48 -05:00
kernel.txt These are the documentation changes for 4.10. 2016-12-12 21:58:13 -08:00
net.txt bpf: make jited programs visible in traces 2017-02-17 13:40:05 -05:00
README userns; Document per user per user namespace limits. 2016-09-22 12:52:03 -05:00
sunrpc.txt
user.txt userns; Document per user per user namespace limits. 2016-09-22 12:52:03 -05:00
vm.txt mm, madvise: fail with ENOMEM when splitting vma will hit max_map_count 2017-02-24 17:46:55 -08:00

Documentation for /proc/sys/		kernel version 2.2.10
	(c) 1998, 1999,  Rik van Riel <riel@nl.linux.org>

'Why', I hear you ask, 'would anyone even _want_ documentation
for them sysctl files? If anybody really needs it, it's all in
the source...'

Well, this documentation is written because some people either
don't know they need to tweak something, or because they don't
have the time or knowledge to read the source code.

Furthermore, the programmers who built sysctl have built it to
be actually used, not just for the fun of programming it :-)

==============================================================

Legal blurb:

As usual, there are two main things to consider:
1. you get what you pay for
2. it's free

The consequences are that I won't guarantee the correctness of
this document, and if you come to me complaining about how you
screwed up your system because of wrong documentation, I won't
feel sorry for you. I might even laugh at you...

But of course, if you _do_ manage to screw up your system using
only the sysctl options used in this file, I'd like to hear of
it. Not only to have a great laugh, but also to make sure that
you're the last RTFMing person to screw up.

In short, e-mail your suggestions, corrections and / or horror
stories to: <riel@nl.linux.org>

Rik van Riel.

==============================================================

Introduction:

Sysctl is a means of configuring certain aspects of the kernel
at run-time, and the /proc/sys/ directory is there so that you
don't even need special tools to do it!
In fact, there are only four things needed to use these config
facilities:
- a running Linux system
- root access
- common sense (this is especially hard to come by these days)
- knowledge of what all those values mean

As a quick 'ls /proc/sys' will show, the directory consists of
several (arch-dependent?) subdirs. Each subdir is mainly about
one part of the kernel, so you can do configuration on a piece
by piece basis, or just some 'thematic frobbing'.

The subdirs are about:
abi/		execution domains & personalities
debug/		<empty>
dev/		device specific information (eg dev/cdrom/info)
fs/		specific filesystems
		filehandle, inode, dentry and quota tuning
		binfmt_misc <Documentation/binfmt_misc.txt>
kernel/		global kernel info / tuning
		miscellaneous stuff
net/		networking stuff, for documentation look in:
		<Documentation/networking/>
proc/		<empty>
sunrpc/		SUN Remote Procedure Call (NFS)
vm/		memory management tuning
		buffer and cache management
user/		Per user per user namespace limits

These are the subdirs I have on my system. There might be more
or other subdirs in another setup. If you see another dir, I'd
really like to hear about it :-)