Find a file
Roman Penyaev a218cc4914 epoll: use rwlock in order to reduce ep_poll_callback() contention
The goal of this patch is to reduce contention of ep_poll_callback()
which can be called concurrently from different CPUs in case of high
events rates and many fds per epoll.  Problem can be very well
reproduced by generating events (write to pipe or eventfd) from many
threads, while consumer thread does polling.  In other words this patch
increases the bandwidth of events which can be delivered from sources to
the poller by adding poll items in a lockless way to the list.

The main change is in replacement of the spinlock with a rwlock, which
is taken on read in ep_poll_callback(), and then by adding poll items to
the tail of the list using xchg atomic instruction.  Write lock is taken
everywhere else in order to stop list modifications and guarantee that
list updates are fully completed (I assume that write side of a rwlock
does not starve, it seems qrwlock implementation has these guarantees).

The following are some microbenchmark results based on the test [1]
which starts threads which generate N events each.  The test ends when
all events are successfully fetched by the poller thread:

 spinlock
 ========

 threads  events/ms  run-time ms
       8       6402        12495
      16       7045        22709
      32       7395        43268

 rwlock + xchg
 =============

 threads  events/ms  run-time ms
       8      10038         7969
      16      12178        13138
      32      13223        24199

According to the results bandwidth of delivered events is significantly
increased, thus execution time is reduced.

This patch was tested with different sort of microbenchmarks and
artificial delays (e.g.  "udelay(get_random_int() & 0xff)") introduced
in kernel on paths where items are added to lists.

[1] https://github.com/rouming/test-tools/blob/master/stress-epoll.c

Link: http://lkml.kernel.org/r/20190103150104.17128-5-rpenyaev@suse.de
Signed-off-by: Roman Penyaev <rpenyaev@suse.de>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:01 -08:00
arch Driver core patches for 5.1-rc1 2019-03-06 14:52:48 -08:00
block for-linus-20190215 2019-02-15 09:12:28 -08:00
certs kbuild: remove redundant target cleaning on failure 2019-01-06 09:46:51 +09:00
crypto Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2019-03-05 09:09:55 -08:00
Documentation USB/PHY patches for 5.1-rc1 2019-03-06 16:48:27 -08:00
drivers USB/PHY patches for 5.1-rc1 2019-03-06 16:48:27 -08:00
fs epoll: use rwlock in order to reduce ep_poll_callback() contention 2019-03-07 18:32:01 -08:00
include include/linux/bitops.h: set_mask_bits() to return old value 2019-03-07 18:32:00 -08:00
init Merge branch 'akpm' (patches from Andrew) 2019-03-06 10:31:36 -08:00
ipc y2038: syscalls: rename y2038 compat syscalls 2019-02-07 00:13:27 +01:00
kernel dynamic_debug: add static inline stub for ddebug_add_module 2019-03-07 18:32:00 -08:00
lib lib/test_firmware.c: remove some dead code 2019-03-07 18:32:00 -08:00
LICENSES This is a fairly typical cycle for documentation. There's some welcome 2018-10-24 18:01:11 +01:00
mm mm,mremap: bail out earlier in mremap_to under map pressure 2019-03-05 21:07:21 -08:00
net Merge branch 'akpm' (patches from Andrew) 2019-03-06 10:31:36 -08:00
samples Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2019-03-05 08:26:13 -08:00
scripts checkpatch: add test for SPDX-License-Identifier on wrong line # 2019-03-07 18:32:01 -08:00
security get rid of legacy 'get_ds()' function 2019-03-04 10:50:14 -08:00
sound Char/Misc driver patches for 5.1-rc1 2019-03-06 14:18:59 -08:00
tools Staging/IIO patches for 5.1-rc1 2019-03-06 16:29:27 -08:00
usr user/Makefile: Fix typo and capitalization in comment section 2018-12-11 00:18:03 +09:00
virt ACPI updates for 5.1-rc1 2019-03-06 13:33:11 -08:00
.clang-format clang-format: Update .clang-format with the latest for_each macro list 2019-01-19 19:26:06 +01:00
.cocciconfig
.get_maintainer.ignore
.gitattributes
.gitignore kbuild: Add support for DT binding schema checks 2018-12-13 09:41:32 -06:00
.mailmap .mailmap: Add Mathieu Othacehe 2019-02-21 11:41:19 +00:00
COPYING
CREDITS Char/Misc driver patches for 5.1-rc1 2019-03-06 14:18:59 -08:00
Kbuild Merge branch 'locking/atomics' into locking/core, to pick up WIP commits 2019-02-11 14:27:05 +01:00
Kconfig kconfig: move the "Executable file formats" menu to fs/Kconfig.binfmt 2018-08-02 08:06:55 +09:00
MAINTAINERS USB/PHY patches for 5.1-rc1 2019-03-06 16:48:27 -08:00
Makefile Driver core patches for 5.1-rc1 2019-03-06 14:52:48 -08:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.