linux/net
Marcelo Ricardo Leitner 311b21774f sctp: simplify sk_receive_queue locking
SCTP already serializes access to rcvbuf through its sock lock:
sctp_recvmsg takes it right in the start and release at the end, while
rx path will also take the lock before doing any socket processing. On
sctp_rcv() it will check if there is an user using the socket and, if
there is, it will queue incoming packets to the backlog. The backlog
processing will do the same. Even timers will do such check and
re-schedule if an user is using the socket.

Simplifying this will allow us to remove sctp_skb_list_tail and get ride
of some expensive lockings.  The lists that it is used on are also
mangled with functions like __skb_queue_tail and __skb_unlink in the
same context, like on sctp_ulpq_tail_event() and sctp_clear_pd().
sctp_close() will also purge those while using only the sock lock.

Therefore the lockings performed by sctp_skb_list_tail() are not
necessary. This patch removes this function and replaces its calls with
just skb_queue_splice_tail_init() instead.

The biggest gain is at sctp_ulpq_tail_event(), because the events always
contain a list, even if it's queueing a single skb and this was
triggering expensive calls to spin_lock_irqsave/_irqrestore for every
data chunk received.

As SCTP will deliver each data chunk on a corresponding recvmsg, the
more effective the change will be.
Before this patch, with chunks with 30 bytes:
netperf -t SCTP_STREAM -H 192.168.1.2 -cC -l 60 -- -m 30 -S 400000
400000 -s 400000 400000
on a 10Gbit link with 1500 MTU:

SCTP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.1 () port 0 AF_INET
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB

425984 425984     30    60.00       137.45   7.34     7.36     52.504  52.608

With it:

SCTP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.1 () port 0 AF_INET
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB

425984 425984     30    60.00       179.10   7.97     6.70     43.740  36.788

Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-15 17:22:20 -04:00
..
6lowpan 6lowpan: iphc: fix handling of link-local compression 2016-04-08 19:28:13 +02:00
9p net/9p: convert to new CQ API 2016-03-10 20:54:09 -05:00
802
8021q vlan: propagate gso_max_segs 2016-03-17 21:05:01 -04:00
appletalk appletalk: fix erroneous return value 2016-02-18 14:59:34 -05:00
atm net: Generalise wq_has_sleeper helper 2015-11-30 14:47:33 -05:00
ax25 ax25: add link layer header validation function 2016-03-09 22:13:01 -05:00
batman-adv batman-adv: clarify CFG80211 dependency 2016-03-02 13:45:47 -05:00
bluetooth sock: tigthen lockdep checks for sock_owned_by_user 2016-04-13 22:37:20 -04:00
bridge bridge: a netlink notification should be sent when those attributes are changed by ioctl 2016-04-13 22:42:33 -04:00
caif net: caif: fix misleading indentation 2016-03-14 13:09:50 -04:00
can sock: enable timestamping using control messages 2016-04-04 15:50:30 -04:00
ceph mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros 2016-04-04 10:41:08 -07:00
core bpf: convert relevant helper args to ARG_PTR_TO_RAW_STACK 2016-04-14 21:40:41 -04:00
dcb net/dcb: make dcbnl.c explicitly non-modular 2015-10-09 07:52:27 -07:00
dccp net: introduce lockdep_is_held and update various places to use it 2016-04-07 16:44:14 -04:00
decnet net: add validation for the socket syscall protocol argument 2015-12-14 16:09:30 -05:00
dns_resolver net: dns_resolver: convert time_t to time64_t 2015-11-18 16:27:46 -05:00
dsa dsa: Rename phys_port_mask to enabled_port_mask 2016-04-13 18:15:23 -04:00
ethernet eth: Pull header from first fragment via eth_get_headlen 2016-02-24 13:58:05 -05:00
hsr net/hsr: Added support for HSR v1 2016-04-15 17:06:48 -04:00
ieee802154 ieee802154: 6lowpan: fix return of netdev notifier 2016-02-23 20:29:40 +01:00
ipv4 tcp: remove false sharing in tcp_rcv_state_process() 2016-04-15 16:45:44 -04:00
ipv6 tcp: do not mess with listener sk_wmem_alloc 2016-04-15 16:45:44 -04:00
ipx
irda Merge 4.5-rc4 into tty-next 2016-02-14 14:36:04 -08:00
iucv af_iucv: Validate socket address length in iucv_sock_bind() 2016-01-19 14:21:08 -05:00
kcm kcm: Add receive message timeout 2016-03-09 16:36:15 -05:00
key af_key: fix two typos 2015-10-23 03:05:19 -07:00
l2tp Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-04-09 17:41:41 -04:00
l3mdev net: l3mdev: address selection should only consider devices in L3 domain 2016-02-26 14:22:26 -05:00
lapb
llc sock: tigthen lockdep checks for sock_owned_by_user 2016-04-13 22:37:20 -04:00
mac80211 cfg80211: remove enum ieee80211_band 2016-04-12 15:56:15 +02:00
mac802154 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2016-03-19 10:05:34 -07:00
mpls GSO: Add GSO type for fixed IPv4 ID 2016-04-14 16:23:40 -04:00
netfilter Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next 2016-04-12 22:34:56 -04:00
netlabel netlabel: do not initialise statics to NULL 2016-03-07 11:08:26 -05:00
netlink rhashtable: accept GFP flags in rhashtable_walk_init 2016-04-05 10:56:32 +02:00
netrom netfilter: Remove spurios included of netfilter.h 2015-06-18 21:14:32 +02:00
nfc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2016-03-19 10:05:34 -07:00
openvswitch Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf 2016-03-28 15:38:59 -04:00
packet packet: uses kfree_skb() for errors. 2016-04-14 17:50:44 -04:00
phonet sock: struct proto hash function may error 2016-02-11 03:54:14 -05:00
rds RDS: fix congestion map corruption for PAGE_SIZE > 4k 2016-04-07 16:58:28 -04:00
rfkill rfkill: Use switch to demux userspace operations 2016-04-05 10:48:53 +02:00
rose Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-06-24 02:58:51 -07:00
rxrpc rxrpc: Create a null security type and get rid of conditional calls 2016-04-11 15:34:41 -04:00
sched qdisc: constify meta_type_ops structures 2016-04-14 00:35:30 -04:00
sctp sctp: simplify sk_receive_queue locking 2016-04-15 17:22:20 -04:00
sunrpc sock: tigthen lockdep checks for sock_owned_by_user 2016-04-13 22:37:20 -04:00
switchdev switchdev: fix typo in comments/doc 2016-03-24 14:51:24 -04:00
tipc tipc: let first message on link be a state message 2016-04-15 16:09:06 -04:00
unix Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-02-23 00:09:14 -05:00
vmw_vsock VSOCK: Detach QP check should filter out non matching QPs. 2016-04-06 16:39:09 -04:00
wimax net:wimax: Fix doucble word "the the" in networking.xml 2015-08-09 22:43:52 -07:00
wireless cfg80211: remove enum ieee80211_band 2016-04-12 15:56:15 +02:00
x25
xfrm xfrm: Fix crash observed during device unregistration and decryption 2016-03-24 14:29:36 -04:00
compat.c
Kconfig Make DST_CACHE a silent config option 2016-03-21 22:56:38 -04:00
Makefile kcm: Kernel Connection Multiplexor module 2016-03-09 16:36:14 -05:00
socket.c Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-04-14 00:39:15 -04:00
sysctl_net.c net: sysctl: fix a kmemleak warning 2015-10-23 06:22:08 -07:00