linux/net
Raghavendra K T a3a773726c net: Optimize snmp stat aggregation by walking all the percpu data at once
Docker container creation linearly increased from around 1.6 sec to 7.5 sec
(at 1000 containers) and perf data showed 50% ovehead in snmp_fold_field.

reason: currently __snmp6_fill_stats64 calls snmp_fold_field that walks
through per cpu data of an item (iteratively for around 36 items).

idea: This patch tries to aggregate the statistics by going through
all the items of each cpu sequentially which is reducing cache
misses.

Docker creation got faster by more than 2x after the patch.

Result:
                       Before           After
Docker creation time   6.836s           3.25s
cache miss             2.7%             1.41%

perf before:
    50.73%  docker           [kernel.kallsyms]       [k] snmp_fold_field
     9.07%  swapper          [kernel.kallsyms]       [k] snooze_loop
     3.49%  docker           [kernel.kallsyms]       [k] veth_stats_one
     2.85%  swapper          [kernel.kallsyms]       [k] _raw_spin_lock

perf after:
    10.57%  docker           docker                [.] scanblock
     8.37%  swapper          [kernel.kallsyms]     [k] snooze_loop
     6.91%  docker           [kernel.kallsyms]     [k] snmp_get_cpu_field
     6.67%  docker           [kernel.kallsyms]     [k] veth_stats_one

changes/ideas suggested:
Using buffer in stack (Eric), Usage of memset (David), Using memcpy in
place of unaligned_put (Joe).

Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-30 21:48:58 -07:00
..
6lowpan
9p 9p: ensure err is initialized to 0 in p9_client_read/write 2015-08-22 21:35:02 -04:00
802
8021q net: 8021q: convert to using IFF_NO_QUEUE 2015-08-18 11:55:06 -07:00
appletalk
atm
ax25
batman-adv batman-adv: turn batadv_neigh_node_get() into local function 2015-08-27 20:15:34 +02:00
bluetooth Bluetooth: Fix SCO link type handling on connection complete 2015-08-28 21:03:00 +02:00
bridge Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next 2015-08-28 16:29:59 -07:00
caif net: caif: convert to using IFF_NO_QUEUE 2015-08-18 11:55:07 -07:00
can
ceph
core ip_tunnels: record IP version in tunnel info 2015-08-29 13:07:54 -07:00
dcb
dccp
decnet
dns_resolver
dsa net: dsa: Allow multi hop routes to be expressed 2015-08-18 14:17:21 -07:00
ethernet
hsr net: hsr: convert to using IFF_NO_QUEUE 2015-08-18 11:55:07 -07:00
ieee802154 Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next 2015-08-29 13:15:03 -07:00
ipv4 net: Introduce helper functions to get the per cpu data 2015-08-30 21:48:58 -07:00
ipv6 net: Optimize snmp stat aggregation by walking all the percpu data at once 2015-08-30 21:48:58 -07:00
ipx
irda
iucv
key net: Fix RCU splat in af_key 2015-08-24 14:48:10 -07:00
l2tp
lapb
llc
mac80211 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-08-21 11:44:04 -07:00
mac802154
mpls lwt: Add cfg argument to build_state 2015-08-24 10:34:40 -07:00
netfilter Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next 2015-08-28 16:29:59 -07:00
netlabel
netlink netlink: mmap: fix lookup frame position 2015-08-28 22:25:42 -07:00
netrom
nfc nfc: netlink: Add capability to reply to vendor_cmd with data 2015-08-20 22:00:11 +02:00
openvswitch openvswitch: Remove vport-net 2015-08-29 19:07:15 -07:00
packet
phonet
rds RDS: remove superfluous from rds_ib_alloc_fmr() 2015-08-25 16:28:11 -07:00
rfkill Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next 2015-08-17 15:41:21 -07:00
rose
rxrpc
sched net: sched: don't break line in tc_classify loop notification 2015-08-28 14:01:15 -07:00
sctp Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-08-30 21:45:01 -07:00
sunrpc
switchdev
tipc tipc: fix stale link problem during synchronization 2015-08-23 16:14:45 -07:00
unix
vmw_vsock
wimax
wireless
x25
xfrm
compat.c
Kconfig
Makefile
socket.c
sysctl_net.c