linux/net/netfilter
Pablo Neira e687ad60af netfilter: add netfilter ingress hook after handle_ing() under unique static key
This patch adds the Netfilter ingress hook just after the existing tc ingress
hook, that seems to be the consensus solution for this.

Note that the Netfilter hook resides under the global static key that enables
ingress filtering. Nonetheless, Netfilter still also has its own static key for
minimal impact on the existing handle_ing().

* Without this patch:

Result: OK: 6216490(c6216338+d152) usec, 100000000 (60byte,0frags)
  16086246pps 7721Mb/sec (7721398080bps) errors: 100000000

    42.46%  kpktgend_0   [kernel.kallsyms]   [k] __netif_receive_skb_core
    25.92%  kpktgend_0   [kernel.kallsyms]   [k] kfree_skb
     7.81%  kpktgend_0   [pktgen]            [k] pktgen_thread_worker
     5.62%  kpktgend_0   [kernel.kallsyms]   [k] ip_rcv
     2.70%  kpktgend_0   [kernel.kallsyms]   [k] netif_receive_skb_internal
     2.34%  kpktgend_0   [kernel.kallsyms]   [k] netif_receive_skb_sk
     1.44%  kpktgend_0   [kernel.kallsyms]   [k] __build_skb

* With this patch:

Result: OK: 6214833(c6214731+d101) usec, 100000000 (60byte,0frags)
  16090536pps 7723Mb/sec (7723457280bps) errors: 100000000

    41.23%  kpktgend_0      [kernel.kallsyms]  [k] __netif_receive_skb_core
    26.57%  kpktgend_0      [kernel.kallsyms]  [k] kfree_skb
     7.72%  kpktgend_0      [pktgen]           [k] pktgen_thread_worker
     5.55%  kpktgend_0      [kernel.kallsyms]  [k] ip_rcv
     2.78%  kpktgend_0      [kernel.kallsyms]  [k] netif_receive_skb_internal
     2.06%  kpktgend_0      [kernel.kallsyms]  [k] netif_receive_skb_sk
     1.43%  kpktgend_0      [kernel.kallsyms]  [k] __build_skb

* Without this patch + tc ingress:

        tc filter add dev eth4 parent ffff: protocol ip prio 1 \
                u32 match ip dst 4.3.2.1/32

Result: OK: 9269001(c9268821+d179) usec, 100000000 (60byte,0frags)
  10788648pps 5178Mb/sec (5178551040bps) errors: 100000000

    40.99%  kpktgend_0   [kernel.kallsyms]  [k] __netif_receive_skb_core
    17.50%  kpktgend_0   [kernel.kallsyms]  [k] kfree_skb
    11.77%  kpktgend_0   [cls_u32]          [k] u32_classify
     5.62%  kpktgend_0   [kernel.kallsyms]  [k] tc_classify_compat
     5.18%  kpktgend_0   [pktgen]           [k] pktgen_thread_worker
     3.23%  kpktgend_0   [kernel.kallsyms]  [k] tc_classify
     2.97%  kpktgend_0   [kernel.kallsyms]  [k] ip_rcv
     1.83%  kpktgend_0   [kernel.kallsyms]  [k] netif_receive_skb_internal
     1.50%  kpktgend_0   [kernel.kallsyms]  [k] netif_receive_skb_sk
     0.99%  kpktgend_0   [kernel.kallsyms]  [k] __build_skb

* With this patch + tc ingress:

        tc filter add dev eth4 parent ffff: protocol ip prio 1 \
                u32 match ip dst 4.3.2.1/32

Result: OK: 9308218(c9308091+d126) usec, 100000000 (60byte,0frags)
  10743194pps 5156Mb/sec (5156733120bps) errors: 100000000

    42.01%  kpktgend_0   [kernel.kallsyms]   [k] __netif_receive_skb_core
    17.78%  kpktgend_0   [kernel.kallsyms]   [k] kfree_skb
    11.70%  kpktgend_0   [cls_u32]           [k] u32_classify
     5.46%  kpktgend_0   [kernel.kallsyms]   [k] tc_classify_compat
     5.16%  kpktgend_0   [pktgen]            [k] pktgen_thread_worker
     2.98%  kpktgend_0   [kernel.kallsyms]   [k] ip_rcv
     2.84%  kpktgend_0   [kernel.kallsyms]   [k] tc_classify
     1.96%  kpktgend_0   [kernel.kallsyms]   [k] netif_receive_skb_internal
     1.57%  kpktgend_0   [kernel.kallsyms]   [k] netif_receive_skb_sk

Note that the results are very similar before and after.

I can see gcc gets the code under the ingress static key out of the hot path.
Then, on that cold branch, it generates the code to accomodate the netfilter
ingress static key. My explanation for this is that this reduces the pressure
on the instruction cache for non-users as the new code is out of the hot path,
and it comes with minimal impact for tc ingress users.

Using gcc version 4.8.4 on:

Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                8
[...]
L1d cache:             16K
L1i cache:             64K
L2 cache:              2048K
L3 cache:              8192K

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-14 01:10:05 -04:00
..
ipset netfilter: bridge: add helpers for fetching physin/outdev 2015-04-08 16:49:08 +02:00
ipvs net: Modify sk_alloc to not reference count the netns of kernel sockets. 2015-05-11 10:50:18 -04:00
core.c netfilter: add netfilter ingress hook after handle_ing() under unique static key 2015-05-14 01:10:05 -04:00
Kconfig netfilter: add netfilter ingress hook after handle_ing() under unique static key 2015-05-14 01:10:05 -04:00
Makefile netfilter: nf_tables: add support for dynamic set updates 2015-04-08 16:58:27 +02:00
nf_conntrack_acct.c netfilter: Remove uses of seq_<foo> return values 2015-03-18 10:51:35 +01:00
nf_conntrack_amanda.c net: Remove state argument from skb_find_text() 2015-02-22 15:59:54 -05:00
nf_conntrack_broadcast.c
nf_conntrack_core.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next 2015-01-15 01:50:25 -05:00
nf_conntrack_ecache.c netfilter: conntrack: remove timer from ecache extension 2014-06-25 19:15:38 +02:00
nf_conntrack_expect.c netfilter: Remove uses of seq_<foo> return values 2015-03-18 10:51:35 +01:00
nf_conntrack_extend.c netfilter: nf_ct_ext: support variable length extensions 2012-06-16 15:08:49 +02:00
nf_conntrack_ftp.c netfilter: replace strnicmp with strncasecmp 2014-10-14 02:18:24 +02:00
nf_conntrack_h323_asn1.c
nf_conntrack_h323_main.c netfilter: nf_conntrack_h323: lookup route from proper net namespace 2014-11-17 12:47:14 +01:00
nf_conntrack_h323_types.c
nf_conntrack_helper.c netfilter: fix spelling errors 2014-10-30 17:35:30 +01:00
nf_conntrack_irc.c netfilter: add my copyright statements 2013-04-18 20:27:55 +02:00
nf_conntrack_l3proto_generic.c netfilter: Convert print_tuple functions to return void 2014-11-05 14:10:33 -05:00
nf_conntrack_labels.c netfilter: connlabels: remove unneeded includes 2013-07-31 16:39:18 +02:00
nf_conntrack_netbios_ns.c
nf_conntrack_netlink.c netfilter: conntrack: Flush connections with a given mark 2015-01-08 12:14:20 +01:00
nf_conntrack_pptp.c netfilter: nf_conntrack: flush net_gre->keymap_list only from gre helper 2014-04-08 10:56:12 +02:00
nf_conntrack_proto.c netfilter: nf_conntrack: remove dead code 2014-01-03 23:41:37 +01:00
nf_conntrack_proto_dccp.c netfilter: Convert print_tuple functions to return void 2014-11-05 14:10:33 -05:00
nf_conntrack_proto_generic.c netfilter: Convert print_tuple functions to return void 2014-11-05 14:10:33 -05:00
nf_conntrack_proto_gre.c netfilter: Convert print_tuple functions to return void 2014-11-05 14:10:33 -05:00
nf_conntrack_proto_sctp.c netfilter: Convert print_tuple functions to return void 2014-11-05 14:10:33 -05:00
nf_conntrack_proto_tcp.c Merge branch 'iov_iter' into for-next 2014-12-08 20:39:29 -05:00
nf_conntrack_proto_udp.c netfilter: Convert print_tuple functions to return void 2014-11-05 14:10:33 -05:00
nf_conntrack_proto_udplite.c netfilter: Convert print_tuple functions to return void 2014-11-05 14:10:33 -05:00
nf_conntrack_sane.c netfilter: nf_ct_helper: better logging for dropped packets 2013-02-19 02:48:05 +01:00
nf_conntrack_seqadj.c netfilter: nf_ct_seqadj: print ack seq in the right host byte order 2015-01-05 13:52:20 +01:00
nf_conntrack_sip.c netfilter: replace strnicmp with strncasecmp 2014-10-14 02:18:24 +02:00
nf_conntrack_snmp.c netfilter: nf_ct_snmp: add include file 2013-01-18 00:28:18 +01:00
nf_conntrack_standalone.c netfilter: Remove checks of seq_printf() return values 2014-11-05 14:11:02 -05:00
nf_conntrack_tftp.c netfilter: add my copyright statements 2013-04-18 20:27:55 +02:00
nf_conntrack_timeout.c netfilter: nf_ct_timeout: move initialization out of pernet_operations 2013-01-23 12:56:02 +01:00
nf_conntrack_timestamp.c netfilter: nf_ct_timestamp: Fix BUG_ON after netns deletion 2013-12-20 14:58:29 +01:00
nf_internals.h netfilter: Create and use nf_hook_state. 2015-04-04 12:17:40 -04:00
nf_log.c netfilter: restore rule tracing via nfnetlink_log 2015-03-19 11:14:48 +01:00
nf_log_common.c netfilter: bridge: add helpers for fetching physin/outdev 2015-04-08 16:49:08 +02:00
nf_nat_amanda.c netfilter: add my copyright statements 2013-04-18 20:27:55 +02:00
nf_nat_core.c net: use reciprocal_scale() helper 2014-08-23 12:21:21 -07:00
nf_nat_ftp.c netfilter: nf_ct_helper: better logging for dropped packets 2013-02-19 02:48:05 +01:00
nf_nat_helper.c netfilter: nf_conntrack: make sequence number adjustments usuable without NAT 2013-08-28 00:26:48 +02:00
nf_nat_irc.c netfilter: nf_nat: fix access to uninitialized buffer in IRC NAT helper 2014-01-06 14:17:17 +01:00
nf_nat_proto_common.c netfilter: use IS_ENABLED() macro 2014-06-30 11:38:03 +02:00
nf_nat_proto_dccp.c netfilter: use IS_ENABLED() macro 2014-06-30 11:38:03 +02:00
nf_nat_proto_sctp.c netfilter: use IS_ENABLED() macro 2014-06-30 11:38:03 +02:00
nf_nat_proto_tcp.c netfilter: use IS_ENABLED() macro 2014-06-30 11:38:03 +02:00
nf_nat_proto_udp.c netfilter: use IS_ENABLED() macro 2014-06-30 11:38:03 +02:00
nf_nat_proto_udplite.c netfilter: use IS_ENABLED() macro 2014-06-30 11:38:03 +02:00
nf_nat_proto_unknown.c netfilter: add protocol independent NAT core 2012-08-30 03:00:14 +02:00
nf_nat_redirect.c netfilter: combine IPv4 and IPv6 nf_nat_redirect code in one module 2014-11-27 13:08:42 +01:00
nf_nat_sip.c netfilter: replace strnicmp with strncasecmp 2014-10-14 02:18:24 +02:00
nf_nat_tftp.c netfilter: nf_ct_helper: better logging for dropped packets 2013-02-19 02:48:05 +01:00
nf_queue.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2015-04-08 18:30:21 +02:00
nf_sockopt.c netfilter: don't use mutex_lock_interruptible() 2014-08-08 16:47:23 +02:00
nf_synproxy_core.c netfilter: nf_conntrack: don't release a conntrack with non-zero refcnt 2014-02-05 17:46:06 +01:00
nf_tables_api.c netfilter: nf_tables: fix wrong length for jump/goto verdicts 2015-04-24 20:51:23 +02:00
nf_tables_core.c netfilter: nf_tables: switch registers to 32 bit addressing 2015-04-13 17:17:29 +02:00
nf_tables_inet.c netfilter: nf_tables: fix error path in the init functions 2014-01-09 23:25:48 +01:00
nfnetlink.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next 2015-01-15 01:50:25 -05:00
nfnetlink_acct.c netfilter: nfnetlink_acct: add filter support to nfacct counter list/reset 2014-08-26 21:36:19 +02:00
nfnetlink_cthelper.c netfilter: Zero the tuple in nfnl_cthelper_parse_tuple() 2015-03-12 13:07:36 +01:00
nfnetlink_cttimeout.c netfilter: cttimeout: allow to set/get default protocol timeouts 2013-10-01 13:17:39 +02:00
nfnetlink_log.c netfilter: Fix format string of nfnetlink_log proc file 2015-04-13 16:35:17 -04:00
nfnetlink_queue_core.c netfilter: Fix format string of nfnetlink_queue proc file 2015-04-13 16:35:16 -04:00
nfnetlink_queue_ct.c netfilter: nf_conntrack: make sequence number adjustments usuable without NAT 2013-08-28 00:26:48 +02:00
nft_bitwise.c netfilter: nf_tables: support variable sized data in nft_data_init() 2015-04-13 17:17:30 +02:00
nft_byteorder.c netfilter: nf_tables: switch registers to 32 bit addressing 2015-04-13 17:17:29 +02:00
nft_cmp.c netfilter: nf_tables: support variable sized data in nft_data_init() 2015-04-13 17:17:30 +02:00
nft_compat.c netfilter: nf_tables: get rid of NFT_REG_VERDICT usage 2015-04-13 17:17:07 +02:00
nft_counter.c netfilter: nf_tables: mark stateful expressions 2015-04-13 20:12:31 +02:00
nft_ct.c netfilter: nf_tables: switch registers to 32 bit addressing 2015-04-13 17:17:29 +02:00
nft_dynset.c netfilter: nft_dynset: dynamic stateful expression instantiation 2015-04-13 20:19:55 +02:00
nft_exthdr.c netfilter: nf_tables: switch registers to 32 bit addressing 2015-04-13 17:17:29 +02:00
nft_hash.c netfilter: nf_tables: variable sized set element keys / data 2015-04-13 17:17:31 +02:00
nft_immediate.c netfilter: nf_tables: support variable sized data in nft_data_init() 2015-04-13 17:17:30 +02:00
nft_limit.c netfilter: nf_tables: mark stateful expressions 2015-04-13 20:12:31 +02:00
nft_log.c netfilter: nf_tables: get rid of NFT_REG_VERDICT usage 2015-04-13 17:17:07 +02:00
nft_lookup.c netfilter: nf_tables: add flag to indicate set contains expressions 2015-04-13 20:12:32 +02:00
nft_masq.c netfilter: nf_tables: validate hooks in NAT expressions 2015-01-19 14:52:39 +01:00
nft_meta.c netfilter: nf_tables: switch registers to 32 bit addressing 2015-04-13 17:17:29 +02:00
nft_nat.c netfilter: nf_tables: switch registers to 32 bit addressing 2015-04-13 17:17:29 +02:00
nft_payload.c netfilter: nf_tables: switch registers to 32 bit addressing 2015-04-13 17:17:29 +02:00
nft_queue.c netfilter: nf_tables: get rid of NFT_REG_VERDICT usage 2015-04-13 17:17:07 +02:00
nft_rbtree.c netfilter: nf_tables: variable sized set element keys / data 2015-04-13 17:17:31 +02:00
nft_redir.c netfilter: nf_tables: add register parsing/dumping helpers 2015-04-13 17:17:28 +02:00
nft_reject.c netfilter; Add some missing default cases to switch statements in nft_reject. 2015-04-27 13:20:34 -04:00
nft_reject_inet.c netfilter; Add some missing default cases to switch statements in nft_reject. 2015-04-27 13:20:34 -04:00
x_tables.c netfilter: Remove checks of seq_printf() return values 2014-11-05 14:11:02 -05:00
xt_addrtype.c netfilter: xt_addrtype: fix trivial typo 2013-07-31 16:36:25 +02:00
xt_AUDIT.c netfilter: Convert uses of __constant_<foo> to <foo> 2014-03-13 14:13:19 +01:00
xt_bpf.c net: filter: split 'struct sk_filter' into socket and bpf parts 2014-08-02 15:03:58 -07:00
xt_cgroup.c netfilter: x_tables: fix cgroup matching on non-full sks 2015-04-01 11:26:42 +02:00
xt_CHECKSUM.c
xt_CLASSIFY.c
xt_cluster.c net: use reciprocal_scale() helper 2014-08-23 12:21:21 -07:00
xt_comment.c
xt_connbytes.c netfilter: Convert pr_warning to pr_warn 2014-09-10 12:40:10 -07:00
xt_connlabel.c netfilter: add connlabel conntrack extension 2013-01-18 00:28:15 +01:00
xt_connlimit.c netfilter: xt_connlimit: honor conntrack zone if available 2014-11-17 12:44:20 +01:00
xt_connmark.c netfilter: Fix FSF address in file headers 2013-12-06 12:37:57 -05:00
xt_CONNSECMARK.c
xt_conntrack.c netfilter: add my copyright statements 2013-04-18 20:27:55 +02:00
xt_cpu.c
xt_CT.c netfilter: nf_conntrack: don't release a conntrack with non-zero refcnt 2014-02-05 17:46:06 +01:00
xt_dccp.c
xt_devgroup.c
xt_dscp.c
xt_DSCP.c netfilter: fix various sparse warnings 2014-11-13 12:14:42 +01:00
xt_ecn.c netfilter: xtables: collapse conditions in xt_ecn 2011-12-27 20:45:25 +01:00
xt_esp.c
xt_hashlimit.c netfilter: Remove checks of seq_printf() return values 2014-11-05 14:11:02 -05:00
xt_helper.c
xt_hl.c netfilter: Reduce switch/case indent 2011-07-01 16:11:15 -07:00
xt_HL.c netfilter: Reduce switch/case indent 2011-07-01 16:11:15 -07:00
xt_HMARK.c net: use reciprocal_scale() helper 2014-08-23 12:21:21 -07:00
xt_IDLETIMER.c netfilter: Remove unnecessary OOM logging messages 2011-11-01 09:19:49 +01:00
xt_ipcomp.c netfilter: xt_ipcomp: Use ntohs to ease sparse warning 2014-02-19 11:41:25 +01:00
xt_iprange.c
xt_ipvs.c ipvs: API change to avoid rescan of IPv6 exthdr 2012-09-28 11:34:33 +09:00
xt_l2tp.c netfilter: introduce l2tp match extension 2014-01-09 21:36:39 +01:00
xt_LED.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-08-05 18:46:26 -07:00
xt_length.c
xt_limit.c netfilter: add my copyright statements 2013-04-18 20:27:55 +02:00
xt_LOG.c netfilter: xt_LOG: add missing string format in nf_log_packet() 2014-06-28 18:50:35 +02:00
xt_mac.c netfilter: Convert compare_ether_addr to ether_addr_equal 2012-05-09 20:49:18 -04:00
xt_mark.c
xt_multiport.c
xt_nat.c netfilter: xt_nat: fix incorrect hooks for SNAT and DNAT targets 2012-10-15 13:39:12 +02:00
xt_NETMAP.c netfilter: combine ipt_NETMAP and ip6t_NETMAP 2012-09-21 12:11:08 +02:00
xt_nfacct.c netfilter: nfnetlink_acct: Adding quota support to accounting framework 2014-04-29 18:25:14 +02:00
xt_NFLOG.c netfilter: log: netns NULL ptr bug when calling from conntrack 2013-05-15 14:11:07 +02:00
xt_NFQUEUE.c netfilter: xt_NFQUEUE: separate reusable code 2013-12-07 23:20:45 +01:00
xt_osf.c netfilter: xt_osf: Use continue to reduce indentation 2014-12-23 14:20:10 +01:00
xt_owner.c userns: xt_owner: Add basic user namespace support. 2012-08-14 21:55:30 -07:00
xt_physdev.c netfilter: physdev: use helpers 2015-04-08 16:49:09 +02:00
xt_pkttype.c
xt_policy.c
xt_quota.c net: Fix files explicitly needing to include module.h 2011-10-31 19:30:28 -04:00
xt_RATEEST.c net: sched: make bstats per cpu and estimator RCU safe 2014-09-30 01:02:26 -04:00
xt_rateest.c net_sched: add 64bit rate estimators 2013-06-11 02:51:03 -07:00
xt_realm.c
xt_recent.c netfilter: xt_recent: don't reject rule if new hitcount exceeds table max 2015-02-16 17:00:47 +01:00
xt_REDIRECT.c netfilter: combine IPv4 and IPv6 nf_nat_redirect code in one module 2014-11-27 13:08:42 +01:00
xt_repldata.h net: netfilter: LLVMLinux: vlais-netfilter 2014-06-07 11:44:39 -07:00
xt_sctp.c
xt_SECMARK.c
xt_set.c netfilter: ipset: fix boolreturn.cocci warnings 2015-02-11 16:13:30 +01:00
xt_socket.c netfilter: x_tables: don't extract flow keys on early demuxed sks in socket match 2015-04-08 16:47:49 +02:00
xt_state.c
xt_statistic.c net: replace macros net_random and net_srandom with direct calls to prandom 2014-01-14 15:15:25 -08:00
xt_string.c net: Remove state argument from skb_find_text() 2015-02-22 15:59:54 -05:00
xt_TCPMSS.c netfilter: xt_TCPMSS: lookup route from proper net namespace 2013-09-27 16:18:23 +02:00
xt_tcpmss.c
xt_TCPOPTSTRIP.c netfilter: xt_TCPOPTSTRIP: fix possible off by one access 2013-08-01 11:45:15 +02:00
xt_tcpudp.c
xt_TEE.c net: pass info struct via netdevice notifier 2013-05-28 13:11:01 -07:00
xt_time.c netfilter: xt_time: add support to ignore day transition 2012-09-24 14:29:01 +02:00
xt_TPROXY.c tcp/dccp: get rid of central timewait timer 2015-04-13 16:40:05 -04:00
xt_TRACE.c
xt_u32.c