linux/net/ipv6
Florian Westphal f7b3bec6f5 net: allow setting ecn via routing table
This patch allows to set ECN on a per-route basis in case the sysctl
tcp_ecn is not set to 1. In other words, when ECN is set for specific
routes, it provides a tcp_ecn=1 behaviour for that route while the rest
of the stack acts according to the global settings.

One can use 'ip route change dev $dev $net features ecn' to toggle this.

Having a more fine-grained per-route setting can be beneficial for various
reasons, for example, 1) within data centers, or 2) local ISPs may deploy
ECN support for their own video/streaming services [1], etc.

There was a recent measurement study/paper [2] which scanned the Alexa's
publicly available top million websites list from a vantage point in US,
Europe and Asia:

Half of the Alexa list will now happily use ECN (tcp_ecn=2, most likely
blamed to commit 255cac91c3 ("tcp: extend ECN sysctl to allow server-side
only ECN") ;)); the break in connectivity on-path was found is about
1 in 10,000 cases. Timeouts rather than receiving back RSTs were much
more common in the negotiation phase (and mostly seen in the Alexa
middle band, ranks around 50k-150k): from 12-thousand hosts on which
there _may_ be ECN-linked connection failures, only 79 failed with RST
when _not_ failing with RST when ECN is not requested.

It's unclear though, how much equipment in the wild actually marks CE
when buffers start to fill up.

We thought about a fallback to non-ECN for retransmitted SYNs as another
global option (which could perhaps one day be made default), but as Eric
points out, there's much more work needed to detect broken middleboxes.

Two examples Eric mentioned are buggy firewalls that accept only a single
SYN per flow, and middleboxes that successfully let an ECN flow establish,
but later mark CE for all packets (so cwnd converges to 1).

 [1] http://www.ietf.org/proceedings/89/slides/slides-89-tsvarea-1.pdf, p.15
 [2] http://ecn.ethz.ch/

Joint work with Daniel Borkmann.

Reference: http://thread.gmane.org/gmane.linux.network/335797
Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-04 16:06:09 -05:00
..
netfilter netfilter: nf_reject_ipv6: split nf_send_reset6() in smaller functions 2014-10-31 12:49:57 +01:00
addrconf.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-11-01 14:53:27 -04:00
addrconf_core.c ipv6: remove rt6i_genid 2014-09-30 14:00:48 -04:00
addrlabel.c list: fix order of arguments for hlist_add_after(_rcu) 2014-08-06 18:01:24 -07:00
af_inet6.c ipv6: make fib6 serial number per namespace 2014-10-07 00:02:30 -04:00
ah6.c ipsec: Remove obsolete MAX_AH_AUTH_LEN 2014-09-18 10:54:36 +02:00
anycast.c ipv6: remove aca_lock spinlock from struct ifacaddr6 2014-10-14 13:15:15 -04:00
datagram.c sock: deduplicate errqueue dequeue 2014-09-01 21:49:08 -07:00
esp6.c ipv6: White-space cleansing : Structure layouts 2014-08-24 22:37:52 -07:00
exthdrs.c ipv6: include linux/uaccess.h instead of asm/uaccess.h 2014-10-27 16:03:52 -04:00
exthdrs_core.c ipv6: ipv6_find_hdr restore prev functionality 2014-02-27 18:27:26 -05:00
exthdrs_offload.c ipv6: Fix exthdrs offload registration. 2014-03-06 16:35:55 -05:00
fib6_rules.c ipv6: move IPV6_TCLASS_SHIFT into ipv6.h and define a helper 2014-01-15 15:53:18 -08:00
icmp.c net: Remove trailing whitespace in tcp.h icmp.c syncookies.c 2014-10-24 00:13:10 -04:00
inet6_connection_sock.c ipv6: White-space cleansing : gaps between function and symbol export 2014-08-24 22:37:52 -07:00
inet6_hashtables.c ipv6: White-space cleansing : gaps between function and symbol export 2014-08-24 22:37:52 -07:00
ip6_checksum.c udp: Generic functions to set checksum 2014-06-04 22:46:38 -07:00
ip6_fib.c ipv6: don't walk node's leaf during serial number update 2014-10-07 00:02:30 -04:00
ip6_flowlabel.c ipv6: White-space cleansing : gaps between function and symbol export 2014-08-24 22:37:52 -07:00
ip6_gre.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-10-08 16:22:22 -04:00
ip6_icmp.c ipv6: White-space cleansing : Line Layouts 2014-08-24 22:37:52 -07:00
ip6_input.c ipv6: White-space cleansing : Line Layouts 2014-08-24 22:37:52 -07:00
ip6_offload.c net: gso: use feature flag argument in all protocol gso handlers 2014-10-20 12:38:12 -04:00
ip6_offload.h
ip6_output.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-09-23 12:09:27 -04:00
ip6_tunnel.c ip6_tunnel: allow to change mode for the ip6tnl0 2014-10-30 16:09:20 -04:00
ip6_udp_tunnel.c udp: Need to make ip6_udp_tunnel.c have GPL license 2014-09-22 15:08:25 -04:00
ip6_vti.c net: better IFF_XMIT_DST_RELEASE support 2014-10-07 13:22:11 -04:00
ip6mr.c ipv6: spelling s/incomming/incoming 2014-10-30 15:51:43 -04:00
ipcomp6.c ipv6: White-space cleansing : Structure layouts 2014-08-24 22:37:52 -07:00
ipv6_sockglue.c ipv6: White-space cleansing : gaps between function and symbol export 2014-08-24 22:37:52 -07:00
Kconfig ip6_vti: Fix build when NET_IP_TUNNEL is not set. 2014-02-20 14:29:49 +01:00
Makefile udp_tunnel: Only build ip6_udp_tunnel.c when IPV6 is selected 2014-09-19 22:05:28 -04:00
mcast.c ipv6: mld: answer mldv2 queries with mldv1 reports in mldv1 fallback 2014-09-22 16:23:15 -04:00
mip6.c ipv6: White-space cleansing : Structure layouts 2014-08-24 22:37:52 -07:00
ndisc.c ipv6: White-space cleansing : gaps between function and symbol export 2014-08-24 22:37:52 -07:00
netfilter.c netfilter: Fix potential use after free in ip6_route_me_harder() 2014-05-09 02:36:39 +02:00
output_core.c drivers/net, ipv6: Select IPv6 fragment idents for virtio UFO packets 2014-10-30 20:01:18 -04:00
ping.c net: Eliminate no_check from protosw 2014-05-23 16:28:53 -04:00
proc.c ipv6: White-space cleansing : Line Layouts 2014-08-24 22:37:52 -07:00
protocol.c net: Export inet_offloads and inet6_offloads 2014-09-19 17:15:31 -04:00
raw.c ipv6: remove assignment in if condition 2014-10-30 15:51:43 -04:00
reassembly.c ipv6: remove inline on static in c file 2014-10-30 15:51:43 -04:00
route.c ipv6: Avoid redoing fib6_lookup() with reachable = 0 by saving fn 2014-10-24 00:14:39 -04:00
sit.c ipv6: fix a potential use after free in sit.c 2014-10-18 13:04:09 -04:00
syncookies.c net: allow setting ecn via routing table 2014-11-04 16:06:09 -05:00
sysctl_net_ipv6.c ipv6: add sysctl_mld_qrv to configure query robustness variable 2014-09-04 22:26:14 -07:00
tcp_ipv6.c net: fix saving TX flow hash in sock for outgoing connections 2014-10-22 16:14:29 -04:00
tcpv6_offload.c net: Remove gso_send_check as an offload callback 2014-09-26 00:22:47 -04:00
tunnel6.c ipv6: White-space cleansing : gaps between function and symbol export 2014-08-24 22:37:52 -07:00
udp.c udp: Add support for doing checksum unnecessary conversion 2014-09-01 21:36:28 -07:00
udp_impl.h net: ipv4/ipv6: Remove extern from function prototypes 2013-10-19 19:12:11 -04:00
udp_offload.c fou: eliminate IPv4,v6 specific GRO functions 2014-10-03 16:53:32 -07:00
udplite.c net: Eliminate no_check from protosw 2014-05-23 16:28:53 -04:00
xfrm6_input.c ipv6: White-space cleansing : gaps between function and symbol export 2014-08-24 22:37:52 -07:00
xfrm6_mode_beet.c
xfrm6_mode_ro.c ipv4/ipv6: Fix FSF address in file headers 2013-12-06 12:37:56 -05:00
xfrm6_mode_transport.c
xfrm6_mode_tunnel.c xfrm6: Remove xfrm_tunnel_notifier 2014-03-14 07:28:08 +01:00
xfrm6_output.c ipv6: White-space cleansing : gaps between function and symbol export 2014-08-24 22:37:52 -07:00
xfrm6_policy.c xfrm6: fix a potential use after free in xfrm6_policy.c 2014-10-22 15:38:48 -04:00
xfrm6_protocol.c xfrm6: Properly handle unsupported protocols 2014-05-06 07:08:38 +02:00
xfrm6_state.c ipv6: White-space cleansing : Line Layouts 2014-08-24 22:37:52 -07:00
xfrm6_tunnel.c ipv6: White-space cleansing : gaps between function and symbol export 2014-08-24 22:37:52 -07:00