linux/net
Pablo Neira Ayuso e5eaac2beb netfilter: flowtable: fix TCP flow teardown
This patch addresses three possible problems:

1. ct gc may race to undo the timeout adjustment of the packet path, leaving
   the conntrack entry in place with the internal offload timeout (one day).

2. ct gc removes the ct because the IPS_OFFLOAD_BIT is not set and the CLOSE
   timeout is reached before the flow offload del.

3. tcp ct is always set to ESTABLISHED with a very long timeout
   in flow offload teardown/delete even though the state might be already
   CLOSED. Also as a remark we cannot assume that the FIN or RST packet
   is hitting flow table teardown as the packet might get bumped to the
   slow path in nftables.

This patch resets IPS_OFFLOAD_BIT from flow_offload_teardown(), so
conntrack handles the tcp rst/fin packet which triggers the CLOSE/FIN
state transition.

Moreover, teturn the connection's ownership to conntrack upon teardown
by clearing the offload flag and fixing the established timeout value.
The flow table GC thread will asynchonrnously free the flow table and
hardware offload entries.

Before this patch, the IPS_OFFLOAD_BIT remained set for expired flows on
which is also misleading since the flow is back to classic conntrack
path.

If nf_ct_delete() removes the entry from the conntrack table, then it
calls nf_ct_put() which decrements the refcnt. This is not a problem
because the flowtable holds a reference to the conntrack object from
flow_offload_alloc() path which is released via flow_offload_free().

This patch also updates nft_flow_offload to skip packets in SYN_RECV
state. Since we might miss or bump packets to slow path, we do not know
what will happen there while we are still in SYN_RECV, this patch
postpones offload up to the next packet which also aligns to the
existing behaviour in tc-ct.

flow_offload_teardown() does not reset the existing tcp state from
flow_offload_fixup_tcp() to ESTABLISHED anymore, packets bump to slow
path might have already update the state to CLOSE/FIN.

Joint work with Oz and Sven.

Fixes: 1e5b2471bc ("netfilter: nf_flow_table: teardown flow timeout race")
Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-05-18 17:34:26 +02:00
..
6lowpan net: don't include ndisc.h from ipv6.h 2022-02-04 14:15:11 -08:00
9p xen/grant-table: remove readonly parameter from functions 2022-03-15 20:34:40 -05:00
802
8021q vlan: use correct format characters 2022-03-17 16:34:49 -07:00
appletalk
atm
ax25 ax25: Fix UAF bugs in ax25 timers 2022-03-29 10:24:34 +02:00
batman-adv batman-adv: Don't skb_split skbuffs with frag_list 2022-04-17 23:41:44 +02:00
bluetooth Bluetooth: Fix the creation of hdev->name 2022-05-11 17:18:42 -07:00
bpf bpf: Fix release of page_pool in BPF_PROG_RUN in test runner 2022-04-11 17:30:15 +02:00
bpfilter uaccess: remove CONFIG_SET_FS 2022-02-25 09:36:06 +01:00
bridge net: bridge: switchdev: check br_vlan_group() return value 2022-04-22 15:12:18 -07:00
caif net: caif: Use netif_rx(). 2022-03-04 12:02:19 +00:00
can can: isotp: remove re-binding of bound socket 2022-04-29 11:02:47 +02:00
ceph libceph: disambiguate cluster/pool full log message 2022-04-25 10:45:15 +02:00
core net: fix dev_fill_forward_path with pppoe + bridge 2022-05-16 12:58:55 +02:00
dcb net: dcb: disable softirqs in dcbnl_flush_dev() 2022-03-03 08:01:55 -08:00
dccp
decnet decnet: Use container_of() for struct dn_neigh casts 2022-05-10 12:21:51 +02:00
dns_resolver
dsa net: dsa: flush switchdev workqueue on bridge join error path 2022-05-09 18:08:04 -07:00
ethernet
ethtool ethtool: add support to set/get completion queue event size 2022-02-23 20:33:05 -08:00
hsr net: add per-cpu storage and net->core_stats 2022-03-11 23:17:24 -08:00
ieee802154 net: ipv6: Handle delivery_time in ipv6 defrag 2022-03-03 14:38:48 +00:00
ife
ipv4 ipv4: drop dst in multicast routing path 2022-05-06 12:46:38 -07:00
ipv6 secure_seq: use the 64 bits of the siphash for port offset calculation 2022-05-04 19:22:20 -07:00
iucv s390/iucv: sort out physical vs virtual pointers usage 2022-02-22 16:09:13 -08:00
kcm
key af_key: add __GFP_ZERO flag for compose_sadb_supported in function pfkey_register 2022-03-10 07:39:47 +01:00
l2tp
l3mdev l3mdev: l3mdev_master_upper_ifindex_by_index_rcu should be using netdev_master_upper_dev_get_rcu 2022-04-15 14:27:24 -07:00
lapb
llc llc: only change llc->dev when bind() succeeds 2022-03-25 16:55:41 -07:00
mac80211 mac80211: Reset MBSSID parameters upon connection 2022-05-04 11:37:46 +02:00
mac802154
mctp mctp: defer the kfree of object mdev->addrs 2022-04-26 09:14:47 +02:00
mpls net: mpls: Fix GCC 12 warning 2022-02-10 15:29:39 +00:00
mptcp Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-03-23 10:53:49 -07:00
ncsi
netfilter netfilter: flowtable: fix TCP flow teardown 2022-05-18 17:34:26 +02:00
netlabel netlabel: fix out-of-bounds memory accesses 2022-03-21 10:59:11 +00:00
netlink netlink: do not reset transport header in netlink_recvmsg() 2022-05-06 15:37:36 -07:00
netrom
nfc NFC: netlink: fix sleep in atomic bug when firmware download timeout 2022-05-05 10:18:15 +02:00
nsh
openvswitch openvswitch: fix OOB access in reserve_sfa_size() 2022-04-15 11:50:02 +01:00
packet net/packet: fix packet_sock xmit return value checking 2022-04-15 11:17:30 +01:00
phonet phonet: Use netif_rx(). 2022-03-07 11:40:41 +00:00
psample
qrtr
rds net: rds: use maybe_get_net() when acquiring refcount on TCP sockets 2022-05-05 16:44:49 -07:00
rfkill rfkill: make new event layout opt-in 2022-03-18 13:09:17 +02:00
rose
rxrpc rxrpc: Enable IPv6 checksums on transport socket 2022-04-30 13:59:34 +01:00
sched net/sched: act_pedit: really ensure the skb is writable 2022-05-11 15:06:42 -07:00
sctp sctp: check asoc strreset_chunk in sctp_generate_reconf_event 2022-04-23 22:34:17 +01:00
smc net/smc: non blocking recvmsg() return -EAGAIN when no data and signal_pending 2022-05-12 10:01:36 -07:00
strparser
sunrpc Revert "SUNRPC: attempt AF_LOCAL connect on setup" 2022-04-29 20:38:27 -04:00
switchdev net: switchdev: remove lag_mod_cb from switchdev_handle_fdb_event_to_device 2022-02-24 21:31:43 -08:00
tipc Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-03-23 10:53:49 -07:00
tls tls: Fix context leak on tls_device_down 2022-05-12 10:01:36 -07:00
unix Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-03-23 10:53:49 -07:00
vmw_vsock vsock/virtio: enable VQs early on probe 2022-03-24 18:36:36 -07:00
wireless nl80211: fix locking in nl80211_set_tx_bitrate_mask() 2022-05-09 14:00:07 +02:00
x25 net/x25: Fix null-ptr-deref caused by x25_disconnect 2022-03-26 11:48:16 -07:00
xdp xsk: Fix possible crash when multiple sockets are created 2022-04-26 16:19:54 +02:00
xfrm xfrm: Pass flowi_oif or l3mdev as oif to xfrm_dst_lookup 2022-04-04 13:32:56 +02:00
compat.c
devres.c
Kconfig page_pool: Add allocation stats 2022-03-03 09:55:28 +00:00
Kconfig.debug
Makefile
socket.c fs: allocate inode by using alloc_inode_sb() 2022-03-22 15:57:03 -07:00
sysctl_net.c