linux/net
luoxuanqiang ff46e3b442 Fix race for duplicate reqsk on identical SYN
When bonding is configured in BOND_MODE_BROADCAST mode, if two identical
SYN packets are received at the same time and processed on different CPUs,
it can potentially create the same sk (sock) but two different reqsk
(request_sock) in tcp_conn_request().

These two different reqsk will respond with two SYNACK packets, and since
the generation of the seq (ISN) incorporates a timestamp, the final two
SYNACK packets will have different seq values.

The consequence is that when the Client receives and replies with an ACK
to the earlier SYNACK packet, we will reset(RST) it.

========================================================================

This behavior is consistently reproducible in my local setup,
which comprises:

                  | NETA1 ------ NETB1 |
PC_A --- bond --- |                    | --- bond --- PC_B
                  | NETA2 ------ NETB2 |

- PC_A is the Server and has two network cards, NETA1 and NETA2. I have
  bonded these two cards using BOND_MODE_BROADCAST mode and configured
  them to be handled by different CPU.

- PC_B is the Client, also equipped with two network cards, NETB1 and
  NETB2, which are also bonded and configured in BOND_MODE_BROADCAST mode.

If the client attempts a TCP connection to the server, it might encounter
a failure. Capturing packets from the server side reveals:

10.10.10.10.45182 > localhost: Flags [S], seq 320236027,
10.10.10.10.45182 > localhost: Flags [S], seq 320236027,
localhost > 10.10.10.10.45182: Flags [S.], seq 2967855116,
localhost > 10.10.10.10.45182: Flags [S.], seq 2967855123, <==
10.10.10.10.45182 > localhost: Flags [.], ack 4294967290,
10.10.10.10.45182 > localhost: Flags [.], ack 4294967290,
localhost > 10.10.10.10.45182: Flags [R], seq 2967855117, <==
localhost > 10.10.10.10.45182: Flags [R], seq 2967855117,

Two SYNACKs with different seq numbers are sent by localhost,
resulting in an anomaly.

========================================================================

The attempted solution is as follows:
Add a return value to inet_csk_reqsk_queue_hash_add() to confirm if the
ehash insertion is successful (Up to now, the reason for unsuccessful
insertion is that a reqsk for the same connection has already been
inserted). If the insertion fails, release the reqsk.

Due to the refcnt, Kuniyuki suggests also adding a return value check
for the DCCP module; if ehash insertion fails, indicating a successful
insertion of the same connection, simply release the reqsk as well.

Simultaneously, In the reqsk_queue_hash_req(), the start of the
req->rsk_timer is adjusted to be after successful insertion.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: luoxuanqiang <luoxuanqiang@kylinos.cn>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240621013929.1386815-1-luoxuanqiang@kylinos.cn
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-25 11:37:45 +02:00
..
6lowpan net: fill in MODULE_DESCRIPTION()s for 6LoWPAN 2024-02-09 14:12:01 -08:00
9p Two fixes headed to stable trees: 2024-05-29 09:25:15 -07:00
802
8021q net: annotate writes on dev->mtu from ndo_change_mtu() 2024-05-07 16:19:14 -07:00
appletalk Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-05-09 10:01:01 -07:00
atm net: change proto and proto_ops accept type 2024-05-13 18:19:09 -06:00
ax25 ax25: Replace kfree() in ax25_dev_free() with ax25_dev_put() 2024-06-01 15:49:42 -07:00
batman-adv Revert "batman-adv: prefer kfree_rcu() over call_rcu() with free-only callbacks" 2024-06-12 20:18:00 +02:00
bluetooth Bluetooth: fix connection setup in l2cap_connect 2024-06-10 09:48:30 -04:00
bpf bpf: Set run context for rawtp test_run callback 2024-06-05 09:41:33 +02:00
bridge net: bridge: mst: fix suspicious rcu usage in br_mst_set_state 2024-06-12 18:24:24 -07:00
caif caif: Use UTILITY_NAME_LENGTH instead of hard-coding 16 2024-04-02 18:20:00 -07:00
can net: can: j1939: recover socket queue on CAN bus error during BAM transmission 2024-06-21 10:50:17 +02:00
ceph libceph: init the cursor when preparing sparse read in msgr2 2024-03-06 12:43:01 +01:00
core bpf-for-netdev 2024-06-24 18:15:22 -07:00
dcb
dccp Fix race for duplicate reqsk on identical SYN 2024-06-25 11:37:45 +02:00
devlink devlink: extend devlink_param *set pointer 2024-04-22 13:05:19 -07:00
dns_resolver
dsa tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
ethernet netkit: Fix pkt_type override upon netkit pass verdict 2024-05-25 10:48:57 -07:00
ethtool net: ethtool: fix the error condition in ethtool_get_phy_stats_ethtool() 2024-06-06 13:34:33 +02:00
handshake net/handshake: remove redundant assignment to variable ret 2024-04-16 17:14:55 -07:00
hsr Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-05-09 10:01:01 -07:00
ieee802154 tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
ife
ipv4 Fix race for duplicate reqsk on identical SYN 2024-06-25 11:37:45 +02:00
ipv6 netfilter pull request 24-06-19 2024-06-20 11:21:53 +02:00
iucv more s390 updates for 6.10 merge window 2024-05-21 12:09:36 -07:00
kcm net: kcm: fix incorrect parameter validation in the kcm_getsockopt) function 2024-03-11 09:53:22 +00:00
key net: fill in MODULE_DESCRIPTION()s for af_key 2024-02-09 14:12:01 -08:00
l2tp l2tp: fix ICMP error handling for UDP-encap sockets 2024-05-17 12:15:22 -07:00
l3mdev
lapb
llc net: change proto and proto_ops accept type 2024-05-13 18:19:09 -06:00
mac80211 wifi: mac80211: fix monitor channel with chanctx emulation 2024-06-14 09:14:08 +02:00
mac802154 mac802154: fix llsec key resources release in mac802154_llsec_key_del 2024-03-06 21:01:26 +01:00
mctp Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-02-29 14:24:56 -08:00
mpls net: Remove the now superfluous sentinel elements from ctl_table array 2024-05-03 13:29:41 +01:00
mptcp mptcp: pm: update add_addr counters after connect 2024-06-10 19:49:10 -07:00
ncsi net/ncsi: Fix the multi thread manner of NCSI driver 2024-06-01 16:21:44 -07:00
netfilter netfilter: move the sysctl nf_hooks_lwtunnel into the netfilter core 2024-06-19 18:41:59 +02:00
netlabel netlabel: fix RCU annotation for IPv4 options on socket creation 2024-05-13 14:58:12 -07:00
netlink netlink: support all extack types in dumps 2024-04-23 10:09:49 -07:00
netrom netrom: Fix a memory leak in nr_heartbeat_expiry() 2024-06-17 13:06:23 +01:00
nfc Quite smaller than usual. Notably it includes the fix for the unix 2024-05-23 12:49:37 -07:00
nsh nsh: Restore skb->{protocol,data,mac_header} for outer header in nsh_gso_segment(). 2024-04-26 12:20:01 +02:00
openvswitch openvswitch: get related ct labels from its master if it is not confirmed 2024-06-21 10:17:30 +01:00
packet af_packet: do not call packet_read_pending() from tpacket_destruct_skb() 2024-05-16 19:38:05 -07:00
phonet net: change proto and proto_ops accept type 2024-05-13 18:19:09 -06:00
psample ip_tunnel: convert __be16 tunnel flags to bitmaps 2024-04-01 10:49:28 +01:00
qrtr net: qrtr: ns: Fix module refcnt 2024-05-16 09:47:45 +01:00
rds net: change proto and proto_ops accept type 2024-05-13 18:19:09 -06:00
rfkill net: rfkill: gpio: Convert to platform remove callback returning void 2024-03-25 15:40:22 +01:00
rose net: change proto and proto_ops accept type 2024-05-13 18:19:09 -06:00
rxrpc Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-05-09 10:01:01 -07:00
sched sched: act_ct: add netns into the key of tcf_ct_flow_table 2024-06-18 15:24:24 +02:00
sctp net: change proto and proto_ops accept type 2024-05-13 18:19:09 -06:00
smc net/smc: avoid overwriting when adjusting sock bufsizes 2024-06-05 09:42:57 +01:00
strparser
sunrpc NFS client bugfixes for Linux 6.10 2024-06-13 11:07:32 -07:00
switchdev net: bridge: switchdev: Improve error message for port_obj_add/del functions 2024-05-08 12:19:12 +01:00
tipc tipc: force a dst refcount before doing decryption 2024-06-18 15:08:57 +02:00
tls tls: fix missing memory barrier in tls_init 2024-05-23 12:03:26 +02:00
unix af_unix: Read with MSG_PEEK loops if the first unread byte is OOB 2024-06-13 08:03:55 -07:00
vmw_vsock virtio: features, fixes, cleanups 2024-05-23 12:04:36 -07:00
wireless wifi: cfg80211: wext: add extra SIOCSIWSCAN data check 2024-06-12 10:07:56 +02:00
x25 net: change proto and proto_ops accept type 2024-05-13 18:19:09 -06:00
xdp Revert "xsk: Support redirect to any socket bound to the same umem" 2024-06-05 09:42:30 +02:00
xfrm net: fix __dst_negative_advice() race 2024-05-29 17:34:49 -07:00
compat.c
devres.c
Kconfig net: add IEEE 802.1q specific helpers 2024-05-08 10:35:09 +01:00
Kconfig.debug
Makefile af_unix: Remove CONFIG_UNIX_SCM. 2024-01-31 16:41:16 -08:00
socket.c net: have do_accept() take a struct proto_accept_arg argument 2024-05-13 18:19:19 -06:00
sysctl_net.c sysctl: treewide: constify argument ctl_table_root::permissions(table) 2024-04-24 09:43:54 +02:00