Commit graph

65123 commits

Author SHA1 Message Date
Zhang Xiaoxu fcb170a9d8 SUNRPC: Fix the batch tasks count wraparound.
The 'queue->nr' will wraparound from 0 to 255 when only current
priority queue has tasks. This maybe lead a deadlock same as commit
dfe1fe75e0 ("NFSv4: Fix deadlock between nfs4_evict_inode()
and nfs4_opendata_get_inode()"):

Privileged delegreturn task is queued to privileged list because all
the slots are assigned. When non-privileged task complete and release
the slot, a non-privileged maybe picked out. It maybe allocate slot
failed when the session on draining.

If the 'queue->nr' has wraparound to 255, and no enough slot to
service it, then the privileged delegreturn will lost to wake up.

So we should avoid the wraparound on 'queue->nr'.

Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: 5fcdfacc01 ("NFSv4: Return delegations synchronously in evict_inode")
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-06-28 09:34:39 -04:00
NeilBrown bc1c56e9bb SUNRPC: prevent port reuse on transports which don't request it.
If an RPC client is created without RPC_CLNT_CREATE_REUSEPORT, it should
not reuse the source port when a TCP connection is re-established.
This is currently implemented by preventing the source port being
recorded after a successful connection (the call to xs_set_srcport()).

However the source port is also recorded after a successful bind in xs_bind().
This may not be needed at all and certainly is not wanted when
RPC_CLNT_CREATE_REUSEPORT wasn't requested.

So avoid that assignment when xprt.reuseport is not set.

With this change, NFSv4.1 and later mounts use a different port number on
each connection.  This is helpful with some firewalls which don't cope
well with port reuse.

Signed-off-by: NeilBrown <neilb@suse.de>
Fixes: e6237b6feb ("NFSv4.1: Don't rebind to the same source port when reconnecting to the server")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-06-21 12:06:16 -04:00
Colin Ian King bb24cc0f37 rpc: remove redundant initialization of variable status
The variable status is being initialized with a value that is never
read, the assignment is redundant and can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-06-21 12:06:16 -04:00
Anna Schumaker 6d1c0f3d28 sunrpc: Avoid a KASAN slab-out-of-bounds bug in xdr_set_page_base()
This seems to happen fairly easily during READ_PLUS testing on NFS v4.2.
I found that we could end up accessing xdr->buf->pages[pgnr] with a pgnr
greater than the number of pages in the array. So let's just return
early if we're setting base to a point at the end of the page data and
let xdr_set_tail_base() handle setting up the buffer pointers instead.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Fixes: 8d86e373b0 ("SUNRPC: Clean up helpers xdr_set_iov() and xdr_set_page_base()")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-06-13 19:36:49 -04:00
Linus Torvalds 9d32fa5d74 Networking fixes for 5.13-rc5, including fixes from bpf, wireless,
netfilter and wireguard trees.
 
 The bpf vs lockdown+audit fix is the most notable.
 
 Current release - regressions:
 
  - virtio-net: fix page faults and crashes when XDP is enabled
 
  - mlx5e: fix HW timestamping with CQE compression, and make sure they
           are only allowed to coexist with capable devices
 
  - stmmac:
         - fix kernel panic due to NULL pointer dereference of mdio_bus_data
         - fix double clk unprepare when no PHY device is connected
 
 Current release - new code bugs:
 
  - mt76: a few fixes for the recent MT7921 devices and runtime
          power management
 
 Previous releases - regressions:
 
  - ice: - track AF_XDP ZC enabled queues in bitmap to fix copy mode Tx
         - fix allowing VF to request more/less queues via virtchnl
 	- correct supported and advertised autoneg by using PHY capabilities
         - allow all LLDP packets from PF to Tx
 
  - kbuild: quote OBJCOPY var to avoid a pahole call break the build
 
 Previous releases - always broken:
 
  - bpf, lockdown, audit: fix buggy SELinux lockdown permission checks
 
  - mt76: address the recent FragAttack vulnerabilities not covered
          by generic fixes
 
  - ipv6: fix KASAN: slab-out-of-bounds Read in fib6_nh_flush_exceptions
 
  - Bluetooth:
  	 - fix the erroneous flush_work() order, to avoid double free
          - use correct lock to prevent UAF of hdev object
 
  - nfc: fix NULL ptr dereference in llcp_sock_getname() after failed connect
 
  - ieee802154: multiple fixes to error checking and return values
 
  - igb: fix XDP with PTP enabled
 
  - intel: add correct exception tracing for XDP
 
  - tls: fix use-after-free when TLS offload device goes down and back up
 
  - ipvs: ignore IP_VS_SVC_F_HASHED flag when adding service
 
  - netfilter: nft_ct: skip expectations for confirmed conntrack
 
  - mptcp: fix falling back to TCP in presence of out of order packets
           early in connection lifetime
 
  - wireguard: switch from O(n) to a O(1) algorithm for maintaining peers,
           fixing stalls and a large memory leak in the process
 
 Misc:
 
  - devlink: correct VIRTUAL port to not have phys_port attributes
 
  - Bluetooth: fix VIRTIO_ID_BT assigned number
 
  - net: return the correct errno code ENOBUF -> ENOMEM
 
  - wireguard:
          - peer: allocate in kmem_cache saving 25% on peer memory
          - do not use -O3
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmC6yGMACgkQMUZtbf5S
 Irv67w//ZpT4+KHETUIS+CgeUIgjAQD0FTmO4iboHFGG7BadWEZpEVswUU0xBfY/
 RJrSWAEqTga8zbjWqRaLRx5Qii99F2hHPZ502VR6x6NbPu1mNdS5rUOa61YbtGCv
 v4sC45eOvG7T/y5mceq4rQaPsQKEUUAIgYzIOpjSiDoMfgFCT3UUF/UrBhgLzybj
 aMXd12rg17dN+RJeNOZjQKligNENX9A0tBtSGXxs9hhYYbY25O+uECOsESrA1RKt
 uHeh003iqApT5x8hmJsdMDtis05n7S/Bq1/4RZfAdbTcgJngepw570bQ999tbXqE
 HeB3Ls9k3Vi9W6svfUkYjFGt3GYygsVGPjFAVhC+g0TZXAgdsh5w2SPQAgcIrzIr
 WOfDL9hu7OJp/XRsPiB9pg8cul7a4Q5Yhp29bvN33u43AMij2TWD0CpKCQt9UQdi
 8V0KOLAGC8bzXx35VTP/pbbwAI21PIYxVKfe/0cOJKShTMtfPePx1a2cuYRWoQSP
 PYYbQaY6WhfUniV3DEmvL1Z+dgL0yyaJKIV2IdBHR8MPKKy+5kD+6HDaNo2lO75J
 wWSN1LtoVKrc5msCD375epGmkbjatpWdfzOE+pljWHz5LnW+2cGwFhCo7+UJhAG5
 XwE8+G9YUyYH51PjFpGBsoPBWEmYmIMnY34p20A1Pz1M7/HFfXc=
 =sNP5
 -----END PGP SIGNATURE-----

Merge tag 'net-5.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Networking fixes, including fixes from bpf, wireless, netfilter and
  wireguard trees.

  The bpf vs lockdown+audit fix is the most notable.

  Things haven't slowed down just yet, both in terms of regressions in
  current release and largish fixes for older code, but we usually see a
  slowdown only after -rc5.

  Current release - regressions:

   - virtio-net: fix page faults and crashes when XDP is enabled

   - mlx5e: fix HW timestamping with CQE compression, and make sure they
     are only allowed to coexist with capable devices

   - stmmac:
      - fix kernel panic due to NULL pointer dereference of
        mdio_bus_data
      - fix double clk unprepare when no PHY device is connected

  Current release - new code bugs:

   - mt76: a few fixes for the recent MT7921 devices and runtime power
     management

  Previous releases - regressions:

   - ice:
      - track AF_XDP ZC enabled queues in bitmap to fix copy mode Tx
      - fix allowing VF to request more/less queues via virtchnl
      - correct supported and advertised autoneg by using PHY
        capabilities
      - allow all LLDP packets from PF to Tx

   - kbuild: quote OBJCOPY var to avoid a pahole call break the build

  Previous releases - always broken:

   - bpf, lockdown, audit: fix buggy SELinux lockdown permission checks

   - mt76: address the recent FragAttack vulnerabilities not covered by
     generic fixes

   - ipv6: fix KASAN: slab-out-of-bounds Read in
     fib6_nh_flush_exceptions

   - Bluetooth:
      - fix the erroneous flush_work() order, to avoid double free
      - use correct lock to prevent UAF of hdev object

   - nfc: fix NULL ptr dereference in llcp_sock_getname() after failed
     connect

   - ieee802154: multiple fixes to error checking and return values

   - igb: fix XDP with PTP enabled

   - intel: add correct exception tracing for XDP

   - tls: fix use-after-free when TLS offload device goes down and back
     up

   - ipvs: ignore IP_VS_SVC_F_HASHED flag when adding service

   - netfilter: nft_ct: skip expectations for confirmed conntrack

   - mptcp: fix falling back to TCP in presence of out of order packets
     early in connection lifetime

   - wireguard: switch from O(n) to a O(1) algorithm for maintaining
     peers, fixing stalls and a large memory leak in the process

  Misc:

   - devlink: correct VIRTUAL port to not have phys_port attributes

   - Bluetooth: fix VIRTIO_ID_BT assigned number

   - net: return the correct errno code ENOBUF -> ENOMEM

   - wireguard:
      - peer: allocate in kmem_cache saving 25% on peer memory
      - do not use -O3"

* tag 'net-5.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (91 commits)
  cxgb4: avoid link re-train during TC-MQPRIO configuration
  sch_htb: fix refcount leak in htb_parent_to_leaf_offload
  wireguard: allowedips: free empty intermediate nodes when removing single node
  wireguard: allowedips: allocate nodes in kmem_cache
  wireguard: allowedips: remove nodes in O(1)
  wireguard: allowedips: initialize list head in selftest
  wireguard: peer: allocate in kmem_cache
  wireguard: use synchronize_net rather than synchronize_rcu
  wireguard: do not use -O3
  wireguard: selftests: make sure rp_filter is disabled on vethc
  wireguard: selftests: remove old conntrack kconfig value
  virtchnl: Add missing padding to virtchnl_proto_hdrs
  ice: Allow all LLDP packets from PF to Tx
  ice: report supported and advertised autoneg using PHY capabilities
  ice: handle the VF VSI rebuild failure
  ice: Fix VFR issues for AVF drivers that expect ATQLEN cleared
  ice: Fix allowing VF to request more/less queues via virtchnl
  virtio-net: fix for skb_over_panic inside big mode
  ipv6: Fix KASAN: slab-out-of-bounds Read in fib6_nh_flush_exceptions
  fib: Return the correct errno code
  ...
2021-06-04 18:25:39 -07:00
Yunjian Wang 944d671d5f sch_htb: fix refcount leak in htb_parent_to_leaf_offload
The commit ae81feb733 ("sch_htb: fix null pointer dereference
on a null new_q") fixes a NULL pointer dereference bug, but it
is not correct.

Because htb_graft_helper properly handles the case when new_q
is NULL, and after the previous patch by skipping this call
which creates an inconsistency : dev_queue->qdisc will still
point to the old qdisc, but cl->parent->leaf.q will point to
the new one (which will be noop_qdisc, because new_q was NULL).
The code is based on an assumption that these two pointers are
the same, so it can lead to refcount leaks.

The correct fix is to add a NULL pointer check to protect
qdisc_refcount_inc inside htb_parent_to_leaf_offload.

Fixes: ae81feb733 ("sch_htb: fix null pointer dereference on a null new_q")
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Suggested-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-04 14:44:18 -07:00
David S. Miller 579028dec1 bluetooth pull request for net:
- Fixes UAF and CVE-2021-3564
  - Fix VIRTIO_ID_BT to use an unassigned ID
  - Fix firmware loading on some Intel Controllers
 -----BEGIN PGP SIGNATURE-----
 
 iQJNBAABCAA3FiEE7E6oRXp8w05ovYr/9JCA4xAyCykFAmC5RWQZHGx1aXoudm9u
 LmRlbnR6QGludGVsLmNvbQAKCRD0kIDjEDILKS0+D/4kJF7G9FohvLJUzTrrhcPx
 nEE/5IL1eZeCQVCdKmgMeiy6K2iARGY9ZNqnx/AX1SJN9bHI7WsL6uy2RV7r57kx
 iP2XZsV2uzXbwY9KVvfXBMNoCA2E4xS0UxpxA2h1znRUgMWDFLFkZydwYsBieGb6
 tXZwJo3WOnDp169RbKdWTrWstYlL6KTTJoIxaVYWlghXVZ8Fl8LUHbhnx5MEqhqz
 469AfGDlUKEoiYUUDwNrwX1ory/RWhcDxTFpDeji48U0P7oLFL73Aoyy/WP0B2FO
 dhOErn38YUDivwBqSO2O21RUsICREbyLqHy6K/JWe4RqY50nEmWhfQo59ApzSuV3
 e2HcbDwK5vgGYxmU6T9vb5S0nV1AgTV+5O3t1Mj6ZVqTAl6b2OkfqskCZzTrklIS
 aKIP4viRAPLsJMdKKHW1mhR3zBH0deYEovIpFy+LkjX5aFsrEgc8hRn7i5ceF8GW
 d+Ov9LPJQJQTK+r6W7xPiCUkC1dj/SMZ756Gr6cGhXPzY1DgBoyaaoZV1K4mz17g
 dlLwXfF4nIJqJFop3iTPVGWVoeapZ/tgu73iTUdkXIEbqj19wj67nw+xz0WGs1pB
 B1H/OemQS4/yfo4IsfLRDAJ14Q+5JS4qRKBf7p4e/yj533BW6lia0GTdujO+N4eT
 FQfnUoYaexkiPYwGMyjRpQ==
 =X9Cg
 -----END PGP SIGNATURE-----

Merge tag 'for-net-2021-06-03' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth

bluetooth pull request for net:

 - Fixes UAF and CVE-2021-3564
 - Fix VIRTIO_ID_BT to use an unassigned ID
 - Fix firmware loading on some Intel Controllers

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-03 15:32:21 -07:00
David S. Miller e31d57ca14 Merge tag 'ieee802154-for-davem-2021-06-03' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan
Stefan Schmidt says:

====================
An update from ieee802154 for your *net* tree.

This time we have fixes for the ieee802154 netlink code, as well as a driver
fix. Zhen Lei, Wei Yongjun and Yang Li each had  a patch to cleanup some return
code handling ensuring we actually get a real error code when things fails.

Dan Robertson fixed a potential null dereference in our netlink handling.

Andy Shevchenko removed of_match_ptr()usage in the mrf24j40 driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-03 15:21:58 -07:00
Coco Li 821bbf79fe ipv6: Fix KASAN: slab-out-of-bounds Read in fib6_nh_flush_exceptions
Reported by syzbot:
HEAD commit:    90c911ad Merge tag 'fixes' of git://git.kernel.org/pub/scm..
git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
dashboard link: https://syzkaller.appspot.com/bug?extid=123aa35098fd3c000eb7
compiler:       Debian clang version 11.0.1-2

==================================================================
BUG: KASAN: slab-out-of-bounds in fib6_nh_get_excptn_bucket net/ipv6/route.c:1604 [inline]
BUG: KASAN: slab-out-of-bounds in fib6_nh_flush_exceptions+0xbd/0x360 net/ipv6/route.c:1732
Read of size 8 at addr ffff8880145c78f8 by task syz-executor.4/17760

CPU: 0 PID: 17760 Comm: syz-executor.4 Not tainted 5.12.0-rc8-syzkaller #0
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x202/0x31e lib/dump_stack.c:120
 print_address_description+0x5f/0x3b0 mm/kasan/report.c:232
 __kasan_report mm/kasan/report.c:399 [inline]
 kasan_report+0x15c/0x200 mm/kasan/report.c:416
 fib6_nh_get_excptn_bucket net/ipv6/route.c:1604 [inline]
 fib6_nh_flush_exceptions+0xbd/0x360 net/ipv6/route.c:1732
 fib6_nh_release+0x9a/0x430 net/ipv6/route.c:3536
 fib6_info_destroy_rcu+0xcb/0x1c0 net/ipv6/ip6_fib.c:174
 rcu_do_batch kernel/rcu/tree.c:2559 [inline]
 rcu_core+0x8f6/0x1450 kernel/rcu/tree.c:2794
 __do_softirq+0x372/0x7a6 kernel/softirq.c:345
 invoke_softirq kernel/softirq.c:221 [inline]
 __irq_exit_rcu+0x22c/0x260 kernel/softirq.c:422
 irq_exit_rcu+0x5/0x20 kernel/softirq.c:434
 sysvec_apic_timer_interrupt+0x91/0xb0 arch/x86/kernel/apic/apic.c:1100
 </IRQ>
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:632
RIP: 0010:lock_acquire+0x1f6/0x720 kernel/locking/lockdep.c:5515
Code: f6 84 24 a1 00 00 00 02 0f 85 8d 02 00 00 f7 c3 00 02 00 00 49 bd 00 00 00 00 00 fc ff df 74 01 fb 48 c7 44 24 40 0e 36 e0 45 <4b> c7 44 3d 00 00 00 00 00 4b c7 44 3d 09 00 00 00 00 43 c7 44 3d
RSP: 0018:ffffc90009e06560 EFLAGS: 00000206
RAX: 1ffff920013c0cc0 RBX: 0000000000000246 RCX: dffffc0000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffffc90009e066e0 R08: dffffc0000000000 R09: fffffbfff1f992b1
R10: fffffbfff1f992b1 R11: 0000000000000000 R12: 0000000000000000
R13: dffffc0000000000 R14: 0000000000000000 R15: 1ffff920013c0cb4
 rcu_lock_acquire+0x2a/0x30 include/linux/rcupdate.h:267
 rcu_read_lock include/linux/rcupdate.h:656 [inline]
 ext4_get_group_info+0xea/0x340 fs/ext4/ext4.h:3231
 ext4_mb_prefetch+0x123/0x5d0 fs/ext4/mballoc.c:2212
 ext4_mb_regular_allocator+0x8a5/0x28f0 fs/ext4/mballoc.c:2379
 ext4_mb_new_blocks+0xc6e/0x24f0 fs/ext4/mballoc.c:4982
 ext4_ext_map_blocks+0x2be3/0x7210 fs/ext4/extents.c:4238
 ext4_map_blocks+0xab3/0x1cb0 fs/ext4/inode.c:638
 ext4_getblk+0x187/0x6c0 fs/ext4/inode.c:848
 ext4_bread+0x2a/0x1c0 fs/ext4/inode.c:900
 ext4_append+0x1a4/0x360 fs/ext4/namei.c:67
 ext4_init_new_dir+0x337/0xa10 fs/ext4/namei.c:2768
 ext4_mkdir+0x4b8/0xc00 fs/ext4/namei.c:2814
 vfs_mkdir+0x45b/0x640 fs/namei.c:3819
 ovl_do_mkdir fs/overlayfs/overlayfs.h:161 [inline]
 ovl_mkdir_real+0x53/0x1a0 fs/overlayfs/dir.c:146
 ovl_create_real+0x280/0x490 fs/overlayfs/dir.c:193
 ovl_workdir_create+0x425/0x600 fs/overlayfs/super.c:788
 ovl_make_workdir+0xed/0x1140 fs/overlayfs/super.c:1355
 ovl_get_workdir fs/overlayfs/super.c:1492 [inline]
 ovl_fill_super+0x39ee/0x5370 fs/overlayfs/super.c:2035
 mount_nodev+0x52/0xe0 fs/super.c:1413
 legacy_get_tree+0xea/0x180 fs/fs_context.c:592
 vfs_get_tree+0x86/0x270 fs/super.c:1497
 do_new_mount fs/namespace.c:2903 [inline]
 path_mount+0x196f/0x2be0 fs/namespace.c:3233
 do_mount fs/namespace.c:3246 [inline]
 __do_sys_mount fs/namespace.c:3454 [inline]
 __se_sys_mount+0x2f9/0x3b0 fs/namespace.c:3431
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x4665f9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f68f2b87188 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 000000000056bf60 RCX: 00000000004665f9
RDX: 00000000200000c0 RSI: 0000000020000000 RDI: 000000000040000a
RBP: 00000000004bfbb9 R08: 0000000020000100 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000056bf60
R13: 00007ffe19002dff R14: 00007f68f2b87300 R15: 0000000000022000

Allocated by task 17768:
 kasan_save_stack mm/kasan/common.c:38 [inline]
 kasan_set_track mm/kasan/common.c:46 [inline]
 set_alloc_info mm/kasan/common.c:427 [inline]
 ____kasan_kmalloc+0xc2/0xf0 mm/kasan/common.c:506
 kasan_kmalloc include/linux/kasan.h:233 [inline]
 __kmalloc+0xb4/0x380 mm/slub.c:4055
 kmalloc include/linux/slab.h:559 [inline]
 kzalloc include/linux/slab.h:684 [inline]
 fib6_info_alloc+0x2c/0xd0 net/ipv6/ip6_fib.c:154
 ip6_route_info_create+0x55d/0x1a10 net/ipv6/route.c:3638
 ip6_route_add+0x22/0x120 net/ipv6/route.c:3728
 inet6_rtm_newroute+0x2cd/0x2260 net/ipv6/route.c:5352
 rtnetlink_rcv_msg+0xb34/0xe70 net/core/rtnetlink.c:5553
 netlink_rcv_skb+0x1f0/0x460 net/netlink/af_netlink.c:2502
 netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline]
 netlink_unicast+0x7de/0x9b0 net/netlink/af_netlink.c:1338
 netlink_sendmsg+0xaa6/0xe90 net/netlink/af_netlink.c:1927
 sock_sendmsg_nosec net/socket.c:654 [inline]
 sock_sendmsg net/socket.c:674 [inline]
 ____sys_sendmsg+0x5a2/0x900 net/socket.c:2350
 ___sys_sendmsg net/socket.c:2404 [inline]
 __sys_sendmsg+0x319/0x400 net/socket.c:2433
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Last potentially related work creation:
 kasan_save_stack+0x27/0x50 mm/kasan/common.c:38
 kasan_record_aux_stack+0xee/0x120 mm/kasan/generic.c:345
 __call_rcu kernel/rcu/tree.c:3039 [inline]
 call_rcu+0x1b1/0xa30 kernel/rcu/tree.c:3114
 fib6_info_release include/net/ip6_fib.h:337 [inline]
 ip6_route_info_create+0x10c4/0x1a10 net/ipv6/route.c:3718
 ip6_route_add+0x22/0x120 net/ipv6/route.c:3728
 inet6_rtm_newroute+0x2cd/0x2260 net/ipv6/route.c:5352
 rtnetlink_rcv_msg+0xb34/0xe70 net/core/rtnetlink.c:5553
 netlink_rcv_skb+0x1f0/0x460 net/netlink/af_netlink.c:2502
 netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline]
 netlink_unicast+0x7de/0x9b0 net/netlink/af_netlink.c:1338
 netlink_sendmsg+0xaa6/0xe90 net/netlink/af_netlink.c:1927
 sock_sendmsg_nosec net/socket.c:654 [inline]
 sock_sendmsg net/socket.c:674 [inline]
 ____sys_sendmsg+0x5a2/0x900 net/socket.c:2350
 ___sys_sendmsg net/socket.c:2404 [inline]
 __sys_sendmsg+0x319/0x400 net/socket.c:2433
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Second to last potentially related work creation:
 kasan_save_stack+0x27/0x50 mm/kasan/common.c:38
 kasan_record_aux_stack+0xee/0x120 mm/kasan/generic.c:345
 insert_work+0x54/0x400 kernel/workqueue.c:1331
 __queue_work+0x981/0xcc0 kernel/workqueue.c:1497
 queue_work_on+0x111/0x200 kernel/workqueue.c:1524
 queue_work include/linux/workqueue.h:507 [inline]
 call_usermodehelper_exec+0x283/0x470 kernel/umh.c:433
 kobject_uevent_env+0x1349/0x1730 lib/kobject_uevent.c:617
 kvm_uevent_notify_change+0x309/0x3b0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:4809
 kvm_destroy_vm arch/x86/kvm/../../../virt/kvm/kvm_main.c:877 [inline]
 kvm_put_kvm+0x9c/0xd10 arch/x86/kvm/../../../virt/kvm/kvm_main.c:920
 kvm_vcpu_release+0x53/0x60 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3120
 __fput+0x352/0x7b0 fs/file_table.c:280
 task_work_run+0x146/0x1c0 kernel/task_work.c:140
 tracehook_notify_resume include/linux/tracehook.h:189 [inline]
 exit_to_user_mode_loop kernel/entry/common.c:174 [inline]
 exit_to_user_mode_prepare+0x10b/0x1e0 kernel/entry/common.c:208
 __syscall_exit_to_user_mode_work kernel/entry/common.c:290 [inline]
 syscall_exit_to_user_mode+0x26/0x70 kernel/entry/common.c:301
 entry_SYSCALL_64_after_hwframe+0x44/0xae

The buggy address belongs to the object at ffff8880145c7800
 which belongs to the cache kmalloc-192 of size 192
The buggy address is located 56 bytes to the right of
 192-byte region [ffff8880145c7800, ffff8880145c78c0)
The buggy address belongs to the page:
page:ffffea00005171c0 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x145c7
flags: 0xfff00000000200(slab)
raw: 00fff00000000200 ffffea00006474c0 0000000200000002 ffff888010c41a00
raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff8880145c7780: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
 ffff8880145c7800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>ffff8880145c7880: 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc fc
                                                                ^
 ffff8880145c7900: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff8880145c7980: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
==================================================================

In the ip6_route_info_create function, in the case that the nh pointer
is not NULL, the fib6_nh in fib6_info has not been allocated.
Therefore, when trying to free fib6_info in this error case using
fib6_info_release, the function will call fib6_info_destroy_rcu,
which it will access fib6_nh_release(f6i->fib6_nh);
However, f6i->fib6_nh doesn't have any refcount yet given the lack of allocation
causing the reported memory issue above.
Therefore, releasing the empty pointer directly instead would be the solution.

Fixes: f88d8ea67f ("ipv6: Plumb support for nexthop object in a fib6_info")
Fixes: 706ec91916 ("ipv6: Fix nexthop refcnt leak when creating ipv6 route info")
Signed-off-by: Coco Li <lixiaoyan@google.com>
Cc: David Ahern <dsahern@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-03 15:19:49 -07:00
Zheng Yongjun 59607863c5 fib: Return the correct errno code
When kalloc or kmemdup failed, should return ENOMEM rather than ENOBUF.

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-03 15:13:56 -07:00
Zheng Yongjun 49251cd002 net: Return the correct errno code
When kalloc or kmemdup failed, should return ENOMEM rather than ENOBUF.

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-03 15:13:56 -07:00
Zheng Yongjun d773695866 net/x25: Return the correct errno code
When kalloc or kmemdup failed, should return ENOMEM rather than ENOBUF.

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-03 15:13:56 -07:00
Pavel Skripkin 7f5d86669f net: caif: fix memory leak in cfusbl_device_notify
In case of caif_enroll_dev() fail, allocated
link_support won't be assigned to the corresponding
structure. So simply free allocated pointer in case
of error.

Fixes: 7ad65bf68d ("caif: Add support for CAIF over CDC NCM USB interface")
Cc: stable@vger.kernel.org
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-03 15:05:07 -07:00
Pavel Skripkin b53558a950 net: caif: fix memory leak in caif_device_notify
In case of caif_enroll_dev() fail, allocated
link_support won't be assigned to the corresponding
structure. So simply free allocated pointer in case
of error

Fixes: 7c18d2205e ("caif: Restructure how link caif link layer enroll")
Cc: stable@vger.kernel.org
Reported-and-tested-by: syzbot+7ec324747ce876a29db6@syzkaller.appspotmail.com
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-03 15:05:07 -07:00
Pavel Skripkin a2805dca51 net: caif: add proper error handling
caif_enroll_dev() can fail in some cases. Ingnoring
these cases can lead to memory leak due to not assigning
link_support pointer to anywhere.

Fixes: 7c18d2205e ("caif: Restructure how link caif link layer enroll")
Cc: stable@vger.kernel.org
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-03 15:05:06 -07:00
Pavel Skripkin bce130e7f3 net: caif: added cfserl_release function
Added cfserl_release() function.

Cc: stable@vger.kernel.org
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-03 15:05:06 -07:00
Pavel Skripkin c47cc30499 net: kcm: fix memory leak in kcm_sendmsg
Syzbot reported memory leak in kcm_sendmsg()[1].
The problem was in non-freed frag_list in case of error.

In the while loop:

	if (head == skb)
		skb_shinfo(head)->frag_list = tskb;
	else
		skb->next = tskb;

frag_list filled with skbs, but nothing was freeing them.

backtrace:
  [<0000000094c02615>] __alloc_skb+0x5e/0x250 net/core/skbuff.c:198
  [<00000000e5386cbd>] alloc_skb include/linux/skbuff.h:1083 [inline]
  [<00000000e5386cbd>] kcm_sendmsg+0x3b6/0xa50 net/kcm/kcmsock.c:967 [1]
  [<00000000f1613a8a>] sock_sendmsg_nosec net/socket.c:652 [inline]
  [<00000000f1613a8a>] sock_sendmsg+0x4c/0x60 net/socket.c:672

Reported-and-tested-by: syzbot+b039f5699bd82e1fb011@syzkaller.appspotmail.com
Fixes: ab7ac4eb98 ("kcm: Kernel Connection Multiplexor module")
Cc: stable@vger.kernel.org
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-03 14:13:26 -07:00
zhang kai 261ba78cc3 sit: set name of device back to struct parms
addrconf_set_sit_dstaddr will use parms->name.

Signed-off-by: zhang kai <zhangkaiheb@126.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-03 13:57:36 -07:00
Jiapeng Chong a8db57c1d2 rtnetlink: Fix missing error code in rtnl_bridge_notify()
The error code is missing in this code scenario, add the error code
'-EINVAL' to the return value 'err'.

Eliminate the follow smatch warning:

net/core/rtnetlink.c:4834 rtnl_bridge_notify() warn: missing error code
'err'.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-03 13:56:27 -07:00
David S. Miller 59717f3931 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

1) Do not allow to add conntrack helper extension for confirmed
   conntracks in the nf_tables ct expectation support.

2) Fix bogus EBUSY in nfnetlink_cthelper when NFCTH_PRIV_DATA_LEN
   is passed on userspace helper updates.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-03 13:49:08 -07:00
Wei Yongjun 373e864cf5 ieee802154: fix error return code in ieee802154_llsec_getparams()
Fix to return negative error code -ENOBUFS from the error handling
case instead of 0, as done elsewhere in this function.

Fixes: 3e9c156e2c ("ieee802154: add netlink interfaces for llsec")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Link: https://lore.kernel.org/r/20210519141614.3040055-1-weiyongjun1@huawei.com
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
2021-06-03 10:59:49 +02:00
Zhen Lei 79c6b8ed30 ieee802154: fix error return code in ieee802154_add_iface()
Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.

Fixes: be51da0f3e ("ieee802154: Stop using NLA_PUT*().")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20210508062517.2574-1-thunder.leizhen@huawei.com
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
2021-06-03 10:50:08 +02:00
Yang Li ad6f5cc5f6 net/ieee802154: drop unneeded assignment in llsec_iter_devkeys()
In order to keep the code style consistency of the whole file,
redundant return value ‘rc’ and its assignments should be deleted

The clang_analyzer complains as follows:
net/ieee802154/nl-mac.c:1203:12: warning: Although the value stored to
'rc' is used in the enclosing expression, the value is never actually
read from 'rc'

No functional change, only more efficient.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Link: https://lore.kernel.org/r/1619346299-40237-1-git-send-email-yang.lee@linux.alibaba.com
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
2021-06-03 10:09:36 +02:00
Josh Triplett b508d5fb69 net: ipconfig: Don't override command-line hostnames or domains
If the user specifies a hostname or domain name as part of the ip=
command-line option, preserve it and don't overwrite it with one
supplied by DHCP/BOOTP.

For instance, ip=::::myhostname::dhcp will use "myhostname" rather than
ignoring and overwriting it.

Fix the comment on ic_bootp_string that suggests it only copies a string
"if not already set"; it doesn't have any such logic.

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-02 13:27:03 -07:00
Pablo Neira Ayuso 8971ee8b08 netfilter: nfnetlink_cthelper: hit EBUSY on updates if size mismatches
The private helper data size cannot be updated. However, updates that
contain NFCTH_PRIV_DATA_LEN might bogusly hit EBUSY even if the size is
the same.

Fixes: 12f7a50533 ("netfilter: add user-space connection tracking helper infrastructure")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-06-02 12:43:50 +02:00
Pablo Neira Ayuso 1710eb913b netfilter: nft_ct: skip expectations for confirmed conntrack
nft_ct_expect_obj_eval() calls nf_ct_ext_add() for a confirmed
conntrack entry. However, nf_ct_ext_add() can only be called for
!nf_ct_is_confirmed().

[ 1825.349056] WARNING: CPU: 0 PID: 1279 at net/netfilter/nf_conntrack_extend.c:48 nf_ct_xt_add+0x18e/0x1a0 [nf_conntrack]
[ 1825.351391] RIP: 0010:nf_ct_ext_add+0x18e/0x1a0 [nf_conntrack]
[ 1825.351493] Code: 41 5c 41 5d 41 5e 41 5f c3 41 bc 0a 00 00 00 e9 15 ff ff ff ba 09 00 00 00 31 f6 4c 89 ff e8 69 6c 3d e9 eb 96 45 31 ed eb cd <0f> 0b e9 b1 fe ff ff e8 86 79 14 e9 eb bf 0f 1f 40 00 0f 1f 44 00
[ 1825.351721] RSP: 0018:ffffc90002e1f1e8 EFLAGS: 00010202
[ 1825.351790] RAX: 000000000000000e RBX: ffff88814f5783c0 RCX: ffffffffc0e4f887
[ 1825.351881] RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88814f578440
[ 1825.351971] RBP: 0000000000000000 R08: 0000000000000000 R09: ffff88814f578447
[ 1825.352060] R10: ffffed1029eaf088 R11: 0000000000000001 R12: ffff88814f578440
[ 1825.352150] R13: ffff8882053f3a00 R14: 0000000000000000 R15: 0000000000000a20
[ 1825.352240] FS:  00007f992261c900(0000) GS:ffff889faec00000(0000) knlGS:0000000000000000
[ 1825.352343] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1825.352417] CR2: 000056070a4d1158 CR3: 000000015efe0000 CR4: 0000000000350ee0
[ 1825.352508] Call Trace:
[ 1825.352544]  nf_ct_helper_ext_add+0x10/0x60 [nf_conntrack]
[ 1825.352641]  nft_ct_expect_obj_eval+0x1b8/0x1e0 [nft_ct]
[ 1825.352716]  nft_do_chain+0x232/0x850 [nf_tables]

Add the ct helper extension only for unconfirmed conntrack. Skip rule
evaluation if the ct helper extension does not exist. Thus, you can
only create expectations from the first packet.

It should be possible to remove this limitation by adding a new action
to attach a generic ct helper to the first packet. Then, use this ct
helper extension from follow up packets to create the ct expectation.

While at it, add a missing check to skip the template conntrack too
and remove check for IPCT_UNTRACK which is implicit to !ct.

Fixes: 857b46027d ("netfilter: nft_ct: add ct expectations support")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-06-02 12:43:34 +02:00
Maxim Mikityanskiy c55dcdd435 net/tls: Fix use-after-free after the TLS device goes down and up
When a netdev with active TLS offload goes down, tls_device_down is
called to stop the offload and tear down the TLS context. However, the
socket stays alive, and it still points to the TLS context, which is now
deallocated. If a netdev goes up, while the connection is still active,
and the data flow resumes after a number of TCP retransmissions, it will
lead to a use-after-free of the TLS context.

This commit addresses this bug by keeping the context alive until its
normal destruction, and implements the necessary fallbacks, so that the
connection can resume in software (non-offloaded) kTLS mode.

On the TX side tls_sw_fallback is used to encrypt all packets. The RX
side already has all the necessary fallbacks, because receiving
non-decrypted packets is supported. The thing needed on the RX side is
to block resync requests, which are normally produced after receiving
non-decrypted packets.

The necessary synchronization is implemented for a graceful teardown:
first the fallbacks are deployed, then the driver resources are released
(it used to be possible to have a tls_dev_resync after tls_dev_del).

A new flag called TLS_RX_DEV_DEGRADED is added to indicate the fallback
mode. It's used to skip the RX resync logic completely, as it becomes
useless, and some objects may be released (for example, resync_async,
which is allocated and freed by the driver).

Fixes: e8f6979981 ("net/tls: Add generic NIC offload infrastructure")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-01 15:58:05 -07:00
Maxim Mikityanskiy 05fc8b6cbd net/tls: Replace TLS_RX_SYNC_RUNNING with RCU
RCU synchronization is guaranteed to finish in finite time, unlike a
busy loop that polls a flag. This patch is a preparation for the bugfix
in the next patch, where the same synchronize_net() call will also be
used to sync with the TX datapath.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-01 15:58:05 -07:00
Alexander Aring dd9082f4a9 net: sock: fix in-kernel mark setting
This patch fixes the in-kernel mark setting by doing an additional
sk_dst_reset() which was introduced by commit 50254256f3 ("sock: Reset
dst when changing sk_mark via setsockopt"). The code is now shared to
avoid any further suprises when changing the socket mark value.

Fixes: 84d1c61740 ("net: sock: add sock_set_mark")
Reported-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-01 15:18:49 -07:00
Vladimir Oltean 4ef8d857b5 net: dsa: tag_8021q: fix the VLAN IDs used for encoding sub-VLANs
When using sub-VLANs in the range of 1-7, the resulting value from:

	rx_vid = dsa_8021q_rx_vid_subvlan(ds, port, subvlan);

is wrong according to the description from tag_8021q.c:

 | 11  | 10  |  9  |  8  |  7  |  6  |  5  |  4  |  3  |  2  |  1  |  0  |
 +-----------+-----+-----------------+-----------+-----------------------+
 |    DIR    | SVL |    SWITCH_ID    |  SUBVLAN  |          PORT         |
 +-----------+-----+-----------------+-----------+-----------------------+

For example, when ds->index == 0, port == 3 and subvlan == 1,
dsa_8021q_rx_vid_subvlan() returns 1027, same as it returns for
subvlan == 0, but it should have returned 1043.

This is because the low portion of the subvlan bits are not masked
properly when writing into the 12-bit VLAN value. They are masked into
bits 4:3, but they should be masked into bits 5:4.

Fixes: 3eaae1d05f ("net: dsa: tag_8021q: support up to 8 VLANs per port using sub-VLANs")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-01 15:02:05 -07:00
Krzysztof Kozlowski 4ac06a1e01 nfc: fix NULL ptr dereference in llcp_sock_getname() after failed connect
It's possible to trigger NULL pointer dereference by local unprivileged
user, when calling getsockname() after failed bind() (e.g. the bind
fails because LLCP_SAP_MAX used as SAP):

  BUG: kernel NULL pointer dereference, address: 0000000000000000
  CPU: 1 PID: 426 Comm: llcp_sock_getna Not tainted 5.13.0-rc2-next-20210521+ #9
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1 04/01/2014
  Call Trace:
   llcp_sock_getname+0xb1/0xe0
   __sys_getpeername+0x95/0xc0
   ? lockdep_hardirqs_on_prepare+0xd5/0x180
   ? syscall_enter_from_user_mode+0x1c/0x40
   __x64_sys_getpeername+0x11/0x20
   do_syscall_64+0x36/0x70
   entry_SYSCALL_64_after_hwframe+0x44/0xae

This can be reproduced with Syzkaller C repro (bind followed by
getpeername):
https://syzkaller.appspot.com/x/repro.c?x=14def446e00000

Cc: <stable@vger.kernel.org>
Fixes: d646960f79 ("NFC: Initial LLCP support")
Reported-by: syzbot+80fb126e7f7d8b1a5914@syzkaller.appspotmail.com
Reported-by: butt3rflyh4ck <butterflyhuangxx@gmail.com>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Link: https://lore.kernel.org/r/20210531072138.5219-1-krzysztof.kozlowski@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-31 22:43:27 -07:00
Lin Ma e305509e67 Bluetooth: use correct lock to prevent UAF of hdev object
The hci_sock_dev_event() function will cleanup the hdev object for
sockets even if this object may still be in used within the
hci_sock_bound_ioctl() function, result in UAF vulnerability.

This patch replace the BH context lock to serialize these affairs
and prevent the race condition.

Signed-off-by: Lin Ma <linma@zju.edu.cn>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2021-05-31 14:33:26 +02:00
Paolo Abeni dea2b1ea9c mptcp: do not reset MP_CAPABLE subflow on mapping errors
When some mapping related errors occurs we close the main
MPC subflow with a RST. We should instead fallback gracefully
to TCP, and do the reset only for MPJ subflows.

Fixes: d22f4988ff ("mptcp: process MP_CAPABLE data option")
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/192
Reported-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28 13:51:40 -07:00
Paolo Abeni 06f9a435b3 mptcp: always parse mptcp options for MPC reqsk
In subflow_syn_recv_sock() we currently skip options parsing
for OoO packet, given that such packets may not carry the relevant
MPC option.

If the peer generates an MPC+data TSO packet and some of the early
segments are lost or get reorder, we server will ignore the peer key,
causing transient, unexpected fallback to TCP.

The solution is always parsing the incoming MPTCP options, and
do the fallback only for in-order packets. This actually cleans
the existing code a bit.

Fixes: d22f4988ff ("mptcp: process MP_CAPABLE data option")
Reported-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28 13:51:40 -07:00
Paolo Abeni b5941f066b mptcp: fix sk_forward_memory corruption on retransmission
MPTCP sk_forward_memory handling is a bit special, as such field
is protected by the msk socket spin_lock, instead of the plain
socket lock.

Currently we have a code path updating such field without handling
the relevant lock:

__mptcp_retrans() -> __mptcp_clean_una_wakeup()

Several helpers in __mptcp_clean_una_wakeup() will update
sk_forward_alloc, possibly causing such field corruption, as reported
by Matthieu.

Address the issue providing and using a new variant of blamed function
which explicitly acquires the msk spin lock.

Fixes: 64b9cea7a0 ("mptcp: fix spurious retransmissions")
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/172
Reported-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Tested-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28 13:51:39 -07:00
Linus Torvalds 5ff2756afd NFS client bugfixes for Linux 5.13
Highlights include:
 
 Stable fixes
 - Fix v4.0/v4.1 SEEK_DATA return -ENOTSUPP when set NFS_V4_2 config
 - Fix Oops in xs_tcp_send_request() when transport is disconnected
 - Fix a NULL pointer dereference in pnfs_mark_matching_lsegs_return()
 
 Bugfixes
 - Fix instances where signal_pending() should be fatal_signal_pending()
 - fix an incorrect limit in filelayout_decode_layout()
 - Fixes for the SUNRPC backlogged RPC queue
 - Don't corrupt the value of pg_bytes_written in nfs_do_recoalesce()
 - Revert commit 586a0787ce ("Clean up rpcrdma_prepare_readch()")
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESQctxSBg8JpV8KqEZwvnipYKAPIFAmCvomgACgkQZwvnipYK
 APLg6xAAqlR/HLNYOLAToQ6d9wzrL6Po3x8Lx7VURjCkFaoKB/jMq3Zbu/K8mZ+X
 CC6/XFtB9AikloK7sle6lRPuwwPL6y+vOML0Ais/dkYPNkbhe9ylf0rsYQiPljXT
 8PAcqn8FXTZ9fKpKU8Quw24X1Jfkk6zUEeMy50HYDBfTx+gYEojMEKa6cl4URGzO
 2JpuBO4Ku/vWDOPj7bWBX9wi7wkrJjGSYDnx1A5SOgUdV87H8VJkbTo9vVdEwFoE
 OtE8MQmFhdton0u9+MKImFQdVfxoYLB1Ig1G45NXGHee91dwfYU0U05THj7E/xP9
 RQWtmJcKdvY1w8sRK/PNEHo43Vkow4usffSrIWNBZ6aO5EkbQFn1tmKMSDtsrkZ2
 ONMfKBiEhhQSy+QRXMR/RC86t4dsQ8SApu62qQT4VuuXqzYhrBum2DqkW0X6Zcti
 gi17+PfjRbgWNvul2yegBvDU016H324aCeT9nfWe0D9iwF7tPK4xsuNTYrWwbFOA
 YFAecIXoyBRtbIV6NZ95/+P5HEFBLAYewEVLpdAOBGQ9fjO023ERiC2sitl5P+ku
 v6V+4HAtBgcPfm/8BZwUYUYBpXnnTZFTizqdJdGGydPXeC671gANPJe6e4xOttCK
 frXFGd9OOqPSXdsRZLUVvhczTOFOGa/UVVG0GxIr4ggy8oKjDMk=
 =66ks
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-5.13-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:
"Stable fixes:
   - Fix v4.0/v4.1 SEEK_DATA return -ENOTSUPP when set NFS_V4_2 config
   - Fix Oops in xs_tcp_send_request() when transport is disconnected
   - Fix a NULL pointer dereference in pnfs_mark_matching_lsegs_return()

  Bugfixes:
   - Fix instances where signal_pending() should be fatal_signal_pending()
   - fix an incorrect limit in filelayout_decode_layout()
   - Fixes for the SUNRPC backlogged RPC queue
   - Don't corrupt the value of pg_bytes_written in nfs_do_recoalesce()
   - Revert commit 586a0787ce ("Clean up rpcrdma_prepare_readch()")"

* tag 'nfs-for-5.13-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  nfs: Remove trailing semicolon in macros
  xprtrdma: Revert 586a0787ce
  NFSv4: Fix v4.0/v4.1 SEEK_DATA return -ENOTSUPP when set NFS_V4_2 config
  NFS: Clean up reset of the mirror accounting variables
  NFS: Don't corrupt the value of pg_bytes_written in nfs_do_recoalesce()
  NFS: Fix an Oopsable condition in __nfs_pageio_add_request()
  SUNRPC: More fixes for backlog congestion
  SUNRPC: Fix Oops in xs_tcp_send_request() when transport is disconnected
  NFSv4: Fix a NULL pointer dereference in pnfs_mark_matching_lsegs_return()
  SUNRPC in case of backlog, hand free slots directly to waiting task
  pNFS/NFSv4: Remove redundant initialization of 'rd_size'
  NFS: fix an incorrect limit in filelayout_decode_layout()
  fs/nfs: Use fatal_signal_pending instead of signal_pending
2021-05-28 08:53:19 -10:00
Jakub Kicinski 44991d61aa Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter/IPVS fixes for net

The following patchset contains Netfilter/IPVS fixes for net:

1) Fix incorrect sockopts unregistration from error path,
   from Florian Westphal.

2) A few patches to provide better error reporting when missing kernel
   netfilter options are missing in .config.

3) Fix dormant table flag updates.

4) Memleak in IPVS  when adding service with IP_VS_SVC_F_HASHED flag.

* git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf:
  ipvs: ignore IP_VS_SVC_F_HASHED flag when adding service
  netfilter: nf_tables: fix table flag updates
  netfilter: nf_tables: extended netlink error reporting for chain type
  netfilter: nf_tables: missing error reporting for not selected expressions
  netfilter: conntrack: unregister ipv4 sockopts on error unwind
====================

Link: https://lore.kernel.org/r/20210527190115.98503-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-27 15:54:12 -07:00
Ariel Levkovich fb91702b74 net/sched: act_ct: Fix ct template allocation for zone 0
Fix current behavior of skipping template allocation in case the
ct action is in zone 0.

Skipping the allocation may cause the datapath ct code to ignore the
entire ct action with all its attributes (commit, nat) in case the ct
action in zone 0 was preceded by a ct clear action.

The ct clear action sets the ct_state to untracked and resets the
skb->_nfct pointer. Under these conditions and without an allocated
ct template, the skb->_nfct pointer will remain NULL which will
cause the tc ct action handler to exit without handling commit and nat
actions, if such exist.

For example, the following rule in OVS dp:
recirc_id(0x2),ct_state(+new-est-rel-rpl+trk),ct_label(0/0x1), \
in_port(eth0),actions:ct_clear,ct(commit,nat(src=10.11.0.12)), \
recirc(0x37a)

Will result in act_ct skipping the commit and nat actions in zone 0.

The change removes the skipping of template allocation for zone 0 and
treats it the same as any other zone.

Fixes: b57dc7c13e ("net/sched: Introduce action ct")
Signed-off-by: Ariel Levkovich <lariel@nvidia.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Link: https://lore.kernel.org/r/20210526170110.54864-1-lariel@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-27 14:54:23 -07:00
Paul Blakey 0cc254e5aa net/sched: act_ct: Offload connections with commit action
Currently established connections are not offloaded if the filter has a
"ct commit" action. This behavior will not offload connections of the
following scenario:

$ tc_filter add dev $DEV ingress protocol ip prio 1 flower \
  ct_state -trk \
  action ct commit action goto chain 1

$ tc_filter add dev $DEV ingress protocol ip chain 1 prio 1 flower \
  action mirred egress redirect dev $DEV2

$ tc_filter add dev $DEV2 ingress protocol ip prio 1 flower \
  action ct commit action goto chain 1

$ tc_filter add dev $DEV2 ingress protocol ip prio 1 chain 1 flower \
  ct_state +trk+est \
  action mirred egress redirect dev $DEV

Offload established connections, regardless of the commit flag.

Fixes: 46475bb20f ("net/sched: act_ct: Software offload of established flows")
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Link: https://lore.kernel.org/r/1622029449-27060-1-git-send-email-paulb@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-27 14:54:14 -07:00
Parav Pandit b28d8f0c25 devlink: Correct VIRTUAL port to not have phys_port attributes
Physical port name, port number attributes do not belong to virtual port
flavour. When VF or SF virtual ports are registered they incorrectly
append "np0" string in the netdevice name of the VF/SF.

Before this fix, VF netdevice name were ens2f0np0v0, ens2f0np0v1 for VF
0 and 1 respectively.

After the fix, they are ens2f0v0, ens2f0v1.

With this fix, reading /sys/class/net/ens2f0v0/phys_port_name returns
-EOPNOTSUPP.

Also devlink port show example for 2 VFs on one PF to ensure that any
physical port attributes are not exposed.

$ devlink port show
pci/0000:06:00.0/65535: type eth netdev ens2f0np0 flavour physical port 0 splittable false
pci/0000:06:00.3/196608: type eth netdev ens2f0v0 flavour virtual splittable false
pci/0000:06:00.4/262144: type eth netdev ens2f0v1 flavour virtual splittable false

This change introduces a netdevice name change on systemd/udev
version 245 and higher which honors phys_port_name sysfs file for
generation of netdevice name.

This also aligns to phys_port_name usage which is limited to switchdev
ports as described in [1].

[1] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/tree/Documentation/networking/switchdev.rst

Fixes: acf1ee44ca ("devlink: Introduce devlink port flavour virtual")
Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20210526200027.14008-1-parav@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-27 14:43:41 -07:00
Lin Ma 6a137caec2 Bluetooth: fix the erroneous flush_work() order
In the cleanup routine for failed initialization of HCI device,
the flush_work(&hdev->rx_work) need to be finished before the
flush_work(&hdev->cmd_work). Otherwise, the hci_rx_work() can
possibly invoke new cmd_work and cause a bug, like double free,
in late processings.

This was assigned CVE-2021-3564.

This patch reorder the flush_work() to fix this bug.

Cc: Marcel Holtmann <marcel@holtmann.org>
Cc: Johan Hedberg <johan.hedberg@gmail.com>
Cc: Luiz Augusto von Dentz <luiz.dentz@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: linux-bluetooth@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Lin Ma <linma@zju.edu.cn>
Signed-off-by: Hao Xiong <mart1n@zju.edu.cn>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2021-05-27 18:16:17 +02:00
Chuck Lever ae605ee983 xprtrdma: Revert 586a0787ce
Commit 9ed5af268e ("SUNRPC: Clean up the handling of page padding
in rpc_prepare_reply_pages()") [Dec 2020] affects RPC Replies that
have a data payload (i.e., Write chunks).

rpcrdma_prepare_readch(), as its name suggests, sets up Read chunks
which are data payloads within RPC Calls. Those payloads are
constructed by xdr_write_pages(), which continues to stuff the call
buffer's tail kvec with the payload's XDR roundup. Thus removing
the tail buffer logic in rpcrdma_prepare_readch() was the wrong
thing to do.

Fixes: 586a0787ce ("xprtrdma: Clean up rpcrdma_prepare_readch()")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-05-27 08:46:19 -04:00
Julian Anastasov 56e4ee82e8 ipvs: ignore IP_VS_SVC_F_HASHED flag when adding service
syzbot reported memory leak [1] when adding service with
HASHED flag. We should ignore this flag both from sockopt
and netlink provided data, otherwise the service is not
hashed and not visible while releasing resources.

[1]
BUG: memory leak
unreferenced object 0xffff888115227800 (size 512):
  comm "syz-executor263", pid 8658, jiffies 4294951882 (age 12.560s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff83977188>] kmalloc include/linux/slab.h:556 [inline]
    [<ffffffff83977188>] kzalloc include/linux/slab.h:686 [inline]
    [<ffffffff83977188>] ip_vs_add_service+0x598/0x7c0 net/netfilter/ipvs/ip_vs_ctl.c:1343
    [<ffffffff8397d770>] do_ip_vs_set_ctl+0x810/0xa40 net/netfilter/ipvs/ip_vs_ctl.c:2570
    [<ffffffff838449a8>] nf_setsockopt+0x68/0xa0 net/netfilter/nf_sockopt.c:101
    [<ffffffff839ae4e9>] ip_setsockopt+0x259/0x1ff0 net/ipv4/ip_sockglue.c:1435
    [<ffffffff839fa03c>] raw_setsockopt+0x18c/0x1b0 net/ipv4/raw.c:857
    [<ffffffff83691f20>] __sys_setsockopt+0x1b0/0x360 net/socket.c:2117
    [<ffffffff836920f2>] __do_sys_setsockopt net/socket.c:2128 [inline]
    [<ffffffff836920f2>] __se_sys_setsockopt net/socket.c:2125 [inline]
    [<ffffffff836920f2>] __x64_sys_setsockopt+0x22/0x30 net/socket.c:2125
    [<ffffffff84350efa>] do_syscall_64+0x3a/0xb0 arch/x86/entry/common.c:47
    [<ffffffff84400068>] entry_SYSCALL_64_after_hwframe+0x44/0xae

Reported-and-tested-by: syzbot+e562383183e4b1766930@syzkaller.appspotmail.com
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Reviewed-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-05-27 13:06:48 +02:00
Linus Torvalds d7c5303fbc Networking fixes for 5.13-rc4, including fixes from bpf, netfilter,
can and wireless trees. Notably including fixes for the recently
 announced "FragAttacks" WiFi vulnerabilities. Rather large batch,
 touching some core parts of the stack, too, but nothing hair-raising.
 
 Current release - regressions:
 
  - tipc: make node link identity publish thread safe
 
  - dsa: felix: re-enable TAS guard band mode
 
  - stmmac: correct clocks enabled in stmmac_vlan_rx_kill_vid()
 
  - stmmac: fix system hang if change mac address after interface ifdown
 
 Current release - new code bugs:
 
  - mptcp: avoid OOB access in setsockopt()
 
  - bpf: Fix nested bpf_bprintf_prepare with more per-cpu buffers
 
  - ethtool: stats: fix a copy-paste error - init correct array size
 
 Previous releases - regressions:
 
  - sched: fix packet stuck problem for lockless qdisc
 
  - net: really orphan skbs tied to closing sk
 
  - mlx4: fix EEPROM dump support
 
  - bpf: fix alu32 const subreg bound tracking on bitwise operations
 
  - bpf: fix mask direction swap upon off reg sign change
 
  - bpf, offload: reorder offload callback 'prepare' in verifier
 
  - stmmac: Fix MAC WoL not working if PHY does not support WoL
 
  - packetmmap: fix only tx timestamp on request
 
  - tipc: skb_linearize the head skb when reassembling msgs
 
 Previous releases - always broken:
 
  - mac80211: address recent "FragAttacks" vulnerabilities
 
  - mac80211: do not accept/forward invalid EAPOL frames
 
  - mptcp: avoid potential error message floods
 
  - bpf, ringbuf: deny reserve of buffers larger than ringbuf to prevent
                  out of buffer writes
 
  - bpf: forbid trampoline attach for functions with variable arguments
 
  - bpf: add deny list of functions to prevent inf recursion of tracing
         programs
 
  - tls splice: check SPLICE_F_NONBLOCK instead of MSG_DONTWAIT
 
  - can: isotp: prevent race between isotp_bind() and isotp_setsockopt()
 
  - netfilter: nft_set_pipapo_avx2: Add irq_fpu_usable() check,
               fallback to non-AVX2 version
 
 Misc:
 
  - bpf: add kconfig knob for disabling unpriv bpf by default
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmCuy2gACgkQMUZtbf5S
 IruE5BAAhihia5EaiV71Bz/Cqr/d+osv5u283riKT8kBft0bWFVFFnT3iweWyR0/
 5X+bB6zmr80Cuqh45ZeYyq+zJtiAAlsbD5hqBIGdMriSWLxciNKjVJRzuEjuqnek
 USMW/LqGyf4NhmLogmQKpx8XcKSG7VYuK7vPrsH8us1dL5vIssceIXn8R9Dzj9NN
 P77K5Z+Oka8XQJgetNLxR3tDAM/92RwIshotkhJbRwgiUvzb+wbnrnSOAZCIPgku
 ydJyOxOklln1Sx07SejgzEl33ri0CkioDPThBWpOn7Mu0JrYKukXPKludoZcRYuJ
 2jNLYfbH0ZS5EkOfk89h7j7MDoAJMUK72M+S1w5DEYz6eH2EjhAq9noZ6E1iQH+U
 9vfoIvQjPh6Zhyk5QeM4dpt0cvR7rSElXkLVxo/x0dSBAi2rIng1bKeCUtv2J689
 CsoD0oghtEzvUTYVxY6iNr15OFGl6KsZv4tVQ709gGA36sDlK8ozGbJH5WReobBl
 f8H2WJlj2tVW5V75yUoio8TumDw34yk/5xlJFzm9GOwkqBrUcqOraHtHdUIsa4qr
 KbELQQ9QVt4zYdLAiWy5BL/QLycp0ibmA1IB8W1bxEVSK1JXzREHzPxv85KOfZkn
 8+vzNHmk2PEZYYsExiEykc5jXKOCPs8L0rJ6p4OverlbpDZcwIg=
 =peMK
 -----END PGP SIGNATURE-----

Merge tag 'net-5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Networking fixes for 5.13-rc4, including fixes from bpf, netfilter,
  can and wireless trees. Notably including fixes for the recently
  announced "FragAttacks" WiFi vulnerabilities. Rather large batch,
  touching some core parts of the stack, too, but nothing hair-raising.

  Current release - regressions:

   - tipc: make node link identity publish thread safe

   - dsa: felix: re-enable TAS guard band mode

   - stmmac: correct clocks enabled in stmmac_vlan_rx_kill_vid()

   - stmmac: fix system hang if change mac address after interface
     ifdown

  Current release - new code bugs:

   - mptcp: avoid OOB access in setsockopt()

   - bpf: Fix nested bpf_bprintf_prepare with more per-cpu buffers

   - ethtool: stats: fix a copy-paste error - init correct array size

  Previous releases - regressions:

   - sched: fix packet stuck problem for lockless qdisc

   - net: really orphan skbs tied to closing sk

   - mlx4: fix EEPROM dump support

   - bpf: fix alu32 const subreg bound tracking on bitwise operations

   - bpf: fix mask direction swap upon off reg sign change

   - bpf, offload: reorder offload callback 'prepare' in verifier

   - stmmac: Fix MAC WoL not working if PHY does not support WoL

   - packetmmap: fix only tx timestamp on request

   - tipc: skb_linearize the head skb when reassembling msgs

  Previous releases - always broken:

   - mac80211: address recent "FragAttacks" vulnerabilities

   - mac80211: do not accept/forward invalid EAPOL frames

   - mptcp: avoid potential error message floods

   - bpf, ringbuf: deny reserve of buffers larger than ringbuf to
     prevent out of buffer writes

   - bpf: forbid trampoline attach for functions with variable arguments

   - bpf: add deny list of functions to prevent inf recursion of tracing
     programs

   - tls splice: check SPLICE_F_NONBLOCK instead of MSG_DONTWAIT

   - can: isotp: prevent race between isotp_bind() and
     isotp_setsockopt()

   - netfilter: nft_set_pipapo_avx2: Add irq_fpu_usable() check,
     fallback to non-AVX2 version

  Misc:

   - bpf: add kconfig knob for disabling unpriv bpf by default"

* tag 'net-5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (172 commits)
  net: phy: Document phydev::dev_flags bits allocation
  mptcp: validate 'id' when stopping the ADD_ADDR retransmit timer
  mptcp: avoid error message on infinite mapping
  mptcp: drop unconditional pr_warn on bad opt
  mptcp: avoid OOB access in setsockopt()
  nfp: update maintainer and mailing list addresses
  net: mvpp2: add buffer header handling in RX
  bnx2x: Fix missing error code in bnx2x_iov_init_one()
  net: zero-initialize tc skb extension on allocation
  net: hns: Fix kernel-doc
  sctp: fix the proc_handler for sysctl encap_port
  sctp: add the missing setting for asoc encap_port
  bpf, selftests: Adjust few selftest result_unpriv outcomes
  bpf: No need to simulate speculative domain for immediates
  bpf: Fix mask direction swap upon off reg sign change
  bpf: Wrap aux data inside bpf_sanitize_info container
  bpf: Fix BPF_LSM kconfig symbol dependency
  selftests/bpf: Add test for l3 use of bpf_redirect_peer
  bpftool: Add sock_release help info for cgroup attach/prog load command
  net: dsa: microchip: enable phy errata workaround on 9567
  ...
2021-05-26 17:44:49 -10:00
Trond Myklebust e86be3a04b SUNRPC: More fixes for backlog congestion
Ensure that we fix the XPRT_CONGESTED starvation issue for RDMA as well
as socket based transports.
Ensure we always initialise the request after waking up from the backlog
list.

Fixes: e877a88d1f ("SUNRPC in case of backlog, hand free slots directly to waiting task")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-05-26 06:36:13 -04:00
David S. Miller f5d287126f Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2021-05-26

The following pull-request contains BPF updates for your *net* tree.

We've added 14 non-merge commits during the last 14 day(s) which contain
a total of 17 files changed, 513 insertions(+), 231 deletions(-).

The main changes are:

1) Fix bpf_skb_change_head() helper to reset mac_len, from Jussi Maki.

2) Fix masking direction swap upon off-reg sign change, from Daniel Borkmann.

3) Fix BPF offloads in verifier by reordering driver callback, from Yinjun Zhang.

4) BPF selftest for ringbuf mmap ro/rw restrictions, from Andrii Nakryiko.

5) Follow-up fixes to nested bprintf per-cpu buffers, from Florent Revest.

6) Fix bpftool sock_release attach point help info, from Liu Jian.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-25 15:59:24 -07:00
Davide Caratti d58300c318 mptcp: validate 'id' when stopping the ADD_ADDR retransmit timer
when Linux receives an echo-ed ADD_ADDR, it checks the IP address against
the list of "announced" addresses. In case of a positive match, the timer
that handles retransmissions is stopped regardless of the 'Address Id' in
the received packet: this behaviour does not comply with RFC8684 3.4.1.

Fix it by validating the 'Address Id' in received echo-ed ADD_ADDRs.
Tested using packetdrill, with the following captured output:

 unpatched kernel:

 Out <...> Flags [.], ack 1, win 256, options [mptcp add-addr v1 id 1 198.51.100.2 hmac 0xfd2e62517888fe29,mptcp dss ack 3007449509], length 0
 In  <...> Flags [.], ack 1, win 257, options [mptcp add-addr v1-echo id 1 1.2.3.4,mptcp dss ack 3013740213], length 0
 Out <...> Flags [.], ack 1, win 256, options [mptcp add-addr v1 id 1 198.51.100.2 hmac 0xfd2e62517888fe29,mptcp dss ack 3007449509], length 0
 In  <...> Flags [.], ack 1, win 257, options [mptcp add-addr v1-echo id 90 198.51.100.2,mptcp dss ack 3013740213], length 0
        ^^^ retransmission is stopped here, but 'Address Id' is 90

 patched kernel:

 Out <...> Flags [.], ack 1, win 256, options [mptcp add-addr v1 id 1 198.51.100.2 hmac 0x1cf372d59e05f4b8,mptcp dss ack 3007449509], length 0
 In  <...> Flags [.], ack 1, win 257, options [mptcp add-addr v1-echo id 1 1.2.3.4,mptcp dss ack 1672384568], length 0
 Out <...> Flags [.], ack 1, win 256, options [mptcp add-addr v1 id 1 198.51.100.2 hmac 0x1cf372d59e05f4b8,mptcp dss ack 3007449509], length 0
 In  <...> Flags [.], ack 1, win 257, options [mptcp add-addr v1-echo id 90 198.51.100.2,mptcp dss ack 1672384568], length 0
 Out <...> Flags [.], ack 1, win 256, options [mptcp add-addr v1 id 1 198.51.100.2 hmac 0x1cf372d59e05f4b8,mptcp dss ack 3007449509], length 0
 In  <...> Flags [.], ack 1, win 257, options [mptcp add-addr v1-echo id 1 198.51.100.2,mptcp dss ack 1672384568], length 0
        ^^^ retransmission is stopped here, only when both 'Address Id' and 'IP Address' match

Fixes: 00cfd77b90 ("mptcp: retransmit ADD_ADDR when timeout")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-25 15:56:20 -07:00
Paolo Abeni 3ed0a585bf mptcp: avoid error message on infinite mapping
Another left-over. Avoid flooding dmesg with useless text,
we already have a MIB for that event.

Fixes: 648ef4b886 ("mptcp: Implement MPTCP receive path")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-25 15:56:20 -07:00
Paolo Abeni 3812ce8950 mptcp: drop unconditional pr_warn on bad opt
This is a left-over of early day. A malicious peer can flood
the kernel logs with useless messages, just drop it.

Fixes: f296234c98 ("mptcp: Add handling of incoming MP_JOIN requests")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-25 15:56:20 -07:00
Paolo Abeni 20b5759f21 mptcp: avoid OOB access in setsockopt()
We can't use tcp_set_congestion_control() on an mptcp socket, as
such function can end-up accessing a tcp-specific field -
prior_ssthresh - causing an OOB access.

To allow propagating the correct ca algo on subflow, cache the ca
name at initialization time.

Additionally avoid overriding the user-selected CA (if any) at
clone time.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/182
Fixes: aa1fbd94e5 ("mptcp: sockopt: add TCP_CONGESTION and TCP_INFO")
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-25 15:56:20 -07:00