Commit graph

1060058 commits

Author SHA1 Message Date
Tamir Duberstein fb7bc92040 ipv6: raw: check passed optlen before reading
Add a check that the user-provided option is at least as long as the
number of bytes we intend to read. Before this patch we would blindly
read sizeof(int) bytes even in cases where the user passed
optlen<sizeof(int), which would potentially read garbage or fault.

Discovered by new tests in https://github.com/google/gvisor/pull/6957 .

The original get_user call predates history in the git repo.

Signed-off-by: Tamir Duberstein <tamird@gmail.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20211229200947.2862255-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-29 12:32:56 -08:00
Ciara Loftus 5bec7ca2be xsk: Initialise xskb free_list_node
This commit initialises the xskb's free_list_node when the xskb is
allocated. This prevents a potential false negative returned from a call
to list_empty for that node, such as the one introduced in commit
199d983bc0 ("xsk: Fix crash on double free in buffer pool")

In my environment this issue caused packets to not be received by
the xdpsock application if the traffic was running prior to application
launch. This happened when the first batch of packets failed the xskmap
lookup and XDP_PASS was returned from the bpf program. This action is
handled in the i40e zc driver (and others) by allocating an skbuff,
freeing the xdp_buff and adding the associated xskb to the
xsk_buff_pool's free_list if it hadn't been added already. Without this
fix, the xskb is not added to the free_list because the check to determine
if it was added already returns an invalid positive result. Later, this
caused allocation errors in the driver and the failure to receive packets.

Fixes: 199d983bc0 ("xsk: Fix crash on double free in buffer pool")
Fixes: 2b43470add ("xsk: Introduce AF_XDP buffer allocation API")
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/r/20211220155250.2746-1-ciara.loftus@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-29 10:00:18 -08:00
Jakub Kicinski 9665e03a8d Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2021-12-28

This series contains updates to igc driver only.

Vinicius disables support for crosstimestamp on i225-V as lockups are being
observed.

James McLaughlin fixes Tx timestamping support on non-MSI-X platforms.

* '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  igc: Fix TX timestamp support for non-MSI-X platforms
  igc: Do not enable crosstimestamping for i225-V models
====================

Link: https://lore.kernel.org/r/20211228182421.340354-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-28 16:19:10 -08:00
Christophe JAILLET 140c7bc7d1 ionic: Initialize the 'lif->dbid_inuse' bitmap
When allocated, this bitmap is not initialized. Only the first bit is set a
few lines below.

Use bitmap_zalloc() to make sure that it is cleared before being used.

Fixes: 6461b446f2 ("ionic: Add interrupts and doorbells")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Link: https://lore.kernel.org/r/6a478eae0b5e6c63774e1f0ddb1a3f8c38fa8ade.1640527506.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-28 16:07:35 -08:00
James McLaughlin f85846bbf4 igc: Fix TX timestamp support for non-MSI-X platforms
Time synchronization was not properly enabled on non-MSI-X platforms.

Fixes: 2c344ae245 ("igc: Add support for TX timestamping")
Signed-off-by: James McLaughlin <james.mclaughlin@qsc.com>
Reviewed-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-28 09:54:11 -08:00
Vinicius Costa Gomes 1e81dcc1ab igc: Do not enable crosstimestamping for i225-V models
It was reported that when PCIe PTM is enabled, some lockups could
be observed with some integrated i225-V models.

While the issue is investigated, we can disable crosstimestamp for
those models and see no loss of functionality, because those models
don't have any support for time synchronization.

Fixes: a90ec84837 ("igc: Add support for PTP getcrosststamp()")
Link: https://lore.kernel.org/all/924175a188159f4e03bd69908a91e606b574139b.camel@gmx.de/
Reported-by: Stefan Dietrich <roots@gmx.de>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-28 09:54:10 -08:00
David S. Miller 16fa29aef7 Merge branch 'smc-fixes'
Dust Li says:

====================
net/smc: fix kernel panic caused by race of smc_sock

This patchset fixes the race between smc_release triggered by
close(2) and cdc_handle triggered by underlaying RDMA device.

The race is caused because the smc_connection may been released
before the pending tx CDC messages got its CQEs. In order to fix
this, I add a counter to track how many pending WRs we have posted
through the smc_connection, and only release the smc_connection
after there is no pending WRs on the connection.

The first patch prevents posting WR on a QP that is not in RTS
state. This patch is needed because if we post WR on a QP that
is not in RTS state, ib_post_send() may success but no CQE will
return, and that will confuse the counter tracking the pending
WRs.

The second patch add a counter to track how many WRs were posted
through the smc_connection, and don't reset the QP on link destroying
to prevent leak of the counter.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-28 12:42:46 +00:00
Dust Li 349d43127d net/smc: fix kernel panic caused by race of smc_sock
A crash occurs when smc_cdc_tx_handler() tries to access smc_sock
but smc_release() has already freed it.

[ 4570.695099] BUG: unable to handle page fault for address: 000000002eae9e88
[ 4570.696048] #PF: supervisor write access in kernel mode
[ 4570.696728] #PF: error_code(0x0002) - not-present page
[ 4570.697401] PGD 0 P4D 0
[ 4570.697716] Oops: 0002 [#1] PREEMPT SMP NOPTI
[ 4570.698228] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.16.0-rc4+ #111
[ 4570.699013] Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 8c24b4c 04/0
[ 4570.699933] RIP: 0010:_raw_spin_lock+0x1a/0x30
<...>
[ 4570.711446] Call Trace:
[ 4570.711746]  <IRQ>
[ 4570.711992]  smc_cdc_tx_handler+0x41/0xc0
[ 4570.712470]  smc_wr_tx_tasklet_fn+0x213/0x560
[ 4570.712981]  ? smc_cdc_tx_dismisser+0x10/0x10
[ 4570.713489]  tasklet_action_common.isra.17+0x66/0x140
[ 4570.714083]  __do_softirq+0x123/0x2f4
[ 4570.714521]  irq_exit_rcu+0xc4/0xf0
[ 4570.714934]  common_interrupt+0xba/0xe0

Though smc_cdc_tx_handler() checked the existence of smc connection,
smc_release() may have already dismissed and released the smc socket
before smc_cdc_tx_handler() further visits it.

smc_cdc_tx_handler()           |smc_release()
if (!conn)                     |
                               |
                               |smc_cdc_tx_dismiss_slots()
                               |      smc_cdc_tx_dismisser()
                               |
                               |sock_put(&smc->sk) <- last sock_put,
                               |                      smc_sock freed
bh_lock_sock(&smc->sk) (panic) |

To make sure we won't receive any CDC messages after we free the
smc_sock, add a refcount on the smc_connection for inflight CDC
message(posted to the QP but haven't received related CQE), and
don't release the smc_connection until all the inflight CDC messages
haven been done, for both success or failed ones.

Using refcount on CDC messages brings another problem: when the link
is going to be destroyed, smcr_link_clear() will reset the QP, which
then remove all the pending CQEs related to the QP in the CQ. To make
sure all the CQEs will always come back so the refcount on the
smc_connection can always reach 0, smc_ib_modify_qp_reset() was replaced
by smc_ib_modify_qp_error().
And remove the timeout in smc_wr_tx_wait_no_pending_sends() since we
need to wait for all pending WQEs done, or we may encounter use-after-
free when handling CQEs.

For IB device removal routine, we need to wait for all the QPs on that
device been destroyed before we can destroy CQs on the device, or
the refcount on smc_connection won't reach 0 and smc_sock cannot be
released.

Fixes: 5f08318f61 ("smc: connection data control (CDC)")
Reported-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: Dust Li <dust.li@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-28 12:42:45 +00:00
Dust Li 90cee52f2e net/smc: don't send CDC/LLC message if link not ready
We found smc_llc_send_link_delete_all() sometimes wait
for 2s timeout when testing with RDMA link up/down.
It is possible when a smc_link is in ACTIVATING state,
the underlaying QP is still in RESET or RTR state, which
cannot send any messages out.

smc_llc_send_link_delete_all() use smc_link_usable() to
checks whether the link is usable, if the QP is still in
RESET or RTR state, but the smc_link is in ACTIVATING, this
LLC message will always fail without any CQE entering the
CQ, and we will always wait 2s before timeout.

Since we cannot send any messages through the QP before
the QP enter RTS. I add a wrapper smc_link_sendable()
which checks the state of QP along with the link state.
And replace smc_link_usable() with smc_link_sendable()
in all LLC & CDC message sending routine.

Fixes: 5f08318f61 ("smc: connection data control (CDC)")
Signed-off-by: Dust Li <dust.li@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-28 12:42:45 +00:00
Wei Yongjun 1b9dadba50 NFC: st21nfca: Fix memory leak in device probe and remove
'phy->pending_skb' is alloced when device probe, but forgot to free
in the error handling path and remove path, this cause memory leak
as follows:

unreferenced object 0xffff88800bc06800 (size 512):
  comm "8", pid 11775, jiffies 4295159829 (age 9.032s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<00000000d66c09ce>] __kmalloc_node_track_caller+0x1ed/0x450
    [<00000000c93382b3>] kmalloc_reserve+0x37/0xd0
    [<000000005fea522c>] __alloc_skb+0x124/0x380
    [<0000000019f29f9a>] st21nfca_hci_i2c_probe+0x170/0x8f2

Fix it by freeing 'pending_skb' in error and remove.

Fixes: 68957303f4 ("NFC: ST21NFCA: Add driver for STMicroelectronics ST21NFCA NFC Chip")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-28 12:40:20 +00:00
Aleksander Jan Bajkowski 5be60a9453 net: lantiq_xrx200: fix statistics of received bytes
Received frames have FCS truncated. There is no need
to subtract FCS length from the statistics.

Fixes: fe1a56420c ("net: lantiq: Add Lantiq / Intel VRX200 Ethernet driver")
Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-28 12:18:18 +00:00
Christophe JAILLET 1cd5384c88 net: ag71xx: Fix a potential double free in error handling paths
'ndev' is a managed resource allocated with devm_alloc_etherdev(), so there
is no need to call free_netdev() explicitly or there will be a double
free().

Simplify all error handling paths accordingly.

Fixes: d51b6ce441 ("net: ethernet: add ag71xx driver")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-28 12:16:34 +00:00
wolfgang huang 8b5fdfc57c mISDN: change function names to avoid conflicts
As we build for mips, we meet following error. l1_init error with
multiple definition. Some architecture devices usually marked with
l1, l2, lxx as the start-up phase. so we change the mISDN function
names, align with Isdnl2_xxx.

mips-linux-gnu-ld: drivers/isdn/mISDN/layer1.o: in function `l1_init':
(.text+0x890): multiple definition of `l1_init'; \
arch/mips/kernel/bmips_5xxx_init.o:(.text+0xf0): first defined here
make[1]: *** [home/mips/kernel-build/linux/Makefile:1161: vmlinux] Error 1

Signed-off-by: wolfgang huang <huangjinhui@kylinos.cn>
Reported-by: k2ci <kernel-bot@kylinos.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-28 12:12:04 +00:00
Krzysztof Kozlowski 79b69a8370 nfc: uapi: use kernel size_t to fix user-space builds
Fix user-space builds if it includes /usr/include/linux/nfc.h before
some of other headers:

  /usr/include/linux/nfc.h:281:9: error: unknown type name ‘size_t’
    281 |         size_t service_name_len;
        |         ^~~~~~

Fixes: d646960f79 ("NFC: Initial LLCP support")
Cc: <stable@vger.kernel.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-27 14:58:37 +00:00
Dmitry V. Levin 7175f02c4e uapi: fix linux/nfc.h userspace compilation errors
Replace sa_family_t with __kernel_sa_family_t to fix the following
linux/nfc.h userspace compilation errors:

/usr/include/linux/nfc.h:266:2: error: unknown type name 'sa_family_t'
  sa_family_t sa_family;
/usr/include/linux/nfc.h:274:2: error: unknown type name 'sa_family_t'
  sa_family_t sa_family;

Fixes: 23b7869c0f ("NFC: add the NFC socket raw protocol")
Fixes: d646960f79 ("NFC: Initial LLCP support")
Cc: <stable@vger.kernel.org>
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-27 14:58:03 +00:00
Matthias-Christian Ott ca506fca46 net: usb: pegasus: Do not drop long Ethernet frames
The D-Link DSB-650TX (2001:4002) is unable to receive Ethernet frames
that are longer than 1518 octets, for example, Ethernet frames that
contain 802.1Q VLAN tags.

The frames are sent to the pegasus driver via USB but the driver
discards them because they have the Long_pkt field set to 1 in the
received status report. The function read_bulk_callback of the pegasus
driver treats such received "packets" (in the terminology of the
hardware) as errors but the field simply does just indicate that the
Ethernet frame (MAC destination to FCS) is longer than 1518 octets.

It seems that in the 1990s there was a distinction between
"giant" (> 1518) and "runt" (< 64) frames and the hardware includes
flags to indicate this distinction. It seems that the purpose of the
distinction "giant" frames was to not allow infinitely long frames due
to transmission errors and to allow hardware to have an upper limit of
the frame size. However, the hardware already has such limit with its
2048 octet receive buffer and, therefore, Long_pkt is merely a
convention and should not be treated as a receive error.

Actually, the hardware is even able to receive Ethernet frames with 2048
octets which exceeds the claimed limit frame size limit of the driver of
1536 octets (PEGASUS_MTU).

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Matthias-Christian Ott <ott@mirix.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-27 14:52:06 +00:00
Zekun Shen 5f50153288 atlantic: Fix buff_ring OOB in aq_ring_rx_clean
The function obtain the next buffer without boundary check.
We should return with I/O error code.

The bug is found by fuzzing and the crash report is attached.
It is an OOB bug although reported as use-after-free.

[    4.804724] BUG: KASAN: use-after-free in aq_ring_rx_clean+0x1e88/0x2730 [atlantic]
[    4.805661] Read of size 4 at addr ffff888034fe93a8 by task ksoftirqd/0/9
[    4.806505]
[    4.806703] CPU: 0 PID: 9 Comm: ksoftirqd/0 Tainted: G        W         5.6.0 #34
[    4.809030] Call Trace:
[    4.809343]  dump_stack+0x76/0xa0
[    4.809755]  print_address_description.constprop.0+0x16/0x200
[    4.810455]  ? aq_ring_rx_clean+0x1e88/0x2730 [atlantic]
[    4.811234]  ? aq_ring_rx_clean+0x1e88/0x2730 [atlantic]
[    4.813183]  __kasan_report.cold+0x37/0x7c
[    4.813715]  ? aq_ring_rx_clean+0x1e88/0x2730 [atlantic]
[    4.814393]  kasan_report+0xe/0x20
[    4.814837]  aq_ring_rx_clean+0x1e88/0x2730 [atlantic]
[    4.815499]  ? hw_atl_b0_hw_ring_rx_receive+0x9a5/0xb90 [atlantic]
[    4.816290]  aq_vec_poll+0x179/0x5d0 [atlantic]
[    4.816870]  ? _GLOBAL__sub_I_65535_1_aq_pci_func_init+0x20/0x20 [atlantic]
[    4.817746]  ? __next_timer_interrupt+0xba/0xf0
[    4.818322]  net_rx_action+0x363/0xbd0
[    4.818803]  ? call_timer_fn+0x240/0x240
[    4.819302]  ? __switch_to_asm+0x40/0x70
[    4.819809]  ? napi_busy_loop+0x520/0x520
[    4.820324]  __do_softirq+0x18c/0x634
[    4.820797]  ? takeover_tasklets+0x5f0/0x5f0
[    4.821343]  run_ksoftirqd+0x15/0x20
[    4.821804]  smpboot_thread_fn+0x2f1/0x6b0
[    4.822331]  ? smpboot_unregister_percpu_thread+0x160/0x160
[    4.823041]  ? __kthread_parkme+0x80/0x100
[    4.823571]  ? smpboot_unregister_percpu_thread+0x160/0x160
[    4.824301]  kthread+0x2b5/0x3b0
[    4.824723]  ? kthread_create_on_node+0xd0/0xd0
[    4.825304]  ret_from_fork+0x35/0x40

Signed-off-by: Zekun Shen <bruceshenzk@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-27 14:49:53 +00:00
yangxingwu 6c25449e1a net: udp: fix alignment problem in udp4_seq_show()
$ cat /pro/net/udp

before:

  sl  local_address rem_address   st tx_queue rx_queue tr tm->when
26050: 0100007F:0035 00000000:0000 07 00000000:00000000 00:00000000
26320: 0100007F:0143 00000000:0000 07 00000000:00000000 00:00000000
27135: 00000000:8472 00000000:0000 07 00000000:00000000 00:00000000

after:

   sl  local_address rem_address   st tx_queue rx_queue tr tm->when
26050: 0100007F:0035 00000000:0000 07 00000000:00000000 00:00000000
26320: 0100007F:0143 00000000:0000 07 00000000:00000000 00:00000000
27135: 00000000:8472 00000000:0000 07 00000000:00000000 00:00000000

Signed-off-by: yangxingwu <xingwu.yang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-27 14:47:14 +00:00
Karsten Graul 6d7373dabf net/smc: fix using of uninitialized completions
In smc_wr_tx_send_wait() the completion on index specified by
pend->idx is initialized and after smc_wr_tx_send() was called the wait
for completion starts. pend->idx is used to get the correct index for
the wait, but the pend structure could already be cleared in
smc_wr_tx_process_cqe().
Introduce pnd_idx to hold and use a local copy of the correct index.

Fixes: 09c61d24f9 ("net/smc: wait for departure of an IB message")
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-27 14:46:17 +00:00
William Zhao c1833c3964 ip6_vti: initialize __ip6_tnl_parm struct in vti6_siocdevprivate
The "__ip6_tnl_parm" struct was left uninitialized causing an invalid
load of random data when the "__ip6_tnl_parm" struct was used elsewhere.
As an example, in the function "ip6_tnl_xmit_ctl()", it tries to access
the "collect_md" member. With "__ip6_tnl_parm" being uninitialized and
containing random data, the UBSAN detected that "collect_md" held a
non-boolean value.

The UBSAN issue is as follows:
===============================================================
UBSAN: invalid-load in net/ipv6/ip6_tunnel.c:1025:14
load of value 30 is not a valid value for type '_Bool'
CPU: 1 PID: 228 Comm: kworker/1:3 Not tainted 5.16.0-rc4+ #8
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
Workqueue: ipv6_addrconf addrconf_dad_work
Call Trace:
<TASK>
dump_stack_lvl+0x44/0x57
ubsan_epilogue+0x5/0x40
__ubsan_handle_load_invalid_value+0x66/0x70
? __cpuhp_setup_state+0x1d3/0x210
ip6_tnl_xmit_ctl.cold.52+0x2c/0x6f [ip6_tunnel]
vti6_tnl_xmit+0x79c/0x1e96 [ip6_vti]
? lock_is_held_type+0xd9/0x130
? vti6_rcv+0x100/0x100 [ip6_vti]
? lock_is_held_type+0xd9/0x130
? rcu_read_lock_bh_held+0xc0/0xc0
? lock_acquired+0x262/0xb10
dev_hard_start_xmit+0x1e6/0x820
__dev_queue_xmit+0x2079/0x3340
? mark_lock.part.52+0xf7/0x1050
? netdev_core_pick_tx+0x290/0x290
? kvm_clock_read+0x14/0x30
? kvm_sched_clock_read+0x5/0x10
? sched_clock_cpu+0x15/0x200
? find_held_lock+0x3a/0x1c0
? lock_release+0x42f/0xc90
? lock_downgrade+0x6b0/0x6b0
? mark_held_locks+0xb7/0x120
? neigh_connected_output+0x31f/0x470
? lockdep_hardirqs_on+0x79/0x100
? neigh_connected_output+0x31f/0x470
? ip6_finish_output2+0x9b0/0x1d90
? rcu_read_lock_bh_held+0x62/0xc0
? ip6_finish_output2+0x9b0/0x1d90
ip6_finish_output2+0x9b0/0x1d90
? ip6_append_data+0x330/0x330
? ip6_mtu+0x166/0x370
? __ip6_finish_output+0x1ad/0xfb0
? nf_hook_slow+0xa6/0x170
ip6_output+0x1fb/0x710
? nf_hook.constprop.32+0x317/0x430
? ip6_finish_output+0x180/0x180
? __ip6_finish_output+0xfb0/0xfb0
? lock_is_held_type+0xd9/0x130
ndisc_send_skb+0xb33/0x1590
? __sk_mem_raise_allocated+0x11cf/0x1560
? dst_output+0x4a0/0x4a0
? ndisc_send_rs+0x432/0x610
addrconf_dad_completed+0x30c/0xbb0
? addrconf_rs_timer+0x650/0x650
? addrconf_dad_work+0x73c/0x10e0
addrconf_dad_work+0x73c/0x10e0
? addrconf_dad_completed+0xbb0/0xbb0
? rcu_read_lock_sched_held+0xaf/0xe0
? rcu_read_lock_bh_held+0xc0/0xc0
process_one_work+0x97b/0x1740
? pwq_dec_nr_in_flight+0x270/0x270
worker_thread+0x87/0xbf0
? process_one_work+0x1740/0x1740
kthread+0x3ac/0x490
? set_kthread_struct+0x100/0x100
ret_from_fork+0x22/0x30
</TASK>
===============================================================

The solution is to initialize "__ip6_tnl_parm" struct to zeros in the
"vti6_siocdevprivate()" function.

Signed-off-by: William Zhao <wizhao@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-27 12:17:33 +00:00
Ma Xinjian e6007b85df selftests: mptcp: Remove the deprecated config NFT_COUNTER
NFT_COUNTER was removed since
390ad4295aa ("netfilter: nf_tables: make counter support built-in")
LKP/0Day will check if all configs listing under selftests are able to
be enabled properly.

For the missing configs, it will report something like:
LKP WARN miss config CONFIG_NFT_COUNTER= of net/mptcp/config

- it's not reasonable to keep the deprecated configs.
- configs under kselftests are recommended by corresponding tests.
So if some configs are missing, it will impact the testing results

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Ma Xinjian <xinjianx.ma@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-25 17:15:49 +00:00
Xin Long 5ec7d18d18 sctp: use call_rcu to free endpoint
This patch is to delay the endpoint free by calling call_rcu() to fix
another use-after-free issue in sctp_sock_dump():

  BUG: KASAN: use-after-free in __lock_acquire+0x36d9/0x4c20
  Call Trace:
    __lock_acquire+0x36d9/0x4c20 kernel/locking/lockdep.c:3218
    lock_acquire+0x1ed/0x520 kernel/locking/lockdep.c:3844
    __raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline]
    _raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:168
    spin_lock_bh include/linux/spinlock.h:334 [inline]
    __lock_sock+0x203/0x350 net/core/sock.c:2253
    lock_sock_nested+0xfe/0x120 net/core/sock.c:2774
    lock_sock include/net/sock.h:1492 [inline]
    sctp_sock_dump+0x122/0xb20 net/sctp/diag.c:324
    sctp_for_each_transport+0x2b5/0x370 net/sctp/socket.c:5091
    sctp_diag_dump+0x3ac/0x660 net/sctp/diag.c:527
    __inet_diag_dump+0xa8/0x140 net/ipv4/inet_diag.c:1049
    inet_diag_dump+0x9b/0x110 net/ipv4/inet_diag.c:1065
    netlink_dump+0x606/0x1080 net/netlink/af_netlink.c:2244
    __netlink_dump_start+0x59a/0x7c0 net/netlink/af_netlink.c:2352
    netlink_dump_start include/linux/netlink.h:216 [inline]
    inet_diag_handler_cmd+0x2ce/0x3f0 net/ipv4/inet_diag.c:1170
    __sock_diag_cmd net/core/sock_diag.c:232 [inline]
    sock_diag_rcv_msg+0x31d/0x410 net/core/sock_diag.c:263
    netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2477
    sock_diag_rcv+0x2a/0x40 net/core/sock_diag.c:274

This issue occurs when asoc is peeled off and the old sk is freed after
getting it by asoc->base.sk and before calling lock_sock(sk).

To prevent the sk free, as a holder of the sk, ep should be alive when
calling lock_sock(). This patch uses call_rcu() and moves sock_put and
ep free into sctp_endpoint_destroy_rcu(), so that it's safe to try to
hold the ep under rcu_read_lock in sctp_transport_traverse_process().

If sctp_endpoint_hold() returns true, it means this ep is still alive
and we have held it and can continue to dump it; If it returns false,
it means this ep is dead and can be freed after rcu_read_unlock, and
we should skip it.

In sctp_sock_dump(), after locking the sk, if this ep is different from
tsp->asoc->ep, it means during this dumping, this asoc was peeled off
before calling lock_sock(), and the sk should be skipped; If this ep is
the same with tsp->asoc->ep, it means no peeloff happens on this asoc,
and due to lock_sock, no peeloff will happen either until release_sock.

Note that delaying endpoint free won't delay the port release, as the
port release happens in sctp_endpoint_destroy() before calling call_rcu().
Also, freeing endpoint by call_rcu() makes it safe to access the sk by
asoc->base.sk in sctp_assocs_seq_show() and sctp_rcv().

Thanks Jones to bring this issue up.

v1->v2:
  - improve the changelog.
  - add kfree(ep) into sctp_endpoint_destroy_rcu(), as Jakub noticed.

Reported-by: syzbot+9276d76e83e3bcde6c99@syzkaller.appspotmail.com
Reported-by: Lee Jones <lee.jones@linaro.org>
Fixes: d25adbeb0c ("sctp: fix an use-after-free issue in sctp_sock_dump")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-25 17:13:37 +00:00
Miaoqian Lin b45396afa4 net: phy: fixed_phy: Fix NULL vs IS_ERR() checking in __fixed_phy_register
The fixed_phy_get_gpiod function() returns NULL, it doesn't return error
pointers, using NULL checking to fix this.i

Fixes: 5468e82f70 ("net: phy: fixed-phy: Drop GPIO from fixed_phy_add()")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Link: https://lore.kernel.org/r/20211224021500.10362-1-linmq006@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-24 14:52:10 -08:00
Coco Li 5471d5226c selftests: Calculate udpgso segment count without header adjustment
The below referenced commit correctly updated the computation of number
of segments (gso_size) by using only the gso payload size and
removing the header lengths.

With this change the regression test started failing. Update
the tests to match this new behavior.

Both IPv4 and IPv6 tests are updated, as a separate patch in this series
will update udp_v6_send_skb to match this change in udp_send_skb.

Fixes: 158390e456 ("udp: using datalen to cap max gso segments")
Signed-off-by: Coco Li <lixiaoyan@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20211223222441.2975883-2-lixiaoyan@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-23 19:20:06 -08:00
Coco Li 736ef37fd9 udp: using datalen to cap ipv6 udp max gso segments
The max number of UDP gso segments is intended to cap to
UDP_MAX_SEGMENTS, this is checked in udp_send_skb().

skb->len contains network and transport header len here, we should use
only data len instead.

This is the ipv6 counterpart to the below referenced commit,
which missed the ipv6 change

Fixes: 158390e456 ("udp: using datalen to cap max gso segments")
Signed-off-by: Coco Li <lixiaoyan@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20211223222441.2975883-1-lixiaoyan@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-23 19:20:06 -08:00
Jakub Kicinski 6f6f0ac664 mlx5-fixes-2021-12-22
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmHD/VkACgkQSD+KveBX
 +j4AXgf/eOPLXGtPLtWI6J5tVtIRk1sX0BDRcOhvHJiQtFtRGFNpQKdwsZ0bDjos
 YSuRtqiY1SORQOuqDL41r2m68jzXpU49z3O6jD4ELojw2+rKmTC6PiNdRdNm34rl
 1bU25qYNK7rsW1EyoaW1FUp91+5+1pkzWcJwO0JY6mrCoUa2FFdwFDkb6KBRDiCZ
 JLBRKFzsqzprYIFWqBm6FyE+0vFipkMzp33tIYzgoe1/A1gWOspsTJDd2tOFVXvw
 UWOudh3xYi1+7WcFov1K4vf1ppFvPhe3JkzC47Q/qqNia8gDYcXRosGw06c05Z5p
 G8CU7D44OUJ4OLjG8oMyNhFAfpUKEg==
 =tS8n
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-fixes-2021-12-22' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5 fixes 2021-12-22

This series provides bug fixes to mlx5 driver.

* tag 'mlx5-fixes-2021-12-22' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5: Fix some error handling paths in 'mlx5e_tc_add_fdb_flow()'
  net/mlx5e: Delete forward rule for ct or sample action
  net/mlx5e: Fix ICOSQ recovery flow for XSK
  net/mlx5e: Fix interoperability between XSK and ICOSQ recovery flow
  net/mlx5e: Fix skb memory leak when TC classifier action offloads are disabled
  net/mlx5e: Wrap the tx reporter dump callback to extract the sq
  net/mlx5: Fix tc max supported prio for nic mode
  net/mlx5: Fix SF health recovery flow
  net/mlx5: Fix error print in case of IRQ request failed
  net/mlx5: Use first online CPU instead of hard coded CPU
  net/mlx5: DR, Fix querying eswitch manager vport for ECPF
  net/mlx5: DR, Fix NULL vs IS_ERR checking in dr_domain_init_resources
====================

Link: https://lore.kernel.org/r/20211223190441.153012-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-23 19:04:33 -08:00
Linus Torvalds 95b40115a9 drm fixes for 5.16-rc7
mediatek:
 - NULL pointer check
 
 i915:
 - guc submission locking fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmHE9GEACgkQDHTzWXnE
 hr7ofA//ZVrsRwyzOXeUE4REbqta0hlx4a2NlmHRcH0qO448o6tji3f13eegiIps
 v01wkXT/G7XI/9+if8aoMXnsBkg1mw6Tv3F8JHLAcSGDKNAUVvRsYd+4fgPqZisH
 ix9VVO3ewkJ67W/JUN3Vc3jZAfuDGuL4IBudHmnSZSHed0UyqR43XslrH0+dsJzP
 oSVMKn+YNPhjN7pdwLLFT4klMwNsull3IG7gKYUL2R3FTxTUGYQdRdiAKOwOn/5u
 6khtROf/D0qJw7LpxRzqZoH7Gt9JMYyghyGwPn0RMNPjvYPKwAPjdScx22fR1rg5
 6dnJTUb4VxyXNMt9N6je9ZrW+YuXJgBK3oJc1VsddBZY0kHYedXHsPQakIzhHgiB
 Do1ZXhgfk6RnQao/lwKEGAPIbcYGmL9PHyeqf6mKDJqEutYBczVyPSbSDT112xWn
 O7/ftaG2X3Ggm1oICwyPJeqATzzftSKns/fiu0w142ZcuMIaDsUVweOhyZkJBUGP
 UG2JDh1yFexk7R0BdBK09shK42+ffOh+9pGkhZdvKCPwnTM88DFUwvBgWIhqX//W
 8Wkf/g5+WTa//iDWdaeiWqiIc+E+GNPCG4bBmDbhbCV1enTkLlYkLYTEQMVIHwjK
 f/7MC92XZhbFByAwK3idj6RjS0X2LXvz01V2OSXQlnEdLBhGXrQ=
 =RvXb
 -----END PGP SIGNATURE-----

Merge tag 'drm-fixes-2021-12-24' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Happy Xmas. Nothing major, one mediatek and a couple of i915 locking
  fixes. There might be a few stragglers over next week or so but I
  don't expect much before next release.

  mediatek:
   - NULL pointer check

  i915:
   - guc submission locking fixes"

* tag 'drm-fixes-2021-12-24' of git://anongit.freedesktop.org/drm/drm:
  drm/i915/guc: Only assign guc_id.id when stealing guc_id
  drm/i915/guc: Use correct context lock when callig clr_context_registered
  drm/mediatek: hdmi: Perform NULL pointer check for mtk_hdmi_conf
2021-12-23 15:43:25 -08:00
Linus Torvalds a026fa5404 io_uring-5.16-2021-12-23
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmHEmDcQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpu1EEACnW4afkADzsu5L0DOYV0k2171cBQFaIeNw
 FJb3ebp6iHbWS0BBnseKKqdpFOQoGF6q8lcDdyx/cmUH7217cZTRDt1jxxXPwmGP
 i3Zy2HMh6Dq/RX2t0Be7oMEFZeRyB+FIQySvsR4JSfhOeCF+x0kg1Rvwljd4u/t/
 sEM4BEPl1ssm1YO5ZXvhnxCoX7eEhS3pbnJj5ZV/y46YTLr9crJQBYfeOHOflEvp
 iyTfK6NGy/CDR8a9DTXrYaGRozd7J8ArO+5K4qdLtNP7Tc6+bMrV/CE5l7gwWdlU
 06eeRxGt/4NQwfWD2aphRaRM5Qa/VcssMy5ikKf98UKvalzwbaAl7Hk5ywbxdQPV
 cgiHTQa1gwpY7FAw+KQwE2m0iAqaLs4eXfFNFt68yJ2NKc3tVWsMXtFskuBdz/E7
 TeoXQTEb5PwDsjHmmFbFlW2OuJajEeIJfRG0pXellOgqm8VDruvjemNDFlg80QmC
 TpF5y/dbhIht9BHTt+K5mniRZXlVfjePok04XWQgYaIvkYATbGzC4vXMAQdwfW6y
 kO6Ez7zaDjQJShDKZvni5VlXhWwWQEjm01f1tLmljPiLmZH+bwGQVin1jHgYpskW
 4mUZD2UpwNY5N5QbEfKeQ/xCxl7EQAN8WPY4HoCssK139WFHczoGTECl6/DFC4sB
 TrTMi9rRaA==
 =CQfA
 -----END PGP SIGNATURE-----

Merge tag 'io_uring-5.16-2021-12-23' of git://git.kernel.dk/linux-block

Pull io_uring fix from Jens Axboe:
 "Single fix for not clearing kiocb->ki_pos back to 0 for a stream,
  destined for stable as well"

* tag 'io_uring-5.16-2021-12-23' of git://git.kernel.dk/linux-block:
  io_uring: zero iocb->ki_pos for stream file types
2021-12-23 15:32:07 -08:00
Linus Torvalds 7fe2bc1b64 Merge branch 'ucount-rlimit-fixes-for-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull ucount fix from Eric Biederman:
 "This fixes a silly logic bug in the ucount rlimits code, where it was
  comparing against the wrong limit"

* 'ucount-rlimit-fixes-for-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  ucounts: Fix rlimit max values check
2021-12-23 15:27:02 -08:00
Linus Torvalds 76657eaef4 Networking fixes for 5.16-rc7, including fixes from netfilter.
Current release - regressions:
 
  - revert "tipc: use consistent GFP flags"
 
 Previous releases - regressions:
 
  - igb: fix deadlock caused by taking RTNL in runtime resume path
 
  - accept UFOv6 packages in virtio_net_hdr_to_skb
 
  - netfilter: fix regression in looped (broad|multi)cast's MAC handling
 
  - bridge: fix ioctl old_deviceless bridge argument
 
  - ice: xsk: do not clear status_error0 for ntu + nb_buffs descriptor,
 	avoid stalls when multiple sockets use an interface
 
 Previous releases - always broken:
 
  - inet: fully convert sk->sk_rx_dst to RCU rules
 
  - veth: ensure skb entering GRO are not cloned
 
  - sched: fix zone matching for invalid conntrack state
 
  - bonding: fix ad_actor_system option setting to default
 
  - nf_tables: fix use-after-free in nft_set_catchall_destroy()
 
  - lantiq_xrx200: increase buffer reservation to avoid mem corruption
 
  - ice: xsk: avoid leaking app buffers during clean up
 
  - tun: avoid double free in tun_free_netdev
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmHEwHEACgkQMUZtbf5S
 IrvFgw/9GGpIqnx4YBwq9MAsbk/JzE93icLM4obcmqkvHX7PCYotapwqAu2YfiW8
 i/UzkwPY8HpDCOHeYQfwdf3QMGHC+o428SZBjKZDAi/BhLciD//9jp20AReaBznX
 Y+6ia5qH/1T6YzsrGXzv7T+pyHgY36IivAoAJZoDDeOsmy8iQckSNI09nn4EAt0K
 U9qMJo4Dw3jkaAGQdVz3mGO8/xjf9oWmuf+ghmsghMl1QC3i2Qq6+Cuf/H7jGOau
 p72/Fpwwi26lX798m7sqdSmQRVINo+UJbMpoBbfttCAIT/RTpn/lcuqHVV5aeqhs
 0qanit/3weCGSGz8zVX9xxPNO6zyPC47u7zWXD5q3JdkMDOylJR/dUAnPxatQ7BM
 TkUT4sdRVm6AqkmJqKEEthpVX7FPhG9BaCDDUZbY4hCxd2Bdth8eRVHgJAFFsops
 vqY7KMzbUCKl2RmHbm5l+vYl3HTgpE1kQzhD/wTKE4djtKYFut8Jgwq6aDfTeFjh
 d/IhTMJF1iJ2ZEw5fx6RC7Z9rEkSr+JrpW6OusunapbK2U5PeS/b3jeYv4a6wbbS
 03E3Ouin6cuyUwDwdwPhkE9id2Y0Q+r0MeKdK+wNQ4sAKcgVvc0/95HbZx2biNW/
 iduaJoRvGlLC8Fb/aCOBLVO3CeTubr5yPkg0jPZ4QB30lQc44xY=
 =/g9I
 -----END PGP SIGNATURE-----

Merge tag 'net-5.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from netfilter.

  Current release - regressions:

   - revert "tipc: use consistent GFP flags"

  Previous releases - regressions:

   - igb: fix deadlock caused by taking RTNL in runtime resume path

   - accept UFOv6 packages in virtio_net_hdr_to_skb

   - netfilter: fix regression in looped (broad|multi)cast's MAC
     handling

   - bridge: fix ioctl old_deviceless bridge argument

   - ice: xsk: do not clear status_error0 for ntu + nb_buffs descriptor,
     avoid stalls when multiple sockets use an interface

  Previous releases - always broken:

   - inet: fully convert sk->sk_rx_dst to RCU rules

   - veth: ensure skb entering GRO are not cloned

   - sched: fix zone matching for invalid conntrack state

   - bonding: fix ad_actor_system option setting to default

   - nf_tables: fix use-after-free in nft_set_catchall_destroy()

   - lantiq_xrx200: increase buffer reservation to avoid mem corruption

   - ice: xsk: avoid leaking app buffers during clean up

   - tun: avoid double free in tun_free_netdev"

* tag 'net-5.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (45 commits)
  net: stmmac: dwmac-visconti: Fix value of ETHER_CLK_SEL_FREQ_SEL_2P5M
  r8152: sync ocp base
  r8152: fix the force speed doesn't work for RTL8156
  net: bridge: fix ioctl old_deviceless bridge argument
  net: stmmac: ptp: fix potentially overflowing expression
  net: dsa: tag_ocelot: use traffic class to map priority on injected header
  veth: ensure skb entering GRO are not cloned.
  asix: fix wrong return value in asix_check_host_enable()
  asix: fix uninit-value in asix_mdio_read()
  sfc: falcon: Check null pointer of rx_queue->page_ring
  sfc: Check null pointer of rx_queue->page_ring
  net: ks8851: Check for error irq
  drivers: net: smc911x: Check for error irq
  fjes: Check for error irq
  bonding: fix ad_actor_system option setting to default
  igb: fix deadlock caused by taking RTNL in RPM resume path
  gve: Correct order of processing device options
  net: skip virtio_net_hdr_set_proto if protocol already set
  net: accept UFOv6 packages in virtio_net_hdr_to_skb
  docs: networking: replace skb_hwtstamp_tx with skb_tstamp_tx
  ...
2021-12-23 10:45:55 -08:00
Nobuhiro Iwamatsu 391e5975c0 net: stmmac: dwmac-visconti: Fix value of ETHER_CLK_SEL_FREQ_SEL_2P5M
ETHER_CLK_SEL_FREQ_SEL_2P5M is not 0 bit of the register. This is a
value, which is 0. Fix from BIT(0) to 0.

Reported-by: Yuji Ishikawa <yuji2.ishikawa@toshiba.co.jp>
Fixes: b38dd98ff8 ("net: stmmac: Add Toshiba Visconti SoCs glue driver")
Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp>
Link: https://lore.kernel.org/r/20211223073633.101306-1-nobuhiro1.iwamatsu@toshiba.co.jp
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-23 09:58:13 -08:00
Jakub Kicinski 65fd0c33eb Merge branch 'r8152-fix-bugs'
Hayes Wang says:

====================
r8152: fix bugs

Patch #1 fix the issue of force speed mode for RTL8156.
Patch #2 fix the issue of unexpected ocp_base.
====================

Link: https://lore.kernel.org/r/20211223092702.23841-386-nic_swsd@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-23 09:56:09 -08:00
Hayes Wang b24edca309 r8152: sync ocp base
There are some chances that the actual base of hardware is different
from the value recorded by driver, so we have to reset the variable
of ocp_base to sync it.

Set ocp_base to -1. Then, it would be updated and the new base would be
set to the hardware next time.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-23 09:56:06 -08:00
Hayes Wang 45bf944e67 r8152: fix the force speed doesn't work for RTL8156
It needs to set mdio force mode. Otherwise, link off always occurs when
setting force speed.

Fixes: 195aae321c ("r8152: support new chips")
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-23 09:56:06 -08:00
Linus Torvalds 996a18eb79 sound fixes for 5.16-rc7
Quite a few small fixes, hopefully the last batch for 5.16.
 Most of them are device-specific quirks and/or fixes, and nothing
 looks scary for the late stage.
 -----BEGIN PGP SIGNATURE-----
 
 iQJCBAABCAAsFiEEIXTw5fNLNI7mMiVaLtJE4w1nLE8FAmHDfDgOHHRpd2FpQHN1
 c2UuZGUACgkQLtJE4w1nLE8BEQ//bGg53nT2PnTjqPToe61XNti4SZg4esV0bQUL
 wU4rl0kg4vrnoQsZWYeGu3+rRzLQd7Cx7dOoWCaZx00/+O4Bpqb0uebbOyttLTEv
 K/RGLRmvscU3XK+xLJ4TSIh2CPu+8njj2ugl4Za0UclmxdOtB+1oPLVP609FEmIm
 XgLuAQNnWSMvZg+rrG98L+GdcVt8g/e/ga3EAVbWZHUSo/VtnM2U2gI0NQV6+d0Z
 cxmRp8X50VgptM1EdpHDyJpO+Mu23RLhoH8XhhoiuJKCj0K6wuMqZRuVewxSOQyR
 3gg42MLFCklPArWWEJ223HSFjTxx4KAgUdVyJGN+TTOmwNiqLCzFNaLSSy+XZA8r
 I7P+/ol4sJ5JRKbBojOE0w3LyZuI9X0ypH+9OnWRKA/nxoKtQemUa4Z9U3l2wIUg
 csH9gT7VaV7km6heVOuCvH3yDRxe8eP+zCdXgo8j+1krV0lwDSUp0lKTtTWvwuz2
 AoCJpBIUZz8Cj26wdpKPHSc2GkutUv6RmibTLqzd7iSvAIYSTshOYAABRPil9YYj
 sx5NDe/VrI/kyEJF0xWj/3HCFozOhIndsNWe0sAATL3JPqeikLXxukKqACLUbiQu
 1nOhEdZ2LqW6wLHZqQnnrsdG85BMJRcE8cKwqcmoTGq6fCis9iQIcX3i/qz/bHRg
 torPhos=
 =oW9C
 -----END PGP SIGNATURE-----

Merge tag 'sound-5.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "Quite a few small fixes, hopefully the last batch for 5.16.

  Most of them are device-specific quirks and/or fixes, and nothing
  looks scary for the late stage"

* tag 'sound-5.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda/realtek: Fix quirk for Clevo NJ51CU
  ALSA: rawmidi - fix the uninitalized user_pversion
  ALSA: hda: intel-sdw-acpi: go through HDAS ACPI at max depth of 2
  ALSA: hda: intel-sdw-acpi: harden detection of controller
  ALSA: hda/hdmi: Disable silent stream on GLK
  ALSA: hda/realtek: fix mute/micmute LEDs for a HP ProBook
  ASoC: meson: aiu: Move AIU_I2S_MISC hold setting to aiu-fifo-i2s
  ASoC: meson: aiu: fifo: Add missing dma_coerce_mask_and_coherent()
  ASoC: tas2770: Fix setting of high sample rates
  ASoC: rt5682: fix the wrong jack type detected
  ALSA: hda/realtek: Add new alc285-hp-amp-init model
  ALSA: hda/realtek: Amp init fixup for HP ZBook 15 G6
  ASoC: tegra: Restore headphones jack name on Nyan Big
  ASoC: tegra: Add DAPM switches for headphones and mic jack
  ALSA: jack: Check the return value of kstrdup()
  ALSA: drivers: opl3: Fix incorrect use of vp->state
  ASoC: SOF: Intel: pci-tgl: add new ADL-P variant
  ASoC: SOF: Intel: pci-tgl: add ADL-N support
2021-12-23 09:55:58 -08:00
Remi Pommarel d95a56207c net: bridge: fix ioctl old_deviceless bridge argument
Commit 561d835281 ("bridge: use ndo_siocdevprivate") changed the
source and destination arguments of copy_{to,from}_user in bridge's
old_deviceless() from args[1] to uarg breaking SIOC{G,S}IFBR ioctls.

Commit cbd7ad29a5 ("net: bridge: fix ioctl old_deviceless bridge
argument") fixed only BRCTL_{ADD,DEL}_BRIDGES commands leaving
BRCTL_GET_BRIDGES one untouched.

The fixes BRCTL_GET_BRIDGES as well and has been tested with busybox's
brctl.

Example of broken brctl:
$ brctl show
bridge name     bridge id               STP enabled     interfaces
brctl: can't get bridge name for index 0: No such device or address

Example of fixed brctl:
$ brctl show
bridge name     bridge id               STP enabled     interfaces
br0             8000.000000000000       no

Fixes: 561d835281 ("bridge: use ndo_siocdevprivate")
Signed-off-by: Remi Pommarel <repk@triplefau.lt>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Link: https://lore.kernel.org/all/20211223153139.7661-2-repk@triplefau.lt/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-23 09:53:50 -08:00
Xiaoliang Yang eccffcf465 net: stmmac: ptp: fix potentially overflowing expression
Convert the u32 variable to type u64 in a context where expression of
type u64 is required to avoid potential overflow.

Fixes: e9e3720002 ("net: stmmac: ptp: update tas basetime after ptp adjust")
Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Link: https://lore.kernel.org/r/20211223073928.37371-1-xiaoliang.yang_1@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-23 09:45:55 -08:00
Xiaoliang Yang ae2778a647 net: dsa: tag_ocelot: use traffic class to map priority on injected header
For Ocelot switches, the CPU injected frames have an injection header
where it can specify the QoS class of the packet and the DSA tag, now it
uses the SKB priority to set that. If a traffic class to priority
mapping is configured on the netdevice (with mqprio for example ...), it
won't be considered for CPU injected headers. This patch make the QoS
class aligned to the priority to traffic class mapping if it exists.

Fixes: 8dce89aa5f ("net: dsa: ocelot: add tagger for Ocelot/Felix switches")
Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: Marouen Ghodhbane <marouen.ghodhbane@nxp.com>
Link: https://lore.kernel.org/r/20211223072211.33130-1-xiaoliang.yang_1@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-23 09:44:59 -08:00
Linus Torvalds 3bf6f01398 gpio fixes for v5.16-rc7
- fix interrupts when replugging the device in gpio-dln2
 - remove the arbitrary timeout on virtio requests from gpio-virtio
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEFp3rbAvDxGAT0sefEacuoBRx13IFAmHEoTwACgkQEacuoBRx
 13KXRg/6AuCCAEVFSZ545AmfDmLX26sAUPnxsWM/HOX1+hgH1wnL4H/y0EL2zZqM
 PYWkztJNmSIpJ6F232MztE+lJ4qr/3CWB8yHWNt3WK58NQlo4iISQTeRKHTnhHcu
 8d4VeHmqCETfqkkv1/+Pg0/0bNsEvru5EfpqcE/ZrGrtzKWRLDKfmyyFiVHhCyrw
 Q5NK0LgPJuftv2EFuP4Y1LSv1VSANC1AkgG0Cxu8KKuNItIAtQp/+s/2FElFPpxQ
 wldnHP+0Kf3vBP790ANGXVtBaRhUQs8jkdlX1xioOieP/htV9ZP7jfKPubYf9kQU
 QMLtkPUJ4yzL5v/S2KGdyUSlGLui63GqnXPGZVQmRR2Dvcpfdb7EpsuL9PgCc7z5
 XA/mWPX1tWrr+KDiHa2y/TTFoe0iU/zatojo9XqB7bB1XNF5Z0OXtiH1wUGbSV3T
 4AaEafTNUT7negI9gZ3nu6kl/WRFde4fYt7RuA3irpHtFqoDFDu3iVkolyGsOHgU
 cP7nUCEkC2T27mRsKtKX+e+7hVo3uZ2L4MmISQ2kRWa324rHWXuHHVPArAjavozP
 /zIq81wzsX7U6dMuUD8APdyTgr/MSMGNuurK+C0iFKBjnnjUMoCjXzfywHuAX0gf
 UGvm4CIf7xCtpsdS8D7JHhT7fMPGzOW20t1+/QnQCate3Sg5h80=
 =qZzN
 -----END PGP SIGNATURE-----

Merge tag 'gpio-fixes-for-v5.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux

Pull gpio fixes from Bartosz Golaszewski:

 - fix interrupts when replugging the device in gpio-dln2

 - remove the arbitrary timeout on virtio requests from gpio-virtio

* tag 'gpio-fixes-for-v5.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: virtio: remove timeout
  gpio: dln2: Fix interrupts when replugging the device
2021-12-23 09:44:29 -08:00
Paolo Abeni 9695b7de5b veth: ensure skb entering GRO are not cloned.
After commit d3256efd8e ("veth: allow enabling NAPI even without XDP"),
if GRO is enabled on a veth device and TSO is disabled on the peer
device, TCP skbs will go through the NAPI callback. If there is no XDP
program attached, the veth code does not perform any share check, and
shared/cloned skbs could enter the GRO engine.

Ignat reported a BUG triggered later-on due to the above condition:

[   53.970529][    C1] kernel BUG at net/core/skbuff.c:3574!
[   53.981755][    C1] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
[   53.982634][    C1] CPU: 1 PID: 19 Comm: ksoftirqd/1 Not tainted 5.16.0-rc5+ #25
[   53.982634][    C1] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
[   53.982634][    C1] RIP: 0010:skb_shift+0x13ef/0x23b0
[   53.982634][    C1] Code: ea 03 0f b6 04 02 48 89 fa 83 e2 07 38 d0
7f 08 84 c0 0f 85 41 0c 00 00 41 80 7f 02 00 4d 8d b5 d0 00 00 00 0f
85 74 f5 ff ff <0f> 0b 4d 8d 77 20 be 04 00 00 00 4c 89 44 24 78 4c 89
f7 4c 89 8c
[   53.982634][    C1] RSP: 0018:ffff8881008f7008 EFLAGS: 00010246
[   53.982634][    C1] RAX: 0000000000000000 RBX: ffff8881180b4c80 RCX: 0000000000000000
[   53.982634][    C1] RDX: 0000000000000002 RSI: ffff8881180b4d3c RDI: ffff88810bc9cac2
[   53.982634][    C1] RBP: ffff8881008f70b8 R08: ffff8881180b4cf4 R09: ffff8881180b4cf0
[   53.982634][    C1] R10: ffffed1022999e5c R11: 0000000000000002 R12: 0000000000000590
[   53.982634][    C1] R13: ffff88810f940c80 R14: ffff88810f940d50 R15: ffff88810bc9cac0
[   53.982634][    C1] FS:  0000000000000000(0000) GS:ffff888235880000(0000) knlGS:0000000000000000
[   53.982634][    C1] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   53.982634][    C1] CR2: 00007ff5f9b86680 CR3: 0000000108ce8004 CR4: 0000000000170ee0
[   53.982634][    C1] Call Trace:
[   53.982634][    C1]  <TASK>
[   53.982634][    C1]  tcp_sacktag_walk+0xaba/0x18e0
[   53.982634][    C1]  tcp_sacktag_write_queue+0xe7b/0x3460
[   53.982634][    C1]  tcp_ack+0x2666/0x54b0
[   53.982634][    C1]  tcp_rcv_established+0x4d9/0x20f0
[   53.982634][    C1]  tcp_v4_do_rcv+0x551/0x810
[   53.982634][    C1]  tcp_v4_rcv+0x22ed/0x2ed0
[   53.982634][    C1]  ip_protocol_deliver_rcu+0x96/0xaf0
[   53.982634][    C1]  ip_local_deliver_finish+0x1e0/0x2f0
[   53.982634][    C1]  ip_sublist_rcv_finish+0x211/0x440
[   53.982634][    C1]  ip_list_rcv_finish.constprop.0+0x424/0x660
[   53.982634][    C1]  ip_list_rcv+0x2c8/0x410
[   53.982634][    C1]  __netif_receive_skb_list_core+0x65c/0x910
[   53.982634][    C1]  netif_receive_skb_list_internal+0x5f9/0xcb0
[   53.982634][    C1]  napi_complete_done+0x188/0x6e0
[   53.982634][    C1]  gro_cell_poll+0x10c/0x1d0
[   53.982634][    C1]  __napi_poll+0xa1/0x530
[   53.982634][    C1]  net_rx_action+0x567/0x1270
[   53.982634][    C1]  __do_softirq+0x28a/0x9ba
[   53.982634][    C1]  run_ksoftirqd+0x32/0x60
[   53.982634][    C1]  smpboot_thread_fn+0x559/0x8c0
[   53.982634][    C1]  kthread+0x3b9/0x490
[   53.982634][    C1]  ret_from_fork+0x22/0x30
[   53.982634][    C1]  </TASK>

Address the issue by skipping the GRO stage for shared or cloned skbs.
To reduce the chance of OoO, try to unclone the skbs before giving up.

v1 -> v2:
 - use avoid skb_copy and fallback to netif_receive_skb  - Eric

Reported-by: Ignat Korchagin <ignat@cloudflare.com>
Fixes: d3256efd8e ("veth: allow enabling NAPI even without XDP")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Tested-by: Ignat Korchagin <ignat@cloudflare.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/b5f61c5602aab01bac8d711d8d1bfab0a4817db7.1640197544.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-23 09:44:06 -08:00
Linus Torvalds 0d81b5faa2 MMC core:
- Disable card detect during shutdown
 
 MMC host:
  - mmci: Fixup tuning support for stm32_sdmmc
  - meson-mx-sdhc: Fix support for multi-block SDIO commands
  - sdhci-tegra: Fix support for eMMC HS400ES mode
 -----BEGIN PGP SIGNATURE-----
 
 iQJLBAABCgA1FiEEugLDXPmKSktSkQsV/iaEJXNYjCkFAmHEdk4XHHVsZi5oYW5z
 c29uQGxpbmFyby5vcmcACgkQ/iaEJXNYjCmfYhAAy66DW8YmcXkXMwpiIRh/aTsr
 Bo60fTk1U1Q4bxC8BsQqItuVKaC5XaiMaJZC0GSQI9pDLI8L8MYGzUfACr8hJI5x
 QGAqajPceAyaX2v5jRkCTxtA7nkIc482/ylCvLLH1cahBFMKMV8ULEoOIzjglcjc
 ZIGwoY8mK+NMMeyRjMalASTK415AdRmlqOhJkXVJ0bsQs+Or+xzEI2RC5UKoZHhg
 LqXCf8F9YHCWtRjYlFfY+wliyFea6bkhq2mnb3KOx4ieXzCC471Ttnz4rR+rAFMs
 XgzfTQ0/W4Yb/UJ4UeiEAdZoBAyiz6q8Pl35GaV8q2MuNvDgPPekLBiVG82ILXCB
 x+CqREzyjmQqqLzVD6f3kl9TPIHXxP3wySLCraGgi7/rKd1Bg//EyhCemRQ2cxEa
 buVlrOQR6bTXqb+qeImjzXBB0/HV4hNPceLNAAbUR6N5lVp1ERsyxi/f20Z7keAU
 Z4Ib6rm1oco3hFHGbjoS87HjeLowduGqmRsEqkcGCjblWudVBqX8Wb/pR6QToGCq
 jjz1ZAk4ZOCvv5uwR1/DhJyreekA6Ce9b8e7FH+wIMHT/tz2kbhxGmPe7u5vcTTX
 8A4CubzdXhOiX5ZtxfRyU0tr3Roe183MaA+Cwgw+l1OmHeuwvfQ3QLxTy1b776kd
 qiAakPQRnumE7RLzw0I=
 =0qCG
 -----END PGP SIGNATURE-----

Merge tag 'mmc-v5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc

Pull MMC fixes from Ulf Hansson:
 "MMC core:
   - Disable card detect during shutdown

  MMC host:
   - mmci: Fixup tuning support for stm32_sdmmc
   - meson-mx-sdhc: Fix support for multi-block SDIO commands
   - sdhci-tegra: Fix support for eMMC HS400ES mode"

* tag 'mmc-v5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: mmci: stm32: clear DLYB_CR after sending tuning command
  mmc: meson-mx-sdhc: Set MANUAL_STOP for multi-block SDIO commands
  mmc: core: Disable card detect during shutdown
  mmc: sdhci-tegra: Fix switch to HS400ES mode
2021-12-23 09:37:59 -08:00
Linus Torvalds c8cc50a98e ARM: SoC fixes for 5.16, part 4
This is my last set of fixes for 5.16, including
 
  - multiple code fixes for the op-tee firmware driver
 
  - Two patches for allwinner SoCs, one fixing the phy mode on
    a board, the other one fixing a driver bug in the "RSB"
    bus driver. This was originally targetted for 5.17, but
    seemed worth moving to 5.16.
 
  - Two small fixes for devicetree files on i.MX platforms,
    resolving problems with ethernet and i2c.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmHETMMACgkQmmx57+YA
 GNlLuw//V9qDeX+G3VcIblfcVZsfnP/ruUxMxh9IR9TWFqy4p2ss0Nn3L9J49Qj4
 fG16PYarh3fvpf5w6EWkRv/VUS6nFdmH5D0XUu8OhGzgaCZaJs8HCGOkg1Le9FDh
 PHX/5rS90ZuMKjQQMFUbKOQRvWkbI6Gyf+q0Cs04iaIlDgIRq+0OG77TqFhl/NhF
 xu5C45FpO2U7ICNHoziYDqHJMJxd1M1sIFUdG2V/uSmXYQ7I17eQigdm57QARRzt
 Mdgos4C+suG56NOaenBWqZgaAN+0qNtvDj7umNPb+gdH5vC2bjnwFEcdNzsGX4Vw
 1GbSuasZG8saEDLvbD9tlGeptliSlZNcVaUElx+BNSUoTZ/EQXZx/UMmURXVQAvr
 JIn1Ge5iWuJ8LWGSCFpS6kMbLVDNtM0oO4v72HeVkA7c7ixEUt5Vb4CRIxFFTghA
 sECH0Y5gQQKMTs+Tfstq3ZqvaanerHFvuot7mTrucFk8k/nQxYYTNkPTgY5B/I2H
 0xS9mAKuoHu093eF8WxVmfggFFs59vf/q29L0H6Nbueale/xPVFqT5oVm9ds4o/P
 Ye9GvFqPPPY42jrVE+PcAtOPx9sTdIIiFUczUkwXR1Klv8Tfi69RZbpchaJ6m93G
 qRLoJHO6z1Ct+AMxfsjql7Pn0XGqo1TlZOLxfFSM+DBpUsxt4FM=
 =vUXF
 -----END PGP SIGNATURE-----

Merge tag 'arm-fixes-5.16-4' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull ARM SoC fixes from Arnd Bergmann:
 "This is my last set of fixes for 5.16, including

   - multiple code fixes for the op-tee firmware driver

   - Two patches for allwinner SoCs, one fixing the phy mode on a board,
     the other one fixing a driver bug in the "RSB" bus driver. This was
     originally targeted for 5.17, but seemed worth moving to 5.16

   - Two small fixes for devicetree files on i.MX platforms, resolving
     problems with ethernet and i2c"

* tag 'arm-fixes-5.16-4' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  optee: Suppress false positive kmemleak report in optee_handle_rpc()
  tee: optee: Fix incorrect page free bug
  arm64: dts: lx2160a: fix scl-gpios property name
  tee: handle lookup of shm with reference count 0
  ARM: dts: imx6qdl-wandboard: Fix Ethernet support
  bus: sunxi-rsb: Fix shutdown
  arm64: dts: allwinner: orangepi-zero-plus: fix PHY mode
2021-12-23 09:22:34 -08:00
Christophe JAILLET 4390c6edc0 net/mlx5: Fix some error handling paths in 'mlx5e_tc_add_fdb_flow()'
All the error handling paths of 'mlx5e_tc_add_fdb_flow()' end to 'err_out'
where 'flow_flag_set(flow, FAILED);' is called.

All but the new error handling paths added by the commits given in the
Fixes tag below.

Fix these error handling paths and branch to 'err_out'.

Fixes: 166f431ec6 ("net/mlx5e: Add indirect tc offload of ovs internal port")
Fixes: b16eb3c81f ("net/mlx5: Support internal port as decap route device")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
(cherry picked from commit 31108d142f)
2021-12-22 20:38:49 -08:00
Chris Mi 2820110d94 net/mlx5e: Delete forward rule for ct or sample action
When there is ct or sample action, the ct or sample rule will be deleted
and return. But if there is an extra mirror action, the forward rule can't
be deleted because of the return.

Fix it by removing the return.

Fixes: 69e2916ebc ("net/mlx5: CT: Add support for mirroring")
Fixes: f94d6389f6 ("net/mlx5e: TC, Add support to offload sample action")
Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-22 20:38:49 -08:00
Maxim Mikityanskiy 19c4aba2d4 net/mlx5e: Fix ICOSQ recovery flow for XSK
There are two ICOSQs per channel: one is needed for RX, and the other
for async operations (XSK TX, kTLS offload). Currently, the recovery
flow for both is the same, and async ICOSQ is mistakenly treated like
the regular ICOSQ.

This patch prevents running the regular ICOSQ recovery on async ICOSQ.
The purpose of async ICOSQ is to handle XSK wakeup requests and post
kTLS offload RX parameters, it has nothing to do with RQ and XSKRQ UMRs,
so the regular recovery sequence is not applicable here.

Fixes: be5323c837 ("net/mlx5e: Report and recover from CQE error on ICOSQ")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-22 20:38:48 -08:00
Maxim Mikityanskiy 17958d7cd7 net/mlx5e: Fix interoperability between XSK and ICOSQ recovery flow
Both regular RQ and XSKRQ use the same ICOSQ for UMRs. When doing
recovery for the ICOSQ, don't forget to deactivate XSKRQ.

XSK can be opened and closed while channels are active, so a new mutex
prevents the ICOSQ recovery from running at the same time. The ICOSQ
recovery deactivates and reactivates XSKRQ, so any parallel change in
XSK state would break consistency. As the regular RQ is running, it's
not enough to just flush the recovery work, because it can be
rescheduled.

Fixes: be5323c837 ("net/mlx5e: Report and recover from CQE error on ICOSQ")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-22 20:38:48 -08:00
Gal Pressman a0cb909644 net/mlx5e: Fix skb memory leak when TC classifier action offloads are disabled
When TC classifier action offloads are disabled (CONFIG_MLX5_CLS_ACT in
Kconfig), the mlx5e_rep_tc_receive() function which is responsible for
passing the skb to the stack (or freeing it) is defined as a nop, and
results in leaking the skb memory. Replace the nop with a call to
napi_gro_receive() to resolve the leak.

Fixes: 28e7606fa8 ("net/mlx5e: Refactor rx handler of represetor device")
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Ariel Levkovich <lariel@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-22 20:38:48 -08:00
Amir Tzin 918fc3855a net/mlx5e: Wrap the tx reporter dump callback to extract the sq
Function mlx5e_tx_reporter_dump_sq() casts its void * argument to struct
mlx5e_txqsq *, but in TX-timeout-recovery flow the argument is actually
of type struct mlx5e_tx_timeout_ctx *.

 mlx5_core 0000:08:00.1 enp8s0f1: TX timeout detected
 mlx5_core 0000:08:00.1 enp8s0f1: TX timeout on queue: 1, SQ: 0x11ec, CQ: 0x146d, SQ Cons: 0x0 SQ Prod: 0x1, usecs since last trans: 21565000
 BUG: stack guard page was hit at 0000000093f1a2de (stack is 00000000b66ea0dc..000000004d932dae)
 kernel stack overflow (page fault): 0000 [#1] SMP NOPTI
 CPU: 5 PID: 95 Comm: kworker/u20:1 Tainted: G W OE 5.13.0_mlnx #1
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 Workqueue: mlx5e mlx5e_tx_timeout_work [mlx5_core]
 RIP: 0010:mlx5e_tx_reporter_dump_sq+0xd3/0x180
 [mlx5_core]
 Call Trace:
 mlx5e_tx_reporter_dump+0x43/0x1c0 [mlx5_core]
 devlink_health_do_dump.part.91+0x71/0xd0
 devlink_health_report+0x157/0x1b0
 mlx5e_reporter_tx_timeout+0xb9/0xf0 [mlx5_core]
 ? mlx5e_tx_reporter_err_cqe_recover+0x1d0/0x1d0
 [mlx5_core]
 ? mlx5e_health_queue_dump+0xd0/0xd0 [mlx5_core]
 ? update_load_avg+0x19b/0x550
 ? set_next_entity+0x72/0x80
 ? pick_next_task_fair+0x227/0x340
 ? finish_task_switch+0xa2/0x280
   mlx5e_tx_timeout_work+0x83/0xb0 [mlx5_core]
   process_one_work+0x1de/0x3a0
   worker_thread+0x2d/0x3c0
 ? process_one_work+0x3a0/0x3a0
   kthread+0x115/0x130
 ? kthread_park+0x90/0x90
   ret_from_fork+0x1f/0x30
 --[ end trace 51ccabea504edaff ]---
 RIP: 0010:mlx5e_tx_reporter_dump_sq+0xd3/0x180
 PKRU: 55555554
 Kernel panic - not syncing: Fatal exception
 Kernel Offset: disabled
 end Kernel panic - not syncing: Fatal exception

To fix this bug add a wrapper for mlx5e_tx_reporter_dump_sq() which
extracts the sq from struct mlx5e_tx_timeout_ctx and set it as the
TX-timeout-recovery flow dump callback.

Fixes: 5f29458b77 ("net/mlx5e: Support dump callback in TX reporter")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Amir Tzin <amirtz@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-22 20:38:48 -08:00
Chris Mi d671e109bd net/mlx5: Fix tc max supported prio for nic mode
Only prio 1 is supported if firmware doesn't support ignore flow
level for nic mode. The offending commit removed the check wrongly.
Add it back.

Fixes: 9a99c8f125 ("net/mlx5e: E-Switch, Offload all chain 0 priorities when modify header and forward action is not supported")
Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-22 20:38:47 -08:00
Moshe Shemesh 33de865f7b net/mlx5: Fix SF health recovery flow
SF do not directly control the PCI device. During recovery flow SF
should not be allowed to do pci disable or pci reset, its PF will do it.

It fixes the following kernel trace:
mlx5_core.sf mlx5_core.sf.25: mlx5_health_try_recover:387:(pid 40948): starting health recovery flow
mlx5_core 0000:03:00.0: mlx5_pci_slot_reset was called
mlx5_core 0000:03:00.0: wait vital counter value 0xab175 after 1 iterations
mlx5_core.sf mlx5_core.sf.25: firmware version: 24.32.532
mlx5_core.sf mlx5_core.sf.23: mlx5_health_try_recover:387:(pid 40946): starting health recovery flow
mlx5_core 0000:03:00.0: mlx5_pci_slot_reset was called
mlx5_core 0000:03:00.0: wait vital counter value 0xab193 after 1 iterations
mlx5_core.sf mlx5_core.sf.23: firmware version: 24.32.532
mlx5_core.sf mlx5_core.sf.25: mlx5_cmd_check:813:(pid 40948): ENABLE_HCA(0x104) op_mod(0x0) failed,
status bad resource state(0x9), syndrome (0x658908)
mlx5_core.sf mlx5_core.sf.25: mlx5_function_setup:1292:(pid 40948): enable hca failed
mlx5_core.sf mlx5_core.sf.25: mlx5_health_try_recover:389:(pid 40948): health recovery failed

Fixes: 1958fc2f07 ("net/mlx5: SF, Add auxiliary device driver")
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-22 20:38:47 -08:00