Make sure that the transmit traffic is tagged correctly or else the
firmware will refuse to transmit and will report an ACL violation.
On receive the hardware will make sure that tagged traffic is delivered
to the appropriate VM. The driver only asserts that the VLAN id that
was extracted from the wire traffic matches the VF's configuration.
All this works when associating a specific VLAN id with a VF. The
'trunk' setting likely needs more work.
MFC after: 1 week
Sponsored by: Chelsio Communications
hw.cxgbe.doorbells_allowed="0xf"
The adapter's doorbells bitmap is clipped to the value specified in the
tunable, which is meant for debug and workarounds only. There is no
change in default behavior.
MFC after: 1 week
Sponsored by: Chelsio Communications
It is pointless to attempt an operation that is not permitted. It spams
the firmware devlog with "insufficient caps" errors that distract from
real errors.
78 2463625358 ERR CORE insufficient caps to process mailbox cmd: pfn 0x0 vfn 0x1; r_caps 0x86 wx_caps 0x82 required r_caps 0x81 w_caps 0x5
MFC after: 1 week
Sponsored by: Chelsio Communications
In several places, a loop tests for powers of two, or iterates through
powers of two. In those places, replace the loop with an invocation
of fls or ilog2 without changing the meaning of the code.
Reviewed by: alc, markj, kib, np, erj, avg (previous version)
Differential Revision: https://reviews.freebsd.org/D45494
The kernel source contains several definitions of an ilog2 function;
some are slower than necessary, and one of them is incorrect.
Elimininate them all and define an ilog2 macro in libkern to replace
them, in a way that is fast, correct for all argument types, and, in a
GENERIC kernel, includes a check for an invalid zero parameter.
Folks at Microsoft have verified that having a correct ilog2
definition for their MANA driver doesn't break it.
Reviewed by: alc, markj, mhorne (older version), jhibbits (older version)
Differential Revision: https://reviews.freebsd.org/D45170
Differential Revision: https://reviews.freebsd.org/D45235
This affects TOE operation when multiple rx c-channels are in use for
offload, which is an unusual configuration.
MFC after: 1 week
Sponsored by: Chelsio Communications
It is the equivalent of tx_chan but for receive so rx_chan is a better
name. Initialize both using helper functions and make sure both are
displayed in the sysctl MIB.
MFC after: 1 week
Sponsored by: Chelsio Communications
PORTVEC obtained from the firmware is the authoritative source of this
information, and nports (calculated from PORTVEC) is available by the
time t4_port_init runs.
MFC after: 1 week
Sponsored by: Chelsio Communications
All the channels are not used on all boards and there's no point
allocating taskqueues that will never be used.
MFC after: 1 week
Sponsored by: Chelsio Communications
The firmware clears the interrupts already and it has a better idea of
exactly what to clear for which generation of the ASIC. There is no
need for the driver to get involved.
MFC after: 1 week
Sponsored by: Chelsio Communications
Use a separate state for when a request to set RX_QUIESCE has been
sent but the resulting TCB reply has not been received. In
particular, this correctly handles the case where data has been
received and queued in the receive queue before the quiesce request
takes effect.
Reviewed by: np
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44435
When this socket option is enabled, relatively large contiguous
buffers are allocated and used to receive data from the remote
connection. When data is received a wrapper M_EXT mbuf is queued to
the socket's receive buffer. This reduces the length of the linked
list of received mbufs and allows consumers to consume receive data in
larger chunks.
To minimize reprogramming the page pods in the adapter, receive
buffers for a given connection are recycled. When a buffer has been
fully consumed by the receiver and freed, the buffer is placed on a
per-connection free buffers list.
The size of the receive buffers defaults to 256k and can be set via
the hw.cxgbe.toe.ddp_rcvbuf_len sysctl. The
hw.cxgbe.toe.ddp_rcvbuf_cache sysctl (defaults to 4) determines the
maximum number of free buffers cached per connection. Note that this
limit does not apply to "in-flight" receive buffers that are
associated with mbufs in the socket's receive buffer.
Co-authored-by: Navdeep Parhar <np@FreeBSD.org>
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44001
Most ULP modes in cxgbe's TOE are enabled on the fly when a protocol
is needed (e.g. ULP_MODE_ISCSI is enabled by cxgbei when offloading a
connection using iSCSI, and ULP_MODE_TLS is enabled when RX TLS keys
are programmed for a TOE connection). The one exception to this is
ULP_MODE_TCPDDP.
Currently the cxgbe driver enables ULP_MODE_TCPDDP when a TOE
connection is first created. However, since DDP connections cannot be
converted to other connection types, this requires some special
handling in the driver. For example, iSCSI daemons use the SO_NO_DDP
socket option to ensure TOE connections use ULP_MODE_NONE so they can
be converted to ULP_MODE_ISCSI. Similarly, using TLS receive offload
(ULP_MODE_TLS) requires disabling TCP DDP for new connections by
default.
This commit changes cxgbe to instead switch a connection from
ULP_MODE_NONE to ULP_MODE_TCPDDP when a connection first attempts to
use TCP DDP via aio_read(2). This permits connections to always start
as ULP_MODE_NONE and switch to a protocol-specific mode as needed.
Reviewed by: np
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D43670
Previously this was only limited on T6 cards to support switching from
ULP_MODE_NONE to ULP_MODE_TLS. To support switching to
ULP_MODE_TCPDDP, enable this for all adapters.
Reviewed by: np
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D43669
Version : 1.27.5.0
Date : 10/10/2023
=====================
Fixes
-----
BASE:
- Fixed handling the Remote Fault with AN, causing the link failure.
=====================
Obtained from: Chelsio Communications
MFC after: 2 weeks
Sponsored by: Chelsio Communications
Replace the DOOMED flag with a transient DETACHING flag that is cleared
when VI is detached. This fixes VI reattach when only the VI and not
the parent nexus is detached. The old flag was never cleared and
prevented subsequent synch op's related to the VI.
PR: 275260
Reviewed by: jhb
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D43287
Sponsored by: Chelsio Communications
This avoids a mutex reinitialization when the VI is detached and
reattached.
Fixes: 516fe911a6 cxgbe(4): Always use the per-VI callout to read interface stats.
MFC after: 1 week
Sponsored by: Chelsio Communications
Just like it was done for accept(2) in cfb1e92912, use same approach
for two simplier syscalls that return socket addresses. Although,
these two syscalls aren't performance critical, this change generalizes
some code between 3 syscalls trimming code size.
Following example of accept(2), provide VNET-aware and INVARIANT-checking
wrappers sopeeraddr() and sosockaddr() around protosw methods.
Reviewed by: tuexen
Differential Revision: https://reviews.freebsd.org/D42694
Let the accept functions provide stack memory for protocols to fill it in.
Generic code should provide sockaddr_storage, specialized code may provide
smaller structure.
While rewriting accept(2) make 'addrlen' a true in/out parameter, reporting
required length in case if provided length was insufficient. Our manual
page accept(2) and POSIX don't explicitly require that, but one can read
the text as they do. Linux also does that. Update tests accordingly.
Reviewed by: rscheff, tuexen, zlei, dchagin
Differential Revision: https://reviews.freebsd.org/D42635
Apply the following automated changes to try to eliminate
no-longer-needed sys/cdefs.h includes as well as now-empty
blank lines in a row.
Remove /^#if.*\n#endif.*\n#include\s+<sys/cdefs.h>.*\n/
Remove /\n+#include\s+<sys/cdefs.h>.*\n+#if.*\n#endif.*\n+/
Remove /\n+#if.*\n#endif.*\n+/
Remove /^#if.*\n#endif.*\n/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/types.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/param.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/capsicum.h>/
Sponsored by: Netflix
1. The tracing ifnet's name must match the nexus name. One way to do
this is to not use a unit number.
2. Do not hold the tracer lock around ether_ifattach or ether_ifdetach.
netlink calls back into the driver (see stack below) and that leads
to illegal lock recursion.
tracer_ioctl
if_ioctl
get_operstate_ether
get_operstate
dump_iface
rtnl_handle_ifevent
rtnl_handle_ifattach
if_attach_internal
if_attach
ether_ifattach
t4_cloner_create
MFC after: 3 days
Sponsored by: Chelsio Communications
Similar to dcfddc8dc0, replace the
simpler, inlined version with the full version.
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D41690
In particular, the kernel RPC layer used by the NFS client never
invokes pru_rcvd since it always reads data from the socket upcall
via MSG_SOCALLBCK which avoids calling pru_rcvd. As a result, on an
NFS client connection managed by t4_tom, RX credits were never
returned to the TOE connection to open the TCP window resulting in
connection hangs.
To fix, expand the set of conditions in do_rx_data where RX credits
are returned to match those in t4_rcvd_locked by calling the function
directly.
Reviewed by: np
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D41688
netlink(4) calls back into the driver during detach and it attempts to
start an internal synchronized op recursively, causing an interruptible
hang. Fix it by failing the ioctl if the VI has been marked as DOOMED
by cxgbe_detach.
Here's the stack for the hang for reference.
#6 begin_synchronized_op
#7 cxgbe_media_status
#8 ifmedia_ioctl
#9 cxgbe_ioctl
#10 if_ioctl
#11 get_operstate_ether
#12 get_operstate
#13 dump_iface
#14 rtnl_handle_ifevent
#15 rtnl_handle_ifnet_event
#16 rt_ifmsg
#17 if_unroute
#18 if_down
#19 if_detach_internal
#20 if_detach
#21 ether_ifdetach
#22 cxgbe_vi_detach
#23 cxgbe_detach
#24 DEVICE_DETACH
MFC after: 3 days
Sponsored by: Chelsio Communications
This change adds struct tcp_info fields corresponding to the following
struct tcpcb ones:
- snd_una
- snd_max
- rcv_numsacks
- rcv_adv
- dupacks
Note that while both tcp_fill_info() and fill_tcp_info_from_tcb() are
extended accordingly, no counterpart of rcv_numsacks is available in
the cxgbe(4) TOE PCB, though.
Sponsored by: NetApp, Inc. (originally)
This function actually only ever reads from the TCP PCB. Consequently,
also make the pointer to its TCP PCB parameter const.
Sponsored by: NetApp, Inc. (originally)