mlx5-updates-2023-08-14

1) Handle PTP out of order CQEs issue
 2) Check FW status before determining reset successful
 3) Expose maximum supported SFs via devlink resource
 4) MISC cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmTan0YACgkQSD+KveBX
 +j7Cugf/Smu/Pc8UWHXpOoqd7glZhxKhA+DruyltC3cS/XJSwG/cA+sjXUo4SCnR
 TPP3ucd3lyIfwzBAJPJgpJDv+uznDhM4VVkD/RbTT5JXpW9aZ/vxwtmBggZ93YXN
 1aQdSr+rvdJWx8hV6XYtkP6GQDqGugL3/lYcwEGSrOU+cds8RJzTnltEHIcc9ldr
 gj7/N6JmliYvyDXqN7tK2YLSVAd0oouVDr19qDIYORctLnYHnCg06aTzqDJH1Hfh
 xltQMPy+EtfCnKrsMr7Hkd9INrEECkP5l27a4PuTbFQDzCdzLClbAdAS3Mq4EG+q
 e9byCdAlpLoBe5Sjmb4baCy/F4mkYg==
 =6IRd
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2023-08-14' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2023-08-14

1) Handle PTP out of order CQEs issue
2) Check FW status before determining reset successful
3) Expose maximum supported SFs via devlink resource
4) MISC cleanups

* tag 'mlx5-updates-2023-08-14' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5: Don't query MAX caps twice
  net/mlx5: Remove unused MAX HCA capabilities
  net/mlx5: Remove unused CAPs
  net/mlx5: Fix error message in mlx5_sf_dev_state_change_handler()
  net/mlx5: Remove redundant check of mlx5_vhca_event_supported()
  net/mlx5: Use mlx5_sf_start_function_id() helper instead of directly calling MLX5_CAP_GEN()
  net/mlx5: Remove redundant SF supported check from mlx5_sf_hw_table_init()
  net/mlx5: Use auxiliary_device_uninit() instead of device_put()
  net/mlx5: E-switch, Add checking for flow rule destinations
  net/mlx5: Check with FW that sync reset completed successfully
  net/mlx5: Expose max possible SFs via devlink resource
  net/mlx5e: Add recovery flow for tx devlink health reporter for unhealthy PTP SQ
  net/mlx5e: Make tx_port_ts logic resilient to out-of-order CQEs
  net/mlx5: Consolidate devlink documentation in devlink/mlx5.rst
====================

Link: https://lore.kernel.org/r/20230814214144.159464-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This commit is contained in:
Jakub Kicinski 2023-08-15 19:21:49 -07:00
commit ccd06f502b
25 changed files with 677 additions and 574 deletions

View file

@ -683,6 +683,12 @@ the software port.
time protocol.
- Error
* - `ptp_cq[i]_late_cqe`
- Number of times a CQE has been delivered on the PTP timestamping CQ when
the CQE was not expected since a certain amount of time had elapsed where
the device typically ensures not posting the CQE.
- Error
.. [#ring_global] The corresponding ring and global counters do not share the
same name (i.e. do not follow the common naming scheme).

View file

@ -1,313 +0,0 @@
.. SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
.. include:: <isonum.txt>
=======
Devlink
=======
:Copyright: |copy| 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
Contents
========
- `Info`_
- `Parameters`_
- `Health reporters`_
Info
====
The devlink info reports the running and stored firmware versions on device.
It also prints the device PSID which represents the HCA board type ID.
User command example::
$ devlink dev info pci/0000:00:06.0
pci/0000:00:06.0:
driver mlx5_core
versions:
fixed:
fw.psid MT_0000000009
running:
fw.version 16.26.0100
stored:
fw.version 16.26.0100
Parameters
==========
flow_steering_mode: Device flow steering mode
---------------------------------------------
The flow steering mode parameter controls the flow steering mode of the driver.
Two modes are supported:
1. 'dmfs' - Device managed flow steering.
2. 'smfs' - Software/Driver managed flow steering.
In DMFS mode, the HW steering entities are created and managed through the
Firmware.
In SMFS mode, the HW steering entities are created and managed though by
the driver directly into hardware without firmware intervention.
SMFS mode is faster and provides better rule insertion rate compared to default DMFS mode.
User command examples:
- Set SMFS flow steering mode::
$ devlink dev param set pci/0000:06:00.0 name flow_steering_mode value "smfs" cmode runtime
- Read device flow steering mode::
$ devlink dev param show pci/0000:06:00.0 name flow_steering_mode
pci/0000:06:00.0:
name flow_steering_mode type driver-specific
values:
cmode runtime value smfs
enable_roce: RoCE enablement state
----------------------------------
If the device supports RoCE disablement, RoCE enablement state controls device
support for RoCE capability. Otherwise, the control occurs in the driver stack.
When RoCE is disabled at the driver level, only raw ethernet QPs are supported.
To change RoCE enablement state, a user must change the driverinit cmode value
and run devlink reload.
User command examples:
- Disable RoCE::
$ devlink dev param set pci/0000:06:00.0 name enable_roce value false cmode driverinit
$ devlink dev reload pci/0000:06:00.0
- Read RoCE enablement state::
$ devlink dev param show pci/0000:06:00.0 name enable_roce
pci/0000:06:00.0:
name enable_roce type generic
values:
cmode driverinit value true
esw_port_metadata: Eswitch port metadata state
----------------------------------------------
When applicable, disabling eswitch metadata can increase packet rate
up to 20% depending on the use case and packet sizes.
Eswitch port metadata state controls whether to internally tag packets with
metadata. Metadata tagging must be enabled for multi-port RoCE, failover
between representors and stacked devices.
By default metadata is enabled on the supported devices in E-switch.
Metadata is applicable only for E-switch in switchdev mode and
users may disable it when NONE of the below use cases will be in use:
1. HCA is in Dual/multi-port RoCE mode.
2. VF/SF representor bonding (Usually used for Live migration)
3. Stacked devices
When metadata is disabled, the above use cases will fail to initialize if
users try to enable them.
- Show eswitch port metadata::
$ devlink dev param show pci/0000:06:00.0 name esw_port_metadata
pci/0000:06:00.0:
name esw_port_metadata type driver-specific
values:
cmode runtime value true
- Disable eswitch port metadata::
$ devlink dev param set pci/0000:06:00.0 name esw_port_metadata value false cmode runtime
- Change eswitch mode to switchdev mode where after choosing the metadata value::
$ devlink dev eswitch set pci/0000:06:00.0 mode switchdev
hairpin_num_queues: Number of hairpin queues
--------------------------------------------
We refer to a TC NIC rule that involves forwarding as "hairpin".
Hairpin queues are mlx5 hardware specific implementation for hardware
forwarding of such packets.
- Show the number of hairpin queues::
$ devlink dev param show pci/0000:06:00.0 name hairpin_num_queues
pci/0000:06:00.0:
name hairpin_num_queues type driver-specific
values:
cmode driverinit value 2
- Change the number of hairpin queues::
$ devlink dev param set pci/0000:06:00.0 name hairpin_num_queues value 4 cmode driverinit
hairpin_queue_size: Size of the hairpin queues
----------------------------------------------
Control the size of the hairpin queues.
- Show the size of the hairpin queues::
$ devlink dev param show pci/0000:06:00.0 name hairpin_queue_size
pci/0000:06:00.0:
name hairpin_queue_size type driver-specific
values:
cmode driverinit value 1024
- Change the size (in packets) of the hairpin queues::
$ devlink dev param set pci/0000:06:00.0 name hairpin_queue_size value 512 cmode driverinit
Health reporters
================
tx reporter
-----------
The tx reporter is responsible for reporting and recovering of the following two error scenarios:
- tx timeout
Report on kernel tx timeout detection.
Recover by searching lost interrupts.
- tx error completion
Report on error tx completion.
Recover by flushing the tx queue and reset it.
tx reporter also support on demand diagnose callback, on which it provides
real time information of its send queues status.
User commands examples:
- Diagnose send queues status::
$ devlink health diagnose pci/0000:82:00.0 reporter tx
.. note::
This command has valid output only when interface is up, otherwise the command has empty output.
- Show number of tx errors indicated, number of recover flows ended successfully,
is autorecover enabled and graceful period from last recover::
$ devlink health show pci/0000:82:00.0 reporter tx
rx reporter
-----------
The rx reporter is responsible for reporting and recovering of the following two error scenarios:
- rx queues' initialization (population) timeout
Population of rx queues' descriptors on ring initialization is done
in napi context via triggering an irq. In case of a failure to get
the minimum amount of descriptors, a timeout would occur, and
descriptors could be recovered by polling the EQ (Event Queue).
- rx completions with errors (reported by HW on interrupt context)
Report on rx completion error.
Recover (if needed) by flushing the related queue and reset it.
rx reporter also supports on demand diagnose callback, on which it
provides real time information of its receive queues' status.
- Diagnose rx queues' status and corresponding completion queue::
$ devlink health diagnose pci/0000:82:00.0 reporter rx
NOTE: This command has valid output only when interface is up. Otherwise, the command has empty output.
- Show number of rx errors indicated, number of recover flows ended successfully,
is autorecover enabled, and graceful period from last recover::
$ devlink health show pci/0000:82:00.0 reporter rx
fw reporter
-----------
The fw reporter implements `diagnose` and `dump` callbacks.
It follows symptoms of fw error such as fw syndrome by triggering
fw core dump and storing it into the dump buffer.
The fw reporter diagnose command can be triggered any time by the user to check
current fw status.
User commands examples:
- Check fw heath status::
$ devlink health diagnose pci/0000:82:00.0 reporter fw
- Read FW core dump if already stored or trigger new one::
$ devlink health dump show pci/0000:82:00.0 reporter fw
.. note::
This command can run only on the PF which has fw tracer ownership,
running it on other PF or any VF will return "Operation not permitted".
fw fatal reporter
-----------------
The fw fatal reporter implements `dump` and `recover` callbacks.
It follows fatal errors indications by CR-space dump and recover flow.
The CR-space dump uses vsc interface which is valid even if the FW command
interface is not functional, which is the case in most FW fatal errors.
The recover function runs recover flow which reloads the driver and triggers fw
reset if needed.
On firmware error, the health buffer is dumped into the dmesg. The log
level is derived from the error's severity (given in health buffer).
User commands examples:
- Run fw recover flow manually::
$ devlink health recover pci/0000:82:00.0 reporter fw_fatal
- Read FW CR-space dump if already stored or trigger new one::
$ devlink health dump show pci/0000:82:00.1 reporter fw_fatal
.. note::
This command can run only on PF.
vnic reporter
-------------
The vnic reporter implements only the `diagnose` callback.
It is responsible for querying the vnic diagnostic counters from fw and displaying
them in realtime.
Description of the vnic counters:
- total_q_under_processor_handle
number of queues in an error state due to
an async error or errored command.
- send_queue_priority_update_flow
number of QP/SQ priority/SL update events.
- cq_overrun
number of times CQ entered an error state due to an overflow.
- async_eq_overrun
number of times an EQ mapped to async events was overrun.
comp_eq_overrun number of times an EQ mapped to completion events was
overrun.
- quota_exceeded_command
number of commands issued and failed due to quota exceeded.
- invalid_command
number of commands issued and failed dues to any reason other than quota
exceeded.
- nic_receive_steering_discard
number of packets that completed RX flow
steering but were discarded due to a mismatch in flow table.
- generated_pkt_steering_fail
number of packets generated by the VNIC experiencing unexpected steering
failure (at any point in steering flow).
- handled_pkt_steering_fail
number of packets handled by the VNIC experiencing unexpected steering
failure (at any point in steering flow owned by the VNIC, including the FDB
for the eswitch owner).
User commands examples:
- Diagnose PF/VF vnic counters::
$ devlink health diagnose pci/0000:82:00.1 reporter vnic
- Diagnose representor vnic counters (performed by supplying devlink port of the
representor, which can be obtained via devlink port command)::
$ devlink health diagnose pci/0000:82:00.1/65537 reporter vnic
.. note::
This command can run over all interfaces such as PF/VF and representor ports.

View file

@ -13,7 +13,6 @@ Contents:
:maxdepth: 2
kconfig
devlink
switchdev
tracepoints
counters

View file

@ -18,6 +18,11 @@ Parameters
* - ``enable_roce``
- driverinit
- Type: Boolean
If the device supports RoCE disablement, RoCE enablement state controls
device support for RoCE capability. Otherwise, the control occurs in the
driver stack. When RoCE is disabled at the driver level, only raw
ethernet QPs are supported.
* - ``io_eq_size``
- driverinit
- The range is between 64 and 4096.
@ -48,6 +53,9 @@ parameters.
* ``smfs`` Software managed flow steering. In SMFS mode, the HW
steering entities are created and manage through the driver without
firmware intervention.
SMFS mode is faster and provides better rule insertion rate compared to
default DMFS mode.
* - ``fdb_large_groups``
- u32
- driverinit
@ -71,7 +79,24 @@ parameters.
deprecated.
Default: disabled
* - ``esw_port_metadata``
- Boolean
- runtime
- When applicable, disabling eswitch metadata can increase packet rate up
to 20% depending on the use case and packet sizes.
Eswitch port metadata state controls whether to internally tag packets
with metadata. Metadata tagging must be enabled for multi-port RoCE,
failover between representors and stacked devices. By default metadata is
enabled on the supported devices in E-switch. Metadata is applicable only
for E-switch in switchdev mode and users may disable it when NONE of the
below use cases will be in use:
1. HCA is in Dual/multi-port RoCE mode.
2. VF/SF representor bonding (Usually used for Live migration)
3. Stacked devices
When metadata is disabled, the above use cases will fail to initialize if
users try to enable them.
* - ``hairpin_num_queues``
- u32
- driverinit
@ -104,3 +129,160 @@ The ``mlx5`` driver reports the following versions
* - ``fw.version``
- stored, running
- Three digit major.minor.subminor firmware version number.
Health reporters
================
tx reporter
-----------
The tx reporter is responsible for reporting and recovering of the following three error scenarios:
- tx timeout
Report on kernel tx timeout detection.
Recover by searching lost interrupts.
- tx error completion
Report on error tx completion.
Recover by flushing the tx queue and reset it.
- tx PTP port timestamping CQ unhealthy
Report too many CQEs never delivered on port ts CQ.
Recover by flushing and re-creating all PTP channels.
tx reporter also support on demand diagnose callback, on which it provides
real time information of its send queues status.
User commands examples:
- Diagnose send queues status::
$ devlink health diagnose pci/0000:82:00.0 reporter tx
.. note::
This command has valid output only when interface is up, otherwise the command has empty output.
- Show number of tx errors indicated, number of recover flows ended successfully,
is autorecover enabled and graceful period from last recover::
$ devlink health show pci/0000:82:00.0 reporter tx
rx reporter
-----------
The rx reporter is responsible for reporting and recovering of the following two error scenarios:
- rx queues' initialization (population) timeout
Population of rx queues' descriptors on ring initialization is done
in napi context via triggering an irq. In case of a failure to get
the minimum amount of descriptors, a timeout would occur, and
descriptors could be recovered by polling the EQ (Event Queue).
- rx completions with errors (reported by HW on interrupt context)
Report on rx completion error.
Recover (if needed) by flushing the related queue and reset it.
rx reporter also supports on demand diagnose callback, on which it
provides real time information of its receive queues' status.
- Diagnose rx queues' status and corresponding completion queue::
$ devlink health diagnose pci/0000:82:00.0 reporter rx
.. note::
This command has valid output only when interface is up. Otherwise, the command has empty output.
- Show number of rx errors indicated, number of recover flows ended successfully,
is autorecover enabled, and graceful period from last recover::
$ devlink health show pci/0000:82:00.0 reporter rx
fw reporter
-----------
The fw reporter implements `diagnose` and `dump` callbacks.
It follows symptoms of fw error such as fw syndrome by triggering
fw core dump and storing it into the dump buffer.
The fw reporter diagnose command can be triggered any time by the user to check
current fw status.
User commands examples:
- Check fw heath status::
$ devlink health diagnose pci/0000:82:00.0 reporter fw
- Read FW core dump if already stored or trigger new one::
$ devlink health dump show pci/0000:82:00.0 reporter fw
.. note::
This command can run only on the PF which has fw tracer ownership,
running it on other PF or any VF will return "Operation not permitted".
fw fatal reporter
-----------------
The fw fatal reporter implements `dump` and `recover` callbacks.
It follows fatal errors indications by CR-space dump and recover flow.
The CR-space dump uses vsc interface which is valid even if the FW command
interface is not functional, which is the case in most FW fatal errors.
The recover function runs recover flow which reloads the driver and triggers fw
reset if needed.
On firmware error, the health buffer is dumped into the dmesg. The log
level is derived from the error's severity (given in health buffer).
User commands examples:
- Run fw recover flow manually::
$ devlink health recover pci/0000:82:00.0 reporter fw_fatal
- Read FW CR-space dump if already stored or trigger new one::
$ devlink health dump show pci/0000:82:00.1 reporter fw_fatal
.. note::
This command can run only on PF.
vnic reporter
-------------
The vnic reporter implements only the `diagnose` callback.
It is responsible for querying the vnic diagnostic counters from fw and displaying
them in realtime.
Description of the vnic counters:
- total_q_under_processor_handle
number of queues in an error state due to
an async error or errored command.
- send_queue_priority_update_flow
number of QP/SQ priority/SL update events.
- cq_overrun
number of times CQ entered an error state due to an overflow.
- async_eq_overrun
number of times an EQ mapped to async events was overrun.
comp_eq_overrun number of times an EQ mapped to completion events was
overrun.
- quota_exceeded_command
number of commands issued and failed due to quota exceeded.
- invalid_command
number of commands issued and failed dues to any reason other than quota
exceeded.
- nic_receive_steering_discard
number of packets that completed RX flow
steering but were discarded due to a mismatch in flow table.
- generated_pkt_steering_fail
number of packets generated by the VNIC experiencing unexpected steering
failure (at any point in steering flow).
- handled_pkt_steering_fail
number of packets handled by the VNIC experiencing unexpected steering
failure (at any point in steering flow owned by the VNIC, including the FDB
for the eswitch owner).
User commands examples:
- Diagnose PF/VF vnic counters::
$ devlink health diagnose pci/0000:82:00.1 reporter vnic
- Diagnose representor vnic counters (performed by supplying devlink port of the
representor, which can be obtained via devlink port command)::
$ devlink health diagnose pci/0000:82:00.1/65537 reporter vnic
.. note::
This command can run over all interfaces such as PF/VF and representor ports.

View file

@ -212,6 +212,9 @@ static int mlx5_devlink_reload_up(struct devlink *devlink, enum devlink_reload_a
/* On fw_activate action, also driver is reloaded and reinit performed */
*actions_performed |= BIT(DEVLINK_RELOAD_ACTION_DRIVER_REINIT);
ret = mlx5_load_one_devl_locked(dev, true);
if (ret)
return ret;
ret = mlx5_fw_reset_verify_fw_complete(dev, extack);
break;
default:
/* Unsupported action should not get to this function */

View file

@ -6,6 +6,14 @@
#include <net/devlink.h>
enum mlx5_devlink_resource_id {
MLX5_DL_RES_MAX_LOCAL_SFS = 1,
MLX5_DL_RES_MAX_EXTERNAL_SFS,
__MLX5_ID_RES_MAX,
MLX5_ID_RES_MAX = __MLX5_ID_RES_MAX - 1,
};
enum mlx5_devlink_param_id {
MLX5_DEVLINK_PARAM_ID_BASE = DEVLINK_PARAM_GENERIC_ID_MAX,
MLX5_DEVLINK_PARAM_ID_FLOW_STEERING_MODE,

View file

@ -18,6 +18,7 @@ void mlx5e_reporter_tx_create(struct mlx5e_priv *priv);
void mlx5e_reporter_tx_destroy(struct mlx5e_priv *priv);
void mlx5e_reporter_tx_err_cqe(struct mlx5e_txqsq *sq);
int mlx5e_reporter_tx_timeout(struct mlx5e_txqsq *sq);
void mlx5e_reporter_tx_ptpsq_unhealthy(struct mlx5e_ptpsq *ptpsq);
int mlx5e_health_cq_diag_fmsg(struct mlx5e_cq *cq, struct devlink_fmsg *fmsg);
int mlx5e_health_cq_common_diag_fmsg(struct mlx5e_cq *cq, struct devlink_fmsg *fmsg);

View file

@ -2,9 +2,12 @@
// Copyright (c) 2020 Mellanox Technologies
#include "en/ptp.h"
#include "en/health.h"
#include "en/txrx.h"
#include "en/params.h"
#include "en/fs_tt_redirect.h"
#include <linux/list.h>
#include <linux/spinlock.h>
struct mlx5e_ptp_fs {
struct mlx5_flow_handle *l2_rule;
@ -19,6 +22,48 @@ struct mlx5e_ptp_params {
struct mlx5e_rq_param rq_param;
};
struct mlx5e_ptp_port_ts_cqe_tracker {
u8 metadata_id;
bool inuse : 1;
struct list_head entry;
};
struct mlx5e_ptp_port_ts_cqe_list {
struct mlx5e_ptp_port_ts_cqe_tracker *nodes;
struct list_head tracker_list_head;
/* Sync list operations in xmit and napi_poll contexts */
spinlock_t tracker_list_lock;
};
static inline void
mlx5e_ptp_port_ts_cqe_list_add(struct mlx5e_ptp_port_ts_cqe_list *list, u8 metadata)
{
struct mlx5e_ptp_port_ts_cqe_tracker *tracker = &list->nodes[metadata];
WARN_ON_ONCE(tracker->inuse);
tracker->inuse = true;
spin_lock(&list->tracker_list_lock);
list_add_tail(&tracker->entry, &list->tracker_list_head);
spin_unlock(&list->tracker_list_lock);
}
static void
mlx5e_ptp_port_ts_cqe_list_remove(struct mlx5e_ptp_port_ts_cqe_list *list, u8 metadata)
{
struct mlx5e_ptp_port_ts_cqe_tracker *tracker = &list->nodes[metadata];
WARN_ON_ONCE(!tracker->inuse);
tracker->inuse = false;
spin_lock(&list->tracker_list_lock);
list_del(&tracker->entry);
spin_unlock(&list->tracker_list_lock);
}
void mlx5e_ptpsq_track_metadata(struct mlx5e_ptpsq *ptpsq, u8 metadata)
{
mlx5e_ptp_port_ts_cqe_list_add(ptpsq->ts_cqe_pending_list, metadata);
}
struct mlx5e_skb_cb_hwtstamp {
ktime_t cqe_hwtstamp;
ktime_t port_hwtstamp;
@ -79,75 +124,97 @@ void mlx5e_skb_cb_hwtstamp_handler(struct sk_buff *skb, int hwtstamp_type,
memset(skb->cb, 0, sizeof(struct mlx5e_skb_cb_hwtstamp));
}
#define PTP_WQE_CTR2IDX(val) ((val) & ptpsq->ts_cqe_ctr_mask)
static bool mlx5e_ptp_ts_cqe_drop(struct mlx5e_ptpsq *ptpsq, u16 skb_ci, u16 skb_id)
static struct sk_buff *
mlx5e_ptp_metadata_map_lookup(struct mlx5e_ptp_metadata_map *map, u16 metadata)
{
return (ptpsq->ts_cqe_ctr_mask && (skb_ci != skb_id));
return map->data[metadata];
}
static bool mlx5e_ptp_ts_cqe_ooo(struct mlx5e_ptpsq *ptpsq, u16 skb_id)
static struct sk_buff *
mlx5e_ptp_metadata_map_remove(struct mlx5e_ptp_metadata_map *map, u16 metadata)
{
u16 skb_ci = PTP_WQE_CTR2IDX(ptpsq->skb_fifo_cc);
u16 skb_pi = PTP_WQE_CTR2IDX(ptpsq->skb_fifo_pc);
if (PTP_WQE_CTR2IDX(skb_id - skb_ci) >= PTP_WQE_CTR2IDX(skb_pi - skb_ci))
return true;
return false;
}
static void mlx5e_ptp_skb_fifo_ts_cqe_resync(struct mlx5e_ptpsq *ptpsq, u16 skb_ci,
u16 skb_id, int budget)
{
struct skb_shared_hwtstamps hwts = {};
struct sk_buff *skb;
ptpsq->cq_stats->resync_event++;
skb = map->data[metadata];
map->data[metadata] = NULL;
while (skb_ci != skb_id) {
skb = mlx5e_skb_fifo_pop(&ptpsq->skb_fifo);
hwts.hwtstamp = mlx5e_skb_cb_get_hwts(skb)->cqe_hwtstamp;
skb_tstamp_tx(skb, &hwts);
ptpsq->cq_stats->resync_cqe++;
napi_consume_skb(skb, budget);
skb_ci = PTP_WQE_CTR2IDX(ptpsq->skb_fifo_cc);
}
return skb;
}
static bool mlx5e_ptp_metadata_map_unhealthy(struct mlx5e_ptp_metadata_map *map)
{
/* Considered beginning unhealthy state if size * 15 / 2^4 cannot be reclaimed. */
return map->undelivered_counter > (map->capacity >> 4) * 15;
}
static void mlx5e_ptpsq_mark_ts_cqes_undelivered(struct mlx5e_ptpsq *ptpsq,
ktime_t port_tstamp)
{
struct mlx5e_ptp_port_ts_cqe_list *cqe_list = ptpsq->ts_cqe_pending_list;
ktime_t timeout = ns_to_ktime(MLX5E_PTP_TS_CQE_UNDELIVERED_TIMEOUT);
struct mlx5e_ptp_metadata_map *metadata_map = &ptpsq->metadata_map;
struct mlx5e_ptp_port_ts_cqe_tracker *pos, *n;
spin_lock(&cqe_list->tracker_list_lock);
list_for_each_entry_safe(pos, n, &cqe_list->tracker_list_head, entry) {
struct sk_buff *skb =
mlx5e_ptp_metadata_map_lookup(metadata_map, pos->metadata_id);
ktime_t dma_tstamp = mlx5e_skb_cb_get_hwts(skb)->cqe_hwtstamp;
if (!dma_tstamp ||
ktime_after(ktime_add(dma_tstamp, timeout), port_tstamp))
break;
metadata_map->undelivered_counter++;
WARN_ON_ONCE(!pos->inuse);
pos->inuse = false;
list_del(&pos->entry);
}
spin_unlock(&cqe_list->tracker_list_lock);
}
#define PTP_WQE_CTR2IDX(val) ((val) & ptpsq->ts_cqe_ctr_mask)
static void mlx5e_ptp_handle_ts_cqe(struct mlx5e_ptpsq *ptpsq,
struct mlx5_cqe64 *cqe,
int budget)
{
u16 skb_id = PTP_WQE_CTR2IDX(be16_to_cpu(cqe->wqe_counter));
u16 skb_ci = PTP_WQE_CTR2IDX(ptpsq->skb_fifo_cc);
struct mlx5e_ptp_port_ts_cqe_list *pending_cqe_list = ptpsq->ts_cqe_pending_list;
u8 metadata_id = PTP_WQE_CTR2IDX(be16_to_cpu(cqe->wqe_counter));
bool is_err_cqe = !!MLX5E_RX_ERR_CQE(cqe);
struct mlx5e_txqsq *sq = &ptpsq->txqsq;
struct sk_buff *skb;
ktime_t hwtstamp;
if (unlikely(MLX5E_RX_ERR_CQE(cqe))) {
skb = mlx5e_skb_fifo_pop(&ptpsq->skb_fifo);
if (likely(pending_cqe_list->nodes[metadata_id].inuse)) {
mlx5e_ptp_port_ts_cqe_list_remove(pending_cqe_list, metadata_id);
} else {
/* Reclaim space in the unlikely event CQE was delivered after
* marking it late.
*/
ptpsq->metadata_map.undelivered_counter--;
ptpsq->cq_stats->late_cqe++;
}
skb = mlx5e_ptp_metadata_map_remove(&ptpsq->metadata_map, metadata_id);
if (unlikely(is_err_cqe)) {
ptpsq->cq_stats->err_cqe++;
goto out;
}
if (mlx5e_ptp_ts_cqe_drop(ptpsq, skb_ci, skb_id)) {
if (mlx5e_ptp_ts_cqe_ooo(ptpsq, skb_id)) {
/* already handled by a previous resync */
ptpsq->cq_stats->ooo_cqe_drop++;
return;
}
mlx5e_ptp_skb_fifo_ts_cqe_resync(ptpsq, skb_ci, skb_id, budget);
}
skb = mlx5e_skb_fifo_pop(&ptpsq->skb_fifo);
hwtstamp = mlx5e_cqe_ts_to_ns(sq->ptp_cyc2time, sq->clock, get_cqe_ts(cqe));
mlx5e_skb_cb_hwtstamp_handler(skb, MLX5E_SKB_CB_PORT_HWTSTAMP,
hwtstamp, ptpsq->cq_stats);
ptpsq->cq_stats->cqe++;
mlx5e_ptpsq_mark_ts_cqes_undelivered(ptpsq, hwtstamp);
out:
napi_consume_skb(skb, budget);
mlx5e_ptp_metadata_fifo_push(&ptpsq->metadata_freelist, metadata_id);
if (unlikely(mlx5e_ptp_metadata_map_unhealthy(&ptpsq->metadata_map)) &&
!test_and_set_bit(MLX5E_SQ_STATE_RECOVERING, &sq->state))
queue_work(ptpsq->txqsq.priv->wq, &ptpsq->report_unhealthy_work);
}
static bool mlx5e_ptp_poll_ts_cq(struct mlx5e_cq *cq, int budget)
@ -291,36 +358,86 @@ static void mlx5e_ptp_destroy_sq(struct mlx5_core_dev *mdev, u32 sqn)
static int mlx5e_ptp_alloc_traffic_db(struct mlx5e_ptpsq *ptpsq, int numa)
{
int wq_sz = mlx5_wq_cyc_get_size(&ptpsq->txqsq.wq);
struct mlx5_core_dev *mdev = ptpsq->txqsq.mdev;
struct mlx5e_ptp_metadata_fifo *metadata_freelist = &ptpsq->metadata_freelist;
struct mlx5e_ptp_metadata_map *metadata_map = &ptpsq->metadata_map;
struct mlx5e_ptp_port_ts_cqe_list *cqe_list;
int db_sz;
int md;
ptpsq->skb_fifo.fifo = kvzalloc_node(array_size(wq_sz, sizeof(*ptpsq->skb_fifo.fifo)),
GFP_KERNEL, numa);
if (!ptpsq->skb_fifo.fifo)
cqe_list = kvzalloc_node(sizeof(*ptpsq->ts_cqe_pending_list), GFP_KERNEL, numa);
if (!cqe_list)
return -ENOMEM;
ptpsq->ts_cqe_pending_list = cqe_list;
db_sz = min_t(u32, mlx5_wq_cyc_get_size(&ptpsq->txqsq.wq),
1 << MLX5_CAP_GEN_2(ptpsq->txqsq.mdev,
ts_cqe_metadata_size2wqe_counter));
ptpsq->ts_cqe_ctr_mask = db_sz - 1;
cqe_list->nodes = kvzalloc_node(array_size(db_sz, sizeof(*cqe_list->nodes)),
GFP_KERNEL, numa);
if (!cqe_list->nodes)
goto free_cqe_list;
INIT_LIST_HEAD(&cqe_list->tracker_list_head);
spin_lock_init(&cqe_list->tracker_list_lock);
metadata_freelist->data =
kvzalloc_node(array_size(db_sz, sizeof(*metadata_freelist->data)),
GFP_KERNEL, numa);
if (!metadata_freelist->data)
goto free_cqe_list_nodes;
metadata_freelist->mask = ptpsq->ts_cqe_ctr_mask;
for (md = 0; md < db_sz; ++md) {
cqe_list->nodes[md].metadata_id = md;
metadata_freelist->data[md] = md;
}
metadata_freelist->pc = db_sz;
metadata_map->data =
kvzalloc_node(array_size(db_sz, sizeof(*metadata_map->data)),
GFP_KERNEL, numa);
if (!metadata_map->data)
goto free_metadata_freelist;
metadata_map->capacity = db_sz;
ptpsq->skb_fifo.pc = &ptpsq->skb_fifo_pc;
ptpsq->skb_fifo.cc = &ptpsq->skb_fifo_cc;
ptpsq->skb_fifo.mask = wq_sz - 1;
if (MLX5_CAP_GEN_2(mdev, ts_cqe_metadata_size2wqe_counter))
ptpsq->ts_cqe_ctr_mask =
(1 << MLX5_CAP_GEN_2(mdev, ts_cqe_metadata_size2wqe_counter)) - 1;
return 0;
free_metadata_freelist:
kvfree(metadata_freelist->data);
free_cqe_list_nodes:
kvfree(cqe_list->nodes);
free_cqe_list:
kvfree(cqe_list);
return -ENOMEM;
}
static void mlx5e_ptp_drain_skb_fifo(struct mlx5e_skb_fifo *skb_fifo)
static void mlx5e_ptp_drain_metadata_map(struct mlx5e_ptp_metadata_map *map)
{
while (*skb_fifo->pc != *skb_fifo->cc) {
struct sk_buff *skb = mlx5e_skb_fifo_pop(skb_fifo);
int idx;
for (idx = 0; idx < map->capacity; ++idx) {
struct sk_buff *skb = map->data[idx];
dev_kfree_skb_any(skb);
}
}
static void mlx5e_ptp_free_traffic_db(struct mlx5e_skb_fifo *skb_fifo)
static void mlx5e_ptp_free_traffic_db(struct mlx5e_ptpsq *ptpsq)
{
mlx5e_ptp_drain_skb_fifo(skb_fifo);
kvfree(skb_fifo->fifo);
mlx5e_ptp_drain_metadata_map(&ptpsq->metadata_map);
kvfree(ptpsq->metadata_map.data);
kvfree(ptpsq->metadata_freelist.data);
kvfree(ptpsq->ts_cqe_pending_list->nodes);
kvfree(ptpsq->ts_cqe_pending_list);
}
static void mlx5e_ptpsq_unhealthy_work(struct work_struct *work)
{
struct mlx5e_ptpsq *ptpsq =
container_of(work, struct mlx5e_ptpsq, report_unhealthy_work);
mlx5e_reporter_tx_ptpsq_unhealthy(ptpsq);
}
static int mlx5e_ptp_open_txqsq(struct mlx5e_ptp *c, u32 tisn,
@ -348,11 +465,12 @@ static int mlx5e_ptp_open_txqsq(struct mlx5e_ptp *c, u32 tisn,
if (err)
goto err_free_txqsq;
err = mlx5e_ptp_alloc_traffic_db(ptpsq,
dev_to_node(mlx5_core_dma_dev(c->mdev)));
err = mlx5e_ptp_alloc_traffic_db(ptpsq, dev_to_node(mlx5_core_dma_dev(c->mdev)));
if (err)
goto err_free_txqsq;
INIT_WORK(&ptpsq->report_unhealthy_work, mlx5e_ptpsq_unhealthy_work);
return 0;
err_free_txqsq:
@ -366,7 +484,9 @@ static void mlx5e_ptp_close_txqsq(struct mlx5e_ptpsq *ptpsq)
struct mlx5e_txqsq *sq = &ptpsq->txqsq;
struct mlx5_core_dev *mdev = sq->mdev;
mlx5e_ptp_free_traffic_db(&ptpsq->skb_fifo);
if (current_work() != &ptpsq->report_unhealthy_work)
cancel_work_sync(&ptpsq->report_unhealthy_work);
mlx5e_ptp_free_traffic_db(ptpsq);
cancel_work_sync(&sq->recover_work);
mlx5e_ptp_destroy_sq(mdev, sq->sqn);
mlx5e_free_txqsq_descs(sq);
@ -534,7 +654,10 @@ static void mlx5e_ptp_build_params(struct mlx5e_ptp *c,
/* SQ */
if (test_bit(MLX5E_PTP_STATE_TX, c->state)) {
params->log_sq_size = orig->log_sq_size;
params->log_sq_size =
min(MLX5_CAP_GEN_2(c->mdev, ts_cqe_metadata_size2wqe_counter),
MLX5E_PTP_MAX_LOG_SQ_SIZE);
params->log_sq_size = min(params->log_sq_size, orig->log_sq_size);
mlx5e_ptp_build_sq_param(c->mdev, params, &cparams->txq_sq_param);
}
/* RQ */

View file

@ -7,18 +7,38 @@
#include "en.h"
#include "en_stats.h"
#include "en/txrx.h"
#include <linux/ktime.h>
#include <linux/ptp_classify.h>
#include <linux/time64.h>
#include <linux/workqueue.h>
#define MLX5E_PTP_CHANNEL_IX 0
#define MLX5E_PTP_MAX_LOG_SQ_SIZE (8U)
#define MLX5E_PTP_TS_CQE_UNDELIVERED_TIMEOUT (1 * NSEC_PER_SEC)
struct mlx5e_ptp_metadata_fifo {
u8 cc;
u8 pc;
u8 mask;
u8 *data;
};
struct mlx5e_ptp_metadata_map {
u16 undelivered_counter;
u16 capacity;
struct sk_buff **data;
};
struct mlx5e_ptpsq {
struct mlx5e_txqsq txqsq;
struct mlx5e_cq ts_cq;
u16 skb_fifo_cc;
u16 skb_fifo_pc;
struct mlx5e_skb_fifo skb_fifo;
struct mlx5e_ptp_cq_stats *cq_stats;
u16 ts_cqe_ctr_mask;
struct work_struct report_unhealthy_work;
struct mlx5e_ptp_port_ts_cqe_list *ts_cqe_pending_list;
struct mlx5e_ptp_metadata_fifo metadata_freelist;
struct mlx5e_ptp_metadata_map metadata_map;
};
enum {
@ -69,12 +89,35 @@ static inline bool mlx5e_use_ptpsq(struct sk_buff *skb)
fk.ports.dst == htons(PTP_EV_PORT));
}
static inline bool mlx5e_ptpsq_fifo_has_room(struct mlx5e_txqsq *sq)
static inline void mlx5e_ptp_metadata_fifo_push(struct mlx5e_ptp_metadata_fifo *fifo, u8 metadata)
{
if (!sq->ptpsq)
return true;
fifo->data[fifo->mask & fifo->pc++] = metadata;
}
return mlx5e_skb_fifo_has_room(&sq->ptpsq->skb_fifo);
static inline u8
mlx5e_ptp_metadata_fifo_pop(struct mlx5e_ptp_metadata_fifo *fifo)
{
return fifo->data[fifo->mask & fifo->cc++];
}
static inline void
mlx5e_ptp_metadata_map_put(struct mlx5e_ptp_metadata_map *map,
struct sk_buff *skb, u8 metadata)
{
WARN_ON_ONCE(map->data[metadata]);
map->data[metadata] = skb;
}
static inline bool mlx5e_ptpsq_metadata_freelist_empty(struct mlx5e_ptpsq *ptpsq)
{
struct mlx5e_ptp_metadata_fifo *freelist;
if (likely(!ptpsq))
return false;
freelist = &ptpsq->metadata_freelist;
return freelist->pc == freelist->cc;
}
int mlx5e_ptp_open(struct mlx5e_priv *priv, struct mlx5e_params *params,
@ -89,6 +132,8 @@ void mlx5e_ptp_free_rx_fs(struct mlx5e_flow_steering *fs,
const struct mlx5e_profile *profile);
int mlx5e_ptp_rx_manage_fs(struct mlx5e_priv *priv, bool set);
void mlx5e_ptpsq_track_metadata(struct mlx5e_ptpsq *ptpsq, u8 metadata);
enum {
MLX5E_SKB_CB_CQE_HWTSTAMP = BIT(0),
MLX5E_SKB_CB_PORT_HWTSTAMP = BIT(1),

View file

@ -164,6 +164,43 @@ static int mlx5e_tx_reporter_timeout_recover(void *ctx)
return err;
}
static int mlx5e_tx_reporter_ptpsq_unhealthy_recover(void *ctx)
{
struct mlx5e_ptpsq *ptpsq = ctx;
struct mlx5e_channels *chs;
struct net_device *netdev;
struct mlx5e_priv *priv;
int carrier_ok;
int err;
if (!test_bit(MLX5E_SQ_STATE_RECOVERING, &ptpsq->txqsq.state))
return 0;
priv = ptpsq->txqsq.priv;
mutex_lock(&priv->state_lock);
chs = &priv->channels;
netdev = priv->netdev;
carrier_ok = netif_carrier_ok(netdev);
netif_carrier_off(netdev);
mlx5e_deactivate_priv_channels(priv);
mlx5e_ptp_close(chs->ptp);
err = mlx5e_ptp_open(priv, &chs->params, chs->c[0]->lag_port, &chs->ptp);
mlx5e_activate_priv_channels(priv);
/* return carrier back if needed */
if (carrier_ok)
netif_carrier_on(netdev);
mutex_unlock(&priv->state_lock);
return err;
}
/* state lock cannot be grabbed within this function.
* It can cause a dead lock or a read-after-free.
*/
@ -516,6 +553,15 @@ static int mlx5e_tx_reporter_timeout_dump(struct mlx5e_priv *priv, struct devlin
return mlx5e_tx_reporter_dump_sq(priv, fmsg, to_ctx->sq);
}
static int mlx5e_tx_reporter_ptpsq_unhealthy_dump(struct mlx5e_priv *priv,
struct devlink_fmsg *fmsg,
void *ctx)
{
struct mlx5e_ptpsq *ptpsq = ctx;
return mlx5e_tx_reporter_dump_sq(priv, fmsg, &ptpsq->txqsq);
}
static int mlx5e_tx_reporter_dump_all_sqs(struct mlx5e_priv *priv,
struct devlink_fmsg *fmsg)
{
@ -621,6 +667,25 @@ int mlx5e_reporter_tx_timeout(struct mlx5e_txqsq *sq)
return to_ctx.status;
}
void mlx5e_reporter_tx_ptpsq_unhealthy(struct mlx5e_ptpsq *ptpsq)
{
struct mlx5e_ptp_metadata_map *map = &ptpsq->metadata_map;
char err_str[MLX5E_REPORTER_PER_Q_MAX_LEN];
struct mlx5e_txqsq *txqsq = &ptpsq->txqsq;
struct mlx5e_cq *ts_cq = &ptpsq->ts_cq;
struct mlx5e_priv *priv = txqsq->priv;
struct mlx5e_err_ctx err_ctx = {};
err_ctx.ctx = ptpsq;
err_ctx.recover = mlx5e_tx_reporter_ptpsq_unhealthy_recover;
err_ctx.dump = mlx5e_tx_reporter_ptpsq_unhealthy_dump;
snprintf(err_str, sizeof(err_str),
"Unhealthy TX port TS queue: %d, SQ: 0x%x, CQ: 0x%x, Undelivered CQEs: %u Map Capacity: %u",
txqsq->ch_ix, txqsq->sqn, ts_cq->mcq.cqn, map->undelivered_counter, map->capacity);
mlx5e_health_report(priv, priv->tx_reporter, err_str, &err_ctx);
}
static const struct devlink_health_reporter_ops mlx5_tx_reporter_ops = {
.name = "tx",
.recover = mlx5e_tx_reporter_recover,

View file

@ -2061,7 +2061,8 @@ static int set_pflag_tx_port_ts(struct net_device *netdev, bool enable)
struct mlx5e_params new_params;
int err;
if (!MLX5_CAP_GEN(mdev, ts_cqe_to_dest_cqn))
if (!MLX5_CAP_GEN(mdev, ts_cqe_to_dest_cqn) ||
!MLX5_CAP_GEN_2(mdev, ts_cqe_metadata_size2wqe_counter))
return -EOPNOTSUPP;
/* Don't allow changing the PTP state if HTB offload is active, because

View file

@ -2142,9 +2142,7 @@ static const struct counter_desc ptp_cq_stats_desc[] = {
{ MLX5E_DECLARE_PTP_CQ_STAT(struct mlx5e_ptp_cq_stats, err_cqe) },
{ MLX5E_DECLARE_PTP_CQ_STAT(struct mlx5e_ptp_cq_stats, abort) },
{ MLX5E_DECLARE_PTP_CQ_STAT(struct mlx5e_ptp_cq_stats, abort_abs_diff_ns) },
{ MLX5E_DECLARE_PTP_CQ_STAT(struct mlx5e_ptp_cq_stats, resync_cqe) },
{ MLX5E_DECLARE_PTP_CQ_STAT(struct mlx5e_ptp_cq_stats, resync_event) },
{ MLX5E_DECLARE_PTP_CQ_STAT(struct mlx5e_ptp_cq_stats, ooo_cqe_drop) },
{ MLX5E_DECLARE_PTP_CQ_STAT(struct mlx5e_ptp_cq_stats, late_cqe) },
};
static const struct counter_desc ptp_rq_stats_desc[] = {

View file

@ -449,9 +449,7 @@ struct mlx5e_ptp_cq_stats {
u64 err_cqe;
u64 abort;
u64 abort_abs_diff_ns;
u64 resync_cqe;
u64 resync_event;
u64 ooo_cqe_drop;
u64 late_cqe;
};
struct mlx5e_rep_stats {

View file

@ -372,7 +372,7 @@ mlx5e_txwqe_complete(struct mlx5e_txqsq *sq, struct sk_buff *skb,
const struct mlx5e_tx_attr *attr,
const struct mlx5e_tx_wqe_attr *wqe_attr, u8 num_dma,
struct mlx5e_tx_wqe_info *wi, struct mlx5_wqe_ctrl_seg *cseg,
bool xmit_more)
struct mlx5_wqe_eth_seg *eseg, bool xmit_more)
{
struct mlx5_wq_cyc *wq = &sq->wq;
bool send_doorbell;
@ -394,11 +394,16 @@ mlx5e_txwqe_complete(struct mlx5e_txqsq *sq, struct sk_buff *skb,
mlx5e_tx_check_stop(sq);
if (unlikely(sq->ptpsq)) {
if (unlikely(sq->ptpsq &&
(skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP))) {
u8 metadata_index = be32_to_cpu(eseg->flow_table_metadata);
mlx5e_skb_cb_hwtstamp_init(skb);
mlx5e_skb_fifo_push(&sq->ptpsq->skb_fifo, skb);
mlx5e_ptpsq_track_metadata(sq->ptpsq, metadata_index);
mlx5e_ptp_metadata_map_put(&sq->ptpsq->metadata_map, skb,
metadata_index);
if (!netif_tx_queue_stopped(sq->txq) &&
!mlx5e_skb_fifo_has_room(&sq->ptpsq->skb_fifo)) {
mlx5e_ptpsq_metadata_freelist_empty(sq->ptpsq)) {
netif_tx_stop_queue(sq->txq);
sq->stats->stopped++;
}
@ -483,13 +488,16 @@ mlx5e_sq_xmit_wqe(struct mlx5e_txqsq *sq, struct sk_buff *skb,
if (unlikely(num_dma < 0))
goto err_drop;
mlx5e_txwqe_complete(sq, skb, attr, wqe_attr, num_dma, wi, cseg, xmit_more);
mlx5e_txwqe_complete(sq, skb, attr, wqe_attr, num_dma, wi, cseg, eseg, xmit_more);
return;
err_drop:
stats->dropped++;
dev_kfree_skb_any(skb);
if (unlikely(sq->ptpsq && (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP)))
mlx5e_ptp_metadata_fifo_push(&sq->ptpsq->metadata_freelist,
be32_to_cpu(eseg->flow_table_metadata));
mlx5e_tx_flush(sq);
}
@ -645,9 +653,9 @@ void mlx5e_tx_mpwqe_ensure_complete(struct mlx5e_txqsq *sq)
static void mlx5e_cqe_ts_id_eseg(struct mlx5e_ptpsq *ptpsq, struct sk_buff *skb,
struct mlx5_wqe_eth_seg *eseg)
{
if (ptpsq->ts_cqe_ctr_mask && unlikely(skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP))
eseg->flow_table_metadata = cpu_to_be32(ptpsq->skb_fifo_pc &
ptpsq->ts_cqe_ctr_mask);
if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP))
eseg->flow_table_metadata =
cpu_to_be32(mlx5e_ptp_metadata_fifo_pop(&ptpsq->metadata_freelist));
}
static void mlx5e_txwqe_build_eseg(struct mlx5e_priv *priv, struct mlx5e_txqsq *sq,
@ -766,7 +774,7 @@ void mlx5e_txqsq_wake(struct mlx5e_txqsq *sq)
{
if (netif_tx_queue_stopped(sq->txq) &&
mlx5e_wqc_has_room_for(&sq->wq, sq->cc, sq->pc, sq->stop_room) &&
mlx5e_ptpsq_fifo_has_room(sq) &&
!mlx5e_ptpsq_metadata_freelist_empty(sq->ptpsq) &&
!test_bit(MLX5E_SQ_STATE_RECOVERING, &sq->state)) {
netif_tx_wake_queue(sq->txq);
sq->stats->wake++;
@ -1031,7 +1039,7 @@ void mlx5i_sq_xmit(struct mlx5e_txqsq *sq, struct sk_buff *skb,
if (unlikely(num_dma < 0))
goto err_drop;
mlx5e_txwqe_complete(sq, skb, &attr, &wqe_attr, num_dma, wi, cseg, xmit_more);
mlx5e_txwqe_complete(sq, skb, &attr, &wqe_attr, num_dma, wi, cseg, eseg, xmit_more);
return;

View file

@ -535,6 +535,28 @@ esw_src_port_rewrite_supported(struct mlx5_eswitch *esw)
MLX5_CAP_ESW_FLOWTABLE_FDB(esw->dev, ignore_flow_level);
}
static bool
esw_dests_to_vf_pf_vports(struct mlx5_flow_destination *dests, int max_dest)
{
bool vf_dest = false, pf_dest = false;
int i;
for (i = 0; i < max_dest; i++) {
if (dests[i].type != MLX5_FLOW_DESTINATION_TYPE_VPORT)
continue;
if (dests[i].vport.num == MLX5_VPORT_UPLINK)
pf_dest = true;
else
vf_dest = true;
if (vf_dest && pf_dest)
return true;
}
return false;
}
static int
esw_setup_dests(struct mlx5_flow_destination *dest,
struct mlx5_flow_act *flow_act,
@ -671,6 +693,15 @@ mlx5_eswitch_add_offloaded_rule(struct mlx5_eswitch *esw,
rule = ERR_PTR(err);
goto err_create_goto_table;
}
/* Header rewrite with combined wire+loopback in FDB is not allowed */
if ((flow_act.action & MLX5_FLOW_CONTEXT_ACTION_MOD_HDR) &&
esw_dests_to_vf_pf_vports(dest, i)) {
esw_warn(esw->dev,
"FDB: Header rewrite with forwarding to both PF and VF is not allowed\n");
rule = ERR_PTR(-EINVAL);
goto err_esw_get;
}
}
if (esw_attr->decap_pkt_reformat)

View file

@ -143,90 +143,86 @@ int mlx5_query_hca_caps(struct mlx5_core_dev *dev)
{
int err;
err = mlx5_core_get_caps(dev, MLX5_CAP_GENERAL);
err = mlx5_core_get_caps_mode(dev, MLX5_CAP_GENERAL, HCA_CAP_OPMOD_GET_CUR);
if (err)
return err;
if (MLX5_CAP_GEN(dev, port_selection_cap)) {
err = mlx5_core_get_caps(dev, MLX5_CAP_PORT_SELECTION);
err = mlx5_core_get_caps_mode(dev, MLX5_CAP_PORT_SELECTION, HCA_CAP_OPMOD_GET_CUR);
if (err)
return err;
}
if (MLX5_CAP_GEN(dev, hca_cap_2)) {
err = mlx5_core_get_caps(dev, MLX5_CAP_GENERAL_2);
err = mlx5_core_get_caps_mode(dev, MLX5_CAP_GENERAL_2, HCA_CAP_OPMOD_GET_CUR);
if (err)
return err;
}
if (MLX5_CAP_GEN(dev, eth_net_offloads)) {
err = mlx5_core_get_caps(dev, MLX5_CAP_ETHERNET_OFFLOADS);
err = mlx5_core_get_caps_mode(dev, MLX5_CAP_ETHERNET_OFFLOADS,
HCA_CAP_OPMOD_GET_CUR);
if (err)
return err;
}
if (MLX5_CAP_GEN(dev, ipoib_enhanced_offloads)) {
err = mlx5_core_get_caps(dev, MLX5_CAP_IPOIB_ENHANCED_OFFLOADS);
err = mlx5_core_get_caps_mode(dev, MLX5_CAP_IPOIB_ENHANCED_OFFLOADS,
HCA_CAP_OPMOD_GET_CUR);
if (err)
return err;
}
if (MLX5_CAP_GEN(dev, pg)) {
err = mlx5_core_get_caps(dev, MLX5_CAP_ODP);
err = mlx5_core_get_caps_mode(dev, MLX5_CAP_ODP, HCA_CAP_OPMOD_GET_CUR);
if (err)
return err;
}
if (MLX5_CAP_GEN(dev, atomic)) {
err = mlx5_core_get_caps(dev, MLX5_CAP_ATOMIC);
err = mlx5_core_get_caps_mode(dev, MLX5_CAP_ATOMIC, HCA_CAP_OPMOD_GET_CUR);
if (err)
return err;
}
if (MLX5_CAP_GEN(dev, roce)) {
err = mlx5_core_get_caps(dev, MLX5_CAP_ROCE);
err = mlx5_core_get_caps_mode(dev, MLX5_CAP_ROCE, HCA_CAP_OPMOD_GET_CUR);
if (err)
return err;
}
if (MLX5_CAP_GEN(dev, nic_flow_table) ||
MLX5_CAP_GEN(dev, ipoib_enhanced_offloads)) {
err = mlx5_core_get_caps(dev, MLX5_CAP_FLOW_TABLE);
err = mlx5_core_get_caps_mode(dev, MLX5_CAP_FLOW_TABLE, HCA_CAP_OPMOD_GET_CUR);
if (err)
return err;
}
if (MLX5_ESWITCH_MANAGER(dev)) {
err = mlx5_core_get_caps(dev, MLX5_CAP_ESWITCH_FLOW_TABLE);
err = mlx5_core_get_caps_mode(dev, MLX5_CAP_ESWITCH_FLOW_TABLE,
HCA_CAP_OPMOD_GET_CUR);
if (err)
return err;
err = mlx5_core_get_caps(dev, MLX5_CAP_ESWITCH);
if (err)
return err;
}
if (MLX5_CAP_GEN(dev, vector_calc)) {
err = mlx5_core_get_caps(dev, MLX5_CAP_VECTOR_CALC);
err = mlx5_core_get_caps_mode(dev, MLX5_CAP_ESWITCH, HCA_CAP_OPMOD_GET_CUR);
if (err)
return err;
}
if (MLX5_CAP_GEN(dev, qos)) {
err = mlx5_core_get_caps(dev, MLX5_CAP_QOS);
err = mlx5_core_get_caps_mode(dev, MLX5_CAP_QOS, HCA_CAP_OPMOD_GET_CUR);
if (err)
return err;
}
if (MLX5_CAP_GEN(dev, debug))
mlx5_core_get_caps(dev, MLX5_CAP_DEBUG);
mlx5_core_get_caps_mode(dev, MLX5_CAP_DEBUG, HCA_CAP_OPMOD_GET_CUR);
if (MLX5_CAP_GEN(dev, pcam_reg))
mlx5_get_pcam_reg(dev);
if (MLX5_CAP_GEN(dev, mcam_reg)) {
mlx5_get_mcam_access_reg_group(dev, MLX5_MCAM_REGS_FIRST_128);
mlx5_get_mcam_access_reg_group(dev, MLX5_MCAM_REGS_0x9080_0x90FF);
mlx5_get_mcam_access_reg_group(dev, MLX5_MCAM_REGS_0x9100_0x917F);
}
@ -234,57 +230,52 @@ int mlx5_query_hca_caps(struct mlx5_core_dev *dev)
mlx5_get_qcam_reg(dev);
if (MLX5_CAP_GEN(dev, device_memory)) {
err = mlx5_core_get_caps(dev, MLX5_CAP_DEV_MEM);
err = mlx5_core_get_caps_mode(dev, MLX5_CAP_DEV_MEM, HCA_CAP_OPMOD_GET_CUR);
if (err)
return err;
}
if (MLX5_CAP_GEN(dev, event_cap)) {
err = mlx5_core_get_caps(dev, MLX5_CAP_DEV_EVENT);
err = mlx5_core_get_caps_mode(dev, MLX5_CAP_DEV_EVENT, HCA_CAP_OPMOD_GET_CUR);
if (err)
return err;
}
if (MLX5_CAP_GEN(dev, tls_tx) || MLX5_CAP_GEN(dev, tls_rx)) {
err = mlx5_core_get_caps(dev, MLX5_CAP_TLS);
err = mlx5_core_get_caps_mode(dev, MLX5_CAP_TLS, HCA_CAP_OPMOD_GET_CUR);
if (err)
return err;
}
if (MLX5_CAP_GEN_64(dev, general_obj_types) &
MLX5_GENERAL_OBJ_TYPES_CAP_VIRTIO_NET_Q) {
err = mlx5_core_get_caps(dev, MLX5_CAP_VDPA_EMULATION);
err = mlx5_core_get_caps_mode(dev, MLX5_CAP_VDPA_EMULATION, HCA_CAP_OPMOD_GET_CUR);
if (err)
return err;
}
if (MLX5_CAP_GEN(dev, ipsec_offload)) {
err = mlx5_core_get_caps(dev, MLX5_CAP_IPSEC);
err = mlx5_core_get_caps_mode(dev, MLX5_CAP_IPSEC, HCA_CAP_OPMOD_GET_CUR);
if (err)
return err;
}
if (MLX5_CAP_GEN(dev, crypto)) {
err = mlx5_core_get_caps(dev, MLX5_CAP_CRYPTO);
if (err)
return err;
}
if (MLX5_CAP_GEN(dev, shampo)) {
err = mlx5_core_get_caps(dev, MLX5_CAP_DEV_SHAMPO);
err = mlx5_core_get_caps_mode(dev, MLX5_CAP_CRYPTO, HCA_CAP_OPMOD_GET_CUR);
if (err)
return err;
}
if (MLX5_CAP_GEN_64(dev, general_obj_types) &
MLX5_GENERAL_OBJ_TYPES_CAP_MACSEC_OFFLOAD) {
err = mlx5_core_get_caps(dev, MLX5_CAP_MACSEC);
err = mlx5_core_get_caps_mode(dev, MLX5_CAP_MACSEC, HCA_CAP_OPMOD_GET_CUR);
if (err)
return err;
}
if (MLX5_CAP_GEN(dev, adv_virtualization)) {
err = mlx5_core_get_caps(dev, MLX5_CAP_ADV_VIRTUALIZATION);
err = mlx5_core_get_caps_mode(dev, MLX5_CAP_ADV_VIRTUALIZATION,
HCA_CAP_OPMOD_GET_CUR);
if (err)
return err;
}

View file

@ -127,17 +127,23 @@ static int mlx5_fw_reset_get_reset_state_err(struct mlx5_core_dev *dev,
if (mlx5_reg_mfrl_query(dev, NULL, NULL, &reset_state))
goto out;
if (!reset_state)
return 0;
switch (reset_state) {
case MLX5_MFRL_REG_RESET_STATE_IN_NEGOTIATION:
case MLX5_MFRL_REG_RESET_STATE_RESET_IN_PROGRESS:
NL_SET_ERR_MSG_MOD(extack, "Sync reset was already triggered");
NL_SET_ERR_MSG_MOD(extack, "Sync reset still in progress");
return -EBUSY;
case MLX5_MFRL_REG_RESET_STATE_TIMEOUT:
NL_SET_ERR_MSG_MOD(extack, "Sync reset got timeout");
case MLX5_MFRL_REG_RESET_STATE_NEG_TIMEOUT:
NL_SET_ERR_MSG_MOD(extack, "Sync reset negotiation timeout");
return -ETIMEDOUT;
case MLX5_MFRL_REG_RESET_STATE_NACK:
NL_SET_ERR_MSG_MOD(extack, "One of the hosts disabled reset");
return -EPERM;
case MLX5_MFRL_REG_RESET_STATE_UNLOAD_TIMEOUT:
NL_SET_ERR_MSG_MOD(extack, "Sync reset unload timeout");
return -ETIMEDOUT;
}
out:
@ -151,7 +157,7 @@ int mlx5_fw_reset_set_reset_sync(struct mlx5_core_dev *dev, u8 reset_type_sel,
struct mlx5_fw_reset *fw_reset = dev->priv.fw_reset;
u32 out[MLX5_ST_SZ_DW(mfrl_reg)] = {};
u32 in[MLX5_ST_SZ_DW(mfrl_reg)] = {};
int err;
int err, rst_res;
set_bit(MLX5_FW_RESET_FLAGS_PENDING_COMP, &fw_reset->reset_flags);
@ -164,13 +170,34 @@ int mlx5_fw_reset_set_reset_sync(struct mlx5_core_dev *dev, u8 reset_type_sel,
return 0;
clear_bit(MLX5_FW_RESET_FLAGS_PENDING_COMP, &fw_reset->reset_flags);
if (err == -EREMOTEIO && MLX5_CAP_MCAM_FEATURE(dev, reset_state))
return mlx5_fw_reset_get_reset_state_err(dev, extack);
if (err == -EREMOTEIO && MLX5_CAP_MCAM_FEATURE(dev, reset_state)) {
rst_res = mlx5_fw_reset_get_reset_state_err(dev, extack);
return rst_res ? rst_res : err;
}
NL_SET_ERR_MSG_MOD(extack, "Sync reset command failed");
return mlx5_cmd_check(dev, err, in, out);
}
int mlx5_fw_reset_verify_fw_complete(struct mlx5_core_dev *dev,
struct netlink_ext_ack *extack)
{
u8 rst_state;
int err;
err = mlx5_fw_reset_get_reset_state_err(dev, extack);
if (err)
return err;
rst_state = mlx5_get_fw_rst_state(dev);
if (!rst_state)
return 0;
mlx5_core_err(dev, "Sync reset did not complete, state=%d\n", rst_state);
NL_SET_ERR_MSG_MOD(extack, "Sync reset did not complete successfully");
return rst_state;
}
int mlx5_fw_reset_set_live_patch(struct mlx5_core_dev *dev)
{
return mlx5_reg_mfrl_set(dev, MLX5_MFRL_REG_RESET_LEVEL0, 0, 0, false);

View file

@ -12,6 +12,8 @@ int mlx5_fw_reset_set_reset_sync(struct mlx5_core_dev *dev, u8 reset_type_sel,
int mlx5_fw_reset_set_live_patch(struct mlx5_core_dev *dev);
int mlx5_fw_reset_wait_reset_done(struct mlx5_core_dev *dev);
int mlx5_fw_reset_verify_fw_complete(struct mlx5_core_dev *dev,
struct netlink_ext_ack *extack);
void mlx5_fw_reset_events_start(struct mlx5_core_dev *dev);
void mlx5_fw_reset_events_stop(struct mlx5_core_dev *dev);
void mlx5_drain_fw_reset(struct mlx5_core_dev *dev);

View file

@ -361,9 +361,8 @@ void mlx5_core_uplink_netdev_event_replay(struct mlx5_core_dev *dev)
}
EXPORT_SYMBOL(mlx5_core_uplink_netdev_event_replay);
static int mlx5_core_get_caps_mode(struct mlx5_core_dev *dev,
enum mlx5_cap_type cap_type,
enum mlx5_cap_mode cap_mode)
int mlx5_core_get_caps_mode(struct mlx5_core_dev *dev, enum mlx5_cap_type cap_type,
enum mlx5_cap_mode cap_mode)
{
u8 in[MLX5_ST_SZ_BYTES(query_hca_cap_in)];
int out_sz = MLX5_ST_SZ_BYTES(query_hca_cap_out);
@ -1620,21 +1619,24 @@ static int mlx5_query_hca_caps_light(struct mlx5_core_dev *dev)
return err;
if (MLX5_CAP_GEN(dev, eth_net_offloads)) {
err = mlx5_core_get_caps(dev, MLX5_CAP_ETHERNET_OFFLOADS);
err = mlx5_core_get_caps_mode(dev, MLX5_CAP_ETHERNET_OFFLOADS,
HCA_CAP_OPMOD_GET_CUR);
if (err)
return err;
}
if (MLX5_CAP_GEN(dev, nic_flow_table) ||
MLX5_CAP_GEN(dev, ipoib_enhanced_offloads)) {
err = mlx5_core_get_caps(dev, MLX5_CAP_FLOW_TABLE);
err = mlx5_core_get_caps_mode(dev, MLX5_CAP_FLOW_TABLE,
HCA_CAP_OPMOD_GET_CUR);
if (err)
return err;
}
if (MLX5_CAP_GEN_64(dev, general_obj_types) &
MLX5_GENERAL_OBJ_TYPES_CAP_VIRTIO_NET_Q) {
err = mlx5_core_get_caps(dev, MLX5_CAP_VDPA_EMULATION);
err = mlx5_core_get_caps_mode(dev, MLX5_CAP_VDPA_EMULATION,
HCA_CAP_OPMOD_GET_CUR);
if (err)
return err;
}
@ -1714,7 +1716,6 @@ static const int types[] = {
MLX5_CAP_FLOW_TABLE,
MLX5_CAP_ESWITCH_FLOW_TABLE,
MLX5_CAP_ESWITCH,
MLX5_CAP_VECTOR_CALC,
MLX5_CAP_QOS,
MLX5_CAP_DEBUG,
MLX5_CAP_DEV_MEM,
@ -1723,7 +1724,6 @@ static const int types[] = {
MLX5_CAP_VDPA_EMULATION,
MLX5_CAP_IPSEC,
MLX5_CAP_PORT_SELECTION,
MLX5_CAP_DEV_SHAMPO,
MLX5_CAP_MACSEC,
MLX5_CAP_ADV_VIRTUALIZATION,
MLX5_CAP_CRYPTO,

View file

@ -174,6 +174,9 @@ static inline int mlx5_flexible_inlen(struct mlx5_core_dev *dev, size_t fixed,
#define MLX5_FLEXIBLE_INLEN(dev, fixed, item_size, num_items) \
mlx5_flexible_inlen(dev, fixed, item_size, num_items, __func__, __LINE__)
int mlx5_core_get_caps(struct mlx5_core_dev *dev, enum mlx5_cap_type cap_type);
int mlx5_core_get_caps_mode(struct mlx5_core_dev *dev, enum mlx5_cap_type cap_type,
enum mlx5_cap_mode cap_mode);
int mlx5_query_hca_caps(struct mlx5_core_dev *dev);
int mlx5_query_board_id(struct mlx5_core_dev *dev);
int mlx5_query_module_num(struct mlx5_core_dev *dev, int *module_num);

View file

@ -129,7 +129,7 @@ static void mlx5_sf_dev_add(struct mlx5_core_dev *dev, u16 sf_index, u16 fn_id,
err = auxiliary_device_add(&sf_dev->adev);
if (err) {
put_device(&sf_dev->adev.dev);
auxiliary_device_uninit(&sf_dev->adev);
goto add_err;
}
@ -167,7 +167,7 @@ mlx5_sf_dev_state_change_handler(struct notifier_block *nb, unsigned long event_
if (!max_functions)
return 0;
base_id = MLX5_CAP_GEN(table->dev, sf_base_id);
base_id = mlx5_sf_start_function_id(table->dev);
if (event->function_id < base_id || event->function_id >= (base_id + max_functions))
return 0;
@ -185,7 +185,7 @@ mlx5_sf_dev_state_change_handler(struct notifier_block *nb, unsigned long event_
mlx5_sf_dev_del(table->dev, sf_dev, sf_index);
else
mlx5_core_err(table->dev,
"SF DEV: teardown state for invalid dev index=%d fn_id=0x%x\n",
"SF DEV: teardown state for invalid dev index=%d sfnum=0x%x\n",
sf_index, event->sw_function_id);
break;
case MLX5_VHCA_STATE_ACTIVE:
@ -209,7 +209,7 @@ static int mlx5_sf_dev_vhca_arm_all(struct mlx5_sf_dev_table *table)
int i;
max_functions = mlx5_sf_max_functions(dev);
function_id = MLX5_CAP_GEN(dev, sf_base_id);
function_id = mlx5_sf_start_function_id(dev);
/* Arm the vhca context as the vhca event notifier */
for (i = 0; i < max_functions; i++) {
err = mlx5_vhca_event_arm(dev, function_id);
@ -234,7 +234,7 @@ static void mlx5_sf_dev_add_active_work(struct work_struct *work)
int i;
max_functions = mlx5_sf_max_functions(dev);
function_id = MLX5_CAP_GEN(dev, sf_base_id);
function_id = mlx5_sf_start_function_id(dev);
for (i = 0; i < max_functions; i++, function_id++) {
if (table->stop_active_wq)
return;
@ -299,7 +299,7 @@ void mlx5_sf_dev_table_create(struct mlx5_core_dev *dev)
unsigned int max_sfs;
int err;
if (!mlx5_sf_dev_supported(dev) || !mlx5_vhca_event_supported(dev))
if (!mlx5_sf_dev_supported(dev))
return;
table = kzalloc(sizeof(*table), GFP_KERNEL);

View file

@ -9,6 +9,7 @@
#include "mlx5_core.h"
#include "eswitch.h"
#include "diag/sf_tracepoint.h"
#include "devlink.h"
struct mlx5_sf_hw {
u32 usr_sfnum;
@ -243,31 +244,61 @@ static void mlx5_sf_hw_table_hwc_cleanup(struct mlx5_sf_hwc_table *hwc)
kfree(hwc->sfs);
}
static void mlx5_sf_hw_table_res_unregister(struct mlx5_core_dev *dev)
{
devl_resources_unregister(priv_to_devlink(dev));
}
static int mlx5_sf_hw_table_res_register(struct mlx5_core_dev *dev, u16 max_fn,
u16 max_ext_fn)
{
struct devlink_resource_size_params size_params;
struct devlink *devlink = priv_to_devlink(dev);
int err;
devlink_resource_size_params_init(&size_params, max_fn, max_fn, 1,
DEVLINK_RESOURCE_UNIT_ENTRY);
err = devl_resource_register(devlink, "max_local_SFs", max_fn, MLX5_DL_RES_MAX_LOCAL_SFS,
DEVLINK_RESOURCE_ID_PARENT_TOP, &size_params);
if (err)
return err;
devlink_resource_size_params_init(&size_params, max_ext_fn, max_ext_fn, 1,
DEVLINK_RESOURCE_UNIT_ENTRY);
return devl_resource_register(devlink, "max_external_SFs", max_ext_fn,
MLX5_DL_RES_MAX_EXTERNAL_SFS, DEVLINK_RESOURCE_ID_PARENT_TOP,
&size_params);
}
int mlx5_sf_hw_table_init(struct mlx5_core_dev *dev)
{
struct mlx5_sf_hw_table *table;
u16 max_ext_fn = 0;
u16 ext_base_id = 0;
u16 max_fn = 0;
u16 base_id;
u16 max_fn;
int err;
if (!mlx5_vhca_event_supported(dev))
return 0;
if (mlx5_sf_supported(dev))
max_fn = mlx5_sf_max_functions(dev);
max_fn = mlx5_sf_max_functions(dev);
err = mlx5_esw_sf_max_hpf_functions(dev, &max_ext_fn, &ext_base_id);
if (err)
return err;
if (mlx5_sf_hw_table_res_register(dev, max_fn, max_ext_fn))
mlx5_core_dbg(dev, "failed to register max SFs resources");
if (!max_fn && !max_ext_fn)
return 0;
table = kzalloc(sizeof(*table), GFP_KERNEL);
if (!table)
return -ENOMEM;
if (!table) {
err = -ENOMEM;
goto alloc_err;
}
mutex_init(&table->table_lock);
table->dev = dev;
@ -291,6 +322,8 @@ int mlx5_sf_hw_table_init(struct mlx5_core_dev *dev)
table_err:
mutex_destroy(&table->table_lock);
kfree(table);
alloc_err:
mlx5_sf_hw_table_res_unregister(dev);
return err;
}
@ -299,12 +332,14 @@ void mlx5_sf_hw_table_cleanup(struct mlx5_core_dev *dev)
struct mlx5_sf_hw_table *table = dev->priv.sf_hw_table;
if (!table)
return;
goto res_unregister;
mutex_destroy(&table->table_lock);
mlx5_sf_hw_table_hwc_cleanup(&table->hwc[MLX5_SF_HWC_EXTERNAL]);
mlx5_sf_hw_table_hwc_cleanup(&table->hwc[MLX5_SF_HWC_LOCAL]);
mutex_destroy(&table->table_lock);
kfree(table);
res_unregister:
mlx5_sf_hw_table_res_unregister(dev);
}
static int mlx5_sf_hw_vhca_event(struct notifier_block *nb, unsigned long opcode, void *data)

View file

@ -1208,9 +1208,7 @@ enum mlx5_cap_type {
MLX5_CAP_FLOW_TABLE,
MLX5_CAP_ESWITCH_FLOW_TABLE,
MLX5_CAP_ESWITCH,
MLX5_CAP_RESERVED,
MLX5_CAP_VECTOR_CALC,
MLX5_CAP_QOS,
MLX5_CAP_QOS = 0xc,
MLX5_CAP_DEBUG,
MLX5_CAP_RESERVED_14,
MLX5_CAP_DEV_MEM,
@ -1220,7 +1218,6 @@ enum mlx5_cap_type {
MLX5_CAP_DEV_EVENT = 0x14,
MLX5_CAP_IPSEC,
MLX5_CAP_CRYPTO = 0x1a,
MLX5_CAP_DEV_SHAMPO = 0x1d,
MLX5_CAP_MACSEC = 0x1f,
MLX5_CAP_GENERAL_2 = 0x20,
MLX5_CAP_PORT_SELECTION = 0x25,
@ -1239,7 +1236,6 @@ enum mlx5_pcam_feature_groups {
enum mlx5_mcam_reg_groups {
MLX5_MCAM_REGS_FIRST_128 = 0x0,
MLX5_MCAM_REGS_0x9080_0x90FF = 0x1,
MLX5_MCAM_REGS_0x9100_0x917F = 0x2,
MLX5_MCAM_REGS_NUM = 0x3,
};
@ -1279,10 +1275,6 @@ enum mlx5_qcam_feature_groups {
MLX5_GET(per_protocol_networking_offload_caps,\
mdev->caps.hca[MLX5_CAP_ETHERNET_OFFLOADS]->cur, cap)
#define MLX5_CAP_ETH_MAX(mdev, cap) \
MLX5_GET(per_protocol_networking_offload_caps,\
mdev->caps.hca[MLX5_CAP_ETHERNET_OFFLOADS]->max, cap)
#define MLX5_CAP_IPOIB_ENHANCED(mdev, cap) \
MLX5_GET(per_protocol_networking_offload_caps,\
mdev->caps.hca[MLX5_CAP_IPOIB_ENHANCED_OFFLOADS]->cur, cap)
@ -1305,77 +1297,40 @@ enum mlx5_qcam_feature_groups {
#define MLX5_CAP64_FLOWTABLE(mdev, cap) \
MLX5_GET64(flow_table_nic_cap, (mdev)->caps.hca[MLX5_CAP_FLOW_TABLE]->cur, cap)
#define MLX5_CAP_FLOWTABLE_MAX(mdev, cap) \
MLX5_GET(flow_table_nic_cap, mdev->caps.hca[MLX5_CAP_FLOW_TABLE]->max, cap)
#define MLX5_CAP_FLOWTABLE_NIC_RX(mdev, cap) \
MLX5_CAP_FLOWTABLE(mdev, flow_table_properties_nic_receive.cap)
#define MLX5_CAP_FLOWTABLE_NIC_RX_MAX(mdev, cap) \
MLX5_CAP_FLOWTABLE_MAX(mdev, flow_table_properties_nic_receive.cap)
#define MLX5_CAP_FLOWTABLE_NIC_TX(mdev, cap) \
MLX5_CAP_FLOWTABLE(mdev, flow_table_properties_nic_transmit.cap)
#define MLX5_CAP_FLOWTABLE_NIC_TX_MAX(mdev, cap) \
MLX5_CAP_FLOWTABLE_MAX(mdev, flow_table_properties_nic_transmit.cap)
#define MLX5_CAP_FLOWTABLE_SNIFFER_RX(mdev, cap) \
MLX5_CAP_FLOWTABLE(mdev, flow_table_properties_nic_receive_sniffer.cap)
#define MLX5_CAP_FLOWTABLE_SNIFFER_RX_MAX(mdev, cap) \
MLX5_CAP_FLOWTABLE_MAX(mdev, flow_table_properties_nic_receive_sniffer.cap)
#define MLX5_CAP_FLOWTABLE_SNIFFER_TX(mdev, cap) \
MLX5_CAP_FLOWTABLE(mdev, flow_table_properties_nic_transmit_sniffer.cap)
#define MLX5_CAP_FLOWTABLE_SNIFFER_TX_MAX(mdev, cap) \
MLX5_CAP_FLOWTABLE_MAX(mdev, flow_table_properties_nic_transmit_sniffer.cap)
#define MLX5_CAP_FLOWTABLE_RDMA_RX(mdev, cap) \
MLX5_CAP_FLOWTABLE(mdev, flow_table_properties_nic_receive_rdma.cap)
#define MLX5_CAP_FLOWTABLE_RDMA_RX_MAX(mdev, cap) \
MLX5_CAP_FLOWTABLE_MAX(mdev, flow_table_properties_nic_receive_rdma.cap)
#define MLX5_CAP_FLOWTABLE_RDMA_TX(mdev, cap) \
MLX5_CAP_FLOWTABLE(mdev, flow_table_properties_nic_transmit_rdma.cap)
#define MLX5_CAP_FLOWTABLE_RDMA_TX_MAX(mdev, cap) \
MLX5_CAP_FLOWTABLE_MAX(mdev, flow_table_properties_nic_transmit_rdma.cap)
#define MLX5_CAP_ESW_FLOWTABLE(mdev, cap) \
MLX5_GET(flow_table_eswitch_cap, \
mdev->caps.hca[MLX5_CAP_ESWITCH_FLOW_TABLE]->cur, cap)
#define MLX5_CAP_ESW_FLOWTABLE_MAX(mdev, cap) \
MLX5_GET(flow_table_eswitch_cap, \
mdev->caps.hca[MLX5_CAP_ESWITCH_FLOW_TABLE]->max, cap)
#define MLX5_CAP_ESW_FLOWTABLE_FDB(mdev, cap) \
MLX5_CAP_ESW_FLOWTABLE(mdev, flow_table_properties_nic_esw_fdb.cap)
#define MLX5_CAP_ESW_FLOWTABLE_FDB_MAX(mdev, cap) \
MLX5_CAP_ESW_FLOWTABLE_MAX(mdev, flow_table_properties_nic_esw_fdb.cap)
#define MLX5_CAP_ESW_EGRESS_ACL(mdev, cap) \
MLX5_CAP_ESW_FLOWTABLE(mdev, flow_table_properties_esw_acl_egress.cap)
#define MLX5_CAP_ESW_EGRESS_ACL_MAX(mdev, cap) \
MLX5_CAP_ESW_FLOWTABLE_MAX(mdev, flow_table_properties_esw_acl_egress.cap)
#define MLX5_CAP_ESW_INGRESS_ACL(mdev, cap) \
MLX5_CAP_ESW_FLOWTABLE(mdev, flow_table_properties_esw_acl_ingress.cap)
#define MLX5_CAP_ESW_INGRESS_ACL_MAX(mdev, cap) \
MLX5_CAP_ESW_FLOWTABLE_MAX(mdev, flow_table_properties_esw_acl_ingress.cap)
#define MLX5_CAP_ESW_FT_FIELD_SUPPORT_2(mdev, cap) \
MLX5_CAP_ESW_FLOWTABLE(mdev, ft_field_support_2_esw_fdb.cap)
#define MLX5_CAP_ESW_FT_FIELD_SUPPORT_2_MAX(mdev, cap) \
MLX5_CAP_ESW_FLOWTABLE_MAX(mdev, ft_field_support_2_esw_fdb.cap)
#define MLX5_CAP_ESW(mdev, cap) \
MLX5_GET(e_switch_cap, \
mdev->caps.hca[MLX5_CAP_ESWITCH]->cur, cap)
@ -1384,10 +1339,6 @@ enum mlx5_qcam_feature_groups {
MLX5_GET64(flow_table_eswitch_cap, \
(mdev)->caps.hca[MLX5_CAP_ESWITCH_FLOW_TABLE]->cur, cap)
#define MLX5_CAP_ESW_MAX(mdev, cap) \
MLX5_GET(e_switch_cap, \
mdev->caps.hca[MLX5_CAP_ESWITCH]->max, cap)
#define MLX5_CAP_PORT_SELECTION(mdev, cap) \
MLX5_GET(port_selection_cap, \
mdev->caps.hca[MLX5_CAP_PORT_SELECTION]->cur, cap)
@ -1400,26 +1351,15 @@ enum mlx5_qcam_feature_groups {
MLX5_GET(adv_virtualization_cap, \
mdev->caps.hca[MLX5_CAP_ADV_VIRTUALIZATION]->cur, cap)
#define MLX5_CAP_ADV_VIRTUALIZATION_MAX(mdev, cap) \
MLX5_GET(adv_virtualization_cap, \
mdev->caps.hca[MLX5_CAP_ADV_VIRTUALIZATION]->max, cap)
#define MLX5_CAP_FLOWTABLE_PORT_SELECTION(mdev, cap) \
MLX5_CAP_PORT_SELECTION(mdev, flow_table_properties_port_selection.cap)
#define MLX5_CAP_FLOWTABLE_PORT_SELECTION_MAX(mdev, cap) \
MLX5_CAP_PORT_SELECTION_MAX(mdev, flow_table_properties_port_selection.cap)
#define MLX5_CAP_ODP(mdev, cap)\
MLX5_GET(odp_cap, mdev->caps.hca[MLX5_CAP_ODP]->cur, cap)
#define MLX5_CAP_ODP_MAX(mdev, cap)\
MLX5_GET(odp_cap, mdev->caps.hca[MLX5_CAP_ODP]->max, cap)
#define MLX5_CAP_VECTOR_CALC(mdev, cap) \
MLX5_GET(vector_calc_cap, \
mdev->caps.hca[MLX5_CAP_VECTOR_CALC]->cur, cap)
#define MLX5_CAP_QOS(mdev, cap)\
MLX5_GET(qos_cap, mdev->caps.hca[MLX5_CAP_QOS]->cur, cap)
@ -1436,10 +1376,6 @@ enum mlx5_qcam_feature_groups {
MLX5_GET(mcam_reg, (mdev)->caps.mcam[MLX5_MCAM_REGS_FIRST_128], \
mng_access_reg_cap_mask.access_regs.reg)
#define MLX5_CAP_MCAM_REG1(mdev, reg) \
MLX5_GET(mcam_reg, (mdev)->caps.mcam[MLX5_MCAM_REGS_0x9080_0x90FF], \
mng_access_reg_cap_mask.access_regs1.reg)
#define MLX5_CAP_MCAM_REG2(mdev, reg) \
MLX5_GET(mcam_reg, (mdev)->caps.mcam[MLX5_MCAM_REGS_0x9100_0x917F], \
mng_access_reg_cap_mask.access_regs2.reg)
@ -1485,9 +1421,6 @@ enum mlx5_qcam_feature_groups {
#define MLX5_CAP_CRYPTO(mdev, cap)\
MLX5_GET(crypto_cap, (mdev)->caps.hca[MLX5_CAP_CRYPTO]->cur, cap)
#define MLX5_CAP_DEV_SHAMPO(mdev, cap)\
MLX5_GET(shampo_cap, mdev->caps.hca_cur[MLX5_CAP_DEV_SHAMPO], cap)
#define MLX5_CAP_MACSEC(mdev, cap)\
MLX5_GET(macsec_cap, (mdev)->caps.hca[MLX5_CAP_MACSEC]->cur, cap)

View file

@ -1022,7 +1022,6 @@ bool mlx5_cmd_is_down(struct mlx5_core_dev *dev);
void mlx5_core_uplink_netdev_set(struct mlx5_core_dev *mdev, struct net_device *netdev);
void mlx5_core_uplink_netdev_event_replay(struct mlx5_core_dev *mdev);
int mlx5_core_get_caps(struct mlx5_core_dev *dev, enum mlx5_cap_type cap_type);
void mlx5_health_cleanup(struct mlx5_core_dev *dev);
int mlx5_health_init(struct mlx5_core_dev *dev);
void mlx5_start_health_poll(struct mlx5_core_dev *dev);

View file

@ -1314,33 +1314,6 @@ struct mlx5_ifc_odp_cap_bits {
u8 reserved_at_120[0x6E0];
};
struct mlx5_ifc_calc_op {
u8 reserved_at_0[0x10];
u8 reserved_at_10[0x9];
u8 op_swap_endianness[0x1];
u8 op_min[0x1];
u8 op_xor[0x1];
u8 op_or[0x1];
u8 op_and[0x1];
u8 op_max[0x1];
u8 op_add[0x1];
};
struct mlx5_ifc_vector_calc_cap_bits {
u8 calc_matrix[0x1];
u8 reserved_at_1[0x1f];
u8 reserved_at_20[0x8];
u8 max_vec_count[0x8];
u8 reserved_at_30[0xd];
u8 max_chunk_size[0x3];
struct mlx5_ifc_calc_op calc0;
struct mlx5_ifc_calc_op calc1;
struct mlx5_ifc_calc_op calc2;
struct mlx5_ifc_calc_op calc3;
u8 reserved_at_c0[0x720];
};
struct mlx5_ifc_tls_cap_bits {
u8 tls_1_2_aes_gcm_128[0x1];
u8 tls_1_3_aes_gcm_128[0x1];
@ -3435,20 +3408,6 @@ struct mlx5_ifc_roce_addr_layout_bits {
u8 reserved_at_e0[0x20];
};
struct mlx5_ifc_shampo_cap_bits {
u8 reserved_at_0[0x3];
u8 shampo_log_max_reservation_size[0x5];
u8 reserved_at_8[0x3];
u8 shampo_log_min_reservation_size[0x5];
u8 shampo_min_mss_size[0x10];
u8 reserved_at_20[0x3];
u8 shampo_max_log_headers_entry_size[0x5];
u8 reserved_at_28[0x18];
u8 reserved_at_40[0x7c0];
};
struct mlx5_ifc_crypto_cap_bits {
u8 reserved_at_0[0x3];
u8 synchronize_dek[0x1];
@ -3484,14 +3443,12 @@ union mlx5_ifc_hca_cap_union_bits {
struct mlx5_ifc_flow_table_eswitch_cap_bits flow_table_eswitch_cap;
struct mlx5_ifc_e_switch_cap_bits e_switch_cap;
struct mlx5_ifc_port_selection_cap_bits port_selection_cap;
struct mlx5_ifc_vector_calc_cap_bits vector_calc_cap;
struct mlx5_ifc_qos_cap_bits qos_cap;
struct mlx5_ifc_debug_cap_bits debug_cap;
struct mlx5_ifc_fpga_cap_bits fpga_cap;
struct mlx5_ifc_tls_cap_bits tls_cap;
struct mlx5_ifc_device_mem_cap_bits device_mem_cap;
struct mlx5_ifc_virtio_emulation_cap_bits virtio_emulation_cap;
struct mlx5_ifc_shampo_cap_bits shampo_cap;
struct mlx5_ifc_macsec_cap_bits macsec_cap;
struct mlx5_ifc_crypto_cap_bits crypto_cap;
u8 reserved_at_0[0x8000];
@ -10858,8 +10815,9 @@ enum {
MLX5_MFRL_REG_RESET_STATE_IDLE = 0,
MLX5_MFRL_REG_RESET_STATE_IN_NEGOTIATION = 1,
MLX5_MFRL_REG_RESET_STATE_RESET_IN_PROGRESS = 2,
MLX5_MFRL_REG_RESET_STATE_TIMEOUT = 3,
MLX5_MFRL_REG_RESET_STATE_NEG_TIMEOUT = 3,
MLX5_MFRL_REG_RESET_STATE_NACK = 4,
MLX5_MFRL_REG_RESET_STATE_UNLOAD_TIMEOUT = 5,
};
enum {