linux

mirror of https://github.com/torvalds/linux synced 2024-09-21 03:28:37 +00:00

Author	SHA1	Message	Date
Florian Westphal	9b1a4d0f91	netfilter: conntrack: convert sysctls to u8 log_invalid sysctl allows values of 0 to 255 inclusive so we no longer need a range check: the min/max values can be removed. This also removes all member variables that were moved to net_generic data in previous patches. This reduces size of netns_ct struct by one cache line. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-04-13 13:10:39 +02:00
Florian Westphal	c53bd0e966	netfilter: conntrack: move ct counter to net_generic data Its only needed from slowpath (sysctl, ctnetlink, gc worker) and when a new conntrack object is allocated. Furthermore, each write dirties the otherwise read-mostly pernet data in struct net.ct, which are accessed from packet path. Move it to the net_generic data. This makes struct netns_ct read-mostly. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-04-13 13:10:39 +02:00
Florian Westphal	f6f2e580d5	netfilter: conntrack: move expect counter to net_generic data Creation of a new conntrack entry isn't a frequent operation (compared to 'ct entry already exists'). Creation of a new entry that is also an expected (related) connection even less so. Place this counter in net_generic data. A followup patch will also move the conntrack count -- this will make netns_ct a read-mostly structure. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-04-13 13:10:39 +02:00
Florian Westphal	67f28216ca	netfilter: conntrack: move autoassign_helper sysctl to net_generic data While at it, make it an u8, no need to use an integer for a boolean. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-04-13 13:10:39 +02:00
Florian Westphal	098b5d3565	netfilter: conntrack: move autoassign warning member to net_generic data Not accessed in fast path, place this is generic_net data instead. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-04-13 13:10:39 +02:00
wenxu	efce49dfe6	netfilter: flowtable: add vlan pop action offload support This patch adds vlan pop action offload in the flowtable offload. Signed-off-by: wenxu <wenxu@ucloud.cn> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-04-13 13:09:41 +02:00
wenxu	3e1b0c168f	netfilter: flowtable: add vlan match offload support This patch adds support for vlan_id, vlan_priority and vlan_proto match for flowtable offload. Signed-off-by: wenxu <wenxu@ucloud.cn> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-04-13 12:59:54 +02:00
Adam Ford	8ef7adc6be	net: ethernet: ravb: Enable optional refclk For devices that use a programmable clock for the AVB reference clock, the driver may need to enable them. Add code to find the optional clock and enable it when available. Signed-off-by: Adam Ford <aford173@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-12 14:09:59 -07:00
Adam Ford	6f43735b6d	dt-bindings: net: renesas,etheravb: Add additional clocks The AVB driver assumes there is an external crystal, but it could be clocked by other means. In order to enable a programmable clock, it needs to be added to the clocks list and enabled in the driver. Since there currently only one clock, there is no clock-names list either. Update bindings to add the additional optional clock, and explicitly name both of them. Signed-off-by: Adam Ford <aford173@gmail.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Rob Herring <robh@kernel.org> Reviewed-by: Sergei Shtylyov <sergei.shtylyov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-12 14:09:59 -07:00
David S. Miller	d27139c544	Merge branch 'enetc-ptp' Yangbo Lu says: ==================== enetc: support PTP Sync packet one-step timestamping This patch-set is to add support for PTP Sync packet one-step timestamping. Since ENETC single-step register has to be configured dynamically per packet for correctionField offeset and UDP checksum update, current one-step timestamping packet has to be sent only when the last one completes transmitting on hardware. So, on the TX, this patch handles one-step timestamping packet as below: - Trasmit packet immediately if no other one in transfer, or queue to skb queue if there is already one in transfer. The test_and_set_bit_lock() is used here to lock and check state. - Start a work when complete transfer on hardware, to release the bit lock and to send one skb in skb queue if has. Changes for v2: - Rebased. - Fixed issues from patchwork checks. - netif_tx_lock for one-step timestamping packet sending. Changes for v3: - Used system workqueue. - Set bit lock when transmitted one-step packet, and scheduled work when completed. The worker cleared the bit lock, and transmitted one skb in skb queue if has, instead of a loop. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-12 13:34:21 -07:00
Yangbo Lu	7294380c52	enetc: support PTP Sync packet one-step timestamping This patch is to add support for PTP Sync packet one-step timestamping. Since ENETC single-step register has to be configured dynamically per packet for correctionField offeset and UDP checksum update, current one-step timestamping packet has to be sent only when the last one completes transmitting on hardware. So, on the TX, this patch handles one-step timestamping packet as below: - Trasmit packet immediately if no other one in transfer, or queue to skb queue if there is already one in transfer. The test_and_set_bit_lock() is used here to lock and check state. - Start a work when complete transfer on hardware, to release the bit lock and to send one skb in skb queue if has. And the configuration for one-step timestamping on ENETC before transmitting is, - Set one-step timestamping flag in extension BD. - Write 30 bits current timestamp in tstamp field of extension BD. - Update PTP Sync packet originTimestamp field with current timestamp. - Configure single-step register for correctionField offeset and UDP checksum update. Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-12 13:34:21 -07:00
Yangbo Lu	f768e75130	enetc: mark TX timestamp type per skb Mark TX timestamp type per skb on skb->cb[0], instead of global variable for all skbs. This is a preparation for one step timestamp support. For one-step timestamping enablement, there will be both one-step and two-step PTP messages to transfer. And a skb queue is needed for one-step PTP messages making sure start to send current message only after the last one completed on hardware. (ENETC single-step register has to be dynamically configured per message.) So, marking TX timestamp type per skb is required. Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-12 13:34:21 -07:00
David S. Miller	8043edee9a	Merge branch 'ibmvnic-errors' Lijun Pan says: ==================== ibmvnic: improve error printing Patch 1 prints reset reason as a string. Patch 2 prints adapter state as a string. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-12 13:31:27 -07:00
Lijun Pan	0666ef7f61	ibmvnic: print adapter state as a string The adapter state can be added or deleted over different versions of the source code. Print a string instead of a number. Signed-off-by: Lijun Pan <lijunp213@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-12 13:31:26 -07:00
Lijun Pan	caee7bf5b0	ibmvnic: print reset reason as a string The reset reason can be added or deleted over different versions of the source code. Print a string instead of a number. Signed-off-by: Lijun Pan <lijunp213@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-12 13:31:26 -07:00
Lijun Pan	c82eaa4064	ibmvnic: clean up the remaining debugfs data structures Commit `e704f0434e` ("ibmvnic: Remove debugfs support") did not clean up everything. Remove the remaining code. Signed-off-by: Lijun Pan <lijunp213@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-12 13:29:10 -07:00
David S. Miller	645b34a7b5	Merge branch 'netns-sysctl-isolation' Jonathon Reinhart says: ==================== Ensuring net sysctl isolation This patchset is the result of an audit of /proc/sys/net to prove that it is safe to be mouted read-write in a container when a net namespace is in use. See [1]. The first commit adds code to detect sysctls which are not netns-safe, and can "leak" changes to other net namespaces. My manual audit found, and the above feature confirmed, that there are two nf_conntrack sysctls which are in fact not netns-safe. I considered sending the latter to netfilter-devel, but I think it's better to have both together on net-next: Adding only the former causes undesirable warnings in the kernel log. [1]: https://github.com/opencontainers/runc/issues/2826 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-12 13:27:11 -07:00
Jonathon Reinhart	2671fa4dc0	netfilter: conntrack: Make global sysctls readonly in non-init netns These sysctls point to global variables: - NF_SYSCTL_CT_MAX (&nf_conntrack_max) - NF_SYSCTL_CT_EXPECT_MAX (&nf_ct_expect_max) - NF_SYSCTL_CT_BUCKETS (&nf_conntrack_htable_size_user) Because their data pointers are not updated to point to per-netns structures, they must be marked read-only in a non-init_net ns. Otherwise, changes in any net namespace are reflected in (leaked into) all other net namespaces. This problem has existed since the introduction of net namespaces. The current logic marks them read-only only if the net namespace is owned by an unprivileged user (other than init_user_ns). Commit `d0febd81ae` ("netfilter: conntrack: re-visit sysctls in unprivileged namespaces") "exposes all sysctls even if the namespace is unpriviliged." Since we need to mark them readonly in any case, we can forego the unprivileged user check altogether. Fixes: `d0febd81ae` ("netfilter: conntrack: re-visit sysctls in unprivileged namespaces") Signed-off-by: Jonathon Reinhart <Jonathon.Reinhart@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-12 13:27:11 -07:00
Jonathon Reinhart	31c4d2f160	net: Ensure net namespace isolation of sysctls This adds an ensure_safe_net_sysctl() check during register_net_sysctl() to validate that sysctl table entries for a non-init_net netns are sufficiently isolated. To be netns-safe, an entry must adhere to at least (and usually exactly) one of these rules: 1. It is marked read-only inside the netns. 2. Its data pointer does not point to kernel/module global data. An entry which fails both of these checks is indicative of a bug, whereby a child netns can affect global net sysctl values. If such an entry is found, this code will issue a warning to the kernel log, and force the entry to be read-only to prevent a leak. To test, simply create a new netns: $ sudo ip netns add dummy As it sits now, this patch will WARN for two sysctls which will be addressed in a subsequent patch: - /proc/sys/net/netfilter/nf_conntrack_max - /proc/sys/net/netfilter/nf_conntrack_expect_max Signed-off-by: Jonathon Reinhart <Jonathon.Reinhart@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-12 13:27:11 -07:00
wengjianfeng	a115d24a63	nfc: pn533: remove redundant assignment In many places,first assign a value to a variable and then return the variable. which is redundant, we should directly return the value. in pn533_rf_field funciton,return rc also in the if statement, so we use return 0 to replace the last return rc. Signed-off-by: wengjianfeng <wengjianfeng@yulong.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-12 13:23:03 -07:00
David S. Miller	5711ffd313	Merge branch 'bnxt_en-error-recovery' Michael Chan says: ==================== bnxt_en: Error recovery fixes. This series adds some fixes and enhancements to the error recovery logic. The health register logic is improved and we also add missing code to free and re-create VF representors in the firmware after error recovery. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-12 13:20:38 -07:00
Sriharsha Basavapatna	ac797ced1f	bnxt_en: Free and allocate VF-Reps during error recovery. During firmware recovery, VF-Rep configuration in the firmware is lost. Fix it by freeing and (re)allocating VF-Reps in FW at relevant points during the error recovery process. Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-12 13:20:38 -07:00
Michael Chan	90f4fd0296	bnxt_en: Refactor __bnxt_vf_reps_destroy(). Add a new helper function __bnxt_free_one_vf_rep() to free one VF rep. We also reintialize the VF rep fields to proper initial values so that the function can be used without freeing the VF rep data structure. This will be used in subsequent patches to free and recreate VF reps after error recovery. Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Reviewed-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-12 13:20:38 -07:00
Sriharsha Basavapatna	ea2d37b2b3	bnxt_en: Refactor bnxt_vf_reps_create(). Add a new function bnxt_alloc_vf_rep() to allocate a VF representor. This function will be needed in subsequent patches to recreate the VF reps after error recovery. Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-12 13:20:38 -07:00
Vasundhara Volam	190eda1a9d	bnxt_en: Invalidate health register mapping at the end of probe. After probe is successful, interface may not be bought up in all the cases and health register mapping could be invalid if firmware undergoes reset. Fix it by invalidating the health register at the end of probe. It will be remapped during ifup. Fixes: `43a440c400` ("bnxt_en: Improve the status_reliable flag in bp->fw_health.") Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-12 13:20:38 -07:00
Michael Chan	17e1be342d	bnxt_en: Treat health register value 0 as valid in bnxt_try_reover_fw(). The retry loop in bnxt_try_recover_fw() should not abort when the health register value is 0. It is a valid value that indicates the firmware is booting up. Fixes: `861aae786f` ("bnxt_en: Enhance retry of the first message to the firmware.") Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-12 13:20:38 -07:00
Andrea Mayer	0d77036057	net: seg6: trivial fix of a spelling mistake in comment There is a comment spelling mistake "interfarence" -> "interference" in function parse_nla_action(). Fix it. Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-12 13:17:09 -07:00
Colin Ian King	d0494135f9	net: hns3: Fix potential null pointer defererence of null ae_dev The reset_prepare and reset_done calls have a null pointer check on ae_dev however ae_dev is being dereferenced via the call to ns3_is_phys_func with the ae->pdev argument. Fix this by performing a null pointer check on ae_dev and hence short-circuiting the dereference to ae_dev on the call to ns3_is_phys_func. Addresses-Coverity: ("Dereference before null check") Fixes: `715c58e94f` ("net: hns3: add suspend and resume pm_ops") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-12 13:15:21 -07:00
Colin Ian King	e701a25840	net: thunderx: Fix unintentional sign extension issue The shifting of the u8 integers rq->caching by 26 bits to the left will be promoted to a 32 bit signed int and then sign-extended to a u64. In the event that rq->caching is greater than 0x1f then all then all the upper 32 bits of the u64 end up as also being set because of the int sign-extension. Fix this by casting the u8 values to a u64 before the 26 bit left shift. Addresses-Coverity: ("Unintended sign extension") Fixes: `4863dea3fa` ("net: Adding support for Cavium ThunderX network controller") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-12 13:13:57 -07:00
Colin Ian King	dd2c796773	cxgb4: Fix unintentional sign extension issues The shifting of the u8 integers f->fs.nat_lip[] by 24 bits to the left will be promoted to a 32 bit signed int and then sign-extended to a u64. In the event that the top bit of the u8 is set then all then all the upper 32 bits of the u64 end up as also being set because of the sign-extension. Fix this by casting the u8 values to a u64 before the 24 bit left shift. Addresses-Coverity: ("Unintended sign extension") Fixes: `12b276fbf6` ("cxgb4: add support to create hash filters") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-12 13:13:17 -07:00
David S. Miller	5b489fea97	Merge branch 'ipa-next' Alex Elder says: ==================== net: ipa: support two more platforms This series adds IPA support for two more Qualcomm SoCs. The first patch updates the DT binding to add compatible strings. The second temporarily disables checksum offload support for IPA version 4.5 and above. Changes are required to the RMNet driver to support the "inline" checksum offload used for IPA v4.5+, and once those are present this capability will be enabled for IPA. The third and fourth patches add configuration data for IPA versions 4.5 (used for the SDX55 SoC) and 4.11 (used for the SD7280 SoC). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-11 16:49:08 -07:00
Alex Elder	927c504345	net: ipa: add IPA v4.11 configuration data Add support for the SC7280 SoC, which includes IPA version 4.11. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-11 16:49:08 -07:00
Alex Elder	fbb763e7e7	net: ipa: add IPA v4.5 configuration data Add support for the SDX55 SoC, which includes IPA version 4.5. Starting with IPA v4.5, a few of the memory regions have a different number of "canary" values; update comments in the where the region identifers are defined to accurately reflect that. I'll note three differences in SDX55 versus the other two existing platforms (SDM845 and SC7180): - SDX55 uses a 32-bit Linux kernel - SDX55 has four interconnects rather than three - SDX55 uses IPA v4.5, which uses inline checksum offload Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-11 16:49:08 -07:00
Alex Elder	c88c34fcf8	net: ipa: disable checksum offload for IPA v4.5+ Checksum offload for IPA v4.5+ is implemented differently, using "inline" offload (which uses a common header format for both upload and download offload). The IPA hardware must be programmed to enable MAP checksum offload, but the RMNet driver is responsible for interpreting checksum metadata supplied with messages. Currently, the RMNet driver does not support inline checksum offload. This support is imminent, but until it is available, do not allow newer versions of IPA to specify checksum offload for endpoints. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-11 16:49:08 -07:00
Alex Elder	c3264fee72	dt-bindings: net: qcom,ipa: add some compatible strings Add existing supported platform "qcom,sc7180-ipa" to the set of IPA compatible strings. Also add newly-supported "qcom,sdx55-ipa", "qcom,sc7280-ipa". Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-11 16:49:08 -07:00
Qiheng Lin	95291ced81	ehea: add missing MODULE_DEVICE_TABLE This patch adds missing MODULE_DEVICE_TABLE definition which generates correct modalias for automatic loading of this driver when it is built as an external module. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Qiheng Lin <linqiheng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-11 16:42:38 -07:00
David S. Miller	23cfa4d4aa	Merge branch 'veth-gro' Paolo Abeni says: ==================== veth: allow GRO even without XDP This series allows the user-space to enable GRO/NAPI on a veth device even without attaching an XDP program. It does not change the default veth behavior (no NAPI, no GRO), except that the GRO feature bit on top of this series will be effectively off by default on veth devices. Note that currently the GRO bit is on by default, but GRO never takes place in absence of XDP. On top of this series, setting the GRO feature bit enables NAPI and allows the GRO to take place. The TSO features on the peer device are preserved. The main goal is improving UDP forwarding performances for containers in a typical virtual network setup: (container) veth -> veth peer -> bridge/ovs -> vxlan -> NIC Enabling the NAPI threaded mode, GRO the NETIF_F_GRO_UDP_FWD feature on the veth peer improves the UDP stream performance with not void netfilter configuration by 2x factor with no measurable overhead for TCP traffic: some heuristic ensures that TCP will not go through the additional NAPI/GRO layer. Some self-tests are added to check the expected behavior in the default configuration, with XDP and with plain GRO enabled. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-11 16:39:28 -07:00
Paolo Abeni	1c3cadbe02	self-tests: add veth tests Add some basic veth tests, that verify the expected flags and aggregation with different setups (default, xdp, etc...) Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-11 16:39:28 -07:00
Paolo Abeni	47e550e010	veth: refine napi usage After the previous patch, when enabling GRO, locally generated TCP traffic experiences some measurable overhead, as it traverses the GRO engine without any chance of aggregation. This change refine the NAPI receive path admission test, to avoid unnecessary GRO overhead in most scenarios, when GRO is enabled on a veth peer. Only skbs that are eligible for aggregation enter the GRO layer, the others will go through the traditional receive path. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-11 16:39:28 -07:00
Paolo Abeni	d3256efd8e	veth: allow enabling NAPI even without XDP Currently the veth device has the GRO feature bit set, even if no GRO aggregation is possible with the default configuration, as the veth device does not hook into the GRO engine. Flipping the GRO feature bit from user-space is a no-op, unless XDP is enabled. In such scenario GRO could actually take place, but TSO is forced to off on the peer device. This change allow user-space to really control the GRO feature, with no need for an XDP program. The GRO feature bit is now cleared by default - so that there are no user-visible behavior changes with the default configuration. When the GRO bit is set, the per-queue NAPI instances are initialized and registered. On xmit, when napi instances are available, we try to use them. Some additional checks are in place to ensure we initialize/delete NAPIs only when needed in case of overlapping XDP and GRO configuration changes. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-11 16:39:28 -07:00
Paolo Abeni	c75fb320d4	veth: use skb_orphan_partial instead of skb_orphan As described by commit `9c4c325252` ("skbuff: preserve sock reference when scrubbing the skb."), orphaning a skb in the TX path will cause OoO. Let's use skb_orphan_partial() instead of skb_orphan(), so that we keep the sk around for queue's selection sake and we still avoid the problem fixed with commit `4bf9ffa0fb` ("veth: Orphan skb before GRO") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-11 16:39:28 -07:00
David S. Miller	7dc85b599a	Merge branch 'ethtool-eeprom' Moshe Shemesh says: ==================== ethtool: Extend module EEPROM dump API Ethtool supports module EEPROM dumps via the `ethtool -m <dev>` command. But in current state its functionality is limited - offset and length parameters, which are used to specify a linear desired region of EEPROM data to dump, is not enough, considering emergence of complex module EEPROM layouts such as CMIS 4.0. Moreover, CMIS 4.0 extends the amount of pages that may be accessible by introducing another parameter for page addressing - banks. Besides, currently module EEPROM is represented as a chunk of concatenated pages, where lower 128 bytes of all pages, except page 00h, are omitted. Offset and length are used to address parts of this fake linear memory. But in practice drivers, which implement get_module_info() and get_module_eeprom() ethtool ops still calculate page number and set I2C address on their own. This series tackles these issues by adding ethtool op, which allows to pass page number, bank number and I2C address in addition to offset and length parameters to the driver, adds corresponding netlink infrastructure and implements the new interface in mlx5 driver. This allows to extend userspace 'ethtool -m' CLI by adding new parameters - page, bank and i2c. New command line format: ethtool -m <dev> [hex on\|off] [raw on\|off] [offset N] [length N] [page N] [bank N] [i2c N] The consequence of this series is a possibility to dump arbitrary EEPROM page at a time, in contrast to dumps of concatenated pages. Therefore, offset and length change their semantics and may be used only to specify a part of data within half page boundary, which size is currently limited to 128 bytes. As for drivers that support legacy get_module_info() and get_module_eeprom() pair, the series addresses it by implementing a fallback mechanism. As mentioned earlier, such drivers derive a page number from 'global' offset, so this can be done vice versa without their involvement thanks to standardization. If kernel netlink handler of 'ethtool -m' command detects that new ethtool op is not supported by the driver, it calculates offset from given page number and page offset and calls old ndos, if they are available. ==================== \Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-11 16:34:56 -07:00
Andrew Lunn	c97a31f66e	ethtool: wire in generic SFP module access If the device has a sfp bus attached, call its sfp_get_module_eeprom_by_page() function, otherwise use the ethtool op for the device. This follows how the IOCTL works. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-11 16:34:56 -07:00
Andrew Lunn	d740513f05	phy: sfp: add netlink SFP support to generic SFP code The new netlink API for reading SFP data requires a new op to be implemented. The idea of the new netlink SFP code is that userspace is responsible to parsing the EEPROM data and requesting pages, rather than have the kernel decide what pages are interesting and returning them. This allows greater flexibility for newer formats. Currently the generic SFP code only supports simple SFPs. Allow i2c address 0x50 and 0x51 to be accessed with page and bank must always be 0. This interface will later be extended when for example QSFP support is added. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Vladyslav Tarasiuk <vladyslavt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-11 16:34:56 -07:00
Vladyslav Tarasiuk	96d971e307	ethtool: Add fallback to get_module_eeprom from netlink command In case netlink get_module_eeprom_by_page() callback is not implemented by the driver, try to call old get_module_info() and get_module_eeprom() pair. Recalculate parameters to get_module_eeprom() offset and len using page number and their sizes. Return error if this can't be done. Signed-off-by: Vladyslav Tarasiuk <vladyslavt@nvidia.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-11 16:34:56 -07:00
Andrew Lunn	95dfc7effd	net: ethtool: Export helpers for getting EEPROM info There are two ways to retrieve information from SFP EEPROMs. Many devices make use of the common code, and assign the sfp_bus pointer in the netdev to point to the bus holding the SFP device. Some MAC drivers directly implement ops in there ethool structure. Export within net/ethtool the two helpers used to call these methods, so that they can also be used in the new netlink code. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-11 16:34:56 -07:00
Vladyslav Tarasiuk	4c88fa412a	net/mlx5: Add support for DSFP module EEPROM dumps Allow the driver to recognise DSFP transceiver module ID and therefore allow its EEPROM dumps using ethtool. Signed-off-by: Vladyslav Tarasiuk <vladyslavt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-11 16:34:56 -07:00
Vladyslav Tarasiuk	e109d2b204	net/mlx5: Implement get_module_eeprom_by_page() Implement ethtool_ops::get_module_eeprom_by_page() to enable support of new SFP standards. Signed-off-by: Vladyslav Tarasiuk <vladyslavt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-11 16:34:56 -07:00
Vladyslav Tarasiuk	e19b0a3474	net/mlx5: Refactor module EEPROM query Prepare for ethtool_ops::get_module_eeprom_data() implementation by extracting common part of mlx5_query_module_eeprom() into a separate function. Signed-off-by: Vladyslav Tarasiuk <vladyslavt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-11 16:34:56 -07:00
Vladyslav Tarasiuk	c781ff12a2	ethtool: Allow network drivers to dump arbitrary EEPROM data Define get_module_eeprom_by_page() ethtool callback and implement netlink infrastructure. get_module_eeprom_by_page() allows network drivers to dump a part of module's EEPROM specified by page and bank numbers along with offset and length. It is effectively a netlink replacement for get_module_info() and get_module_eeprom() pair, which is needed due to emergence of complex non-linear EEPROM layouts. Signed-off-by: Vladyslav Tarasiuk <vladyslavt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-11 16:34:56 -07:00

1 2 3 4 5 ...

999746 commits