Commit graph

950564 commits

Author SHA1 Message Date
Wei Wang de033b7d15 ip: pass tos into ip_build_and_send_pkt()
This commit adds tos as a new passed in parameter to
ip_build_and_send_pkt() which will be used in the later commit.
This is a pure restructure and does not have any functional change.

Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-10 13:15:40 -07:00
Wei Wang e9b12edc13 tcp: record received TOS value in the request socket
A new field is added to the request sock to record the TOS value
received on the listening socket during 3WHS:
When not under syn flood, it is recording the TOS value sent in SYN.
When under syn flood, it is recording the TOS value sent in the ACK.
This is a preparation patch in order to do TOS reflection in the later
commit.

Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-10 13:15:40 -07:00
Lorenzo Bianconi 3a8c4ad161 net: mventa: drop mvneta_stats from mvneta_swbm_rx_frame signature
Remove mvneta_stats from mvneta_swbm_rx_frame signature since now stats
are accounted in mvneta_run_xdp routine

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-10 13:09:33 -07:00
David S. Miller 6198f44690 Merge branch 'netpoll-make-sure-napi_list-is-safe-for-RCU-traversal'
Jakub Kicinski says:

====================
netpoll: make sure napi_list is safe for RCU traversal

This series is a follow-up to the fix in commit 96e97bc07e ("net:
disable netpoll on fresh napis"). To avoid any latent race conditions
convert dev->napi_list to a proper RCU list. We need minor restructuring
because it looks like netif_napi_del() used to be idempotent, and
it may be quite hard to track down everyone who depends on that.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-10 13:08:47 -07:00
Jakub Kicinski 5251ef8299 net: make sure napi_list is safe for RCU traversal
netpoll needs to traverse dev->napi_list under RCU, make
sure it uses the right iterator and that removal from this
list is handled safely.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-10 13:08:46 -07:00
Jakub Kicinski 4d092dd204 net: manage napi add/del idempotence explicitly
To RCUify napi->dev_list we need to replace list_del_init()
with list_del_rcu(). There is no _init() version for RCU for
obvious reasons. Up until now netif_napi_del() was idempotent
so to make sure it remains such add a bit which is set when
NAPI is listed, and cleared when it removed. Since we don't
expect multiple calls to netif_napi_add() to be correct,
add a warning on that side.

Now that napi_hash_add / napi_hash_del are only called by
napi_add / del we can actually steal its bit. We just need
to make sure hash node is initialized correctly.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-10 13:08:46 -07:00
Jakub Kicinski 5198d545db net: remove napi_hash_del() from driver-facing API
We allow drivers to call napi_hash_del() before calling
netif_napi_del() to batch RCU grace periods. This makes
the API asymmetric and leaks internal implementation details.
Soon we will want the grace period to protect more than just
the NAPI hash table.

Restructure the API and have drivers call a new function -
__netif_napi_del() if they want to take care of RCU waits.

Note that only core was checking the return status from
napi_hash_del() so the new helper does not report if the
NAPI was actually deleted.

Some notes on driver oddness:
 - veth observed the grace period before calling netif_napi_del()
   but that should not matter
 - myri10ge observed normal RCU flavor
 - bnx2x and enic did not actually observe the grace period
   (unless they did so implicitly)
 - virtio_net and enic only unhashed Rx NAPIs

The last two points seem to indicate that the calls to
napi_hash_del() were a left over rather than an optimization.
Regardless, it's easy enough to correct them.

This patch may introduce extra synchronize_net() calls for
interfaces which set NAPI_STATE_NO_BUSY_POLL and depend on
free_netdev() to call netif_napi_del(). This seems inevitable
since we want to use RCU for netpoll dev->napi_list traversal,
and almost no drivers set IFF_DISABLE_NETPOLL.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-10 13:08:46 -07:00
David S. Miller 8b40f21b69 Merge branch 'mlx4-avoid-devlink-port-type-not-set-warnings'
Jakub Kicinski says:

====================
mlx4: avoid devlink port type not set warnings

This small set addresses the issue of mlx4 potentially not setting
devlink port type when Ethernet or IB driver is not built, but
port has that type.

v2:
 - add patch 1
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-10 12:49:00 -07:00
Jakub Kicinski 0313c7c2e4 mlx4: make sure to always set the port type
Even tho mlx4_core registers the devlink ports, it's mlx4_en
and mlx4_ib which set their type. In situations where one of
the two is not built yet the machine has ports of given type
we see the devlink warning from devlink_port_type_warn() trigger.

Having ports of a type not supported by the kernel may seem
surprising, but it does occur in practice - when the unsupported
port is not plugged in to a switch anyway users are more than happy
not to see it (and potentially allocate any resources to it).

Set the type in mlx4_core if type-specific driver is not built.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-10 12:49:00 -07:00
Jakub Kicinski 3ea87ca772 devlink: don't crash if netdev is NULL
Following change will add support for a corner case where
we may not have a netdev to pass to devlink_port_type_eth_set()
but we still want to set port type.

This is definitely a corner case, and drivers should not normally
pass NULL netdev - print a warning message when this happens.

Sadly for other port types (ib) switches don't have a device
reference, the way we always do for Ethernet, so we can't put
the warning in __devlink_port_type_set().

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-10 12:49:00 -07:00
Lorenzo Bianconi 6eb8b7fbe3 net: mvneta: rely on MVNETA_MAX_RX_BUF_SIZE for pkt split in mvneta_swbm_rx_frame()
In order to easily change the rx buffer size, rely on
MVNETA_MAX_RX_BUF_SIZE instead of PAGE_SIZE in mvneta_swbm_rx_frame
routine for rx buffer split. Currently this is not an issue since we set
MVNETA_MAX_RX_BUF_SIZE to PAGE_SIZE - MVNETA_SKB_PAD but it is a good to
have to configure a different rx buffer size.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-10 12:40:19 -07:00
David S. Miller 8c5c49a6a0 Merge branch 'Allow-more-than-255-IPv4-multicast-interfaces'
Paul Davey says:

====================
Allow more than 255 IPv4 multicast interfaces

Currently it is not possible to use more than 255 multicast interfaces
for IPv4 due to the format of the igmpmsg header which only has 8 bits
available for the VIF ID.  There is space available in the igmpmsg
header to store the full VIF ID in the form of an unused byte following
the VIF ID field.  There is also enough space for the full VIF ID in
the Netlink cache notifications, however the value is currently taken
directly from the igmpmsg header and has thus already been truncated.

Adding the high byte of the VIF ID into the unused3 byte of igmpmsg
allows use of more than 255 IPv4 multicast interfaces. The full VIF ID
is  also available in the Netlink notification by assembling it from
both bytes from the igmpmsg.

Additionally this reveals a deficiency in the Netlink cache report
notifications, they lack any means for differentiating cache reports
relating to different multicast routing tables.  This is easily
resolved by adding the multicast route table ID to the cache reports.

changes in v2:
 - Added high byte of VIF ID to igmpmsg struct replacing unused3
   member.
 - Assemble VIF ID in Netlink notification from both bytes in igmpmsg
   header.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-10 12:25:51 -07:00
Paul Davey bb82067c57 ipmr: Use full VIF ID in netlink cache reports
Insert the full 16 bit VIF ID into ipmr Netlink cache reports.

The VIF_ID attribute has 32 bits of space so can store the full VIF ID
extracted from the high and low byte fields in the igmpmsg.

Signed-off-by: Paul Davey <paul.davey@alliedtelesis.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-10 12:25:51 -07:00
Paul Davey c8715a8e9f ipmr: Add high byte of VIF ID to igmpmsg
Use the unused3 byte in struct igmpmsg to hold the high 8 bits of the
VIF ID.

If using more than 255 IPv4 multicast interfaces it is necessary to have
access to a VIF ID for cache reports that is wider than 8 bits, the VIF
ID present in the igmpmsg reports sent to mroute_sk was only 8 bits wide
in the igmpmsg header.  Adding the high 8 bits of the 16 bit VIF ID in
the unused byte allows use of more than 255 IPv4 multicast interfaces.

Signed-off-by: Paul Davey <paul.davey@alliedtelesis.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-10 12:25:51 -07:00
Paul Davey 501cb00890 ipmr: Add route table ID to netlink cache reports
Insert the multicast route table ID as a Netlink attribute to Netlink
cache report notifications.

When multiple route tables are in use it is necessary to have a way to
determine which route table a given cache report belongs to when
receiving the cache report.

Signed-off-by: Paul Davey <paul.davey@alliedtelesis.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-10 12:25:51 -07:00
Florian Fainelli 4f6a5caf18 net: dsa: b53: Report VLAN table occupancy via devlink
We already maintain an array of VLANs used by the switch so we can
simply iterate over it to report the occupancy via devlink.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-09 14:29:02 -07:00
David S. Miller 4a056990e1 Merge branch 'Marvell-PP2-2-PTP-support'
Russell King says:

====================
Marvell PP2.2 PTP support

This series adds PTP support for PP2.2 hardware to the mvpp2 driver.
Tested on the Macchiatobin eth1 port.

Note that on the Macchiatobin, eth0 uses a separate TAI block from
eth1, and there is no hardware synchronisation between the two.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-09 14:22:42 -07:00
Russell King f5015a594c net: mvpp2: ptp: add support for transmit timestamping
Add support for timestamping transmit packets.  We allocate SYNC
messages to queue 1, every other message to queue 0.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-09 14:22:42 -07:00
Russell King ce3497e207 net: mvpp2: ptp: add support for receive timestamping
Add support for receive timestamping. When enabled, the hardware adds
a timestamp into the receive queue descriptor for all received packets
with no filtering. Hence, we can only support NONE or ALL receive
filter modes.

The timestamp in the receive queue contains two bit sof seconds and
the full nanosecond timestamp. This has to be merged with the remainder
of the seconds from the TAI clock to arrive at a full timestamp before
we can convert it to a ktime for the skb hardware timestamp field.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-09 14:22:42 -07:00
Russell King 91dd71950b net: mvpp2: ptp: add TAI support
Add support for the TAI block in the mvpp2.2 hardware.

Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-09 14:22:42 -07:00
Russell King b4b17714c3 net: mvpp2: check first level interrupt status registers
Check the first level interrupt status registers to determine how to
further process the port interrupt. We will need this to know whether
to invoke the link status processing and/or the PTP processing for
both XLG and GMAC.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-09 14:22:41 -07:00
Russell King 8914197269 net: mvpp2: rename mis-named "link status" interrupt
The link interrupt is used for way more than just the link status; it
comes from a collection of units to do with the port. The Marvell
documentation describes the interrupt as "GOP port X interrupt".

Since we are adding PTP support, and the PTP interrupt uses this,
rename it to be more inline with the documentation.

This interrupt is also mis-named in the DT binding, but we leave that
alone.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-09 14:22:41 -07:00
Russell King 36cfd3a6e5 net: mvpp2: restructure "link status" interrupt handling
The "link status" interrupt is used for more than just link status.
Restructure mvpp2_link_status_isr() so we can add additional handling.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-09 14:22:41 -07:00
David S. Miller b599a5b9e1 Merge branch 'devlink-show-controller-number'
Parav Pandit says:

====================
devlink show controller number

Currently a devlink instance that supports an eswitch handles eswitch
ports of two type of controllers.
(1) controller discovered on same system where eswitch resides.
This is the case where PCI PF/VF of a controller and devlink eswitch
instance both are located on a single system.
(2) controller located on external system.
This is the case where a controller is plugged in one system and its
devlink eswitch ports are located in a different system. In this case
devlink instance of the eswitch only have access to ports of the
controller.
However, there is no way to describe that a eswitch devlink port
belongs to which controller (mainly which external host controller).
This problem is more prevalent when port attribute such as PF and VF
numbers are overlapping between multiple controllers of same eswitch.
Due to this, for a specific switch_id, unique phys_port_name cannot
be constructed for such devlink ports.

This short series overcomes this limitation by defining two new
attributes.
(a) external: Indicates if port belongs to external controller
(b) controller number: Indicates a controller number of the port

Based on this a unique phys_port_name is prepared using controller
number.

phys_port_name construction using unique controller number is only
applicable to external controller ports. This ensures that for
non smartnic usecases where there is no external controller,
phys_port_name stays same as before.

Patch summary:
Patch-1 Added mlx5 driver to read controller number
Patch-2 Adds the missing comment for the port attributes
Patch-3 Move structure comments away from structure fields
Patch-4 external attribute added for PCI port flavours
Patch-5 Add controller number
Patch-6 Use controller number to build phys_port_name

---
Changelog:
v2->v3:
 - Updated diagram to get rid of controller 'A' and 'B'
 - Kept ports of single controller together in diagram
 - Updated diagram for pf1's VF and SF and its ports
v1->v2:
 - Added text diagram of multiple controllers
 - Updated example for a VF
 - Addressed comments from Jiri and Jakub
 - Moved controller number attribute to PCI port flavours
   This enables to better, hirerchical view with controller and its
    PF, VF numbers
 - Split 'external' and 'controller number' attributes as two
   different attributes
 - Merged mlx5_core driver to avoid compiliation break
====================

Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-09 14:19:56 -07:00
Parav Pandit 66b17082d1 devlink: Use controller while building phys_port_name
Now that controller number attribute is available, use it when
building phsy_port_name for external controller ports.

An example devlink port and representor netdev name consist of controller
annotation for external controller with controller number = 1,
for a VF 1 of PF 0:

$ devlink port show pci/0000:06:00.0/2
pci/0000:06:00.0/2: type eth netdev ens2f0c1pf0vf1 flavour pcivf controller 1 pfnum 0 vfnum 1 external true splittable false
  function:
    hw_addr 00:00:00:00:00:00

$ devlink port show pci/0000:06:00.0/2 -jp
{
    "port": {
        "pci/0000:06:00.0/2": {
            "type": "eth",
            "netdev": "ens2f0c1pf0vf1",
            "flavour": "pcivf",
            "controller": 1,
            "pfnum": 0,
            "vfnum": 1,
            "external": true,
            "splittable": false,
            "function": {
                "hw_addr": "00:00:00:00:00:00"
            }
        }
    }
}

Controller number annotation is skipped for non external controllers to
maintain backward compatibility.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-09 14:19:55 -07:00
Parav Pandit 3a2d9588c4 devlink: Introduce controller number
A devlink port may be for a controller consist of PCI device.
A devlink instance holds ports of two types of controllers.
(1) controller discovered on same system where eswitch resides
This is the case where PCI PF/VF of a controller and devlink eswitch
instance both are located on a single system.
(2) controller located on external host system.
This is the case where a controller is located in one system and its
devlink eswitch ports are located in a different system.

When a devlink eswitch instance serves the devlink ports of both
controllers together, PCI PF/VF numbers may overlap.
Due to this a unique phys_port_name cannot be constructed.

For example in below such system controller-0 and controller-1, each has
PCI PF pf0 whose eswitch ports can be present in controller-0.
These results in phys_port_name as "pf0" for both.
Similar problem exists for VFs and upcoming Sub functions.

An example view of two controller systems:

             ---------------------------------------------------------
             |                                                       |
             |           --------- ---------         ------- ------- |
-----------  |           | vf(s) | | sf(s) |         |vf(s)| |sf(s)| |
| server  |  | -------   ----/---- ---/----- ------- ---/--- ---/--- |
| pci rc  |=== | pf0 |______/________/       | pf1 |___/_______/     |
| connect |  | -------                       -------                 |
-----------  |     | controller_num=1 (no eswitch)                   |
             ------|--------------------------------------------------
             (internal wire)
                   |
             ---------------------------------------------------------
             | devlink eswitch ports and reps                        |
             | ----------------------------------------------------- |
             | |ctrl-0 | ctrl-0 | ctrl-0 | ctrl-0 | ctrl-0 |ctrl-0 | |
             | |pf0    | pf0vfN | pf0sfN | pf1    | pf1vfN |pf1sfN | |
             | ----------------------------------------------------- |
             | |ctrl-1 | ctrl-1 | ctrl-1 | ctrl-1 | ctrl-1 |ctrl-1 | |
             | |pf1    | pf1vfN | pf1sfN | pf1    | pf1vfN |pf0sfN | |
             | ----------------------------------------------------- |
             |                                                       |
             |                                                       |
             |           --------- ---------         ------- ------- |
             |           | vf(s) | | sf(s) |         |vf(s)| |sf(s)| |
             | -------   ----/---- ---/----- ------- ---/--- ---/--- |
             | | pf0 |______/________/       | pf1 |___/_______/     |
             | -------                       -------                 |
             |                                                       |
             |  local controller_num=0 (eswitch)                     |
             ---------------------------------------------------------

An example devlink port for external controller with controller
number = 1 for a VF 1 of PF 0:

$ devlink port show pci/0000:06:00.0/2
pci/0000:06:00.0/2: type eth netdev ens2f0pf0vf1 flavour pcivf controller 1 pfnum 0 vfnum 1 external true splittable false
  function:
    hw_addr 00:00:00:00:00:00

$ devlink port show pci/0000:06:00.0/2 -jp
{
    "port": {
        "pci/0000:06:00.0/2": {
            "type": "eth",
            "netdev": "ens2f0pf0vf1",
            "flavour": "pcivf",
            "controller": 1,
            "pfnum": 0,
            "vfnum": 1,
            "external": true,
            "splittable": false,
            "function": {
                "hw_addr": "00:00:00:00:00:00"
            }
        }
    }
}

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-09 14:19:55 -07:00
Parav Pandit 05b595e9c4 devlink: Introduce external controller flag
A devlink eswitch port may represent PCI PF/VF ports of a controller.

A controller either located on same system or it can be an external
controller located in host where such NIC is plugged in.

Add the ability for driver to specify if a port is for external
controller.

Use such flag in the mlx5_core driver.

An example of an external controller having VF1 of PF0 belong to
controller 1.

$ devlink port show pci/0000:06:00.0/2
pci/0000:06:00.0/2: type eth netdev ens2f0pf0vf1 flavour pcivf pfnum 0 vfnum 1 external true splittable false
  function:
    hw_addr 00:00:00:00:00:00
$ devlink port show pci/0000:06:00.0/2 -jp
{
    "port": {
        "pci/0000:06:00.0/2": {
            "type": "eth",
            "netdev": "ens2f0pf0vf1",
            "flavour": "pcivf",
            "pfnum": 0,
            "vfnum": 1,
            "external": true,
            "splittable": false,
            "function": {
                "hw_addr": "00:00:00:00:00:00"
            }
        }
    }
}

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-09 14:19:55 -07:00
Parav Pandit ff03e63ad1 devlink: Move structure comments outside of structure
To add more fields to the PCI PF and VF port attributes, follow standard
structure comment format.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-09 14:19:55 -07:00
Parav Pandit 2efbe6aebe devlink: Add comment block for missing port attributes
Add comment block for physical, PF and VF port attributes.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-09 14:19:55 -07:00
Parav Pandit a53cf9497a net/mlx5: E-switch, Read controller number from device
ECPF supports one external host controller. Read controller number
from the device.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-09 14:19:55 -07:00
Zhang Changzhong 6b5472d4f1 net: stmmac: dwmac-intel-plat: remove redundant null check before clk_disable_unprepare()
Because clk_prepare_enable() and clk_disable_unprepare() already checked
NULL clock parameter, so the additional checks are unnecessary, just
remove them.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-09 14:17:02 -07:00
Zhang Changzhong a0d48518cd net: pxa168_eth: remove redundant null check before clk_disable_unprepare()
Because clk_prepare_enable() and clk_disable_unprepare() already checked
NULL clock parameter, so the additional checks are unnecessary, just
remove them.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-09 14:16:33 -07:00
David S. Miller 34e435438c Merge branch 'SMSC-Cleanups-and-clock-setup'
Marco Felsch says:

====================
SMSC: Cleanups and clock setup

this small series cleans the smsc-phy code a bit and adds the support to
specify the phy clock source. Adding the phy clock source support is
also the main purpose of this series.

Each file has its own changelog.

Thanks a lot to Florian and Andrew for reviewing it.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-09 14:15:02 -07:00
Marco Felsch d65af21842 net: phy: smsc: LAN8710/20: remove PHY_RST_AFTER_CLK_EN flag
Don't reset the phy without respect to the PHY library state machine
because this breaks the phy IRQ mode. The same behaviour can be archived
now by specifying the refclk.

Signed-off-by: Marco Felsch <m.felsch@pengutronix.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-09 14:15:02 -07:00
Marco Felsch bedd8d78ab net: phy: smsc: LAN8710/20: add phy refclk in support
Add support to specify the clock provider for the PHY refclk and don't
rely on 'magic' host clock setup. [1] tried to address this by
introducing a flag and fixing the corresponding host. But this commit
breaks the IRQ support since the irq setup during .config_intr() is
thrown away because the reset comes from the side without respecting the
current PHY state within the PHY library state machine. Furthermore the
commit fixed the problem only for FEC based hosts other hosts acting
like the FEC are not covered.

This commit goes the other way around to address the bug fixed by [1].
Instead of resetting the device from the side every time the refclk gets
(re-)enabled it requests and enables the clock till the device gets
removed. Now the PHY library is the only place where the PHY gets reset
to respect the PHY library state machine.

[1] commit 7f64e5b18e ("net: phy: smsc: LAN8710/20: add
    PHY_RST_AFTER_CLK_EN flag")

Signed-off-by: Marco Felsch <m.felsch@pengutronix.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-09 14:15:02 -07:00
Marco Felsch 84475a9e04 dt-bindings: net: phy: smsc: document reference clock
Add support to specify the reference clock for the phy.

Signed-off-by: Marco Felsch <m.felsch@pengutronix.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-09 14:15:02 -07:00
Marco Felsch 436e380064 net: phy: smsc: simplify config_init callback
Exit the driver specific config_init hook early if energy detection is
disabled. We can do this because we don't need to clear the interrupt
status here. Clearing the status should be removed anyway since this is
handled by the phy_enable_interrupts().

Signed-off-by: Marco Felsch <m.felsch@pengutronix.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-09 14:15:02 -07:00
Marco Felsch 7365494550 net: phy: smsc: skip ENERGYON interrupt if disabled
Don't enable the interrupt if the platform disable the energy detection
by "smsc,disable-energy-detect".

Signed-off-by: Marco Felsch <m.felsch@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-09 14:15:02 -07:00
Wang Hai 74c654a852 net: cavium: Fix a bunch of kerneldoc parameter issues
Rename ptp to ptp_info.

Fix W=1 compile warnings (invalid kerneldoc):

drivers/net/ethernet/cavium/common/cavium_ptp.c:94: warning: Excess function parameter 'ptp' description in 'cavium_ptp_adjfine'
drivers/net/ethernet/cavium/common/cavium_ptp.c:141: warning: Excess function parameter 'ptp' description in 'cavium_ptp_adjtime'
drivers/net/ethernet/cavium/common/cavium_ptp.c:163: warning: Excess function parameter 'ptp' description in 'cavium_ptp_gettime'
drivers/net/ethernet/cavium/common/cavium_ptp.c:185: warning: Excess function parameter 'ptp' description in 'cavium_ptp_settime'
drivers/net/ethernet/cavium/common/cavium_ptp.c:208: warning: Excess function parameter 'ptp' description in 'cavium_ptp_enable'

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-09 14:11:58 -07:00
Ayush Sawal 76f919ebff cxgb4/ch_ipsec: Registering xfrmdev_ops with cxgb4
As ch_ipsec was removed without clearing xfrmdev_ops and netdev
feature(esp-hw-offload). When a recalculation of netdev feature is
triggered by changing tls feature(tls-hw-tx-offload) from user
request, it causes a page fault due to absence of valid xfrmdev_ops.

Fixes: 6dad4e8ab3 ("chcr: Add support for Inline IPSec")
Signed-off-by: Ayush Sawal <ayush.sawal@chelsio.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-09 11:30:14 -07:00
David S. Miller 8794ebfe9a Merge branch 'ksz9477-dsa-switch-driver-improvements'
Paul Barker says:

====================
ksz9477 dsa switch driver improvements

These changes were made while debugging the ksz9477 driver for use on a
custom board which uses the ksz9893 switch supported by this driver. The
patches have been runtime tested on top of Linux 5.8.4, I couldn't
runtime test them on top of 5.9-rc3 due to unrelated issues. They have
been build tested on top of net-next.

These changes can also be pulled from:

  https://gitlab.com/pbarker.dev/staging/linux.git
  tag: for-net-next/ksz-v3_2020-09-09

Changes from v2:

  * Fixed incorrect type in assignment error.
    Reported-by: kernel test robot <lkp@intel.com>

Changes from v1:

  * Rebased onto net-next.

  * Dropped unnecessary `#include <linux/printk.h>`.

  * Instead of printing the phy mode in `ksz9477_port_setup()`, modify
    the existing print in `ksz9477_config_cpu_port()` to always produce
    output and to be more clear.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-09 11:26:32 -07:00
Paul Barker 5b79798090 net: dsa: microchip: Implement recommended reset timing
The datasheet for the ksz9893 and ksz9477 switches recommend waiting at
least 100us after the de-assertion of reset before trying to program the
device through any interface.

Also switch the existing msleep() call to usleep_range() as recommended
in Documentation/timers/timers-howto.rst. The 2ms range used here is
somewhat arbitrary, as long as the reset is asserted for at least 10ms
we should be ok.

Signed-off-by: Paul Barker <pbarker@konsulko.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-09 11:26:32 -07:00
Paul Barker ade64eb5be net: dsa: microchip: Disable RGMII in-band status on KSZ9893
We can't assume that the link partner supports the in-band status
reporting which is enabled by default on the KSZ9893 when using RGMII
for the upstream port.

Signed-off-by: Paul Barker <pbarker@konsulko.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-09 11:26:32 -07:00
Paul Barker 805a7e6f53 net: dsa: microchip: Improve phy mode message
Always print the selected phy mode for the CPU port when using the
ksz9477 driver. If the phy mode was changed, also print the previous
mode to aid in debugging.

To make the message more clear, prefix it with the port number which it
applies to and improve the language a little.

Signed-off-by: Paul Barker <pbarker@konsulko.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-09 11:26:32 -07:00
Paul Barker 3c85f77515 net: dsa: microchip: Make switch detection more informative
To make switch detection more informative print the result of the
ksz9477/ksz9893 compatibility check. With debug output enabled also
print the contents of the Chip ID registers as a 40-bit hex string.

As this detection is the first communication with the switch performed
by the driver, making it easy to see any errors here will help identify
issues with SPI data corruption or reset sequencing.

Signed-off-by: Paul Barker <pbarker@konsulko.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-09 11:26:32 -07:00
David S. Miller d85427e3c8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for net-next:

1) Rewrite inner header IPv6 in ICMPv6 messages in ip6t_NPT,
   from Michael Zhou.

2) do_ip_vs_set_ctl() dereferences uninitialized value,
   from Peilin Ye.

3) Support for userdata in tables, from Jose M. Guisado.

4) Do not increment ct error and invalid stats at the same time,
   from Florian Westphal.

5) Remove ct ignore stats, also from Florian.

6) Add ct stats for clash resolution, from Florian Westphal.

7) Bump reference counter bump on ct clash resolution only,
   this is safe because bucket lock is held, again from Florian.

8) Use ip_is_fragment() in xt_HMARK, from YueHaibing.

9) Add wildcard support for nft_socket, from Balazs Scheidler.

10) Remove superfluous IPVS dependency on iptables, from
    Yaroslav Bolyukin.

11) Remove unused definition in ebt_stp, from Wang Hai.

12) Replace CONFIG_NFT_CHAIN_NAT_{IPV4,IPV6} by CONFIG_NFT_NAT
    in selftests/net, from Fabian Frederick.

13) Add userdata support for nft_object, from Jose M. Guisado.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-09 11:21:19 -07:00
Randy Dunlap ac99a822c6 net: ethernet/neterion/vxge: fix spelling of "functionality"
Fix typo/spello of "functionality".

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Jon Mason <jdmason@kudzu.us>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org
Cc: Jiri Kosina <trivial@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-08 20:26:13 -07:00
Randy Dunlap f5499c6747 nfc: pn533/usb.c: fix spelling of "functions"
Fix typo/spello of "functions".

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: linux-nfc@lists.01.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org
Cc: Jiri Kosina <trivial@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-08 20:25:51 -07:00
Wei Wang e92dd77e6f ipv6: add tos reflection in TCP reset and ack
Currently, ipv6 stack does not do any TOS reflection. To make the
behavior consistent with v4 stack, this commit adds TOS reflection in
tcp_v6_reqsk_send_ack() and tcp_v6_send_reset(). We clear the lower
2-bit ECN value of the received TOS in compliance with RFC 3168 6.1.5
robustness principles.

Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-08 20:20:55 -07:00
David S. Miller 56bbc22d83 RxRPC development
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEqG5UsNXhtOCrfGQP+7dXa6fLC2sFAl9X6OcACgkQ+7dXa6fL
 C2s3Dg/9FyGc4Na08Rc5q8K2TSVhsx1dqBscf/t3gIulCaShbbRRFAb4UecrxOTn
 GV/eekGVejh/a+cCCVjMGFNP2FxyW9WTFoxINaTFp+bdGURdo8MOJTCTc/1Sgflb
 M7s+2mubYHfHLw4/VUR8tCAhODPC0JUTiDnkZqXZxA8SizUPAkNZySH7OfgW9PjA
 PFy/lB5IbR9mU64FWMALU2Rs09Gav/KRA+/X8fyE80i1MgYAz82U7x/EQa0XCmdR
 B61VObK8z9/o7Juu8GzM01k8b5zoWraHph0e/BdFaEN1KY5phPdPyerGa67GcgwX
 imnvcW4N75QBlHhE8C34eKnbIaNIPUR9ZNJdEgnA29CPOtP87rSPaeNNn4ISyF8r
 TnGGpAeBHMahrY2dhKK0lIv9Gbwd1j+OAlrL9O+j4KTaKZc9CqnXLzKp3ugXDeex
 RzHptlsc0I0bQ8ZeCA1jS9OFmypaRtYabE9DSrPS2Epbb8SdcCE4ZTXabB3A/ytk
 HBle/MfGcQN9yFAJpDJ/0Bj6PmUZgohVS7qMrZi+JV99vkNPebxxxEvoiM5il2km
 3DXPrD3rv9/qg8F6xVHYkWXVxZk6FCMe8/sg9VKUfOhmwEiFBkXv2IwTP/s3nN0g
 0h46rXAzjWXY8YMZzsgBHVe/Vp2MfJGsx9hyO1JppRbxGb1PVeU=
 =f+/X
 -----END PGP SIGNATURE-----

Merge tag 'rxrpc-next-20200908' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

David Howells says:

====================
rxrpc: Allow more calls to same peer

Here are some development patches for AF_RXRPC that allow more simultaneous
calls to be made to the same peer with the same security parameters.  The
current code allows a maximum of 4 simultaneous calls, which limits the afs
filesystem to that many simultaneous threads.  This increases the limit to
16.

To make this work, the way client connections are limited has to be changed
(incoming call/connection limits are unaffected) as the current code
depends on queuing calls on a connection and then pushing the connection
through a queue.  The limit is on the number of available connections.

This is changed such that there's a limit[*] on the total number of calls
systemwide across all namespaces, but the limit on the number of client
connections is removed.

Once a call is allowed to proceed, it finds a bundle of connections and
tries to grab a call slot.  If there's a spare call slot, fine, otherwise
it will wait.  If there's already a waiter, it will try to create another
connection in the bundle, unless the limit of 4 is reached (4 calls per
connection, giving 16).

A number of things throttle someone trying to set up endless connections:

 - Calls that fail immediately have their conns deleted immediately,

 - Calls that don't fail immediately have to wait for a timeout,

 - Connections normally get automatically reaped if they haven't been used
   for 2m, but this is sped up to 2s if the number of connections rises
   over 900.  This number is tunable by sysctl.

[*] Technically two limits - kernel sockets and userspace rxrpc sockets are
    accounted separately.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-08 20:18:17 -07:00