Commit graph

458208 commits

Author SHA1 Message Date
Willem de Bruijn e7fd288538 net-timestamp: SCHED timestamp on entering packet scheduler
Kernel transmit latency is often incurred in the packet scheduler.
Introduce a new timestamp on transmission just before entering the
scheduler. When data travels through multiple devices (bonding,
tunneling, ...) each device will export an individual timestamp.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-05 16:35:54 -07:00
Willem de Bruijn 09c2d251b7 net-timestamp: add key to disambiguate concurrent datagrams
Datagrams timestamped on transmission can coexist in the kernel stack
and be reordered in packet scheduling. When reading looped datagrams
from the socket error queue it is not always possible to unique
correlate looped data with original send() call (for application
level retransmits). Even if possible, it may be expensive and complex,
requiring packet inspection.

Introduce a data-independent ID mechanism to associate timestamps with
send calls. Pass an ID alongside the timestamp in field ee_data of
sock_extended_err.

The ID is a simple 32 bit unsigned int that is associated with the
socket and incremented on each send() call for which software tx
timestamp generation is enabled.

The feature is enabled only if SOF_TIMESTAMPING_OPT_ID is set, to
avoid changing ee_data for existing applications that expect it 0.
The counter is reset each time the flag is reenabled. Reenabling
does not change the ID of already submitted data. It is possible
to receive out of order IDs if the timestamp stream is not quiesced
first.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-05 16:35:54 -07:00
Willem de Bruijn b9f40e21ef net-timestamp: move timestamp flags out of sk_flags
sk_flags is reaching its limit. New timestamping options will not fit.
Move all of them into a new field sk->sk_tsflags.

Added benefit is that this removes boilerplate code to convert between
SOF_TIMESTAMPING_.. and SOCK_TIMESTAMPING_.. in getsockopt/setsockopt.

SOCK_TIMESTAMPING_RX_SOFTWARE is also used to toggle the receive
timestamp logic (netstamp_needed). That can be simplified and this
last key removed, but will leave that for a separate patch.

Signed-off-by: Willem de Bruijn <willemb@google.com>

----

The u16 in sock can be moved into a 16-bit hole below sk_gso_max_segs,
though that scatters tstamp fields throughout the struct.
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-05 16:35:54 -07:00
Willem de Bruijn f24b9be595 net-timestamp: extend SCM_TIMESTAMPING ancillary data struct
Applications that request kernel tx timestamps with SO_TIMESTAMPING
read timestamps as recvmsg() ancillary data. The response is defined
implicitly as timespec[3].

1) define struct scm_timestamping explicitly and

2) add support for new tstamp types. On tx, scm_timestamping always
   accompanies a sock_extended_err. Define previously unused field
   ee_info to signal the type of ts[0]. Introduce SCM_TSTAMP_SND to
   define the existing behavior.

The reception path is not modified. On rx, no struct similar to
sock_extended_err is passed along with SCM_TIMESTAMPING.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-05 16:35:53 -07:00
Anish Bhatt a2b81b35f9 cxgb4i : Move stray CPL definitions to cxgb4 driver
These belong to the t4 msg header, will ensure there is no accidental code
duplication in the future

Signed-off-by: Anish Bhatt <anish@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-05 16:30:18 -07:00
Neal Cardwell 5ae344c949 tcp: reduce spurious retransmits due to transient SACK reneging
This commit reduces spurious retransmits due to apparent SACK reneging
by only reacting to SACK reneging that persists for a short delay.

When a sequence space hole at snd_una is filled, some TCP receivers
send a series of ACKs as they apparently scan their out-of-order queue
and cumulatively ACK all the packets that have now been consecutiveyly
received. This is essentially misbehavior B in "Misbehaviors in TCP
SACK generation" ACM SIGCOMM Computer Communication Review, April
2011, so we suspect that this is from several common OSes (Windows
2000, Windows Server 2003, Windows XP). However, this issue has also
been seen in other cases, e.g. the netdev thread "TCP being hoodwinked
into spurious retransmissions by lack of timestamps?" from March 2014,
where the receiver was thought to be a BSD box.

Since snd_una would temporarily be adjacent to a previously SACKed
range in these scenarios, this receiver behavior triggered the Linux
SACK reneging code path in the sender. This led the sender to clear
the SACK scoreboard, enter CA_Loss, and spuriously retransmit
(potentially) every packet from the entire write queue at line rate
just a few milliseconds before the ACK for each packet arrives at the
sender.

To avoid such situations, now when a sender sees apparent reneging it
does not yet retransmit, but rather adjusts the RTO timer to give the
receiver a little time (max(RTT/2, 10ms)) to send us some more ACKs
that will restore sanity to the SACK scoreboard. If the reneging
persists until this RTO then, as before, we clear the SACK scoreboard
and enter CA_Loss.

A 10ms delay tolerates a receiver sending such a stream of ACKs at
56Kbit/sec. And to allow for receivers with slower or more congested
paths, we wait for at least RTT/2.

We validated the resulting max(RTT/2, 10ms) delay formula with a mix
of North American and South American Google web server traffic, and
found that for ACKs displaying transient reneging:

 (1) 90% of inter-ACK delays were less than 10ms
 (2) 99% of inter-ACK delays were less than RTT/2

In tests on Google web servers this commit reduced reneging events by
75%-90% (as measured by the TcpExtTCPSACKReneging counter), without
any measurable impact on latency for user HTTP and SPDY requests.

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-05 16:29:33 -07:00
David S. Miller 61675fea33 Merge branch 'xen-netback-next'
Zoltan Kiss says:

====================
xen-netback: Changes around carrier handling

This series starts using carrier off as a way to purge packets when the guest is
not able (or willing) to receive them. It is a much faster way to get rid of
packets waiting for an overwhelmed guest.
The first patch changes current netback code where it relies currently on
netif_carrier_ok.
The second turns off the carrier if the guest times out on a queue, and only
turn it on again if that queue (or queues) resurrects.
====================

Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-05 16:04:55 -07:00
Zoltan Kiss f34a4cf9c9 xen-netback: Turn off the carrier if the guest is not able to receive
Currently when the guest is not able to receive more packets, qdisc layer starts
a timer, and when it goes off, qdisc is started again to deliver a packet again.
This is a very slow way to drain the queues, consumes unnecessary resources and
slows down other guests shutdown.
This patch change the behaviour by turning the carrier off when that timer
fires, so all the packets are freed up which were stucked waiting for that vif.
Instead of the rx_queue_purge bool it uses the VIF_STATUS_RX_PURGE_EVENT bit to
signal the thread that either the timeout happened or an RX interrupt arrived,
so the thread can check what it should do. It also disables NAPI, so the guest
can't transmit, but leaves the interrupts on, so it can resurrect.
Only the queues which brought down the interface can enable it again, the bit
QUEUE_STATUS_RX_STALLED makes sure of that.

Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-05 16:04:46 -07:00
Zoltan Kiss 3d1af1df97 xen-netback: Using a new state bit instead of carrier
This patch introduces a new state bit VIF_STATUS_CONNECTED to track whether the
vif is in a connected state. Using carrier will not work with the next patch
in this series, which aims to turn the carrier temporarily off if the guest
doesn't seem to be able to receive packets.

Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-devel@lists.xenproject.org

v2:
- rename the bitshift type to "enum state_bit_shift" here, not in the next patch
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-05 16:04:46 -07:00
David S. Miller aef4f5b6db Merge tag 'master-2014-07-31' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next
Conflicts:
	net/6lowpan/iphc.c

Minor conflicts in iphc.c were changes overlapping with some
style cleanups.

John W. Linville says:

====================
Please pull this last(?) batch of wireless change intended for the
3.17 stream...

For the NFC bits, Samuel says:

"This is a rather quiet one, we have:

- A new driver from ST Microelectronics for their NCI ST21NFCB,
  including device tree  support.

- p2p support for the ST21NFCA driver

- A few fixes an enhancements for the NFC digital laye"

For the Atheros bits, Kalle says:

"Michal and Janusz did some important RX aggregation fixes, basically we
were missing RX reordering altogether. The 10.1 firmware doesn't support
Ad-Hoc mode and Michal fixed ath10k so that it doesn't advertise Ad-Hoc
support with that firmware. Also he implemented a workaround for a KVM
issue."

For the Bluetooth bits, Gustavo and Johan say:

"To quote Gustavo from his previous request:

'Some last minute fixes for -next. We have a fix for a use after free in
RFCOMM, another fix to an issue with ADV_DIRECT_IND and one for ADV_IND with
auto-connection handling.  Last, we added support for reading the codec and
MWS setting for controllers that support these features.'

Additionally there are fixes to LE scanning, an update to conform to the 4.1
core specification as well as fixes for tracking the page scan state. All
of these fixes are important for 3.17."

And,

"We've got:

- 6lowpan fixes/cleanups
- A couple crash fixes, one for the Marvell HCI driver and another in LE SMP.
- Fix for an incorrect connected state check
- Fix for the bondable requirement during pairing (an issue which had
  crept in because of using "pairable" when in fact the actual meaning
  was "bondable" (these have different meanings in Bluetooth)"

Along with those are some late-breaking hardware support patches in
brcmfmac and b43 as well as a stray ath9k patch.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-05 13:18:20 -07:00
Ricardo Ribalda 61ab9efddf net/usb/hso: Add support for Option GTM671WFS
After this patch:

[   32.985530] hso: drivers/net/usb/hso.c: Option Wireless
[   33.000452] hso 2-1.4:1.7: Not our interface
[   33.001849] usbcore: registered new interface driver hso

root@qt5022:~# ls /dev/ttyHS*
/dev/ttyHS0  /dev/ttyHS1  /dev/ttyHS2  /dev/ttyHS3  /dev/ttyHS4
/dev/ttyHS5

root@qt5022:~# lsusb -d 0af0: -vvv

Bus 002 Device 003: ID 0af0:9200 Option
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass          255 Vendor Specific Class
  bDeviceSubClass       255 Vendor Specific Subclass
  bDeviceProtocol       255 Vendor Specific Protocol
  bMaxPacketSize0        64
  idVendor           0x0af0 Option
  idProduct          0x9200
  bcdDevice            0.00
  iManufacturer           3 Option N.V.
  iProduct                2 Globetrotter HSUPA Modem
  iSerial                 0
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength          200
    bNumInterfaces          8
    bConfigurationValue     1
    iConfiguration          1 Option Configuration
    bmAttributes         0xe0
      Self Powered
      Remote Wakeup
    MaxPower              100mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           2
      bInterfaceClass       255 Vendor Specific Class
      bInterfaceSubClass    255 Vendor Specific Subclass
      bInterfaceProtocol    255 Vendor Specific Protocol
      iInterface              0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval              32
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x01  EP 1 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval              32
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        1
      bAlternateSetting       0
      bNumEndpoints           2
      bInterfaceClass       255 Vendor Specific Class
      bInterfaceSubClass    255 Vendor Specific Subclass
      bInterfaceProtocol    255 Vendor Specific Protocol
      iInterface              0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x82  EP 2 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval              32
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x02  EP 2 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval              32
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        2
      bAlternateSetting       0
      bNumEndpoints           2
      bInterfaceClass       255 Vendor Specific Class
      bInterfaceSubClass    255 Vendor Specific Subclass
      bInterfaceProtocol    255 Vendor Specific Protocol
      iInterface              0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x83  EP 3 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval              32
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x03  EP 3 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval              32
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        3
      bAlternateSetting       0
      bNumEndpoints           2
      bInterfaceClass       255 Vendor Specific Class
      bInterfaceSubClass    255 Vendor Specific Subclass
      bInterfaceProtocol    255 Vendor Specific Protocol
      iInterface              0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x84  EP 4 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval              32
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x04  EP 4 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval              32
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        4
      bAlternateSetting       0
      bNumEndpoints           2
      bInterfaceClass       255 Vendor Specific Class
      bInterfaceSubClass    255 Vendor Specific Subclass
      bInterfaceProtocol    255 Vendor Specific Protocol
      iInterface              0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x85  EP 5 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval              32
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x05  EP 5 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval              32
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        5
      bAlternateSetting       0
      bNumEndpoints           2
      bInterfaceClass       255 Vendor Specific Class
      bInterfaceSubClass    255 Vendor Specific Subclass
      bInterfaceProtocol    255 Vendor Specific Protocol
      iInterface              0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x06  EP 6 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval              32
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x86  EP 6 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval              32
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        6
      bAlternateSetting       0
      bNumEndpoints           3
      bInterfaceClass       255 Vendor Specific Class
      bInterfaceSubClass    255 Vendor Specific Subclass
      bInterfaceProtocol    255 Vendor Specific Protocol
      iInterface              0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x87  EP 7 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0040  1x 64 bytes
        bInterval               5
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x88  EP 8 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval              32
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x07  EP 7 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval              32
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        7
      bAlternateSetting       0
      bNumEndpoints           2
      bInterfaceClass         8 Mass Storage
      bInterfaceSubClass      6 SCSI
      bInterfaceProtocol     80 Bulk-Only
      iInterface              0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x08  EP 8 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               1
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x89  EP 9 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               1
Device Qualifier (for other device speed):
  bLength                10
  bDescriptorType         6
  bcdUSB               2.00
  bDeviceClass          255 Vendor Specific Class
  bDeviceSubClass       255 Vendor Specific Subclass
  bDeviceProtocol       255 Vendor Specific Protocol
  bMaxPacketSize0        64
  bNumConfigurations      1
Device Status:     0x0001
  Self Powered

Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Acked-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-05 13:06:04 -07:00
Hans Wennborg 45dfab6894 net: smc911x: fix %d confusingly prefixed with 0x in format string
Signed-off-by: Hans Wennborg <hans@hanshq.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-05 13:04:32 -07:00
Hans Wennborg 597aec3f93 drivers: atm: fix %d confusingly prefixed with 0x in format strings
Signed-off-by: Hans Wennborg <hans@hanshq.net>
Acked-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-05 13:04:32 -07:00
Eric Dumazet 67a24ac18b netlink: fix lockdep splats
With netlink_lookup() conversion to RCU, we need to use appropriate
rcu dereference in netlink_seq_socket_idx() & netlink_seq_next()

Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: e341694e3e ("netlink: Convert netlink_lookup() to use RCU protected hash table")
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-04 22:58:06 -07:00
Dmitry Popov 64a124edcc tcp: md5: remove unneeded check in tcp_v4_parse_md5_keys
tcpm_key is an array inside struct tcp_md5sig, there is no need to check it
against NULL.

Signed-off-by: Dmitry Popov <ixaphire@qrator.net>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-04 15:09:45 -07:00
KY Srinivasan 06b47aac49 Drivers: net-next: hyperv: Increase the size of the sendbuf region
Intel did some benchmarking on our network throughput when Linux on Hyper-V
is as used as a gateway. This fix gave us almost a 1 Gbps additional throughput
on about 5Gbps base throughput we hadi, prior to increasing the sendbuf size.
The sendbuf mechanism is a copy based transport that we have which is clearly
more optimal than the copy-free page flipping mechanism (for small packets).
In the forwarding scenario, we deal only with MTU sized packets,
and increasing the size of the senbuf area gave us the additional performance.
For what it is worth, Windows guests on Hyper-V, I am told use similar sendbuf
size as well.

The exact value of sendbuf I think is less important than the fact that it needs
to be larger than what Linux can allocate as physically contiguous memory.
Thus the change over to allocating via vmalloc().

We currently allocate 16MB receive buffer and we use vmalloc there for allocation.
Also the low level channel code has already been modified to deal with physically
dis-contiguous memory in the ringbuffer setup.

Based on experimentation Intel did, they say there was some improvement in throughput
as the sendbuf size was increased up to 16MB and there was no effect on throughput
beyond 16MB. Thus I have chosen 16MB here.

Increasing the sendbuf value makes a material difference in small packet handling

In this version of the patch, based on David's feedback, I have added
additional details in the commit log.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-04 15:07:14 -07:00
Francois Romieu 9041263ca9 net: remove spurious zd1201 rule.
Leftover from 5c601d0c94 ("wireless: move
zd1201 where it belongs").

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-04 15:02:54 -07:00
Himangi Saraogi b32a8b6410 net: phy: spi_ks8995: Introduce the use of devm_kzalloc
This patch introduces the use of devm_kzalloc and does away with the
kfrees in the probe and remove functions. Also, a label is removed.

Signed-off-by: Himangi Saraogi <himangi774@gmail.com>
Acked-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-04 12:55:23 -07:00
Hariprasad Shenai 5fa766946b cxgb4: only free allocated fls
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-04 12:51:28 -07:00
Michael S. Tsirkin 4b7a9168e1 bridge: remove a useless comment
commit 6cbdceeb1c
    bridge: Dump vlan information from a bridge port
introduced a comment in an attempt to explain the
code logic. The comment is unfinished so it confuses more
than it explains, remove it.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-04 12:46:51 -07:00
Anish Bhatt e6b92c25d2 cxgb4i : remove spurious use of rcu
As pointed out by the intel guys, there is no need to hold rcu read lock in
cxgbi_inet6addr_handler(), this patch removes it.

Fixes: 759a0cc5a3 ("cxgb4i: Add ipv6 code to driver, call into libcxgbi ipv6 api")
Signed-off-by: Anish Bhatt <anish@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 20:34:33 -07:00
David S. Miller bae2e81a69 Merge branch 'concurrent_hash_tables'
Thomas Graf says:

====================
Lockless netlink_lookup() with new concurrent hash table

Netlink sockets are maintained in a hash table to allow efficient lookup
via the port ID for unicast messages. However, lookups currently require
a read lock to be taken. This series adds a new generic, resizable,
scalable, concurrent hash table based on the paper referenced in the first
patch. It then makes use of the new data type to implement lockless
netlink_lookup().

Patch 3/3 to convert nft_hash is included for reference but should be
merged via the netfilter tree. Inclusion in this series is to provide
context for the suggested API.

Against net-next since the initial user of the new hash table is in net/

Changes:
v4-v5:
 - use GFP_KERNEL to alloc Netlink buckets as suggested by Nikolay
   Aleksandrov
 - free nft hash element on removal as spotted by Nikolay Aleksandrov
   and Patrick McHardy
v3-v4:
 - fixed wrong shift assignment placement as spotted by Nikolay Aleksandrov
 - reverted default size of nft_hash to 4 as requested by Patrick McHardy,
   default size for other hash tables remains at 64 if no hint is given
 - fixed copyright as requested by Patrick McHardy
v2-v3:
 - fixed typo in nft_hash_destroy() when passing rhashtable handle
v1-v2:
 - fixed traversal off-by-one as spotted by Tobias Klauser
 - removed unlikely() from BUG_ON() as spotted by Josh Triplett
 - new 3rd patch to convert nft_hash to rhashtable
 - make rhashtable_insert() return void
 - nl_sk_hash_lock must be a mutex
 - fixed wrong name of rht_shrink_below_30()
 - exported symbols rht_grow_above_75() and rht_shrink_below_30()
 - allow table freeing with RCU callback
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 19:49:47 -07:00
Thomas Graf cfe4a9dda0 nftables: Convert nft_hash to use generic rhashtable
The sizing of the hash table and the practice of requiring a lookup
to retrieve the pprev to be stored in the element cookie before the
deletion of an entry is left intact.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Patrick McHardy <kaber@trash.net>
Reviewed-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 19:49:38 -07:00
Thomas Graf e341694e3e netlink: Convert netlink_lookup() to use RCU protected hash table
Heavy Netlink users such as Open vSwitch spend a considerable amount of
time in netlink_lookup() due to the read-lock on nl_table_lock. Use of
RCU relieves the lock contention.

Makes use of the new resizable hash table to avoid locking on the
lookup.

The hash table will grow if entries exceeds 75% of table size up to a
total table size of 64K. It will automatically shrink if usage falls
below 30%.

Also splits nl_table_lock into a separate mutex to protect hash table
mutations and allow synchronize_rcu() to sleep while waiting for readers
during expansion and shrinking.

Before:
   9.16%  kpktgend_0  [openvswitch]      [k] masked_flow_lookup
   6.42%  kpktgend_0  [pktgen]           [k] mod_cur_headers
   6.26%  kpktgend_0  [pktgen]           [k] pktgen_thread_worker
   6.23%  kpktgend_0  [kernel.kallsyms]  [k] memset
   4.79%  kpktgend_0  [kernel.kallsyms]  [k] netlink_lookup
   4.37%  kpktgend_0  [kernel.kallsyms]  [k] memcpy
   3.60%  kpktgend_0  [openvswitch]      [k] ovs_flow_extract
   2.69%  kpktgend_0  [kernel.kallsyms]  [k] jhash2

After:
  15.26%  kpktgend_0  [openvswitch]      [k] masked_flow_lookup
   8.12%  kpktgend_0  [pktgen]           [k] pktgen_thread_worker
   7.92%  kpktgend_0  [pktgen]           [k] mod_cur_headers
   5.11%  kpktgend_0  [kernel.kallsyms]  [k] memset
   4.11%  kpktgend_0  [openvswitch]      [k] ovs_flow_extract
   4.06%  kpktgend_0  [kernel.kallsyms]  [k] _raw_spin_lock
   3.90%  kpktgend_0  [kernel.kallsyms]  [k] jhash2
   [...]
   0.67%  kpktgend_0  [kernel.kallsyms]  [k] netlink_lookup

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Reviewed-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 19:49:38 -07:00
Thomas Graf 7e1e77636e lib: Resizable, Scalable, Concurrent Hash Table
Generic implementation of a resizable, scalable, concurrent hash table
based on [0]. The implementation supports both, fixed size keys specified
via an offset and length, or arbitrary keys via own hash and compare
functions.

Lookups are lockless and protected as RCU read side critical sections.
Automatic growing/shrinking based on user configurable watermarks is
available while allowing concurrent lookups to take place.

Objects to be hashed must include a struct rhash_head. The reason for not
using the existing struct hlist_head is that the expansion and shrinking
will have two buckets point to a single entry which would lead in obscure
reverse chaining behaviour.

Code includes a boot selftest if CONFIG_TEST_RHASHTABLE is defined.

[0] https://www.usenix.org/legacy/event/atc11/tech/final_files/Triplett.pdf

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Reviewed-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 19:49:38 -07:00
David S. Miller d39a9ffce7 Merge branch 'intel-next'
Aaron Brown says:

====================
Intel Wired LAN Driver Updates

This series contains updates to the i40e and i40evf drivers.

Vasu adds FCOE support, build options and a documentation pointer to i40e.

Shannon exposes a Firmware API request used to do register writes on the
driver's behalf and disables local loopback on VMDQ VSI in order to stop the
VEB from echoing the VMDQ packets back at the VSI.

Ashish corrects the vf_id offset for virtchnl messages in the case of multiple
PFs, removes support for vf unicast promiscuos mode to disallow VFs from
receiving traffic intended for another VF, updates the vfr_stat state check to
handle the existing and future mechanism and adds an adapter state check to
prevent re-arming the watchdog timer after i40evf_remove has been called and
the timer has been deleted.

Serey fixes an issue where a guest OS would panic when removing the vf driver
while the device is being reset due to an attempt to clean a non initialized
mac_filter_list.

Akeem makes a minor comment change.

Jessie changes an instance of sprintf to snprintf that was missed when the
driver was converted to use snprintf everywhere.

Mitch plugs a few memory leaks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 19:41:24 -07:00
Serey Kong 8bb1a54045 i40evf: Fixed guest OS panic when removing vf driver
Removing VF driver during device still in reset caused guest OS panic.

in the i40evf_remove(), we're trying to clean mac_filter_list which has
not been initialized since the device is still stuck at the reset.
The change is to initialize the filter_list before setting any task.

Change-ID: I8b59df7384416c7e6f2d264b598f447e1c2c92b0
Signed-off-by: Serey Kong <serey.kong@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 19:41:15 -07:00
Mitch Williams 6ba36a246e i40evf: fix memory leak on unused interfaces
If the driver is loaded and then unloaded before the interface is
brought up, then it will allocate a MAC filter entry and never free it.
To fix this, on unload, run through the mac filter list and free all the
entries. We also do this during reset recovery when the driver cannot
contact the PF and needs to shut down completely.

Change-ID: I15fabd67eb4a1bfc57605a7db60d0b5d819839db
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 19:41:15 -07:00
Mitch Williams d31944d6f0 i40evf: don't leak queue vectors
Fix a memory leak. Driver was allocating memory for queue vectors on
init but not freeing them on shutdown. These need to be freed at two
different times: during module unload, and during reset recovery when
the driver cannot contact the PF driver and needs to give up.

Change-ID: I7c1d0157a776e960d4da432dfe309035aad7c670
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 19:41:15 -07:00
Ashish Shah d3e2edb70e i40evf: do not re-arm watchdog after remove
Add in an adapter state check to prevent re-arming watchdog timer after
i40evf_remove has been called and timer has been deleted.

Change-ID: I636ba7c6322be8cbf053231959f90c0a2d8d803a
Signed-off-by: Ashish Shah <ashish.n.shah@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 19:41:14 -07:00
Ashish Shah fd35886ad3 i40evf: future-proof vfr_stat state check
Previously defined state I40E_VFR_VFACTIVE uses bit 1 which is now set to
"reserved."  Update the state checks to also include I40E_VFR_COMPLETED.
This change will allow the VF to work with both existing and future PFs.

Change-ID: Ifd1d34f79f3b0ffd6d2550ee4dadc55825ff52f8
Signed-off-by: Ashish Shah <ashish.n.shah@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 19:41:14 -07:00
Ashish Shah 89cb86c3b2 i40e: remove support for vf unicast promiscuous mode
Remove the ability of a VF to set unicast promiscuous mode.
Considered to be a security risk to allow VFs to receive traffic
intended for other VFs so don't allow it, simply ignore the flag.

Also fix it to send the correct seid to aq for multicast promiscuous set.

Change-ID: Icb9c49a281a8e9d3aeebf991ef1533ac82b84b14
Signed-off-by: Ashish Shah <ashish.n.shah@intel.com>
Tested-by: Jim Young  <jamesx.m.young@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 19:41:14 -07:00
Akeem G Abodunrin 84a9208d9e i40e: Minor comment changes
Fixes comment for reset reason

Change-ID: I6fda4fa292255e6eb0f874502b4d38d722149b10
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 19:41:14 -07:00
Jesse Brandeburg b39c1e2c58 i40evf: fix scan warning on sprintf
The driver was converted to use snprintf everywhere but this one function.
Just use snprintf, instead of sprintf.

Also a small spelling correction in a comment.

Change-ID: I59d45f94a52754c7b4cd6034df9a61d8132b7f77
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 19:41:14 -07:00
Shannon Nelson 738abbac9b i40e: disable local loopback on vmdq vsi
The local loopback should only be enabled for VSIs that are supporting
cascaded VEBs or VEPA setups.  This is not the case here, and we need
to stop the VEB from echoing the VMDQ VSI packets back at the VSI.

Change-ID: I9dfb6ac79db24d04360d7efde62d81e20abc5090
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 19:41:14 -07:00
Ashish Shah f19efbb5ef i40e: use correct vf_id offset for virtchnl message
The vf_id needs to be offset by the vf_base_id from hw function capabilities
for the case of multiple PFs.

Change-ID: I20ca8621f98e9cdf98649380b8eeaa35db52677c
Signed-off-by: Ashish Shah <ashish.n.shah@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 19:41:14 -07:00
Shannon Nelson 53db45cd9a i40e: expose debug_write_register request
Now that the HW registers are no longer in debug mode and many are
locked down for writes, we need to expose the Firmware API request
used to do writes on the driver's behalf.

Change-ID: I09a05c4dc9ea0b24c00193faac34d7799eaa8496
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Jim Young <jamesx.m.young>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 19:41:13 -07:00
Vasu Dev 38758f552d i40e: adds FCoE to build and updates its documentation
Adds newly added FCoE files to the build but only if FCoE module is configured.

Also, updates i40e document for added FCoE support.

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Tested-by: Jack Morgan<jack.morgan@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 19:41:13 -07:00
Vasu Dev 38e0043886 i40e: Adds FCoE related code to i40e core driver
Adds FCoE specific code to existing i40e core driver to:-

1. have separate FCoE VSI with additional FCoE queues pairs.
2. have FCoE related hash defines.
3. have additional FCoE related stats code.
4. export and then re-use existing functions required by FCoE build.

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Tested-by: Jack Morgan<jack.morgan@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 19:41:13 -07:00
Vasu Dev a1a693698d i40e: adds FCoE code to the i40e driver
This patch adds FCoE ( Fibre Channel Over Ethernet ) code for
Intel XL710 adapters. This patch is limited to only new FCoE
offloads code in newly added files by this patch and then
following patches in the series modifies rest of the existing
driver to enable FCoE with i40e driver.

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Tested-by: Jack Morgan<jack.morgan@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 19:41:13 -07:00
David S. Miller a4f090fda3 Merge branch 'amd-xgbe-next'
Tom Lendacky says:

====================
amd-xgbe: AMD XGBE driver update 2014-08-01

The following series of patches includes minor fixes/updates to the
driver.

- Remove some uses of spinlock around ethtool/phylib areas
- Update Rx/Tx ready check logic
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 19:29:59 -07:00
Lendacky, Thomas 1fa1f2e098 amd-xgbe-phy: Allow more time for Rx/Tx to become ready
The current time range waiting for Rx/Tx to become ready can sometimes
be too short if a connection is not present.  Increase the number of
retries and the sleep to give a bit more time. Also, change level of
the message issued from _err to _dbg if Rx/Tx do not become ready
since the underlying logic will function as if no link is established
and retry eventually.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 19:29:53 -07:00
Lendacky, Thomas 8c43a2cc75 amd-xgbe: Remove unnecessary spinlocks
Remove the spinlocks around the ethtool get and set settings
functions and within the link adjustment callback routine.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 19:29:53 -07:00
Himangi Saraogi ae29223eaf net: dnet: Use managed interfaces
This patch introduces the use of managed interfaces like
devm_ioremap_resource and does away with the calls to free the
allocated memory in the probe and remove functions. Also, some
labels and variable are done away with. This fixes a bug as there
was a missing release_mem_region in the remove function.

Signed-off-by: Himangi Saraogi <himangi774@gmail.com>
Acked-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 16:40:52 -07:00
Himangi Saraogi 54789983d1 net: ks8851-ml: Use devm_ioremap_resource
This patch introduces the use of devm_ioremap_resource, devm_kmalloc and
does away with the functions to free the allocated memory in the probe
and remove functions. Also, some labels are done away with. A bug is
fixed as two regions are allocated in the probe function, but only one
is freed in the remove function.

Signed-off-by: Himangi Saraogi <himangi774@gmail.com>
Acked-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 16:40:52 -07:00
Himangi Saraogi 6751edeb87 cirrus: cs89x0: Use managed interfaces
This patch introduces the use of managed interfaces like
devm_ioremap_resource and does away with the functions to free the
allocated memory in the probe and remove functions. Also, many labels
are done away with. The field size in no longer needed and is hence
removed from the struct net_local.

Signed-off-by: Himangi Saraogi <himangi774@gmail.com>
Acked-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 16:40:52 -07:00
Hisashi Nakamura 0f76b9d83b net: sh_eth: Add r8a7794 support
Signed-off-by: Hisashi Nakamura <hisashi.nakamura.ak@renesas.com>
[uli: added bindings documentation]
Signed-off-by: Ulrich Hecht <ulrich.hecht+renesas@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 16:38:32 -07:00
David S. Miller 58e70b5940 Merge branch 'be2net-next'
Sathya Perla says:

====================
be2net: patch set

Patch 1 fixes a regression caused by a previous commit on net-next.
Old versions of BE3 FW may not support cmds to re-provision (and hence
optimize) resources/queues in SR-IOV config. Do not treat this FW cmd
failure as fatal and fail the function initialization. Instead, just
enable SR-IOV with the resources provided by the FW.

Patch 2 ignores a VF mac address setting if the new mac is already active
on the VF.

Patch 3 adds support to delete a FW-dump via ethtool on Lancer adapters.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 15:59:22 -07:00
Kalesh AP f061338015 be2net: support deleting FW dump via ethtool (only for Lancer)
This patch adds support to delete an existing FW-dump in Lancer via ethtool.
Initiating a new dump is not allowed if a FW dump is already present in the
adapter. The existing dump has to be first explicitly deleted.

Signed-off-by: Kalesh AP <kalesh.purayil@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 15:59:18 -07:00
Vasundhara Volam 3c31aaf340 be2net: ignore VF mac address setting for the same mac
ndo_set_vf_mac() call may be issued for a mac-addr that is already
active on a VF. If so, silently ignore the request.

Signed-off-by: Vasundhara Volam <vasundhara.volam@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 15:59:17 -07:00