Commit graph

504708 commits

Author SHA1 Message Date
David S. Miller
c0eebfa323 Merge branch 'rhashtable'
Daniel Borkmann says:

====================
rhashtable updates

As discussed, I'm sending out rhashtable fixups for -net.

I have a couple of more patches I was working on last week pending,
i.e. to get rid of ht->nelems and ht->shift atomic operations which
speed-up pure insertions/deletions, e.g. on my laptop I have 2 threads,
inserting 7M entries each, that will reduce insertion time from ~1,450 ms
to 865 ms (performance should even be better after removing the
grow/shrink indirections). I guess that however is rather something
for net-next.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-27 16:06:21 -05:00
Daniel Borkmann
4c4b52d9b2 rhashtable: remove indirection for grow/shrink decision functions
Currently, all real users of rhashtable default their grow and shrink
decision functions to rht_grow_above_75() and rht_shrink_below_30(),
so that there's currently no need to have this explicitly selectable.

It can/should be generic and private inside rhashtable until a real
use case pops up. Since we can make this private, we'll save us this
additional indirection layer and can improve insertion/deletion time
as well.

Reference: http://patchwork.ozlabs.org/patch/443040/
Suggested-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-27 16:06:02 -05:00
Daniel Borkmann
8331de75cb rhashtable: unconditionally grow when max_shift is not specified
While commit c0c09bfdc4 ("rhashtable: avoid unnecessary wakeup for
worker queue") rightfully moved part of the decision making of
whether we should expand or shrink from the expand/shrink functions
themselves into insert/delete functions in order to avoid unnecessary
worker wake-ups, it however introduced a regression by doing so.

Before that change, if no max_shift was specified (= 0) on rhashtable
initialization, rhashtable_expand() would just grow unconditionally
and lets the available memory be the limiting factor. After that
change, if no max_shift was specified, there would be _no_ expansion
step at all.

Given that netlink and tipc have a max_shift specified, it was not
visible there, but Josh Hunt reported that if nft that starts out
with a default element hint of 3 if not otherwise provided, would
slow i.e. inserts down trememdously as it cannot grow larger to
relax table occupancy.

Given that the test case verifies shrinks/expands manually, we also
must remove pointer to the helper functions to explicitly avoid
parallel resizing on insertions/deletions. test_bucket_stats() and
test_rht_lookup() could also be wrapped around rhashtable mutex to
explicitly synchronize a walk from resizing, but I think that defeats
the actual test case which intended to have explicit test steps,
i.e. 1) inserts, 2) expands, 3) shrinks, 4) deletions, with object
verification after each stage.

Reported-by: Josh Hunt <johunt@akamai.com>
Fixes: c0c09bfdc4 ("rhashtable: avoid unnecessary wakeup for worker queue")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ying Xue <ying.xue@windriver.com>
Cc: Josh Hunt <johunt@akamai.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-27 16:06:02 -05:00
Michael S. Tsirkin
0d79a493e5 vhost: drop hard-coded num_buffers size
The 2 that we use for copy_to_iter comes from sizeof(u16),
it used to be that way before the iov iter update.
Fix it up, making it obvious the size of stack access
is right.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-27 15:53:44 -05:00
Michael S. Tsirkin
4c5a84421c vhost: cleanup iterator update logic
Recent iterator-related changes in vhost made it
harder to follow the logic fixing up the header.
In fact, the fixup always happens at the same
offset: sizeof(virtio_net_hdr): sometimes the
fixup iterator is updated by copy_to_iter,
sometimes-by iov_iter_advance.

Rearrange code to make this obvious.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-27 15:53:44 -05:00
Dan Carpenter
5f2ebfbee6 rocker: silence shift wrapping warning
"val" is declared as a u64 so static checkers complain that this shift
can wrap.  I don't have the hardware but probably it's doesn't have over
31 ports.  Still we may as well silence the warning even if it's not a
real bug.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-27 15:53:44 -05:00
Dan Carpenter
e65ad3be86 rocker: add a check for NULL in rocker_probe_ports()
Make sure kmalloc() succeeds.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Scott Feldman <sfeldma@gmail.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-27 15:53:43 -05:00
Hariprasad Shenai
f01aa633e0 cxgb4: Fix PCI-E Memory window interface for big-endian systems
When doing reads and writes to adapter memory via the PCI-E Memory Window
interface, data gets swizzled on 4-byte boundaries on Big-Endian systems
because we need to account for the register read/write interface which
incorporates a swizzle onto the Little-Endian PCI-E Bus.

Based on original work by Casey Leedom <leedom@chelsio.com>

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-27 15:53:43 -05:00
Sujith Sankar
2b0c2e2d2a enic: do notify_check before returning credits
We should complete notify_check before returning the credits. Once we return the
credits, adaptor may access the notify data.

Signed-off-by: Sujith Sankar <ssujith@cisco.com>
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-27 15:53:43 -05:00
Andy Gospodarek
31639b94ca MAINTAINERS: update my email address
I have been signing off on patches with this address so I'll change it.

Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-25 17:01:40 -05:00
Tom Lendacky
74ad752442 amd-xgbe-phy: PHY KX/KR mode differences
The PHY requires different settings for the Decision Feedback Analyzer
(DFE) when running in KX mode vs. KR mode. Update the code to change
these settings when changing modes in order to provide a more stable
link.

Additionally, adjust the 10GbE PQ skew default setting to a more sane
value.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-25 16:57:42 -05:00
Yannick Guerrini
5c2d2b148b r8169: Fix trivial typo in rtl_check_firmware
Change 'firwmare' to 'firmware'

Signed-off-by: Yannick Guerrini <yguerrini@tomshardware.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-24 16:26:07 -05:00
David Vrabel
7fbb9d8415 xen-netback: release pending index before pushing Tx responses
If the pending indexes are released /after/ pushing the Tx response
then a stale pending index may be used if a new Tx request is
immediately pushed by the frontend.  The may cause various WARNINGs or
BUGs if the stale pending index is actually still in use.

Fix this by releasing the pending index before pushing the Tx
response.

The full barrier for the pending ring update is not required since the
the Tx response push already has a suitable write barrier.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-24 16:24:22 -05:00
Alexander Drozdov
41a50d621a af_packet: don't pass empty blocks for PACKET_V3
Before da413eec72 ("packet: Fixed TPACKET V3 to signal poll when block is
closed rather than every packet") poll listening for an af_packet socket was
not signaled if there was no packets to process. After the patch poll is
signaled evety time when block retire timer expires. That happens because
af_packet closes the current block on timeout even if the block is empty.

Passing empty blocks to the user not only wastes CPU but also wastes ring
buffer space increasing probability of packets dropping on small timeouts.

Signed-off-by: Alexander Drozdov <al.drozdov@gmail.com>
Cc: Dan Collins <dan@dcollins.co.nz>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Guy Harris <guy@alum.mit.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-24 16:08:42 -05:00
Sasha Levin
4e10fd5b4a rtnetlink: avoid 0 sized arrays
Arrays (when not in a struct) "shall have a value greater than zero".

GCC complains when it's not the case here.

Fixes: ba7d49b1f0 ("rtnetlink: provide api for getting and setting slave info")
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-24 15:39:09 -05:00
Marcelo Leitner
77751427a1 ipv6: addrconf: validate new MTU before applying it
Currently we don't check if the new MTU is valid or not and this allows
one to configure a smaller than minimum allowed by RFCs or even bigger
than interface own MTU, which is a problem as it may lead to packet
drops.

If you have a daemon like NetworkManager running, this may be exploited
by remote attackers by forging RA packets with an invalid MTU, possibly
leading to a DoS. (NetworkManager currently only validates for values
too small, but not for too big ones.)

The fix is just to make sure the new value is valid. That is, between
IPV6_MIN_MTU and interface's MTU.

Note that similar check is already performed at
ndisc_router_discovery(), for when kernel itself parses the RA.

Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-23 18:16:12 -05:00
Vlastimil Setka
8d4ac39df0 altera_tse: Fixes in NAPI and interrupt handling paths
Incorrect NAPI polling caused WARNING at net/core/dev.c net_rx_action.
Some stability issues were also seen at high throughput and system
load before this patch.

This patch contains several changes in altera_tse_main.c:

- tse_rx() is fixed to not process more than `limit` frames

- tse_poll() is refactored to match NAPI logic
  - only received frames are counted for return value
  - removed bogus condition `(rxcomplete >= budget || txcomplete > 0)`
  - replace by: if (rxcomplete < budget) -> call __napi_complete and enable irq

- altera_isr()
  - replace spin_lock_irqsave() by spin_lock() - we are in isr
  - use spinlocks just over irq manipulation, not over __napi_schedule
  - reset IRQ first, then disable and schedule napi

This is a cleaned up resubmission from Vlastimil's recent submission.

Signed-off-by: Vlastimil Setka <setka@vsis.cz>
Signed-off-by: Roman Pisl <rpisl@kky.zcu.cz>
Signed-off-by: Vince Bridgers <vbridger@opensource.altera.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-23 18:07:36 -05:00
Vlastimil Setka
fe6e4081a6 altera_tse: Correct typo in obtaining tx_fifo_depth from devicetree
This patch corrects a typo in the way tx_fifo_depth is read from the
devicetree. This patch was submitted by Vlastimil about a week ago,
and is now cleaned up and resubmitted.

Signed-off-by: Vlastimil Setka <setka@vsis.cz>
Signed-off-by: Vince Bridgers <vbridger@opensource.altera.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-23 18:07:35 -05:00
Catalin Marinas
d720d8cec5 net: compat: Ignore MSG_CMSG_COMPAT in compat_sys_{send, recv}msg
With commit a7526eb5d0 (net: Unbreak compat_sys_{send,recv}msg), the
MSG_CMSG_COMPAT flag is blocked at the compat syscall entry points,
changing the kernel compat behaviour from the one before the commit it
was trying to fix (1be374a051, net: Block MSG_CMSG_COMPAT in
send(m)msg and recv(m)msg).

On 32-bit kernels (!CONFIG_COMPAT), MSG_CMSG_COMPAT is 0 and the native
32-bit sys_sendmsg() allows flag 0x80000000 to be set (it is ignored by
the kernel). However, on a 64-bit kernel, the compat ABI is different
with commit a7526eb5d0.

This patch changes the compat_sys_{send,recv}msg behaviour to the one
prior to commit 1be374a051.

The problem was found running 32-bit LTP (sendmsg01) binary on an arm64
kernel. Arguably, LTP should not pass 0xffffffff as flags to sendmsg()
but the general rule is not to break user ABI (even when the user
behaviour is not entirely sane).

Fixes: a7526eb5d0 (net: Unbreak compat_sys_{send,recv}msg)
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-23 17:22:05 -05:00
Fabian Frederick
a948f8ce77 irda: replace current->state by set_current_state()
Use helper functions to access current->state.
Direct assignments are prone to races and therefore buggy.

current->state = TASK_RUNNING can be replaced by __set_current_state()

Thanks to Peter Zijlstra for the exact definition of the problem.

Suggested-By: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-23 17:21:11 -05:00
Jamal Hadi Salim
30ff547659 net: sched: export tc_connmark.h so it is uapi accessible
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-23 16:41:08 -05:00
Jiri Pirko
57e5956319 team: fix possible null pointer dereference in team_handle_frame
Currently following race is possible in team:

CPU0                                        CPU1
                                            team_port_del
                                              team_upper_dev_unlink
                                                priv_flags &= ~IFF_TEAM_PORT
team_handle_frame
  team_port_get_rcu
    team_port_exists
      priv_flags & IFF_TEAM_PORT == 0
    return NULL (instead of port got
                 from rx_handler_data)
                                              netdev_rx_handler_unregister

The thing is that the flag is removed before rx_handler is unregistered.
If team_handle_frame is called in between, team_port_exists returns 0
and team_port_get_rcu will return NULL.
So do not check the flag here. It is guaranteed by netdev_rx_handler_unregister
that team_handle_frame will always see valid rx_handler_data pointer.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Fixes: 3d249d4ca7 ("net: introduce ethernet teaming device")
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-23 15:30:28 -05:00
Rasmus Villemoes
46b9e4bb76 decnet: Fix obvious o/0 typo
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-23 15:28:50 -05:00
Sasha Levin
71bb0012c3 rhashtable: initialize all rhashtable walker members
Commit f2dba9c6ff ("rhashtable: Introduce rhashtable_walk_*") forgot to
initialize the members of struct rhashtable_walker after allocating it, which
caused an undefined value for 'resize' which is used later on.

Fixes: f2dba9c6ff ("rhashtable: Introduce rhashtable_walk_*")
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-23 15:23:19 -05:00
David S. Miller
85689b24c5 Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Johan Hedberg says:

====================
pull request: bluetooth 2015-02-23

Here's one important fix for the 4.0-rc series. Refactoring of Intel
Bluetooth controller detection ended up disabling some older ones which
are based on CSR hardware. This patch re-introduces the necessary USB id
and fixes the breakage.

Please let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-23 15:05:39 -05:00
Marcel Holtmann
407550fe2c Bluetooth: btusb: Fix issue with CSR based Intel Wireless controllers
Older Wireless controllers from Intel used CSR chips to provide support
for Bluetooth.

The commit d0ac9eb72 (Bluetooth: btusb: Ignore unknown Intel devices
with generic descriptor) disabled these older controllers. To enable
them again, put them into the blacklist and mark them clearly as CSR
based controllers.

T:  Bus=02 Lev=02 Prnt=02 Port=05 Cnt=01 Dev#=  3 Spd=12   MxCh= 0
D:  Ver= 2.00 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=8087 ProdID=07da Rev=78.69
C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=  0mA
I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=81(I) Atr=03(Int.) MxPS=  16 Ivl=1ms
E:  Ad=02(O) Atr=02(Bulk) MxPS=  64 Ivl=0ms
E:  Ad=82(I) Atr=02(Bulk) MxPS=  64 Ivl=0ms
I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=   0 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=   0 Ivl=1ms
I:  If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=   9 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=   9 Ivl=1ms
I:  If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=  17 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=  17 Ivl=1ms
I:  If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=  25 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=  25 Ivl=1ms
I:  If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=  33 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=  33 Ivl=1ms
I:  If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=  49 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=  49 Ivl=1ms

Reported-by: Kenneth R. Crudup <kenny@panix.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-02-23 09:30:35 +02:00
Neal Cardwell
6514890f7a tcp: fix tcp_should_expand_sndbuf() to use tcp_packets_in_flight()
tcp_should_expand_sndbuf() does not expand the send buffer if we have
filled the congestion window.

However, it should use tcp_packets_in_flight() instead of
tp->packets_out to make this check.

Testing has established that the difference matters a lot if there are
many SACKed packets, causing a needless performance shortfall.

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Nandita Dukkipati <nanditad@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-22 23:07:11 -05:00
Eric Dumazet
52d6c8c6ca net: pktgen: disable xmit_clone on virtual devices
Trying to use burst capability (aka xmit_more) on a virtual device
like bonding is not supported.

For example, skb might be queued multiple times on a qdisc, with
various list corruptions.

Fixes: 38b2cf2982 ("net: pktgen: packet bursting via skb->xmit_more")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-22 22:43:20 -05:00
David S. Miller
87cda7cb43 r8169: Revert BQL and xmit_more support.
There are certain regressions which are pointing to
these two commits which we are having a hard time
resolving.  So revert them for now.

Specifically this reverts:

	commit 0bec3b700d
	Author: Florian Westphal <fw@strlen.de>
	Date:   Wed Jan 7 10:49:49 2015 +0100

	    r8169: add support for xmit_more

and

	commit 1e91887685
	Author: Florian Westphal <fw@strlen.de>
	Date:   Wed Oct 1 13:38:03 2014 +0200

	    r8169: add support for Byte Queue Limits

There were some attempts by Eric Dumazet to address some obvious
problems in the TX flow, to see if they would fix the problems,
but none of them seem to help for the regression reporters.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-22 15:54:29 -05:00
Fabian Frederick
2c45015a66 wan: cosa: replace current->state by set_current_state()
Use helper functions to access current->state.
Direct assignments are prone to races and therefore buggy.

current->state = TASK_RUNNING is replaced by __set_current_state()

Thanks to Peter Zijlstra for the exact definition of the problem.

Suggested-By: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Acked-By: Jan "Yenya" Kasprzak <kas@fi.muni.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-22 15:24:10 -05:00
Fabian Frederick
50462ce005 hso: replace current->state by __set_current_state()
Use helper functions to access current->state.
Direct assignments are prone to races and therefore buggy.

Thanks to Peter Zijlstra for the exact definition of the problem.

Suggested-By: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-22 15:24:10 -05:00
Fabian Frederick
45cee4f594 mISDN: replace current->state by set_current_state()
Use helper function to access current->state.
Direct assignments are prone to races and therefore buggy.

Thanks to Peter Zijlstra for the exact definition of the problem.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-22 15:24:10 -05:00
Alexander Drozdov
3f34b24a73 af_packet: allow packets defragmentation not only for hash fanout type
Packets defragmentation was introduced for PACKET_FANOUT_HASH only,
see 7736d33f42 ("packet: Add pre-defragmentation support for ipv4
fanouts")

It may be useful to have defragmentation enabled regardless of
fanout type. Without that, the AF_PACKET user may have to:
1. Collect fragments from different rings
2. Defragment by itself

Signed-off-by: Alexander Drozdov <al.drozdov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-21 23:00:18 -05:00
Eric Dumazet
b9ebafbe8c rhashtable: ensure cache line alignment on bucket_table
struct bucket_table contains mostly read fields :

size, locks_mask, locks.

Make sure these are not sharing a cache line with buckets[]

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-21 21:59:57 -05:00
Matthew Thode
a4176a9391 net: reject creation of netdev names with colons
colons are used as a separator in netdev device lookup in dev_ioctl.c

Specific functions are SIOCGIFTXQLEN SIOCETHTOOL SIOCSIFNAME

Signed-off-by: Matthew Thode <mthode@mthode.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-21 21:45:25 -05:00
Florian Fainelli
ddede6d536 net: dsa: bcm_sf2: fix 64-bits register reads
Reading 64-bits register was not working because we inverted the steps
between reading the lower 32-bits of the register and reading the upper
32-bits. Swapping these operations is how the HW guarantees that 64-bits
reads are latched correctly. We only have a handful of 64-bits registers
for now, mostly MIB counters, so the imapct is low.

Fixes: 246d7f773c ("net: dsa: add Broadcom SF2 switch driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-20 17:52:46 -05:00
Daniel Borkmann
6dd0c1655b rhashtable: allow to unload test module
There's no good reason why to disallow unloading of the rhashtable
test case module.

Commit 9d6dbe1bba moved the code from a boot test into a stand-alone
module, but only converted the subsys_initcall() handler into a
module_init() function without a related exit handler, and thus
preventing the test module from unloading.

Fixes: 9d6dbe1bba ("rhashtable: Make selftest modular")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-20 17:38:10 -05:00
Daniel Borkmann
eb6d1abf1b rhashtable: better high order allocation attempts
When trying to allocate future tables via bucket_table_alloc(), it seems
overkill on large table shifts that we probe for kzalloc() unconditionally
first, as it's likely to fail.

Only probe with kzalloc() for more reasonable table sizes and use vzalloc()
either as a fallback on failure or directly in case of large table sizes.

Fixes: 7e1e77636e ("lib: Resizable, Scalable, Concurrent Hash Table")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-20 17:38:09 -05:00
Daniel Borkmann
342100d937 rhashtable: don't test for shrink on insert, expansion on delete
Restore pre 54c5b7d311 behaviour and only probe for expansions on inserts
and shrinks on deletes. Currently, it will happen that on initial inserts
into a sparse hash table, we may i.e. shrink it first simply because it's
not fully populated yet, only to later realize that we need to grow again.

This however is counter intuitive, e.g. an initial default size of 64
elements is already small enough, and in case an elements size hint is given
to the hash table by a user, we should avoid unnecessary expansion steps,
so a shrink is clearly unintended here.

Fixes: 54c5b7d311 ("rhashtable: introduce rhashtable_wakeup_worker helper function")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ying Xue <ying.xue@windriver.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-20 17:38:09 -05:00
David S. Miller
ee92259849 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter/IPVS fixes for net

The following patchset contains updates for your net tree, they are:

1) Fix removal of destination in IPVS when the new mixed family support
   is used, from Alexey Andriyanov via Simon Horman.

2) Fix module refcount undeflow in nft_compat when reusing a match /
   target.

3) Fix iptables-restore when the recent match is used with a new hitcount
   that exceeds threshold, from Florian Westphal.

4) Fix stack corruption in xt_socket due to using stack storage to save
   the inner IPv6 header, from Eric Dumazet.

I'll follow up soon with another batch with more fixes that are still
cooking.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-20 17:36:20 -05:00
Dan Carpenter
278f7b4fff caif: fix a signedness bug in cfpkt_iterate()
The cfpkt_iterate() function can return -EPROTO on error, but the
function is a u16 so the negative value gets truncated to a positive
unsigned short.  This causes a static checker warning.

The only caller which might care is cffrml_receive(), when it's checking
the frame checksum.  I modified cffrml_receive() so that it never says
-EPROTO is a valid checksum.

Also this isn't ever going to be inlined so I removed the "inline".

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-20 17:35:14 -05:00
Anish Bhatt
5a8eeec468 cxgb4: Fix incorrect 'c' suffix to %pI4, use %pISc instead
Issue caught by 0-day kernel test infrastructure. Code changed to use sockaddr
members so that %pISc can be used instead.

Fixes: b5a02f503c ('cxgb4 : Update ipv6 address handling api')

Signed-off-by: Anish Bhatt <anish@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-20 17:25:52 -05:00
Rami Rosen
65e9256c4e ethtool: Add hw-switch-offload to netdev_features_strings.
commit aafb3e98b2 (netdev: introduce new NETIF_F_HW_SWITCH_OFFLOAD feature
flag for switch device offloads) add a new feature without adding it to
netdev_features_strings array; this patch fixes this.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-20 16:36:43 -05:00
Mahesh Bandewar
f4c2b7a081 ipvlan: Fix text that talks about ip util support
ipvlan was added into 3.19 release and iproute2 added support
for the same in iproute2-3.19 package.

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
CC: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-20 16:35:44 -05:00
Daniel Borkmann
b7f5e5c7f8 rhashtable: don't allocate ht structure on stack in test_rht_init
With object runtime debugging enabled, the rhashtable test suite
will rightfully throw a warning "ODEBUG: object is on stack, but
not annotated" from rhashtable_init().

This is because run_work is (correctly) being initialized via
INIT_WORK(), and not annotated by INIT_WORK_ONSTACK(). Meaning,
rhashtable_init() is okay as is, we just need to move ht e.g.,
into global scope.

It never triggered anything, since test_rhashtable is rather a
controlled environment and effectively runs to completion, so
that stack memory is not vanishing underneath us, we shouldn't
confuse any testers with it though.

Fixes: 7e1e77636e ("lib: Resizable, Scalable, Concurrent Hash Table")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-20 16:33:30 -05:00
Arnd Bergmann
bf60e50cb3 net/appletalk: LTPC needs virt_to_bus
The ltpc driver is rather outdated and does not get built on most
platforms because it requires the ISA_DMA_API symbol. However
there are some ARM platforms that have ISA_DMA_API but no virt_to_bus,
and they get this build error when enabling the ltpc driver.

drivers/net/appletalk/ltpc.c: In function 'handlefc':
drivers/net/appletalk/ltpc.c:380:2: error: implicit declaration of function 'virt_to_bus' [-Werror=implicit-function-declaration]
  set_dma_addr(dma,virt_to_bus(ltdmacbuf));
  ^

This adds another dependency in Kconfig to avoid that configuration.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-20 16:28:46 -05:00
Arnd Bergmann
15added6a5 net: smc91x: improve neponset hack
The smc91x driver tries to support multiple platforms at compile
time, but they are mutually exclusive at runtime, and not clearly
defined.

Trying to build for CONFIG_SA1100_ASSABET without CONFIG_ASSABET_NEPONSET
results in this link error:

drivers/built-in.o: In function `smc_drv_probe':
:(.text+0x33310c): undefined reference to `neponset_ncr_frob'

since the neponset_ncr_set function is not defined otherwise.

Similarly, building for both CONFIG_SA1100_ASSABET and CONFIG_SA1100_PLEB
results in a different build error:

smsc/smc91x.c: In function 'smc_drv_probe':
smsc/smc91x.c:2299:2: error: implicit declaration of function 'neponset_ncr_set' [-Werror=implicit-function-declaration]
  neponset_ncr_set(NCR_ENET_OSC_EN);
  ^
smsc/smc91x.c:2299:19: error: 'NCR_ENET_OSC_EN' undeclared (first use in this function)
  neponset_ncr_set(NCR_ENET_OSC_EN);
                   ^

This is an attempt to fix the call site responsible for both
errors, making sure we call the function exactly when the driver
is actually trying to run on the assabet/neponset machine. With
this patch, I no longer see randconfig build errors in this file.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-20 16:23:27 -05:00
Eric Dumazet
997d5c3f44 sock: sock_dequeue_err_skb() needs hard irq safety
Non NAPI drivers can call skb_tstamp_tx() and then sock_queue_err_skb()
from hard IRQ context.

Therefore, sock_dequeue_err_skb() needs to block hard irq or
corruptions or hangs can happen.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: 364a9e9324 ("sock: deduplicate errqueue dequeue")
Fixes: cb820f8e4b ("net: Provide a generic socket error queue delivery method for Tx time stamps.")
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-20 15:52:21 -05:00
Geert Uytterhoeven
846cd66788 net: Initialize all members in skb_gro_remcsum_init()
skb_gro_remcsum_init() initializes the gro_remcsum.delta member only,
leading to compiler warnings about a possibly uninitialized
gro_remcsum.offset member:

drivers/net/vxlan.c: In function ‘vxlan_gro_receive’:
drivers/net/vxlan.c:602: warning: ‘grc.offset’ may be used uninitialized in this function
net/ipv4/fou.c: In function ‘gue_gro_receive’:
net/ipv4/fou.c:262: warning: ‘grc.offset’ may be used uninitialized in this function

While these are harmless for now:
  - skb_gro_remcsum_process() sets offset before changing delta,
  - skb_gro_remcsum_cleanup() checks if delta is non-zero before
    accessing offset,
it's safer to let the initialization function initialize all members.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-20 15:49:57 -05:00
Derrick Pallas
f81edc6ac1 ethernet/ixp4xx: prevent allmulti from clobbering promisc
If both promisc and allmulti are set, promisc should trump allmulti and
disable the MAC filter; otherwise, the interface is not really promisc.

Previously, this code checked IFF_ALLMULTI prior to and without regard for
IFF_PROMISC; if both were set, only multicast and direct unicast traffic
would make it through the filter.

Signed-off-by: Derrick Pallas <pallas@meraki.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-20 15:49:07 -05:00