Commit graph

519986 commits

Author SHA1 Message Date
Robert Shearman 25cc8f0763 mpls: fix possible use after free of device
The mpls device is used in an RCU read context without a lock being
held. As the memory is freed without waiting for the RCU grace period
to elapse, the freed memory could still be in use.

Address this by using kfree_rcu to free the memory for the mpls device
after the RCU grace period has elapsed.

Fixes: 03c57747a7 ("mpls: Per-device MPLS state")
Signed-off-by: Robert Shearman <rshearma@brocade.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-07 19:37:27 -07:00
Sriharsha Basavapatna e51000db4c be2net: Replace dma/pci_alloc_coherent() calls with dma_zalloc_coherent()
There are several places in the driver (all in control paths) where
coherent dma memory is being allocated using either dma_alloc_coherent()
or the deprecated pci_alloc_consistent(). All these calls should be
changed to use dma_zalloc_coherent() to avoid uninitialized fields in
data structures backed by this memory.

Reported-by: Joerg Roedel <jroedel@suse.de>
Tested-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@avagotech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-07 15:35:11 -07:00
Wilson Kok 1d7c49037b bridge: use _bh spinlock variant for br_fdb_update to avoid lockup
br_fdb_update() can be called in process context in the following way:
br_fdb_add() -> __br_fdb_add() -> br_fdb_update() (if NTF_USE flag is set)
so we need to use spin_lock_bh because there are softirq users of the
hash_lock. One easy way to reproduce this is to modify the bridge utility
to set NTF_USE, enable stp and then set maxageing to a low value so
br_fdb_cleanup() is called frequently and then just add new entries in
a loop. This happens because br_fdb_cleanup() is called from timer/softirq
context. These locks were _bh before commit f8ae737dee
("[BRIDGE]: forwarding remove unneeded preempt and bh diasables")
and at the time that commit was correct because br_fdb_update() couldn't be
called from process context, but that changed after commit:
292d139898 ("bridge: add NTF_USE support")

Signed-off-by: Wilson Kok <wkok@cumulusnetworks.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Fixes: 292d139898 ("bridge: add NTF_USE support")
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-07 15:24:54 -07:00
Lendacky, Thomas 078b29d7e9 amd-xgbe: Use disable_irq_nosync from within timer function
Since the Tx timer function runs in softirq context the driver needs
to call disable_irq_nosync instead of a disable_irq.

Reported-by: Josh Stone <jistone@redhat.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-07 00:21:12 -07:00
Hauke Mehrtens 6d7954130c rhashtable: add missing import <linux/export.h>
rhashtable uses EXPORT_SYMBOL_GPL() without importing linux/export.h
directly it is only imported indirectly through some other includes.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-07 00:10:15 -07:00
David S. Miller c6271b7633 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue
Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2015-06-04

This series contains updates to i40e and i40evf.

Anjali provides three fixes, first to resolve a Tx queue hang if mixed
size frags are passed to the driver while using TSO.  There was a corner
case where we needed to linearize but we were not.  Next fixes a bug in
the default configuration which prevented a software bridge loaded on the
PF interface from working correctly because broadcast packets are
incorrectly looped back.  Lastly fixes an NPAR bug when SRIOV is enabled,
where we need to be in VEB mode, not VEPA mode at probe.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-05 21:15:50 -07:00
Anjali Singhai Jain fa11cb3d16 i40e: Make sure to be in VEB mode if SRIOV is enabled at probe
If SRIOV is enabled we need to be in VEB mode not VEPA mode at probe.
This fixes an NPAR bug when SRIOV is enabled in the BIOS.

Change-ID: Ibf006abafd9a0ca3698ec24848cd771cf345cbbc
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-06-04 20:14:23 -07:00
Anjali Singhai Jain fc60861e9b i40e: start up in VEPA mode by default
The patch fixes a bug in the default configuration which
prevented a software bridge loaded on the PF interface from
working correctly because broadcast packets are incorrectly
looped back.

Fix the general case, by loading the driver in VEPA mode Until a
VF or VMDq VSI is added. This way loopback on the Main VSI is
turned off until needed and can resolve the issue of unnecessary
reflection for users that do not have VF or VMDq VSIs setup.

The driver must now coordinate the loopback setting for the Flow
Director (FDIR) VSI to make sure it is in sync with the current
VEB or VEPA mode setting.

The user can still switch bridge modes from the bridge commands and
choose to be in VEPA mode with VF VSIs. Because of hardware
requirements, the call to switch to VEB mode when no VF/VMDqs are
present will be rejected.

NOTE: This patch uses BIT_ULL as that is preferred going forward,
a followup patch in the lower priority queue to net-next will fix
up the remaining 1 << usages.

Change-ID: Ib121ddb18fe4b3c4f52e9deda6fcbeb9105683d1
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-06-04 20:10:30 -07:00
Anjali Singhai Jain 30520831f0 i40e/i40evf: Fix mixed size frags and linearization
This patch fixes a bug where the i40e Tx queue will hang if this
skb is passed to the driver.

With mixed size fragments while using TSO there was a corner case
where we needed to linearize but we were not. This was seen with
iSCSI traffic and could be reproduced with a frag list that looks
like this:

num_frags = 17, gso_segs = 17, hdr_len = 66,
skb_shinfo(skb)->gso_size = 1448
size = 3002, j = 1, frag_size = 2936, num_frags = 17
size = 4268, j = 1, frag_size = 4096, num_frags = 16
size = 5534, j = 1, frag_size = 4096, num_frags = 15
size = 5352, j = 1, frag_size = 4096, num_frags = 14
size = 5170, j = 1, frag_size = 4096, num_frags = 13
size = 3468, j = 1, frag_size = 2576, num_frags = 12
size = 750, j = 1, frag_size = 112, num_frags = 11
size = 862, j = 2, frag_size = 112, num_frags = 10
size = 974, j = 3, frag_size = 112, num_frags = 9
size = 1126, j = 4, frag_size = 152, num_frags = 8
size = 1330, j = 5, frag_size = 204, num_frags = 7
size = 1534, j = 6, frag_size = 204, num_frags = 6
size = 356, j = 1, frag_size = 204, num_frags = 5
size = 560, j = 2, frag_size = 204, num_frags = 4
size = 764, j = 3, frag_size = 204, num_frags = 3
size = 968, j = 4, frag_size = 204, num_frags = 2
size = 1140, j = 5, frag_size = 172, num_frags = 1
result: linearize = 0, j = 6

Change-ID: I79bb1aeab0af255fe2ce28e93672a85d85bf47e8
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-06-04 20:06:06 -07:00
Shawn Bohrer 6e54030932 ipv4/udp: Verify multicast group is ours in upd_v4_early_demux()
421b3885bf "udp: ipv4: Add udp early
demux" introduced a regression that allowed sockets bound to INADDR_ANY
to receive packets from multicast groups that the socket had not joined.
For example a socket that had joined 224.168.2.9 could also receive
packets from 225.168.2.9 despite not having joined that group if
ip_early_demux is enabled.

Fix this by calling ip_check_mc_rcu() in udp_v4_early_demux() to verify
that the multicast packet is indeed ours.

Signed-off-by: Shawn Bohrer <sbohrer@rgmadvisors.com>
Reported-by: Yurij M. Plotnikov <Yurij.Plotnikov@oktetlabs.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-04 00:46:26 -07:00
Jiri Benc 640b2b107c openvswitch: disable LRO
Currently, openvswitch tries to disable LRO from the user space. This does
not work correctly when the device added is a vlan interface, though.
Instead of dealing with possibly complex stacked cross name space relations
in the user space, do the same as bridging does and call dev_disable_lro in
the kernel.

Signed-off-by: Jiri Benc <jbenc@redhat.com>
Acked-by: Flavio Leitner <fbl@redhat.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-03 19:39:35 -07:00
Michael Holzheu 88aeca15d6 s390/bpf: fix bpf frame pointer setup
Currently the bpf frame pointer is set to the old r15. This is
wrong because of packed stack. Fix this and adjust the frame pointer
to respect packed stack. This now generates a prolog like the following:

 3ff8001c3fa: eb67f0480024   stmg    %r6,%r7,72(%r15)
 3ff8001c400: ebcff0780024   stmg    %r12,%r15,120(%r15)
 3ff8001c406: b904001f       lgr     %r1,%r15      <- load backchain
 3ff8001c40a: 41d0f048       la      %r13,72(%r15) <- load adjusted bfp
 3ff8001c40e: a7fbfd98       aghi    %r15,-616
 3ff8001c412: e310f0980024   stg     %r1,152(%r15) <- save backchain

Fixes: 0546231057 ("s390/bpf: Add s390x eBPF JIT compiler backend")
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-03 19:31:39 -07:00
Michael Holzheu bbac1c9488 s390/bpf: fix stack allocation
On s390x we have to provide 160 bytes stack space before we can call
the next function. From the 160 bytes that we got from the previous
function we only use 11 * 8 bytes and have 160 - 11 * 8 bytes left.
Currently for BPF we allocate additional 160 - 11 * 8 bytes for the
next function. This is wrong because then the next function only gets:

 (160 - 11 * 8) + (160 - 11 * 8) = 2 * 72 = 144 bytes

Fix this and allocate enough memory for the next function.

Fixes: 0546231057 ("s390/bpf: Add s390x eBPF JIT compiler backend")
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-03 19:31:39 -07:00
Linus Torvalds c46a024ea5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Various VTI tunnel (mark handling, PMTU) bug fixes from Alexander
    Duyck and Steffen Klassert.

 2) Revert ethtool PHY query change, it wasn't correct.  The PHY address
    selected by the driver running the PHY to MAC connection decides
    what PHY address GET ethtool operations return information from.

 3) Fix handling of sequence number bits for encryption IV generation in
    ESP driver, from Herbert Xu.

 4) UDP can return -EAGAIN when we hit a bad checksum on receive, even
    when there are other packets in the receive queue which is wrong.
    Just respect the error returned from the generic socket recv
    datagram helper.  From Eric Dumazet.

 5) Fix BNA driver firmware loading on big-endian systems, from Ivan
    Vecera.

 6) Fix regression in that we were inheriting the congestion control of
    the listening socket for new connections, the intended behavior
    always was to use the default in this case.  From Neal Cardwell.

 7) Fix NULL deref in brcmfmac driver, from Arend van Spriel.

 8) OTP parsing fix in iwlwifi from Liad Kaufman.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (26 commits)
  vti6: Add pmtu handling to vti6_xmit.
  Revert "net: core: 'ethtool' issue with querying phy settings"
  bnx2x: Move statistics implementation into semaphores
  xen: netback: read hotplug script once at start of day.
  xen: netback: fix printf format string warning
  Revert "netfilter: ensure number of counters is >0 in do_replace()"
  net: dsa: Properly propagate errors from dsa_switch_setup_one
  tcp: fix child sockets to use system default congestion control if not set
  udp: fix behavior of wrong checksums
  sfc: free multiple Rx buffers when required
  bna: fix soft lock-up during firmware initialization failure
  bna: remove unreasonable iocpf timer start
  bna: fix firmware loading on big-endian machines
  bridge: fix br_multicast_query_expired() bug
  via-rhine: Resigning as maintainer
  brcmfmac: avoid null pointer access when brcmf_msgbuf_get_pktid() fails
  mac80211: Fix mac80211.h docbook comments
  iwlwifi: nvm: fix otp parsing in 8000 hw family
  iwlwifi: pcie: fix tracking of cmd_in_flight
  ip_vti/ip6_vti: Preserve skb->mark after rcv_cb call
  ...
2015-06-01 20:51:18 -07:00
Linus Torvalds 2459c6099b Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
Pull Sparc fixes from David Miller:

 1) Setup the core/threads/sockets bitmaps correctly so that 'lscpus'
    and friends operate properly.  Frtom Chris Hyser.

 2) The bit that normally means "Cached Virtually" on sun4v systems,
    actually changes meaning in M7 and later chips.  Fix from Khalid
    Aziz.

 3) One some PCI-E systems we need to probe different OF properties to
    fill in the PCI slot information properly, from Eric Snowberg.

 4) Kill an extraneous memset after kzalloc(), from Christophe Jaillet.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc: Resolve conflict between sparc v9 and M7 on usage of bit 9 of TTE
  sparc64: pci slots information is not populated in sysfs
  sparc: kernel: GRPCI2: Remove a useless memset
  sparc64: Setup sysfs to mark LDOM sockets, cores and threads correctly
2015-06-01 20:44:51 -07:00
Linus Torvalds fec345baa5 virtio: last-minute fix for 4.1
This tweaks an exported user-space header to fix
 build breakage for userspace using it.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJVbKwGAAoJECgfDbjSjVRp8t0H/3H9RYIMZskzWFwLt3iguVsR
 A6C+JNMszd4+SBlRYHJ1LyNOYh0xQN089VeHIqVp3agGX4ecOJ58LlUKb670XSFT
 ItXY1QVwlR1cQ2HjYjllSijKy13W4zeDmnYdicsp0DmOngr3Jy1xw8b0LvhhUDEl
 bhzJ4eLV9Cn0oxrirbHhvqi4ESNKa7avP7jbq/7r0qtgBSzdf6WSP7ZhCu+BdEwW
 tpXBh9m3vriFGXmagn10pxsO8k8oDRGFPJPT2SQeyvbTZWkojHPY034PtbrFNpba
 YT2/FCs8nvq7/fmz47nRdX6KivyeSF+RZrCumozjSWJntNolxu3EbHcnJbd0kmU=
 =q1Wp
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio fix from Michael Tsirkin:
 "Last-minute virtio fix for 4.1

  This tweaks an exported user-space header to fix build breakage for
  userspace using it"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  include/uapi/linux/virtio_balloon.h: include linux/virtio_types.h
2015-06-01 18:49:45 -07:00
David S. Miller e453581dd5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter fix for net

The following patch reverts the ebtables chunk that enforces counters that was
introduced in the recently applied d26e2c9ffa ('Revert "netfilter: ensure
number of counters is >0 in do_replace()"') since this breaks ebtables.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-01 16:56:43 -07:00
David S. Miller cd842a67e6 iwlwifi:
* fix OTP parsing 8260
 * fix powersave handling for 8260
 
 brcmfmac:
 
 * fix null pointer crash
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQEcBAABAgAGBQJVa/iiAAoJEG4XJFUm622bTdcH/0N1GxozHLqnSpWb6GF0QNN1
 R80ws2rStBS4/wMNg21xxaXgkzNZfA91dzIOOsNQlbo+RaBu8B/95GZVTMyYRKKT
 eSKfRN1TOwR2eLB9plGM/MQvKGGn/xAPnXqGukhZXE8F1usyVOJhkGwiLlBEgfuE
 2cJOlnWVwe1s8nfjKtZ40kh069oAVqv7sI1AT2+S1EAVtD1DtgYeA+bgo7TvHFDT
 WYz5LlX10tmpcfr0MBE3ikZAEUAOxjbq/nmMaqKqstAbeyzgURxKsyX7YfLQAHjL
 vZ9PGCcnHQ+WXqXeYxyvrY3e49LIzK2jtmgDAG/fPE5z7HAkPQEswuoL5XT+TRU=
 =uZ8u
 -----END PGP SIGNATURE-----

Merge tag 'wireless-drivers-for-davem-2015-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers

Kalle Valo says:

====================
iwlwifi:

* fix OTP parsing 8260
* fix powersave handling for 8260

brcmfmac:

* fix null pointer crash
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-01 16:06:29 -07:00
Steffen Klassert ccd740cbc6 vti6: Add pmtu handling to vti6_xmit.
We currently rely on the PMTU discovery of xfrm.
However if a packet is localy sent, the PMTU mechanism
of xfrm tries to to local socket notification what
might not work for applications like ping that don't
check for this. So add pmtu handling to vti6_xmit to
report MTU changes immediately.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-01 16:03:43 -07:00
David S. Miller 18ec898ee5 Revert "net: core: 'ethtool' issue with querying phy settings"
This reverts commit f96dee13b8.

It isn't right, ethtool is meant to manage one PHY instance
per netdevice at a time, and this is selected by the SET
command.  Therefore by definition the GET command must only
return the settings for the configured and selected PHY.

Reported-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-01 14:43:50 -07:00
Yuval Mintz c6e36d8c1a bnx2x: Move statistics implementation into semaphores
Commit dff173de84 ("bnx2x: Fix statistics locking scheme") changed the
bnx2x locking around statistics state into using a mutex - but the lock
is being accessed via a timer which is forbidden.

[If compiled with CONFIG_DEBUG_MUTEXES, logs show a warning about
accessing the mutex in interrupt context]

This moves the implementation into using a semaphore [with size '1']
instead.

Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-01 12:04:31 -07:00
Ian Campbell 31a418986a xen: netback: read hotplug script once at start of day.
When we come to tear things down in netback_remove() and generate the
uevent it is possible that the xenstore directory has already been
removed (details below).

In such cases netback_uevent() won't be able to read the hotplug
script and will write a xenstore error node.

A recent change to the hypervisor exposed this race such that we now
sometimes lose it (where apparently we didn't ever before).

Instead read the hotplug script configuration during setup and use it
for the lifetime of the backend device.

The apparently more obvious fix of moving the transition to
state=Closed in netback_remove() to after the uevent does not work
because it is possible that we are already in state=Closed (in
reaction to the guest having disconnected as it shutdown). Being
already in Closed means the toolstack is at liberty to start tearing
down the xenstore directories. In principal it might be possible to
arrange to unregister the device sooner (e.g on transition to Closing)
such that xenstore would still be there but this state machine is
fragile and prone to anger...

A modern Xen system only relies on the hotplug uevent for driver
domains, when the backend is in the same domain as the toolstack it
will run the necessary setup/teardown directly in the correct sequence
wrt xenstore changes.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-01 12:03:04 -07:00
Ian Campbell dc5e7a811d xen: netback: fix printf format string warning
drivers/net/xen-netback/netback.c: In function ‘xenvif_tx_build_gops’:
drivers/net/xen-netback/netback.c:1253:8: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 5 has type ‘int’ [-Wformat=]
        (txreq.offset&~PAGE_MASK) + txreq.size);
        ^

PAGE_MASK's type can vary by arch, so a cast is needed.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
----
v2: Cast to unsigned long, since PAGE_MASK can vary by arch.
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-01 12:03:04 -07:00
Bernhard Thaler d26e2c9ffa Revert "netfilter: ensure number of counters is >0 in do_replace()"
This partially reverts commit 1086bbe97a ("netfilter: ensure number of
counters is >0 in do_replace()") in net/bridge/netfilter/ebtables.c.

Setting rules with ebtables does not work any more with 1086bbe97a place.

There is an error message and no rules set in the end.

e.g.

~# ebtables -t nat -A POSTROUTING --src 12:34:56:78:9a:bc -j DROP
Unable to update the kernel. Two possible causes:
1. Multiple ebtables programs were executing simultaneously. The ebtables
   userspace tool doesn't by default support multiple ebtables programs
running

Reverting the ebtables part of 1086bbe97a makes this work again.

Signed-off-by: Bernhard Thaler <bernhard.thaler@wvnet.at>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-06-01 19:45:47 +02:00
Mikko Rapeli 8a7b19d8b5 include/uapi/linux/virtio_balloon.h: include linux/virtio_types.h
Fixes userspace compilation error:

error: unknown type name ‘__virtio16’
  __virtio16 tag;

Signed-off-by: Mikko Rapeli <mikko.rapeli@iki.fi>

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-06-01 15:46:54 +02:00
Khalid Aziz 494e5b6fae sparc: Resolve conflict between sparc v9 and M7 on usage of bit 9 of TTE
sparc: Resolve conflict between sparc v9 and M7 on usage of bit 9 of TTE

Bit 9 of TTE is CV (Cacheable in V-cache) on sparc v9 processor while
the same bit 9 is MCDE (Memory Corruption Detection Enable) on M7
processor. This creates a conflicting usage of the same bit. Kernel
sets TTE.cv bit on all pages for sun4v architecture which works well
for sparc v9 but enables memory corruption detection on M7 processor
which is not the intent. This patch adds code to determine if kernel
is running on M7 processor and takes steps to not enable memory
corruption detection in TTE erroneously.

Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-31 22:15:01 -07:00
Eric Snowberg f0c1a11737 sparc64: pci slots information is not populated in sysfs
Add PCI slot numbers within sysfs for PCIe hardware.  Larger
PCIe systems with nested PCI bridges and slots further
down on these bridges were not being populated within sysfs.
This will add ACPI style PCI slot numbers for these systems
since the OF 'slot-names' information is not available on
all PCIe platforms.

Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-31 22:14:40 -07:00
Christophe Jaillet 8642ad1c7b sparc: kernel: GRPCI2: Remove a useless memset
grpci2priv is allocated using kzalloc, so there is no need to memset it.

Signed-off-by: Christophe Jaillet <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-31 22:14:05 -07:00
Florian Fainelli 24595346d7 net: dsa: Properly propagate errors from dsa_switch_setup_one
While shuffling some code around, dsa_switch_setup_one() was introduced,
and it was modified to return either an error code using ERR_PTR() or a
NULL pointer when running out of memory or failing to setup a switch.

This is a problem for its caler: dsa_switch_setup() which uses IS_ERR()
and expects to find an error code, not a NULL pointer, so we still try
to proceed with dsa_switch_setup() and operate on invalid memory
addresses. This can be easily reproduced by having e.g: the bcm_sf2
driver built-in, but having no such switch, such that drv->setup will
fail.

Fix this by using PTR_ERR() consistently which is both more informative
and avoids for the caller to use IS_ERR_OR_NULL().

Fixes: df197195a5 ("net: dsa: split dsa_switch_setup into two functions")
Reported-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-31 21:50:34 -07:00
Neal Cardwell 9f950415e4 tcp: fix child sockets to use system default congestion control if not set
Linux 3.17 and earlier are explicitly engineered so that if the app
doesn't specifically request a CC module on a listener before the SYN
arrives, then the child gets the system default CC when the connection
is established. See tcp_init_congestion_control() in 3.17 or earlier,
which says "if no choice made yet assign the current value set as
default". The change ("net: tcp: assign tcp cong_ops when tcp sk is
created") altered these semantics, so that children got their parent
listener's congestion control even if the system default had changed
after the listener was created.

This commit returns to those original semantics from 3.17 and earlier,
since they are the original semantics from 2007 in 4d4d3d1e8 ("[TCP]:
Congestion control initialization."), and some Linux congestion
control workflows depend on that.

In summary, if a listener socket specifically sets TCP_CONGESTION to
"x", or the route locks the CC module to "x", then the child gets
"x". Otherwise the child gets current system default from
net.ipv4.tcp_congestion_control. That's the behavior in 3.17 and
earlier, and this commit restores that.

Fixes: 55d8694fa8 ("net: tcp: assign tcp cong_ops when tcp sk is created")
Cc: Florian Westphal <fw@strlen.de>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Glenn Judd <glenn.judd@morganstanley.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-31 21:49:14 -07:00
Eric Dumazet beb39db59d udp: fix behavior of wrong checksums
We have two problems in UDP stack related to bogus checksums :

1) We return -EAGAIN to application even if receive queue is not empty.
   This breaks applications using edge trigger epoll()

2) Under UDP flood, we can loop forever without yielding to other
   processes, potentially hanging the host, especially on non SMP.

This patch is an attempt to make things better.

We might in the future add extra support for rt applications
wanting to better control time spent doing a recv() in a hostile
environment. For example we could validate checksums before queuing
packets in socket receive queue.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-31 21:42:18 -07:00
Linus Torvalds c65b99f046 Linux 4.1-rc6 2015-05-31 19:01:07 -07:00
Daniel Pieczko 9eb0a5d190 sfc: free multiple Rx buffers when required
When Rx packet data must be dropped, all the buffers
associated with that Rx packet must be freed. Extend
and rename efx_free_rx_buffer() to efx_free_rx_buffers()
and loop through all the fragments.
By doing so this patch fixes a possible memory leak.

Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-31 17:36:20 -07:00
Linus Torvalds 8ba64dc338 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fix from Al Viro:
 "Off-by-one in d_walk()/__dentry_kill() race fix.

  It's very hard to hit; possible in the same conditions as the original
  bug, except that you need the skipped branch to contain all the
  remaining evictables, so that the d_walk()-calling loop in
  d_invalidate() decides there's nothing more to do and doesn't go for
  another pass - otherwise that next pass will sweep the sucker.

  So it's not too urgent, but seeing that the fix is obvious and the
  original commit has spread into all -stable branches..."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  d_walk() might skip too much
2015-05-31 16:00:34 -07:00
Linus Torvalds 36a8b9a774 Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
 "Three fixes this time around:

   - fix a memory leak which occurs when probing performance monitoring
     unit interrupts

   - fix handling of non-PMD aligned end of RAM causing boot failures

   - fix missing syscall trace exit path with syscall tracing enabled
     causing a kernel oops in the audit code"

* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
  ARM: 8357/1: perf: fix memory leak when probing PMU PPIs
  ARM: fix missing syscall trace exit
  ARM: 8356/1: mm: handle non-pmd-aligned end of RAM
2015-05-31 12:20:59 -07:00
Linus Torvalds e4ca714b63 Merge branch 'upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/ralf/linux
Pull MIPS fixes from Ralf Baechle:
 "MIPS fixes for 4.1 all across the tree"

* 'upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/ralf/linux:
  MIPS: strnlen_user.S: Fix a CPU_DADDI_WORKAROUNDS regression
  MIPS: BMIPS: Fix bmips_wr_vec()
  MIPS: ath79: fix build problem if CONFIG_BLK_DEV_INITRD is not set
  MIPS: Fuloong 2E: Replace CONFIG_USB_ISP1760_HCD by CONFIG_USB_ISP1760
  MIPS: irq: Use DECLARE_BITMAP
  ttyFDC: Fix to use native endian MMIO reads
  MIPS: Fix CDMM to use native endian MMIO reads
2015-05-31 12:03:42 -07:00
Linus Torvalds 50f5a1ee32 Merge branch 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux
Pull turbostat tool fixes from Len Brown:
 "Just one minor kernel dependency in this batch -- added a #define to
  msr-index.h"

* 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
  tools/power turbostat: update version number to 4.7
  tools/power turbostat: allow running without cpu0
  tools/power turbostat: correctly decode of ENERGY_PERFORMANCE_BIAS
  tools/power turbostat: enable turbostat to support Knights Landing (KNL)
  tools/power turbostat: correctly display more than 2 threads/core
2015-05-31 11:39:25 -07:00
Linus Torvalds dae8f283bf Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target fixes from Nicholas Bellinger:
 "These are mostly minor fixes, with the exception of the following that
  address fall-out from recent v4.1-rc1 changes:

   - regression fix related to the big fabric API registration changes
     and configfs_depend_item() usage, that required cherry-picking one
     of HCH's patches from for-next to address the issue for v4.1 code.

   - remaining TCM-USER -v2 related changes to enforce full CDB
     passthrough from Andy + Ilias.

  Also included is a target_core_pscsi driver fix from Andy that
  addresses a long standing issue with a Scsi_Host reference being
  leaked on PSCSI device shutdown"

* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
  iser-target: Fix error path in isert_create_pi_ctx()
  target: Use a PASSTHROUGH flag instead of transport_types
  target: Move passthrough CDB parsing into a common function
  target/user: Only support full command pass-through
  target/user: Update example code for new ABI requirements
  target/pscsi: Don't leak scsi_host if hba is VIRTUAL_HOST
  target: Fix se_tpg_tfo->tf_subsys regression + remove tf_subsystem
  target: Drop signal_pending checks after interruptible lock acquire
  target: Add missing parentheses
  target: Fix bidi command handling
  target/user: Disallow full passthrough (pass_level=0)
  ISCSI: fix minor memory leak
2015-05-31 11:31:42 -07:00
Linus Torvalds 30a5f11896 Some late hwmon patches, all headed for -stable
- Fix sysfs attribute initialization in nct6775 and nct6683 drivers
 - Do not attempt to auto-detect tmp435 on I2C address 0x37
 - Ensure iio channel is of type IIO_VOLTAGE in ntc_thermistor driver
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVaQlBAAoJEMsfJm/On5mBLw8P/RWUBhL04n4dy7bw8Ap78XOo
 vJBBK/viNonECxShAdyG2dP+/qgNI/1Zbd3ApaDuJKiqV36WdL7wb2/D+wDWxkZG
 MRnxpQ4E67EifbdXZ0+xbIWv/VoHv7JafFS7E7t7GeXih7KophDj3xuMBcR85/Dt
 89TA0+RyX4Cyk7WagXXa+7ybpYEa8DbsHRdPGxdddNh3+BC87/LUoOVgIshN/pv2
 3bYYBo7de7jxR/Avf8G7VETxS/MrG6RTeZ6lM0QSAj4uRDPYR/dtkrVOpE6ZcGmh
 +XjWJE0YWDFh4OEPheF2aXBhsZuakI055vLXrFJDomzJGdzIUnttZo2+IB9I/lI+
 fzZslRtTdecF8ee9XJSOocNnLhV5WtNEg7zYbk7titSnp5zoNxM6fvHC5HkK+uTb
 BgEtjgs+l8+BLXKAp7y92YUSvLt/esGbBr+AnscvRbcG//41FV1bY3zG3Xxh4tCV
 cjg9WokSkIvvbxYqA6x4qr7WHTtruTxHtHkc9BKRzIjzrISm/HFKbBS80gQ6RGPm
 YLK5eAMYuDCaRtsaPJAPx3Sqmzl+fCy2MAmcPz36OTqGhps/YtVmeQpCAqIuP2eu
 8v/5rJU7rjs+OjJxPYeH630C7WeQK7Q09xWEYNoXMl2cqAjGJVka/tJFDF1Ru4+M
 JdiRx1tia5AXu1aThx1h
 =cHeh
 -----END PGP SIGNATURE-----

Merge tag 'hwmon-for-linus-v4.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

Pull hwmon fixes from Guenter Roeck:
 "Some late hwmon patches, all headed for -stable

   - fix sysfs attribute initialization in nct6775 and nct6683 drivers

   - do not attempt to auto-detect tmp435 on I2C address 0x37

   - ensure iio channel is of type IIO_VOLTAGE in ntc_thermistor driver"

* tag 'hwmon-for-linus-v4.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (nct6683) Add missing sysfs attribute initialization
  hwmon: (nct6775) Add missing sysfs attribute initialization
  hwmon: (tmp401) Do not auto-detect chip on I2C address 0x37
  hwmon: (ntc_thermistor) Ensure iio channel is of type IIO_VOLTAGE
2015-05-31 11:24:49 -07:00
David S. Miller fdf7c64444 Merge branch 'bna-fixes'
Ivan Vecera says:

====================
bna: misc bugfixes

These patches fix several bugs found during device initialization debugging.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-30 23:46:54 -07:00
Ivan Vecera 4818e85647 bna: fix soft lock-up during firmware initialization failure
Bug in the driver initialization causes soft-lockup if firmware
initialization timeout is reached. Polling function bfa_ioc_poll_fwinit()
incorrectly calls bfa_nw_iocpf_timeout() when the timeout is reached.
The problem is that bfa_nw_iocpf_timeout() calls again
bfa_ioc_poll_fwinit()... etc. The bfa_ioc_poll_fwinit() should directly
send timeout event for iocpf and the same should be done if firmware
download into HW fails.

Cc: Rasesh Mody <rasesh.mody@qlogic.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-30 23:46:49 -07:00
Ivan Vecera 4918eb1e7c bna: remove unreasonable iocpf timer start
Driver starts iocpf timer prior bnad_ioceth_enable() call and this is
unreasonable. This piece of code probably originates from Brocade/Qlogic
out-of-box driver during initial import into upstream. This driver uses
only one timer and queue to implement multiple timers and this timer is
started at this place. The upstream driver uses multiple timers instead
of this.

Cc: Rasesh Mody <rasesh.mody@qlogic.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-30 23:46:39 -07:00
Ivan Vecera e236b95423 bna: fix firmware loading on big-endian machines
Firmware required by bna is stored in appropriate files as sequence
of LE32 integers. After loading by request_firmware() they need to be
byte-swapped on big-endian arches. Without this conversion the NIC
is unusable on big-endian machines.

Cc: Rasesh Mody <rasesh.mody@qlogic.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-30 23:46:30 -07:00
David S. Miller 50dce303d0 This just has a single docbook build fix. In my confusion
I'd already sent the same fix for -next, but Ben Hutchings
 noted it's necessary in 4.1.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJVZwzBAAoJEDBSmw7B7bqrH4YQAMZazIdDKxZNtprjT1WdB+qJ
 v82zJJUXLzNf7XPYACmwVQpyJq+uG9GHnvIr1sAU5JNtkHgSJQdRrhhICty0Y5dS
 U6dqb/5wvE3oQfPZNYYtEbtUP18j9z34nr18vWhJ3yra1ZWHjk1JJylv8Wy+TVdR
 iHQH/VVs8ACcytBQa7my5642ISWUABtOwOGJ0+cWltxbBUGpzGojXskJwclQ47TW
 bRqS6vXX27RCtfIKwq5BCdryM5uAyWIDwefRbsl1FEE6ergLUYTRQaaNV93kNCN2
 gvitY0xlK2nYHUOodNL03bifg6nnafBkellUTszi4Xfi6dP8WjEtXvpHb2ZHOMpV
 fiiYYtN3Uc6/XiW3UUufMGKmRUu2Fhl/4tIoubcmhaurbPXT1yMkoe0V6ZidKZvU
 N0sLeuCNnR0T9w2Ml23qUSQhHILT6VoaeW1SqTXBAAdlOcqTXuchU6RbOFRgqJyF
 C7VvKE+8+5Rjeu2orgj7bwOQN9cZh8MC1kVhTjjs/jm21CUKRt7pUSqBxmN0WihE
 lkfEqohIwti6v/bHOWpB+RnhWKeV1YX4G0ZMIwmn8tP0807hd9b9ov2F8CBK4e+g
 ylb4Y45WyzuR4cYGY5kndvV0xl8k7XYiFsUj1QnylFPuhaTtuZ5BtlgDq7CKMDTm
 AgkH8YAhLnWvzpTbeDPT
 =gYtq
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-for-davem-2015-05-28' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

Johannes Berg says:

====================
This just has a single docbook build fix. In my confusion
I'd already sent the same fix for -next, but Ben Hutchings
noted it's necessary in 4.1.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-30 23:37:46 -07:00
Eric Dumazet 71d9f6149c bridge: fix br_multicast_query_expired() bug
br_multicast_query_expired() querier argument is a pointer to
a struct bridge_mcast_querier :

struct bridge_mcast_querier {
        struct br_ip addr;
        struct net_bridge_port __rcu    *port;
};

Intent of the code was to clear port field, not the pointer to querier.

Fixes: 2cd4143192 ("bridge: memorize and export selected IGMP/MLD querier port")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Acked-by: Linus Lüssing <linus.luessing@c0d3.blue>
Cc: Linus Lüssing <linus.luessing@web.de>
Cc: Steinar H. Gunderson <sesse@samfundet.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-30 23:31:28 -07:00
Roland Dreier b2feda4feb iser-target: Fix error path in isert_create_pi_ctx()
We don't assign pi_ctx to desc->pi_ctx until we're certain to succeed
in the function.  That means the cleanup path should use the local
pi_ctx variable, not desc->pi_ctx.

This was detected by Coverity (CID 1260062).

Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-05-30 20:01:04 -07:00
Andy Grover a3541703eb target: Use a PASSTHROUGH flag instead of transport_types
It seems like we only care if a transport is passthrough or not. Convert
transport_type to a flags field and replace TRANSPORT_PLUGIN_* with a
flag, TRANSPORT_FLAG_PASSTHROUGH.

Signed-off-by: Andy Grover <agrover@redhat.com>
Reviewed-by: Ilias Tsitsimpis <iliastsi@arrikto.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-05-30 19:58:11 -07:00
Andy Grover 7bfea53b5c target: Move passthrough CDB parsing into a common function
Aside from whether they handle BIDI ops or not, parsing of the CDB by
kernel and user SCSI passthrough modules should be identical. Move this
into a new passthrough_parse_cdb() and call it from tcm-pscsi and tcm-user.

Reported-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ilias Tsitsimpis <iliastsi@arrikto.com>
Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-05-30 19:57:59 -07:00
Andy Grover 9c1cd1b68c target/user: Only support full command pass-through
After much discussion, give up on only passing a subset of SCSI commands
to userspace and pass them all. Based on what pscsi is doing, make sure
to set SCF_SCSI_DATA_CDB for I/O ops, and define attributes identical to
pscsi.

Make hw_block_size configurable via dev param.

Remove mention of command filtering from tcmu-design.txt.

Signed-off-by: Andy Grover <agrover@redhat.com>
Reviewed-by: Ilias Tsitsimpis <iliastsi@arrikto.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-05-30 19:57:48 -07:00
Andy Grover cf87edc602 target/user: Update example code for new ABI requirements
We now require that the userspace handler set a bit if the command is not
handled.

Update calls to tcmu_hdr_get_op for v2.

Signed-off-by: Andy Grover <agrover@redhat.com>
Reviewed-by: Ilias Tsitsimpis <iliastsi@arrikto.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-05-30 19:57:33 -07:00