The cxgb3 NIC driver can handle more firmware versions than iw_cxgb3,
and since commit 8207befa ("cxgb3: untie strict FW matching") cxgb3
will load with firmware versions that iw_cxgb3 can't handle. The FW
major number indicates a specific interface between the FW and
iw_cxgb3. Thus if the major number of the running firmware does not
match the required version compiled into iw_cxgb3, then iw_cxgb3 must
not register that device.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
According to the ConnectX programmer's reference manual, all
operations should be stopped, all QPs should be torn down and all WQEs
flushed before the CLOSE_PORT command is invoked. In some cases
reversing the order of operations (as implemented now) could cause
a loss of completions.
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
While doing testing, there are failures as MPA Reject call is not
handled. To handle MPA Reject call, following changes are done:
*Handle inbound/outbound MPA Reject response message.
When nes_reject() is called for pending MPA request reply,
send the MPA Reject message to its peer (active
side)cm_node. The peer cm_node (active side) will indicate
Reject message event for the pending Connect Request.
*Handle MPA Reject response message for loopback connections and listener.
When MPA Request is rejected, check if it is a loopback
connection and if it is then it will send Reject message event
to its peer loopback node. Also when destroying listener,
check if the cm_nodes for that listener are loopback or not.
*Add gracefull connection close with the MPA Reject response message.
Send gracefull close (FIN, FIN ACK..) to terminate the cm_nodes.
*Some code re-org while making the above changes.
Removed recv_list and recv_list_lock from the cm_node
structure as there can be only one receive close entry on the
timer. Also implemented handle_recv_entry() as receive close
entry is processed from both nes_rem_ref_cm_node() as well as
nes_cm_timer_tick().
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Two level 256 byte PBLs was not implemented so the driver could report
out of memory when in fact there were PBLs still available.
This solution prefers to use 4KB PBLs over two level 256B PBLs until
the number of 4KB PBLs falls below a threshold. At this point the 4KB
PBL structure is converted to use 256B PBLs which prevents the driver
from running out of 4KB PBLs too quickly.
Signed-off-by: Don Wood <donald.e.wood@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
NETIF_F_LLTX is deprecated. Remove private TX locking from the driver
and remove the NETIF_F_LLTX feature flag. This also fixes a warning
in some configs that comes from doing skb_linearize() call in the
hard_start_xmit method with IRQs disabled (if HIGHMEM is enabled,
skb_linearize() may end up enabling BHs, which is a no-no if hard IRQs
are disabled in that context). By getting rid of LLTX, we do not
disable IRQs when skb_linearize() is called.
Remove the sq_lock as it is not needed for non-LLTX. Fix ethtool not
to show the counter for sq_lock.
Reported-by: aluno3@poczta.onet.pl
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When asynchronous events are processed by software, it is necessary
to let the hardware know that software has handled the event. This
frees up the entry in the asynchronous event queue.
Signed-off-by: Don Wood <donald.e.wood@intel.com>
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
In find_node(), tmp_addr causes an "unused variable" warning when
INFINIBAND_NES_DEBUG is not defined. It's only used in a nes_debug()
and the print does not make sense. So take out the whole thing.
Reported-by: Manish Katiyar <mkatiyar@gmail.com>
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ibv_devinfo displays 0 for vendor_id and vendor_part_id. Fill in OUI
and device_id for those two fields.
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Update copyright to the new legal entity, Intel-NE, Inc., an Intel
company. Update copyright for the new year.
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Fix occurrences where the software PBL counts were changed before the
hardware was updated. This bug allowed another thread to overallocate
the hardware resources.
Add proper PBL accounting in case nes_reg_mr() fails.
Signed-off-by: Don Wood <donald.e.wood@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Some attribute show functions test ibdev_is_alive() to make sure that
it's OK to access device state. However, the sysfs attributes will
not be registered until the device is fully initialized, and they'll
be unregistered before anything is torn down, so ibdev_is_alive()
doesn't do anything useful. Remove it.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Our testing uncovered a race condition in ib_sa_event():
spin_lock_irqsave(&port->ah_lock, flags);
if (port->sm_ah)
kref_put(&port->sm_ah->ref, free_sm_ah);
port->sm_ah = NULL;
spin_unlock_irqrestore(&port->ah_lock, flags);
schedule_work(&sa_dev->port[event->element.port_num -
sa_dev->start_port].update_task);
If two events occur back-to-back (e.g., client-reregister and LID
change), both may pass the spinlock-protected code above before the
scheduled work updates the port->sm_ah handle. Then if the scheduled
work ends up running twice, the second operation will then find a
non-NULL port->sm_ah, and will simply overwrite it in update_sm_ah --
resulting in an AH leak.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
If ib_post_send_mad() returns 0, the API guarantees that there will be
a callback to send_buf->mad_agent->send_handler() so that the sender
can call ib_free_send_mad(). Otherwise, the ib_mad_send_buf will be
leaked and the mad_agent reference count will never go to zero and the
IB device module cannot be unloaded. The above can happen without
this patch if process_mad() returns (IB_MAD_RESULT_SUCCESS |
IB_MAD_RESULT_CONSUMED).
If process_mad() returns IB_MAD_RESULT_SUCCESS and there is no agent
registered to receive the mad being sent, handle_outgoing_dr_smp()
returns zero which causes a MAD packet which is at the end of the
directed route to be incorrectly sent on the wire but doesn't cause a
hang since the HCA generates a send completion.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
There is a potential race in ib_register_mad_agent() where the struct
ib_mad_agent_private is not fully initialized before it is added to
the list of agents per IB port. This means the ib_mad_agent_private
could be seen before the refcount, spin locks, and linked lists are
initialized. The fix is to initialize the structure earlier.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
handle_outgoing_dr_smp() can queue a struct ib_mad_local_private
*local on the mad_agent_priv->local_work work queue with
local->mad_priv == NULL if device->process_mad() returns
IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_REPLY and
(!ib_response_mad(&mad_priv->mad.mad) ||
!mad_agent_priv->agent.recv_handler).
In this case, local_completions() will be called with local->mad_priv
== NULL. The code does check for this case and skips calling
recv_mad_agent->agent.recv_handler() but recv == 0 so
kmem_cache_free() is called with a NULL pointer.
Also, since recv isn't reinitialized each time through the loop, it
can cause a memory leak if recv should have been zero.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Remove hard setting of the IB MTU used by iSER's RC queue-pair to 1K,
as this was done due to inter-op issues with an old iser target which
is not used any more.
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Move the ib_device_unregister_sysfs() call from ib_dealloc_device() to
ib_unregister_device(). The old code allows device unregister to
proceed even if some sysfs files are open, which leaves a window where
userspace can open a file before a device is removed but then end up
reading the file after the device is removed, which leads to various
kernel crashes either because the device data structure is freed or
because the low-level driver code is gone after module removal.
By not returning from ib_unregister_device() until after all sysfs
entries are removed, we make sure that data structures and/or module
code is not freed until after all sysfs access is done.
Reported-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ipath_release_user_pages_on_close() just allocated a structure to
schedule work with but just returned (leaking the structure) rather than
actually doing schedule_work(). Fix the logic to what was intended.
This was spotted by the Coverity checker (CID 2700).
Signed-off-by: Roland Dreier <rolandd@cisco.com>
If the second vmalloc() fails, the wrong pointer is pased to vfree(), so
the first vmalloc() ends up getting leaked.
This was spotted by the Coverity checker (CID 2709).
Signed-off-by: Roland Dreier <rolandd@cisco.com>
If path_rec_start() returns error, call path_free() only if the path
was newly-created. If we free an existing path whose valid flag was zero,
(but do not detach it from the list) we cause corruption of the
path list (of which it is a member), and get a kernel crash.
The simplest solution is to not free an existing path -- just leave it
in the list as-is (i.e., with its valid flag cleared).
Thanks to Yossi Etigin of Voltaire for identifying the problem flow
which caused the kernel crash.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Moni Shua <monis@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Remove modulo usage to avoid a divide in the fast path (not all
gcc versions do strength reduction here).
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The poll and flush code needs to handle all send opcodes: SEND,
SEND_WITH_SE, SEND_WITH_INV, and SEND_WITH_SE_INV.
Ignore TERM indications if the connection already gone.
Ignore HW receive completions if the RQ is empty.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The variable 'offset' in iwch_sgl2pbl_map() needs to be a u64.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When snooping a PortInfo MAD, its client_reregister bit is checked.
If the bit is ON then a CLIENT_REREGISTER event is dispatched,
otherwise a LID_CHANGE event is dispatched. This way of decision
ignores the cases where the MAD changes the LID along with an
instruction to reregister (so a necessary LID_CHANGE event won't be
dispatched) or the MAD is neither of these (and an unnecessary
LID_CHANGE event will be dispatched).
This causes problems at least with IPoIB, which will do a "light"
flush on reregister, rather than the "heavy" flush required due to a
LID change.
Fix this by dispatching a CLIENT_REREGISTER event if the
client_reregister bit is set, but also compare the LID in the MAD to
the current LID. If and only if they are not identical then a
LID_CHANGE event is dispatched.
Signed-off-by: Moni Shoua <monis@voltaire.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Yossi Etigin <yosefe@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When snooping a PortInfo MAD, its client_reregister bit is checked.
If the bit is ON then a CLIENT_REREGISTER event is dispatched,
otherwise a LID_CHANGE event is dispatched. This way of decision
ignores the cases where the MAD changes the LID along with an
instruction to reregister (so a necessary LID_CHANGE event won't be
dispatched) or the MAD is neither of these (and an unnecessary
LID_CHANGE event will be dispatched).
This causes problems at least with IPoIB, which will do a "light"
flush on reregister, rather than the "heavy" flush required due to a
LID change.
Fix this by dispatching a CLIENT_REREGISTER event if the
client_reregister bit is set, but also compare the LID in the MAD to
the current LID. If and only if they are not identical then a
LID_CHANGE event is dispatched.
Signed-off-by: Moni Shoua <monis@voltaire.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Yossi Etigin <yosefe@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The base versions handle constant folding just fine, use them
directly. The replacements are OK in the include/ files as they are
not exported to userspace so we don't need the __ prefixed versions.
This patch does not affect code generation at all.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ehca_plpar_hcall9() takes an unsigned long array, so make all callers
pass that in. This fixes warnings introduced by commit fe333321
("powerpc: Change u64/s64 to a long long integer type"), which changed
u64 from unsigned long to unsigned long long.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Commit fe333321 ("powerpc: Change u64/s64 to a long long integer
type") changed u64 from unsigned long to unsigned long long, which
means that printk formats for printing u64 values should use "ll"
instead of "l" to avoid warnings. Fix all the places affected by this
in ehca.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When IPoIB tries to join a multicast group, and the SA module's SM
address handle is NULL (because of an SM change, etc), the join
returns with -EAGAIN status. In that case, don't print an error
message unless multicast debugging is enabled.
Signed-off-by: Yossi Etigin <yosefe@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The current work request posting code writes the LSO segment before
writing any data segments. This leaves a window where the LSO segment
overwrites the stamping in one cacheline that the HCA prefetches
before the rest of the cacheline is filled with the correct data
segments. When the HCA processes this work request, a local
protection error may result.
Fix this by saving the LSO header size field off and writing it only
after all data segments are written. This fix is a cleaned-up version
of a patch from Jack Morgenstein <jackm@dev.mellanox.co.il>.
This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=1383>.
Reported-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Fix a deadlock between child interface creation/deletion and ipoib
start/stop. The former takes vlan_mutex, and then might take RTNL via
register_netdev()/unregister_netdev(). The latter is executed with
RTNL held, and tries to take vlan_mutex, which can lead to an AB-BA
deadlock.
Fix this by having the child interface creation/deletion code take the
RTNL first so vlan_mutex always nests inside RTNL. We can use
register_netdevice() for child interfaces because we form the
interface name from the parent interface and hence don't need the '%'
expansion of register_netdev().
Reported-by: Yossi Etigin <yosefe@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
After commit fe25c561 ("IPoIB: Don't enable NAPI when it's already
enabled"), if an interface is brought up but the corresponding P_Key
never appears, then ipoib_stop() will hang in napi_disable(), because
ipoib_open() returns before it does napi_enable().
Fix this by changing ipoib_open() to call napi_enable() even if the
P_Key isn't present.
Reported-by: Yossi Etigin <yosefe@Voltaire.COM>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
IB/iser: Add dependency on INFINIBAND_ADDR_TRANS
IPoIB: Do not join broadcast group if interface is brought down
RDMA/nes: Fix for NIPQUAD removal
IPoIB: Fix loss of connectivity after bonding failover on both sides
IB/mlx4: Don't register IB device for adapters with no IB ports
mlx4_core: Fix warning from min()
IB/ehca: spin_lock_irqsave() takes an unsigned long
Fix ib_iser build to depend on INFINIBAND_ADDR_TRANS; if INET=y but
IPV6=n, then the RDMA CM is not built but INFINIBAND_ISER can be
enabled, leading to:
ERROR: "rdma_destroy_id" [drivers/infiniband/ulp/iser/ib_iser.ko] undefined!
ERROR: "rdma_connect" [drivers/infiniband/ulp/iser/ib_iser.ko] undefined!
ERROR: "rdma_destroy_qp" [drivers/infiniband/ulp/iser/ib_iser.ko] undefined!
ERROR: "rdma_create_id" [drivers/infiniband/ulp/iser/ib_iser.ko] undefined!
ERROR: "rdma_create_qp" [drivers/infiniband/ulp/iser/ib_iser.ko] undefined!
ERROR: "rdma_resolve_route" [drivers/infiniband/ulp/iser/ib_iser.ko] undefined!
ERROR: "rdma_disconnect" [drivers/infiniband/ulp/iser/ib_iser.ko] undefined!
ERROR: "rdma_resolve_addr" [drivers/infiniband/ulp/iser/ib_iser.ko] undefined!
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Because the ipoib_workqueue is not flushed when ipoib interface is
brought down, ipoib_mcast_join() may trigger a join to the broadcast
group after priv->broadcast was set to NULL (during cleanup). This
will cause the system to be a member of the broadcast group when
interface is down. As a side effect, this breaks the optimization of
setting the Q_key only when joining the broadcast group.
Signed-off-by: Yossi Etigin <yosefe@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Commit 63779436 ("drivers: replace NIPQUAD()") accidentally replaced
some HIPQUAD()s, causing IP addresses to be printed in reverse order.
Add temporary local vars until the byteswapping can be pushed further
up the stack.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Fix bonding failover in the case both peers failover and the
gratuitous ARP is lost. In that case, the sender side will create an
ipoib_neigh and issue a path request with the old GID first. When
skb->dst->neighbour->ha changes due to ARP refresh, this ipoib_neigh
will not be added to the path->list of the path of the new GID,
because the ipoib_neigh already exists. It will not have an AH
either, because of sender-side failover. Therefore, it will not get
an AH when the path is resolved.
The solution here is to compare GIDs in ipoib_start_xmit() even if
neigh->ah is invalid. Comparing with an uninitialized value of
neigh->dgid should be fine, since a spurious match is harmless (and
astronomically unlikely too).
Signed-off-by: Moni Shoua <monis@voltaire.com>
Signed-off-by: Yossi Etigin <yosefe@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
If the mlx4_ib driver finds an adapter that has only ethernet ports, the
current code will register an IB device with 0 ports. Nothing useful or
sensible can be done with such a device, so just skip registering it.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When I review ocfs2 code, find there are 2 typos to "successfull". After
doing grep "successfull " in kernel tree, 22 typos found totally -- great
minds always think alike :)
This patch fixes all the similar typos. Thanks for Randy's ack and comments.
Signed-off-by: Coly Li <coyli@suse.de>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Roland Dreier <rolandd@cisco.com>
Cc: Jeremy Kerr <jk@ozlabs.org>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Vlad Yasevich <vladislav.yasevich@hp.com>
Cc: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (24 commits)
trivial: chack -> check typo fix in main Makefile
trivial: Add a space (and a comma) to a printk in 8250 driver
trivial: Fix misspelling of "firmware" in docs for ncr53c8xx/sym53c8xx
trivial: Fix misspelling of "firmware" in powerpc Makefile
trivial: Fix misspelling of "firmware" in usb.c
trivial: Fix misspelling of "firmware" in qla1280.c
trivial: Fix misspelling of "firmware" in a100u2w.c
trivial: Fix misspelling of "firmware" in megaraid.c
trivial: Fix misspelling of "firmware" in ql4_mbx.c
trivial: Fix misspelling of "firmware" in acpi_memhotplug.c
trivial: Fix misspelling of "firmware" in ipw2100.c
trivial: Fix misspelling of "firmware" in atmel.c
trivial: Fix misspelled firmware in Kconfig
trivial: fix an -> a typos in documentation and comments
trivial: fix then -> than typos in comments and documentation
trivial: update Jesper Juhl CREDITS entry with new email
trivial: fix singal -> signal typo
trivial: Fix incorrect use of "loose" in event.c
trivial: printk: fix indentation of new_text_line declaration
trivial: rtc-stk17ta8: fix sparse warning
...
The flags argument to spin_lock_irqsave() should really be unsigned
long. This will also help prevent some warnings when we change u64 to
unsigned long long.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
... and don't bother in callers. Don't bother with zeroing i_blocks,
while we are at it - it's already been zeroed.
i_mode is not worth the effort; it has no common default value.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
IB/mlx4: Fix reading SL field out of cqe->sl_vid
RDMA/addr: Fix build breakage when IPv6 is disabled