Received from Mark Salyzyn
If the adapter is in blinkled (Firmware Assert) when error recovery
timeout actions have been triggered, perform an adapter warm reset and
restart the initialization.
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Received from Mark Salyzyn
The enclosed patch cleans up some code fragments, adds some paranoia
(unproven causes of potential driver failures).
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Received from Mark Salyzyn
If the adapter should be in a blinkled (Firmware Assert) state when the
driver loads, we will perform a warm restart of the Adapter Firmware to
see if we can rescue the adapter. Possible causes of a blinkled can
occur on some early release motherboard BIOSes, transitory PCI bus
problems on embedded systems or non-x86 based architectures, transitory
startup failures of early release drives or transitory hardware
failures; some of which can bite the adapter later at runtime. Future
enhancements will include recovery during runtime.
Fixed extra whitespace space issue.
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Received from Mark Salyzyn
This patch allows the FSACTL_SEND_LARGE_FIB, FSACTL_SENDFIB and
FSACTL_SEND_RAW_SRB ioctl calls into the aacraid driver to be
interruptible. Only necessary if the adapter and/or the management
software has gone into some sort of misbehavior and the system is being
rebooted, thus permitting the user management software applications to
be killed relatively cleanly. The FIB queue resource is held out of the
free queue until the adapter finally, if ever, completes the command.
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Attached is a patch that should limit a possible recursion that can
lead to a stack overflow like follows:
Kernel stack overflow.
CPU: 3 Not tainted
Process zfcperp0.0.d819
(pid: 13897, task: 000000003e0d8cc8, ksp: 000000003499dbb8)
Krnl PSW : 0404000180000000 000000000030f8b2 (get_device+0x12/0x48)
Krnl GPRS: 00000000135a1980 000000000030f758 000000003ed6c1e8 0000000000000005
0000000000000000 000000000044a780 000000003dbf7000 0000000034e15800
000000003621c048 070000003499c108 000000003499c1a0 000000003ed6c000
0000000040895000 00000000408ab630 000000003499c0a0 000000003499c0a0
Krnl Code: a7 fb ff e8 a7 19 00 00 b9 02 00 22 e3 e0 f0 98 00 24 a7 84
Call Trace:
([<000000004089edc2>] scsi_request_fn+0x13e/0x650 [scsi_mod])
[<00000000002c5ff4>] blk_run_queue+0xd4/0x1a4
[<000000004089ff8c>] scsi_queue_insert+0x22c/0x2a4 [scsi_mod]
[<000000004089779a>] scsi_dispatch_cmd+0x8a/0x3d0 [scsi_mod]
[<000000004089f1ec>] scsi_request_fn+0x568/0x650 [scsi_mod]
...
[<000000004089f1ec>] scsi_request_fn+0x568/0x650 [scsi_mod]
[<00000000002c5ff4>] blk_run_queue+0xd4/0x1a4
[<000000004089ff8c>] scsi_queue_insert+0x22c/0x2a4 [scsi_mod]
[<000000004089779a>] scsi_dispatch_cmd+0x8a/0x3d0 [scsi_mod]
[<000000004089f1ec>] scsi_request_fn+0x568/0x650 [scsi_mod]
[<00000000002c5ff4>] blk_run_queue+0xd4/0x1a4
[<000000004089fa9e>] scsi_run_host_queues+0x196/0x230 [scsi_mod]
[<00000000409eba28>] zfcp_erp_thread+0x2638/0x3080 [zfcp]
[<0000000000107462>] kernel_thread_starter+0x6/0xc
[<000000000010745c>] kernel_thread_starter+0x0/0xc
<0>Kernel panic - not syncing: Corrupt kernel stack, can't continue.
This stack overflow occurred during tests on s390 using zfcp.
Recursion depth for this panic was 19.
Usually recursion between blk_run_queue and a request_fn is avoided
using QUEUE_FLAG_REENTER. But this does not help if the scsi stack
tries to flush the starved_list of a scsi_host.
Limit recursion depth when flushing the starved_list
of a scsi_host.
Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
A recent drivers base commit:
3e95637a48
Caused the bus to be added to dev_printk, so now our SCSI inquiry short
messages print like this:
scsiscsi 2:0:0:0: Direct access IBM-ESXS ST973401SS B519 PQ: 0 ANSI: 5
Just remove the "scsi" from the sdev_printk to compensate.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
- Replace scsi_device_types array API with scsi_device_type function API.
Gets rid of a lot of common code, as well as being easier to use.
- Add the new device types in SPC4 r05a, and rename some of the older ones.
- Reformat the printing of inquiry data; now fits on one line and
includes PQ.
I think I've addressed all the feedback from the previous versions. My
current test box prints:
scsi 2:0:1:0: Direct access HP 18.2G ATLAS10K3_18_SCA HP05 PQ: 0 ANSI: 2
Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Fix up a logic error in the checking for valid sense data.
Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The ipr driver currently translates adapter recovered errors
to DID_ERROR. This patch fixes this to translate these
errors to success instead.
Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Add definitions for some SAS error codes that can be
logged by ipr SAS adapters.
Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Add some hardware defined types for SATA. This is required
by future patches to add SATA support to ipr.
Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Add host_supported_speeds, host_maxframe_size, host_speed, host_fabric_name,
host_port_type, host_port_state, and host_symbolic_name transport attributes
to fusion fibre channel.
Signed-off-by: Michael Reed <mdr@sgi.com>
Acked-by: Moore, Eric <Eric.Moore@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
If you examine the queue_work() routine you'll see that it returns
1 on success, 0 if the work is already queued.
This patch corrects the source code documentation for the
scsi_queue_work function.
Signed-off-by: Michael Reed <mdr@sgi.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The PCI ID table in the DAC960 driver conflicts with some devices
that use the ipr driver. All ipr adapters that use this chip
have an IBM subvendor ID and all DAC960 adapters that use this
chip have a Mylex subvendor id.
Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Remove a lot of duplicate #defines from the advansys driver,
and make them look like PCI IDs as defined elsewhere in the kernel.
Also add a module table so that it automatically gets picked up
by tools relying on modinfo output (like say, distro installers).
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The sysfs files in arcmsr are non-standard in that they aren't simple
filename value pairs, the values actually contain preceeding text which
would have to be parsed. The idea of sysfs files is that the file name
is the description and the contents is a simple value.
Fix up arcmsr to conform to this standard.
Acked-By: Erich Chen <erich@areca.com.tw>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Remove sysfs_remove_bin_file() return-value checking from the areca driver.
There's nothing a driver can do if sysfs file removal fails, so we'll soon be
changing sysfs_remove_bin_file() to internally print a diagnostic and to
return void.
Cc: Erich Chen <erich@areca.com.tw>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
arcmsr is a driver for the Areca Raid controller, a host based RAID
subsystem that speaks SCSI at the firmware level.
This patch is quite a clean up over the initial submission with
contributions from:
Randy Dunlap <rdunlap@xenotime.net>
Christoph Hellwig <hch@lst.de>
Matthew Wilcox <matthew@wil.cx>
Adrian Bunk <bunk@stusta.de>
Signed-off-by: Erich Chen <erich@areca.com.tw>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This takes advantage of the sas class backlink function to show which
port on an expander is used to communicate with the parent.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Adds support for change_queue_depth so that device
queue depth can be changed at runtime through sysfs.
Signed-off-by: <brking@charter.net>
Acked-by: Seokmann Ju <seokmann.ju@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This fixes three drivers to compile again after my patch that removes
the data_cmnd member from struct scsi_cmnd.
The fas216 change is trivial, it should have been using ->cmnd all the
time.
NCR53C9 (which seem to be mostly duplicate driver with esp.c!) is doing
something odd, it should only have looked at ->cmnd before not the saved
copy that is kept for the error handlers sake. Note that it really
should deal with the sync setting themselves but use the generic domain
validation code that get this right - but that's for later let's push
this simple compile fix for now.
And sorry for the late fix for this, I have been busy with OLS and
associated activities last week.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
[SCSI] esp: Fix build.
[SPARC]: Fix SA_STATIC_ALLOC value.
[SPARC64]: Explicitly print return PC when the kernel fault PC is bogus.
The patch below moves the cpu hotplugging higher up in the cpufreq
layering; this is needed to avoid recursive taking of the cpu hotplug
lock and to otherwise detangle the mess.
The new rules are:
1. you must do lock_cpu_hotplug() around the following functions:
__cpufreq_driver_target
__cpufreq_governor (for CPUFREQ_GOV_LIMITS operation only)
__cpufreq_set_policy
2. governer methods (.governer) must NOT take the lock_cpu_hotplug()
lock in any way; they are called with the lock taken already
3. if your governer spawns a thread that does things, like calling
__cpufreq_driver_target, your thread must honor rule #1.
4. the policy lock and other cpufreq internal locks nest within
the lock_cpu_hotplug() lock.
I'm not entirely happy about how the __cpufreq_governor rule ended up
(conditional locking rule depending on the argument) but basically all
callers pass this as a constant so it's not too horrible.
The patch also removes the cpufreq_governor() function since during the
locking audit it turned out to be entirely unused (so no need to fix it)
The patch works on my testbox, but it could use more testing
(otoh... it can't be much worse than the current code)
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Handle dev_alloc_skb() failures when initializing the RX rings.
Without proper handling, the driver will crash when using a partial
ring.
Thanks to Stephane Doyon <sdoyon@max-t.com> for reporting the bug and
providing the initial patch.
Howie Xu <howie@vmware.com> also reported the same issue.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add tg3_restart_hw() to handle failures when re-initializing the
device.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need to postpone the queue startup until after the softirq
handler has actually finished some requests, otherwise we could
be racing with cciss_softirq_done() and not actually restart
the queue handling.
Signed-off-by: Jens Axboe <axboe@suse.de>
The data_cmd[] member got deleted, so do not use it any more. Scsi
commands do not have their ->cmd[] overwritten temporary to probe for
status after an error before retrying.
Signed-off-by: David S. Miller <davem@davemloft.net>
The Broadcom dongles with HID proxy support actually support SCO over
HCI if the SCO buffer size values are corrected. So instead of disabling
the SCO support, mark this dongle with the quirk for the Bluetooth core
to correct the wrong buffer size values.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This patch disables the ISOC transfers for another broken RTX Telecom
based USB dongle. Starting the USB ISOC transfers only ends in a burst
of error messages for invalid SCO packets on connection handle 0.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The Belkin F8T012 and F8T013 devices are both based on a Bluetooth chip
from Broadcom and their SCO buffer size values are wrong. The Bluetooth
core should correct these values.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The SCO buffer size values on IBM/Lenovo ThinkPad laptops with a
Bluetooth chip from Broadcom are wrong. The USB Bluetooth driver
has to set a quirk to correct the SCO buffer size values.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Read the max_cmds value from the response to the QUERY_FW command
before printing out the value, so that the real value goes into the
debug output.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The neighbour ha field may get updated without destroying the
neighbour. In this case, the ha field gets out of sync with the
address handle stored in ipoib_neigh->ah, with the result that
the ah field would point to an incorrect path, resulting in all
packets being lost.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Need to set mcast->ah before debug code dereferences it.
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Validate MADs sent by userspace clients for spec compliance with
C13-18.1.1 (prevent duplicate requests and responses sent on the
same port). Without this, RMPP transactions get aborted because
of duplicate packets.
This patch is similar to that provided by Jack Morgenstein.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ipath_skip_sge() doesn't exactly duplicate the side effects of
ipath_copy_sge() if num_sge > 1 since it doesn't decrement ss->num_sge.
This could result in the sg_list being accessed out of bounds.
Since ipath_skip_sge() is almost always called with num_sge == 1,
the original "optimization" is almost never used.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
I am still working on a proposal to remove the phys_to_virt() calls
in the ib_ipath driver. In the mean time, this patch allows SRP
to work by fixing the R_Key check and conversion from IB address
to kernel virtual address. It also returns the correct page size
for FMRs.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This patch fixes a problem where certain error packets are passed
to the InfiniBand layer for processing even though the packet
actually was received with an error.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Mem-free HCAs always keep one spare SRQ WQE, so the SRQ limit cannot
be set beyond srq->max - 1.
Signed-off-by: Dotan Barak <dotanb@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Lockdep warns because uverbs is trying to take uobj->mutex when it
already holds that lock. This is because there are really multiple
types of uobjs even though all of their locks are initialized in
common code.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ib_uverbs_create_ah() and ib_uverbs_create_srq() did not release the
PD's read lock in their error paths, which lead to deadlock when
destroying the PD.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Shutting down the ondemand policy was fraught with potential
problems, causing issues for SMP suspend (which wants to hot-
unplug) all but the last CPU.
This should fix at least the worst problems (divide-by-zero
and infinite wait for the workqueue to shut down).
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (21 commits)
[TIPC]: Removing useless casts
[IPV4]: Fix nexthop realm dumping for multipath routes
[DUMMY]: Avoid an oops when dummy_init_one() failed
[IFB] After ifb_init_one() failed, i is increased. Decrease
[NET]: Fix reversed error test in netif_tx_trylock
[MAINTAINERS]: Mark LAPB as Oprhan.
[NET]: Conversions from kmalloc+memset to k(z|c)alloc.
[NET]: sun happymeal, little pci cleanup
[IrDA]: Use alloc_skb() in IrDA TX path
[I/OAT]: Remove pci_module_init() from Intel I/OAT DMA engine
[I/OAT]: net/core/user_dma.c should #include <net/netdma.h>
[SCTP]: ADDIP: Don't use an address as source until it is ASCONF-ACKed
[SCTP]: Set chunk->data_accepted only if we are going to accept it.
[SCTP]: Verify all the paths to a peer via heartbeat before using them.
[SCTP]: Unhash the endpoint in sctp_endpoint_free().
[SCTP]: Check for NULL arg to sctp_bucket_destroy().
[PKT_SCHED] netem: Fix slab corruption with netem (2nd try)
[WAN]: Converted synclink drivers to use netif_carrier_*()
[WAN]: Cosmetic changes to N2 and C101 drivers
[WAN]: Added missing netif_dormant_off() to generic HDLC
...
It before entering in the loop for freeing the other ifb devices.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use pci_register_driver instead of pci_module_init. Use PCI_DEVICE macro.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Changes pci_module_init() to pci_register_driver().
Signed-off-by: Henrik Kretzschmar <henne@nachtwindheim.de>
Signed-off-by: David S. Miller <davem@davemloft.net>