Commit graph

132072 commits

Author SHA1 Message Date
Tony Battersby
c6517b7942 [SCSI] sg: fix races during device removal
sg has the following problems related to device removal:

* opening a sg fd races with removing a device
* closing a sg fd races with removing a device
* /proc/scsi/sg/* access races with removing a device
* command completion races with removing a device
* command completion races with closing a sg fd
* can rmmod sg with active commands

These problems can cause kernel oopses, memory-use-after-free, or
double-free errors.  This patch fixes these problems by using krefs
to manage the lifetime of sg_device and sg_fd.

Each command submitted to the midlevel holds a reference to sg_fd
until the completion callback.  This ensures that sg_fd doesn't go
away if the fd is closed with commands still outstanding.

sg_fd gets the reference of sg_device (with scsi_device) and also
makes sure that the sg module doesn't go away.

/proc/scsi/sg/* functions don't play nicely with krefs because they
give information about sg_fds which have been closed but not yet
freed due to still having outstanding commands and sg_devices which
have been removed but not yet freed due to still being referenced
by one or more sg_fds.  To deal with this safely without removing
functionality, /proc functions now access sg_device and sg_fd while
holding a lock instead of using kref_get()/kref_put().

Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:58:04 -05:00
Ed Lin - PTU
bd5cd9cdc5 [SCSI] stex: Version update
Update version to 4.6.0000.1

Signed-off-by: Ed Lin <ed.lin@promise.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:58:04 -05:00
Ed Lin - PTU
7cfe99a526 [SCSI] stex: Small fixes
Some small fixes, including:
- add data direction in req_msg because new firmware version
  may require this (backward compatible)
- change internal timeout value
- change judgment of type st_vsc1
- blank line handling, etc.

Signed-off-by: Ed Lin <ed.lin@promise.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:58:04 -05:00
Ed Lin - PTU
e8a091b36c [SCSI] stex: Fix for controller type st_yosemite
This is the fix for controller type st_yosemite, including
- max_lun is 256 (backward compatible)
- remove unneeded special handling of INQUIRY
- remove unnecessary listing of sub device ids

Signed-off-by: Ed Lin <ed.lin@promise.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:58:04 -05:00
Ed Lin - PTU
62e5b3d850 [SCSI] stex: Add new device id
Add new device id for controller type st_seq.

Signed-off-by: Ed Lin <ed.lin@promise.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:58:03 -05:00
Ed Lin - PTU
dd48ebf7ca [SCSI] stex: Fix for potential invalid response
The interrupt routine is good for normal cases. However, if the firmware
is abnormal and returns an invalid response, the driver may reuse a
ccb structure that has already been handled. This may cause problem.
Fix this by setting the req member to NULL. Next time we know the
response is invalid and handle accordingly if req is NULL.

Signed-off-by: Ed Lin <ed.lin@promise.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:58:03 -05:00
James Bottomley
a4976d6886 [SCSI] osst: Remove SUGGEST flags
Fix up remaining bit of SUGGEST flag removal done by patch

commit 0f10274300857d98ea5ea4c800c561a9ad9ac89f
Author: Martin K. Petersen <martin.petersen@oracle.com>
Date:   Sun Jan 4 03:14:11 2009 -0500

    [SCSI] Remove SUGGEST flags

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:58:03 -05:00
Martin K. Petersen
1c9fbafc8c [SCSI] Remove SUGGEST flags
The SUGGEST_* flags in the SCSI command result have been out of fashion
for a while and we don't actually use them in the error handling.
Remove the remaining occurrences.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:58:02 -05:00
Wayne Boyer
5a9ef25b14 [SCSI] ipr: add MSI support
Enable MSI if available/supported.

Signed-off-by: Wayne Boyer <wayneb@linux.vnet.ibm.com>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:57:58 -05:00
Christof Schmitt
951948a397 [SCSI] scsi_transport_fc: Add missing parenthesis to Point-To-Point description
Fix typo by adding closing parenthesis.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:57:57 -05:00
Aaro Koskinen
49799fee82 [SCSI] sym53c8xx: Keep transfer negotiations valid
(The patch updated based on testing and comments from Tony Battersby.)

Change the sym53c8xx_2 driver negotiation logic so that the driver will
tolerate better device removals. Negotiation message(s) will be sent
with every INQUIRY and REQUEST SENSE command, and whenever there is a
change in goals or when the device reports check condition.

The patch was made specifically to address the case where you hotswap
the disk using remove-single-device/add-single-device commands through
/proc/scsi/scsi. Without the patch the driver keeps using old transfer
parameters even though the target is reset and reports check condition,
so the data transfer of the very first INQUIRY will fail.

Signed-off-by: Aaro Koskinen <Aaro.Koskinen@nokia.com>
Tested-by: Tony Battersby <tonyb@cybernetics.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:57:57 -05:00
Robert Love
5ef074161b [SCSI] Improve SCSI_LOGGING Kconfig entry
The Kconfig entry for SCSI_LOGGING refers the reader to
drivers/scsi/scsi.c, but I didn't find any useful information
there. There is certainly logging code in that file, but the
logging types and logging levels are described in
drivers/scsi/scsi_logging.h.

Also, the procfs file referred to in the section is incorrect.
It should be /proc/sys/dev/scsi/logging_level and not
/proc/scsi/scsi.

Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:57:57 -05:00
Hannes Reinecke
0762a4824d [SCSI] Check for deleted device in scsi_device_online()
scsi_device_online() is not just a negation of SDEV_OFFLINE,
also devices in state SDEV_DEL are actually offline.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:57:56 -05:00
Jan Engelhardt
71fa742182 [SCSI] lpfc: constify virtual function tables
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Acked-by: James Smart <James.Smart@Emulex.Com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:57:56 -05:00
FUJITA Tomonori
6e7490c73d [SCSI] libfc: fix compile warning
I got the following warnings on IA64:

drivers/scsi/libfc/fc_lport.c: In function 'fc_lport_recv_flogi_req':
drivers/scsi/libfc/fc_lport.c:788: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type 'u64'
drivers/scsi/libfc/fc_lport.c:792: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type 'u64'
scsi/libfc/fc_rport.c: In function 'fc_rport_recv_plogi_req':
/home/fujita/git/linux-2.6/drivers/scsi/libfc/fc_rport.c:968: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:57:56 -05:00
Randy Dunlap
d943aeebc5 [SCSI] libfc: needs CRC32
libfc uses crc32 functions, so cause it to be built
via select:

drivers/built-in.o: In function `fc_frame_crc_check':
(.text+0x75dae): undefined reference to `crc32_le'
drivers/built-in.o: In function `fc_fcp_recv':
fc_fcp.c:(.text+0x7b919): undefined reference to `crc32_le'
fc_fcp.c:(.text+0x7b9d5): undefined reference to `crc32_le'
fc_fcp.c:(.text+0x7ba54): undefined reference to `crc32_le'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:57:56 -05:00
Randy Dunlap
d0ace3c5ee [SCSI] scsi_debug: needs CRC_T10DIF
Fix scsi_debug build error:

drivers/built-in.o: In function `resp_read':
scsi_debug.c:(.text+0x21379a): undefined reference to `crc_t10dif'
drivers/built-in.o: In function `resp_write':
scsi_debug.c:(.text+0x213fca): undefined reference to `crc_t10dif'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:57:55 -05:00
Martin K. Petersen
c6a4428741 [SCSI] scsi_debug: DIF/DIX support
This patch adds support for DIX and DIF in scsi_debug.  A separate
buffer is allocated for the protection information.

 - The dix parameter indicates whether the controller supports DIX
   (protection information DMA)

 - The dif parameter indicates whether the simulated storage device
   supports DIF

 - The guard parameter switches between T10 CRC(0) and IP checksum(1)

 - The ato parameter indicates whether the application tag is owned by
   the disk(0) or the OS(1)

 - DIF and DIX errors can be triggered using the scsi_debug_opts mask

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:57:55 -05:00
Randy Dunlap
59d3270326 [SCSI] scsi_sysfs: delete extra kernel-doc
Warning(linux-2.6.28-git13//drivers/scsi/scsi_sysfs.c:1049): Excess function parameter 'dev' description in 'scsi_sysfs_add_host'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:57:55 -05:00
Matthew Wilcox
40c3460f3c [SCSI] ses: Use new scsi VPD helper
SES had its own code to retrieve VPD from devices; convert it to use the
new scsi_get_vpd_page helper.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:57:54 -05:00
Matthew Wilcox
881a256d84 [SCSI] Add VPD helper
Based on prior work by Martin Petersen and James Bottomley, this patch
adds a generic helper for retrieving VPD pages from SCSI devices.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:57:54 -05:00
Andrew Vasquez
5fa0ae1982 [SCSI] qla2xxx: Update version number to 8.03.00-k4.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-10 09:28:03 -05:00
Andrew Vasquez
ca42f2b5f5 [SCSI] qla2xxx: Correct overwrite of pre-assigned init-control-block structure size.
The value is already pre-assigned prior to the qla2x00_mem_alloc().

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-10 09:27:55 -05:00
Andrew Vasquez
c6b2fca820 [SCSI] qla2xxx: Correct truncation in return-code status checking.
QLA_* return codes are 'int' in size.  There were still several
legacy check-points which assumed a return-code width of 8-bits.
This could cause incorrect assumptions of 'good' status if a
return of QLA_FUNCTION_TIMEOUT.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-10 09:27:54 -05:00
Anirban Chakraborty
ee546b6e04 [SCSI] qla2xxx: Correct vport delete bug.
Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-10 09:27:54 -05:00
Lalit Chandivade
605aa2bcd5 [SCSI] qla2xxx: Use correct value for max vport in LOOP topology.
Use minimum value for max vport during firmware initialization in LOOP
topology. Using max vport value from get resource count in LOOP topology
causes firmware initialization failure.

Signed-off-by: Lalit Chandivade <lalit.chandivade@qlogic.com>
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-10 09:27:54 -05:00
Andrew Vasquez
6431c5dc5e [SCSI] qla2xxx: Correct address range checking for option-rom updates.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-10 09:27:53 -05:00
Robert Love
4469c195da [SCSI] fcoe: Change fcoe receive thread nice value from 19 (lowest priority) to -20
This change makes the fcoe Rx threads have the same nice value
as lpfc and qla2xxx Rx threads.

Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-10 09:10:02 -05:00
Chris Leech
55c8bafba5 [SCSI] fcoe: fix handling of pending queue, prevent out of order frames (v3)
In fcoe_check_wait_queue() the queue length could temporarily drop to 0,
before the last frame was successfully sent.  This resulted in out of order
data frames within a single sequence, leading to IO timeout errors.

This builds on the approach from Vasu Dev to only fix the queue management in
fcoe_check_wait_queue, where my first patch added locking to the transmit
path even when the pending queue was not in use.

This patch continues to use fcoe_pending_queue.qlen instead of introducing a
new length counter, but takes precautions to ensure it never drops to 0 before
the final frame in the queue has successfully been passed to the netdev qdisc
layer.  It also includes some cleanup of fcoe_check_wait_queue and removes the
fcoe_insert_wait_queue(_head) wrapper functions.

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-10 09:09:40 -05:00
Vasu Dev
c826a31457 [SCSI] fcoe: Out of order tx frames was causing several check condition SCSI status
frames followed by these errors in log.

	[sdp] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE,SUGGEST_OK
	[sdp] Sense Key : Aborted Command [current]
	[sdp] Add. Sense: Data phase error

This was causing some test apps to exit due to write failure under heavy
load.

This was due to a race around adding and removing tx frame skb in
fcoe_pending_queue, Chris Leech helped me to find that brief unlocking
period when pulling skb from fcoe_pending_queue in various contexts
(fcoe_watchdog and fcoe_xmit) and then adding skb back into fcoe_pending_queue
up on a failed fcoe_start_io could change skb/tx frame order in
fcoe_pending_queue. Thanks Chris.

This patch allows only single context to pull skb from fcoe_pending_queue
at any time to prevent above described ordering issue/race by use of
fcoe_pending_queue_active flag.

This patch simplified fcoe_watchdog with modified fcoe_check_wait_queue by
use of FCOE_LOW_QUEUE_DEPTH instead previously used several conditionals
to clear and set lp->qfull.

I think FCOE_MAX_QUEUE_DEPTH with FCOE_LOW_QUEUE_DEPTH  will work better
in re/setting lp->qfull and these could be fine tuned for performance.

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-10 09:09:21 -05:00
Roel Kluin
e904158159 [SCSI] fcoe: fix kfree(skb)
Use kfree_skb instead of kfree for struct sk_buff pointers.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-10 09:09:01 -05:00
Yi Zou
74846bf85e [SCSI] fcoe: ETH_P_8021Q is already in if_ether and fcoe is not using it anyway
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-10 09:07:09 -05:00
Yi Zou
422819cfa3 [SCSI] libfc: do not change the fh_rx_id of a recevied frame
We shouldn't be altering inbound frames.

Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-10 09:06:36 -05:00
Robert Love
03ec862dff [SCSI] fcoe: Correct fcoe_transports initialization vs. registration
The registration function shouldn't initialize the mutex or
list head. The fcoe SW transport should initialize itself
before registering.

Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-10 09:06:17 -05:00
Robert Love
a468f328ad [SCSI] fcoe: Use setup_timer() and mod_timer()
Use helper functions for watchdog timer setup.

Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-10 09:05:57 -05:00
Robert Love
fc47ff6b1b [SCSI] libfc, fcoe: Remove unnecessary cast by removing inline wrapper
Comment from "Andrew Morton <akpm@linux-foundation.org>"

> +{
> +     return (struct fcoe_softc *)lport_priv(lp);

unneeded/undesirable cast of void*.  There are probably zillions of
instances of this - there always are.

This whole inline function was unnecessary. The FCoE layer knows
that it's data structure is stored in the lport private data, it
can just access it from lport_priv().

Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-10 09:05:35 -05:00
Robert Love
b2ab99c9a3 [SCSI] libfc, fcoe: Cleanup function formatting and minor typos
1) There were a few functions with a strange layout, i.e. all
   arguments on the second line, when not necessary.

   Where ever possible I moved the return value to the same line
   as the function name. However, when the line was too long
   to have a single argument on the same line I moved the
   return value to above line. For example:

   <short return> <function name>(<arg 1>, <arg2>)

   and

   <very long return value>
   <function name>(<arg1>,
		   <arg2>)

2) Removed one extra whitespace line

3) Fixed two typos

Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-10 09:05:09 -05:00
Robert Love
34f42a070f [SCSI] libfc, fcoe: Fix kerneldoc comments
1) Added '()' for function names in kerneldoc comments

2) Changed comment bookends from '**/' to '*/'. The comment on the the
   mailing list was that '**/' "is consistently unconventional.  Not
   wrong, just odd." The Documentation/kernel-doc-nano-HOWTO.txt
   states that kerneldoc comment blocks should end with '**/' but most
   (if not all) instance I found under drivers/scsi/ were only using
   the '*/' so I converted to that style.

3) Removed incorrect linebreaks in kerneldoc comments where found

4) Removed a few unnecessary blank comment lines in kerneldoc comment
   blocks

Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-10 09:04:40 -05:00
Robert Love
0ae4d4ae47 [SCSI] libfc: Cleanup libfc_function_template comments
Made the comments more like the comments for struct scsi_host_template.

Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-06 15:45:32 -06:00
Robert Love
efaf5c085d [SCSI] libfc: check for err when recv and state is incorrect
If we've just created an interface and the an rport is
logging in we may have a request on the wire (say PRLI).
If we destroy the interface, we'll go through each rport
on the disc->rports list and set each rport's state to NONE.
Then the lport will reset the EM. The EM reset will send a
CLOSED event to the prli_resp() handler which will notice
that the state != PRLI. In this case it frees the frame
pointer, decrements the refcount and unlocks the rport.

The problem is that there isn't a frame in this case. It's
just a pointer with an embedded error code. The free causes
an Oops.

This patch moves the error checking to be before the state
checking.

Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-06 15:44:36 -06:00
Robert Love
d3b33327ca [SCSI] libfc: rename rp to rdata in fc_disc_new_target()
Just rename the variable as per our naming convention.

Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-06 15:41:37 -06:00
Robert Love
23f11f9076 [SCSI] libfc: correct RPORT_TO_PRIV usage
We only need to use this macro when assigning a value to
rport->dd_data. All other accesses should just use dd_data.

Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-06 15:41:16 -06:00
Robert Love
5101ff99f5 [SCSI] libfc: Don't violate transport template for rogue port creation
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-06 15:41:01 -06:00
Steve Ma
f7db2c150c [SCSI] libfc: exch mgr is freed while lport still retrying sequences
When a sequence cannot be delivered to the target, the local
port will schedule retries, While this process is in progress,
if we destroy the FCoE interface, the fcoe_sw_destroy routine is
entered, and the fc_exch_mgr_free(lp->emp) is called.  Thus
if fc_exch_alloc() is called when retrying the sequence,
the mempool_alloc() will fail to allocate the exchange because
the mempool of the exchange manager has already been released.
This patch is to cancel any pending retry work of the local
port before we start to destroy the interface.

Also, when resetting the local port, we should also stop the
scheduled pending retries.

Signed-off-by: Steve Ma <steve.ma@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-06 15:40:45 -06:00
Vasu Dev
26d9cab558 [SCSI] libfc: fixed a read IO data integrity issue when a IO data frame lost
The fc_fcp_complete_locked detected data underrun in this case and set
the FC_DATA_UNDRUN but that was ignored by fc_io_compl for all cases
including read underrun.

Added code to not to ignore FC_DATA_UNDRUN for read IO and instead
suggested scsi-ml to retry cmd to  recover from lost data frame.

Not sure if it is okay to ignore FC_DATA_UNDRUN for other case, so let
code as is for other cases but removed or-ing with zero valued fsp->cdb_status
for those cases.

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-06 15:40:06 -06:00
Chris Leech
6755db1cd4 [SCSI] libfc: rport retry on LS_RJT from certain ELS
This allows any rport ELS to retry on LS_RJT.

The rport error handling would only retry on resource allocation failures
and exchange timeouts.  I have a target that will occasionally reject PLOGI
when we do a quick LOGO/PLOGI.  When a critical ELS was rejected, libfc would
fail silently leaving the rport in a dead state.

The retry count and delay are managed by fc_rport_error_retry.  If the retry
count is exceeded fc_rport_error will be called.  When retrying is not the
correct course of action, fc_rport_error can be called directly.

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-06 15:39:34 -06:00
Vasu Dev
bc0e17f691 [SCSI] libfc, fcoe: fixed locking issues with lport->lp_mutex around lport->link_status
The fcoe_xmit could call fc_pause in case the pending skb queue len is larger
than FCOE_MAX_QUEUE_DEPTH, the fc_pause was trying to grab lport->lp_muex to
change lport->link_status and that had these issues :-

1. The fcoe_xmit was getting called with bh disabled, thus causing
"BUG: scheduling while atomic" when grabbing lport->lp_muex with bh disabled.

2. fc_linkup and fc_linkdown function calls lport_enter function with
lport->lp_mutex held and these enter function in turn calls fcoe_xmit to send
lport related FC frame, e.g. fc_linkup => fc_lport_enter_flogi to send flogi
req. In this case grabbing the same lport->lp_mutex again in fc_puase from
fcoe_xmit would cause deadlock.

The lport->lp_mutex was used for setting FC_PAUSE in fcoe_xmit path but
FC_PAUSE bit was not used anywhere beside just setting and clear this
bit in lport->link_status, instead used a separate field qfull in fc_lport
to eliminate need for lport->lp_mutex to track pending queue full condition
and in turn avoid above described two locking issues.

Also added check for lp->qfull in fc_fcp_lport_queue_ready to trigger
SCSI_MLQUEUE_HOST_BUSY when lp->qfull is set to prevent more scsi-ml cmds
while lp->qfull is set.

This patch eliminated FC_LINK_UP and FC_PAUSE and instead used dedicated
fields in fc_lport for this, this simplified all related conditional
code.

Also removed fc_pause and fc_unpause functions and instead used newly added
lport->qfull directly in fcoe.

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-06 15:37:49 -06:00
Vasu Dev
a7e84f2b83 [SCSI] libfc: fixed a soft lockup issue in fc_exch_recv_abts
The fc_seq_start_next grabs ep->ex_lock but this lock was already held here,
so instead called fc_seq_start_next_locked to avoid soft lockup.

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-06 15:37:23 -06:00
Vasu Dev
78342da368 [SCSI] libfc: handle RRQ exch timeout
Cleanup exchange held due to RRQ when RRQ exch times out, in this case the
ABTS is already done causing RRQ req therefore proceeding with cleanup in
fc_exch_rrq_resp should be okay to restore exch resource.

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-06 15:36:56 -06:00
Abhijeet Joglekar
571f824c3c [SCSI] libfc: when rport goes away (re-plogi), clean up exchanges to/from rport
When a rport goes away, libFC does a plogi which will reset exchanges
    at the rport. Clean exchanges at our end, both in transport and libFC.
    If transport hooks into exch_mgr_reset, it will call back into
    fc_exch_mgr_reset() to clean up libFC exchanges.

Signed-off-by: Abhijeet Joglekar <abjoglek@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-06 15:36:28 -06:00