Commit graph

133 commits

Author SHA1 Message Date
Swen Schillig a4623c467f [SCSI] zfcp: Improve request allocation through mempools
Remove the special case for NO_QTCB requests and optimize the
mempool and cache processing for fsfreqs. Especially use seperate
mempools for the zfcp_fsf_req and zfcp_qtcb structs.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-09-05 08:49:20 -05:00
Swen Schillig 058b864789 [SCSI] zfcp: Replace fsf_req wait_queue with completion
The combination wait_queue/wakeup in conjunction with the flag
ZFCP_STATUS_FSFREQ_COMPLETED to signal the completion of an fsfreq
was not race-safe and can be better solved by a completion.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-09-05 08:49:18 -05:00
Swen Schillig bd63eaf4b8 [SCSI] zfcp: fix layering oddities between zfcp_fsf and zfcp_qdio
There is no need for the QDIO layer to have knowledge or do things
wich are done better by the FSF layer and vice versa.  Straighten a
few things to improve vividness.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-09-05 08:49:16 -05:00
Christof Schmitt 44f09f7376 [SCSI] zfcp: Remove useless assignment
Using a bitwise OR to not set anything at all is pointless so remove
the useless statement.

Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-09-05 08:49:12 -05:00
Christof Schmitt dcd20e2316 [SCSI] zfcp: Only collect SCSI debug data for matching trace levels
The default trace level is to only trace failed SCSI commands. Thus it
is not necessary to collect trace data for most SCSI commands since it
will be thrown away later. Restructure the SCSI trace infrastructure
to first check the trace level in a inline function and only do the
expensive data collection for matching trace levels.

Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-09-05 08:49:08 -05:00
Swen Schillig 27f492ccec [SCSI] zfcp: Fix wka port processing
Under certain conditions it is possible that a WKA port ist not opened
within the expected timeframe of half a second. In this situation
the WKA port remains in the state OPENING preventing any succeding
request to open the port. This led to unrecoverable remote ports.
Fixing this by always setting an appropriate WKA port status before
leaving the function and removing the timeout value here since it's
not needed here because the general timeout processing would deal
with it if required.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-07-30 08:49:58 -05:00
Christof Schmitt cbf1ed0264 [SCSI] zfcp: Recover from stalled outbound queue
Depending on interruptions on some storage systems, the complete
channel can stall which looks like an outbound queue stall to Linux.
When trying to acquire a free SBAL for a non-SCSI command, zfcp waits
for 5 seconds for a free slot to appear. This is the right place to
detect a queue stall: If the wait times out, we assume a stalled queue
and try to recover this.

The overall strategy should be to trigger the erp from specific
events, and not try an overall escalation from one failed port to a
full-blown queue recovery. If we manage to send a command, the status
codes for this command or a timeout will trigger the right follow-on
actions.

Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-07-30 08:49:57 -05:00
Christof Schmitt 9072df4dc6 [SCSI] zfcp: Use -EIO for SBAL allocation failures
-ENOMEM is for memory allocation problems, -EIO for queue/SBAL
allocation problems.

Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-07-30 08:49:56 -05:00
Christof Schmitt 426f6059b0 [SCSI] zfcp: Use unchained mode for small ct and els requests
The ELS ADISC and the GID_PN requests sent from zfcp fit into
unchained FSF requests. Change the FSF allocation logic to use
unchained requests whenever possible where everything fits in one
SBAL. This avoids acquiring more SBALs than necessary, especially
during zfcp recovery when things might be stalled.

Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-07-30 08:49:56 -05:00
Christof Schmitt 1e9b16430f [SCSI] zfcp: Return -ENOMEM for allocation failures in zfcp_fsf
When a fsf_req or a qtcb cannot be allocated return -ENOMEM instead of
-EIO.

Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-07-30 08:49:55 -05:00
Swen Schillig dfb3cf00e4 [SCSI] zfcp: Fix invalid command order
We should not modify the port status after triggering an ERP action
for the port. It is not guaranteed which status is finally active
when the ERP action is performed. This can lead to situations which
are unwanted and hard to debug in case of a failure.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-07-30 08:49:54 -05:00
Christof Schmitt dc577d554a [SCSI] zfcp: Update FC pass-through support
Don't access the block layer request, get the payload length instead
from the FC job. Simplify access to the zfcp_port, only the d_id is
required, if the port is no longer accessed later. This is possible
when the els_handler does not access the port pointer from the ELS
request.

Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-06-12 14:20:06 -05:00
Christof Schmitt 6fcf41d1d8 [SCSI] zfcp: Keep ccw device and model id in zfcp_ccw.c
Keep the information about the device and model id in zfcp_ccw. This
requires an additional helper function to check for the privileged
cfdc subchannel, but it allows the removal of the redundant defines
from the zfcp_def header file.

Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-05-23 15:44:16 -05:00
Martin Petermann a17c585564 [SCSI] zfcp: Increase ref counter for port open requests
In rare cases, open port request might timeout, erp calls
zfcp_port_put, port gets dequeued. Now, the late returning (or
dismissed) fsf-port-open calls the fsf_port_open_handler that tries to
reference the port data structure leading to a kernel oops.

Signed-off-by: Martin Petermann <martin.petermann@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-05-23 15:44:15 -05:00
Christof Schmitt dceab655d9 [SCSI] zfcp: Add comments to switch/case fallthroughs
Add comments where there is a deliberate fall through in switch/case
statements. This makes some code checkers happy and makes it clear
that there is no missing break statement.

Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-05-23 15:44:15 -05:00
Christof Schmitt bc90c8632f [SCSI] zfcp: Remove unnecessary default case and assignments
enum dma_data_direction only has the 4 values DMA_BIDIRECTIONAL,
DMA_TO_DEVICE, DMA_FROM_DEVICE and DMA_NONE. No need to have the
default case. While changing this, setup sbtype in one place to make
sparse happy.

The default value of retval is already -EIO, so remove the
additional assignment for these two cases.

Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-05-23 15:44:15 -05:00
Christof Schmitt 70932935b6 [SCSI] zfcp: Fix oops when port disappears
The zfcp_port might have been removed, while the FC fast_io_fail timer
is still running and could trigger the terminate_rport_io callback.
Set the pointer to the zfcp_port to NULL and check accordingly
before using it.

Reviewed-by: Martin Petermann <martin@linux.vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-04-27 10:07:37 -05:00
Martin Petermann 7001f0c486 [SCSI] zfcp: revert previous patch for sbal counting
The current sbal counting can be wrong if a fsf request is
waiting for free sbals and at the same time qdio request queue
is shutdown and re-opened. Revering a previous patch fixes this
issue.

Signed-off-by: Martin Petermann <martin.petermann@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-04-27 10:07:34 -05:00
Christof Schmitt f7306bf615 [SCSI] zfcp: Let actcli handle control file errors
Error codes specific to the control file requests are evaluated by the
actcli tool, so don't report -ENXIO for those. Generic problems are
still checked for outside the command specific handler.

Reviewed-by: Martin Petermann <martin@linux.vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-04-27 10:07:31 -05:00
Christof Schmitt ada81b748b [SCSI] zfcp: Dont call zfcp_fsf_req_free on NULL pointer
Fix problem that zfcp_fsf_exchange_config_data_sync and
zfcp_fsf_exchange_config_data_sync could try to call zfcp_fsf_req_free
with a NULL pointer.

Reviewed-by: Martin Petermann <martin@linux.vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-04-27 10:07:25 -05:00
Martin Petermann 135ea137e3 [SCSI] zfcp: Avoid referencing freed memory in req send
Avoid referencing a fsf request after sending it in fcp_fsf_req_send,
it might have already completed and deallocated.

Signed-off-by: Martin Petermann <martin@linux.vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-04-27 10:07:15 -05:00
Christof Schmitt 0282985da5 [SCSI] zfcp: Report fc_host_port_type as NPIV
Report the fc_host_port_type as FC_PORTTYPE_NPIV when the subchannel
is running in NPIV mode. This allows to see the correct type with
lsscsi -H -t --list

Acked-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:58:22 -05:00
Christof Schmitt a2fa0aede0 [SCSI] zfcp: Block FC transport rports early on errors
Use the I/O blocking mechanism in the FC transport class to allow
faster failovers for multipathing:
- Call fc_remote_port_delete early to set the rport to BLOCKED.
- Check the rport status in queuecommand with fc_remote_portchkready
  to no longer accept new I/O for this port and fail the I/O with the
  appropriate scsi_cmnd result.
- Implement the terminate_rport_io handler to abort all pending I/O
  requests
- Return SCSI commands with DID_TRANSPORT_DISRUPTED while erp is
  running.
- When updating the remote port status, check for late changes and
  update the remote ports status accordingly.

Acked-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:58:21 -05:00
Swen Schillig 2409549068 [SCSI] zfcp: incorrect reaction on incoming RSCN
After an error condition resolved a remote storage port was never
re-opened. The incoming RSCN was not processed accordingly due
to a misinterpreted status flag / return value combination.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:58:21 -05:00
Swen Schillig 5ffd51a5e4 [SCSI] zfcp: replace current ERP logging with a more convenient version
The current number based id ERP logging is replaced by a string
based tag version. The benefit is an easier location of the code in
question and the removal of the lengthy array referencing the
individual messages.
The string (7 bytes) based version does not use more space since those
bytes were "used" anyway due to the alignment of the structure.
The encoding of the 7 byte string is as follows
        [0-1] = filename
        [2-5] = task/function
        [6]   = section
Due to the character of this string (fixed length) a string
termination is not required here.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:58:20 -05:00
Swen Schillig 2128391632 [SCSI] zfcp: remove undefined subtype for status read response
The status read response FSF_STATUS_READ_SUB_ERROR_PORT is not
defined in the specs and therefore not valid.
All occurrences are removed from the code.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:58:20 -05:00
Christof Schmitt 8fdf30d542 [SCSI] zfcp: Send ELS ADISC from workqueue
Issue ELS ADISC requests from workqueue. This allows the link test
request to be sent when the request queue is full due to I/O load for
other remote ports. It also simplifies request queue locking,
zfcp_fsf_send_fcp_command_task is now the only function that has
interrupts disabled from the caller. This is also a prereq for the FC
passthrough support that issues ELS requests from userspace.

Acked-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:58:19 -05:00
Christof Schmitt 63caf367e1 [SCSI] zfcp: Improve reliability of SCSI eh handlers in zfcp
When the SCSI midlayer is running error recovery, the low-level error
recovery in zfcp could be running and preventing the SCSI midlayer to
issue error recovery requests. To avoid unnecessary error recovery
escalation, wait for the zfcp erp to finish and retry if necessary.

While reworking the SCSI eh handlers, alsa cleanup the code and
simplify the interface from zfcp_scsi to the fsf layer.

Acked-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:58:19 -05:00
Christof Schmitt 92cab0d93a [SCSI] zfcp: Wait for free SBALs when possible
For calls from zfcp erp, scsi_eh and sysfs switch the calls issuing
FSF requests to zfcp_fsf_req_sbal_get to wait for free SBALs.

Acked-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:58:19 -05:00
Christof Schmitt 52bfb558d2 [SCSI] zfcp: Only increment req_id for successfully issued requests
Only increment the req_id for successfully issued requests. This
avoids some confusion when debugging issued fsf requests.

Acked-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:58:19 -05:00
Christof Schmitt 49f0f01c99 [SCSI] zfcp: Simplify latency lock handling
The lock only needs to protect the softirq context called from qdio
against the userspace context called from sysfs. spin_lock and
spin_lock_bh is enough.

Acked-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:58:18 -05:00
Christof Schmitt a5b11dda12 [SCSI] zfcp: Remove some port flags
PORT_PHYS_CLOSING is only set and cleared, but not actually used
for status checking.

PORT_INVALID_WWPN is set when the GID_PN request does not return
a d_id for a remote port, e.g. when a remote port has been
unplugged. For this case, the d_id is zero. In the erp we can
check the d_id and use the normal escalation procedure that gives
up after three retries and remove the special case.

PORT_NO_WWPN is unused: Each port in the remote port list has a
valid wwpn. The WKA ports are now tracked outside the port
list. Remove the PORT_NO_WWPN flag, since this is no longer set
for any port.

Acked-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:58:18 -05:00
Martin K. Petersen 1c9fbafc8c [SCSI] Remove SUGGEST flags
The SUGGEST_* flags in the SCSI command result have been out of fashion
for a while and we don't actually use them in the error handling.
Remove the remaining occurrences.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:58:02 -05:00
Christof Schmitt b632ade282 [SCSI] zfcp: Remove unnecessary warning message
Remove a message that was emitted for a port that could not initially
be opened. This is a rare case when the port discovery hits an
initiator port and only confuses the user with an initator port logged
in the message. Remove the whole special case: The failed "open port"
request triggers required follow-up actions anyway.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Acked-by: Felix Beck <felix@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-12-29 11:38:29 -06:00
Christof Schmitt 39eb7e9aca [SCSI] zfcp: Add support for unchained FSF requests
Add the support to send CT and ELS requests as unchained FSF requests. This is
required for older hardware and was somehow omitted during the cleanup of the
FSF layer. The req_count and resp_count attributes are unused, so remove them
instead of adding a special case for setting them. Also add debug data and a
warning, when the ct request hits a limit.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Acked-by: Martin Petermann <martin@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-12-29 11:38:28 -06:00
Christof Schmitt b98478d71b [SCSI] zfcp: remove DID_DID flag
The port flag DID_DID indicates whether we know the current id of the
port. This is always set in parallel. Since the id 0 is invalid
(because the port id 0 is invalid) we can remove the DID_DID flag:
d_id of 0 indicates an invalid d_id != 0 is a valid one.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Acked-by: Felix Beck <felix@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-12-29 11:38:28 -06:00
Christof Schmitt dedbc2b3cb [SCSI] zfcp: Simplify SBAL allocation to fix sparse warnings
When waiting for a request claim the SBAL before waiting. This way,
locking before each check of the free counter is not required and
sparse does not emit warnings for the complicated locking scheme.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Acked-by: Felix Beck <felix@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-12-29 11:24:36 -06:00
Christof Schmitt 27c3f0a6e4 [SCSI] zfcp: Fix message line break
Move the closing parenthesis before the line break.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Acked-by: Felix Beck <felix@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-12-29 11:24:35 -06:00
Christof Schmitt ecf39d4212 [S390] convert zfcp printks to pr_xxx macros.
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:27 +01:00
Swen Schillig 0ac55aa90f [SCSI] zfcp: eliminate race between validation and locking
The check of having a valid pointer was performed before the
processing was secured by the lock. Between those two steps the
pointer can turn invalid.  During further processing another value is
used (referenced by the pointer described above) as a function pointer
which is never verified to be valid either, resulting under some
circumstances in an invalid function call.  This patch is fixing both
issues.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-12-01 10:17:50 -06:00
Swen Schillig 633528c304 [SCSI] zfcp: returning an ERR_PTR where a NULL value is expected
Aborting a SCSI cmnd might requrie to send a abort_fsf_cmnd. If the
creation of this fsf_req fails an ERR_PTR is returned where a NULL
value would be expected as an error indicator. This ERR_PTR is
dereferenced as valid fsf_req in succeeding processing leading to
an error.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-12-01 10:17:14 -06:00
Christof Schmitt 1c1cba17a9 [SCSI] zfcp: Fix opening of wka ports
Running two wka_port_get calls in parallel could issue two open_port
requests, overwriting the port handle. Don't issue an open_port
for the state PORT_OPENING, and only read the data from GOOD
responses.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Acked-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-12-01 10:16:59 -06:00
Christof Schmitt 3765138ae9 [SCSI] zfcp: Fix request list handling in error path
Fix the handling of the request list in the error path:
 - Use irqsave for the lock as in the good path.
 - Before removing the request, check if it is still in the list, a
   call to dismiss_all might have changed the list in between.
 - zfcp_qdio_send does not change the queue counters on failure,
   trying revert something is wrong, so remove this.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:46:39 -05:00
Christof Schmitt 88f2a97787 [SCSI] zfcp: fix mempool usage for status_read requests
When allocating fsf requests without qtcb, store the pointer to the
mempool in the fsf requests for later call to mempool_free. This
codepath is only used by the status_read requests.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:45:07 -05:00
Heiko Carstens 45316a86a6 [SCSI] zfcp: fix req_list_locking.
The per adapter req_list_lock must be held with interrupts disabled, otherwise
we might end up with nice deadlocks as lockdep tells us (see below).

zfcp 0.0.1804: QDIO problem occurred.

=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.27-rc8-00035-g4a77035-dirty #86
---------------------------------------------------------
swapper/0 just changed the state of lock:
 (&adapter->erp_lock){++..}, at: [<00000000002c82ae>] zfcp_erp_adapter_reopen+0x4e/0x8c
but this lock took another, hard-irq-unsafe lock in the past:
 (&adapter->req_list_lock){-+..}

and interrupts could create inverse lock ordering between them.

[tons of backtraces, but only the interesting part follows]

the second lock's dependencies:
-> (&adapter->req_list_lock){-+..} ops: 2280627634176 {
   initial-use  at:
                        [<0000000000071f10>] __lock_acquire+0x504/0x18bc
                        [<000000000007335c>] lock_acquire+0x94/0xbc
                        [<00000000003d7224>] _spin_lock_irqsave+0x6c/0xb0
                        [<00000000002cf684>] zfcp_fsf_req_dismiss_all+0x50/0x140
                        [<00000000002c87ee>] zfcp_erp_adapter_strategy_generic+0x66/0x3d0
                        [<00000000002c9498>] zfcp_erp_thread+0x88c/0x1318
                        [<000000000001b0d2>] kernel_thread_starter+0x6/0xc
                        [<000000000001b0cc>] kernel_thread_starter+0x0/0xc
   in-softirq-W at:
                        [<0000000000072172>] __lock_acquire+0x766/0x18bc
                        [<000000000007335c>] lock_acquire+0x94/0xbc
                        [<00000000003d7224>] _spin_lock_irqsave+0x6c/0xb0
                        [<00000000002ca73e>] zfcp_qdio_int_resp+0xbe/0x2ac
                        [<000000000027a1d6>] qdio_kick_inbound_handler+0x82/0xa0
                        [<000000000027daba>] tiqdio_inbound_processing+0x62/0xf8
                        [<0000000000047ba4>] tasklet_action+0x100/0x1f4
                        [<0000000000048b5a>] __do_softirq+0xae/0x154
                        [<0000000000021e4a>] do_softirq+0xea/0xf0
                        [<00000000000485de>] irq_exit+0xde/0xe8
                        [<0000000000268c64>] do_IRQ+0x160/0x1fc
                        [<00000000000261a2>] io_return+0x0/0x8
                        [<000000000001b8f8>] cpu_idle+0x17c/0x224
   hardirq-on-W at:
                        [<0000000000072190>] __lock_acquire+0x784/0x18bc
                        [<000000000007335c>] lock_acquire+0x94/0xbc
                        [<00000000003d702c>] _spin_lock+0x5c/0x9c
                        [<00000000002caff6>] zfcp_fsf_req_send+0x3e/0x158
                        [<00000000002ce7fe>] zfcp_fsf_exchange_config_data+0x106/0x124
                        [<00000000002c8948>] zfcp_erp_adapter_strategy_generic+0x1c0/0x3d0
                        [<00000000002c98ea>] zfcp_erp_thread+0xcde/0x1318
                        [<000000000001b0d2>] kernel_thread_starter+0x6/0xc
                        [<000000000001b0cc>] kernel_thread_starter+0x0/0xc
 }
 ... key      at: [<0000000000e356c8>] __key.26629+0x0/0x8

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmit@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:44:37 -05:00
Stefan Raspl 0997f1c5fe blktrace: pass zfcp driver data
This patch writes the channel and fabric latencies in nanoseconds per
request via blktrace for later analysis. The utilization of the inbound
and outbound adapter queue is also reported.

Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Martin Peschke <mp3@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-17 08:46:57 +02:00
Swen Schillig 41bfcf9010 [SCSI] zfcp: fix double dbf id usage
Trace ids 107 and 3 are used twice, fix this to have unique ids for
the erp triggers.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-03 12:11:56 -05:00
Swen Schillig b7f15f3c94 [SCSI] zfcp: fix deadlock caused by shared work queue tasks
Each adapter reopen trigger automatically a scan_port task which
is waiting for the ERP to be finished before further processing.
Since the initial device setup enqueues adapter, port and LUN which
are individual ERP actions, this process would start after
everything is done. Unfortunately the port_reopen requires another
scheduled work to be finished which is queued after the automatic
scan_port -> deadlock !

This fix creates an own work queue for ERP based nameserver requests.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-03 12:11:55 -05:00
Swen Schillig 5706938669 [SCSI] zfcp: put threshold data in hba trace
Now that we removed the long messages for the bit error threshold
data, put the data in the hba trace. This way, we get a short warning
for the threshold event from the hardware and have the data in the
trace for further analysis.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-03 12:11:55 -05:00
Christof Schmitt 0406289ed5 [SCSI] zfcp: Simplify zfcp data structures
Reduce the size of zfcp data structures by removing unused and
redundant members. scsi_lun is only the mangled version of the
fcp_lun. So, remove the redundant field and use the fcp_lun instead.

Since the queue lock and the pci_batch indicator are only used in the
request queue, move them from the common queue struct to the adapter
struct.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-03 12:11:54 -05:00