linux/drivers
Mauricio Faria de Oliveira 05a05872c8 lpfc: fix oops in lpfc_sli4_scmd_to_wqidx_distr() from lpfc_send_taskmgmt()
The lpfc_sli4_scmd_to_wqidx_distr() function expects the scsi_cmnd
'lpfc_cmd->pCmd' not to be null, and point to the midlayer command.

That's not true in the .eh_(device|target|bus)_reset_handler path,
because lpfc_send_taskmgmt() sends commands not from the midlayer, so
does not set 'lpfc_cmd->pCmd'.

That is true in the .queuecommand path because lpfc_queuecommand()
stores the scsi_cmnd from midlayer in lpfc_cmd->pCmd; and lpfc_cmd is
stored by lpfc_scsi_prep_cmnd() in piocbq->context1 -- which is passed
to lpfc_sli4_scmd_to_wqidx_distr() as lpfc_cmd parameter.

This problem can be hit on SCSI EH, and immediately with sg_reset.
These 2 test-cases demonstrate the problem/fix with next-20160601.

Test-case 1) sg_reset

    # strace sg_reset --device /dev/sdm
    <...>
    open("/dev/sdm", O_RDWR|O_NONBLOCK)     = 3
    ioctl(3, SG_SCSI_RESET, 0x3fffde6d0994 <unfinished ...>
    +++ killed by SIGSEGV +++
    Segmentation fault

    # dmesg
    Unable to handle kernel paging request for data at address 0x00000000
    Faulting instruction address: 0xd00000001c88442c
    Oops: Kernel access of bad area, sig: 11 [#1]
    <...>
    CPU: 104 PID: 16333 Comm: sg_reset Tainted: G        W       4.7.0-rc1-next-20160601-00004-g95b89dc #6
    <...>
    NIP [d00000001c88442c] lpfc_sli4_scmd_to_wqidx_distr+0xc/0xd0 [lpfc]
    LR [d00000001c826fe8] lpfc_sli_calc_ring.part.27+0x98/0xd0 [lpfc]
    Call Trace:
    [c000003c9ec876f0] [c000003c9ec87770] 0xc000003c9ec87770 (unreliable)
    [c000003c9ec87720] [d00000001c82e004] lpfc_sli_issue_iocb+0xd4/0x260 [lpfc]
    [c000003c9ec87780] [d00000001c831a3c] lpfc_sli_issue_iocb_wait+0x15c/0x5b0 [lpfc]
    [c000003c9ec87880] [d00000001c87f27c] lpfc_send_taskmgmt+0x24c/0x650 [lpfc]
    [c000003c9ec87950] [d00000001c87fd7c] lpfc_device_reset_handler+0x10c/0x200 [lpfc]
    [c000003c9ec87a10] [c000000000610694] scsi_try_bus_device_reset+0x44/0xc0
    [c000003c9ec87a40] [c0000000006113e8] scsi_ioctl_reset+0x198/0x2c0
    [c000003c9ec87bf0] [c00000000060fe5c] scsi_ioctl+0x13c/0x4b0
    [c000003c9ec87c80] [c0000000006629b0] sd_ioctl+0xf0/0x120
    [c000003c9ec87cd0] [c00000000046e4f8] blkdev_ioctl+0x248/0xb70
    [c000003c9ec87d30] [c0000000002a1f60] block_ioctl+0x70/0x90
    [c000003c9ec87d50] [c00000000026d334] do_vfs_ioctl+0xc4/0x890
    [c000003c9ec87de0] [c00000000026db60] SyS_ioctl+0x60/0xc0
    [c000003c9ec87e30] [c000000000009120] system_call+0x38/0x108
    Instruction dump:
    <...>

    With fix:

    # strace sg_reset --device /dev/sdm
    <...>
    open("/dev/sdm", O_RDWR|O_NONBLOCK)     = 3
    ioctl(3, SG_SCSI_RESET, 0x3fffe103c554) = 0
    close(3)                                = 0
    exit_group(0)                           = ?
    +++ exited with 0 +++

    # dmesg
    [  424.658649] lpfc 0006:01:00.4: 4:(0):0713 SCSI layer issued Device Reset (1, 0) return x2002

Test-case 2) SCSI EH

    Using this debug patch to wire an SCSI EH trigger, for lpfc_scsi_cmd_iocb_cmpl():
    -       cmd->scsi_done(cmd);
    +       if ((phba->pport ? phba->pport->cfg_log_verbose : phba->cfg_log_verbose) == 0x32100000)
    +               printk(KERN_ALERT "lpfc: skip scsi_done()\n");
    +       else
    +               cmd->scsi_done(cmd);

    # echo 0x32100000 > /sys/class/scsi_host/host11/lpfc_log_verbose

    # dd if=/dev/sdm of=/dev/null iflag=direct &
    <...>

    After a while:

    # dmesg
    lpfc 0006:01:00.4: 4:(0):3053 lpfc_log_verbose changed from 0 (x0) to 839909376 (x32100000)
    lpfc: skip scsi_done()
    <...>
    Unable to handle kernel paging request for data at address 0x00000000
    Faulting instruction address: 0xd0000000199e448c
    Oops: Kernel access of bad area, sig: 11 [#1]
    <...>
    CPU: 96 PID: 28556 Comm: scsi_eh_11 Tainted: G        W       4.7.0-rc1-next-20160601-00004-g95b89dc #6
    <...>
    NIP [d0000000199e448c] lpfc_sli4_scmd_to_wqidx_distr+0xc/0xd0 [lpfc]
    LR [d000000019986fe8] lpfc_sli_calc_ring.part.27+0x98/0xd0 [lpfc]
    Call Trace:
    [c000000ff0d0b890] [c000000ff0d0b900] 0xc000000ff0d0b900 (unreliable)
    [c000000ff0d0b8c0] [d00000001998e004] lpfc_sli_issue_iocb+0xd4/0x260 [lpfc]
    [c000000ff0d0b920] [d000000019991a3c] lpfc_sli_issue_iocb_wait+0x15c/0x5b0 [lpfc]
    [c000000ff0d0ba20] [d0000000199df27c] lpfc_send_taskmgmt+0x24c/0x650 [lpfc]
    [c000000ff0d0baf0] [d0000000199dfd7c] lpfc_device_reset_handler+0x10c/0x200 [lpfc]
    [c000000ff0d0bbb0] [c000000000610694] scsi_try_bus_device_reset+0x44/0xc0
    [c000000ff0d0bbe0] [c0000000006126cc] scsi_eh_ready_devs+0x49c/0x9c0
    [c000000ff0d0bcb0] [c000000000614160] scsi_error_handler+0x580/0x680
    [c000000ff0d0bd80] [c0000000000ae848] kthread+0x108/0x130
    [c000000ff0d0be30] [c0000000000094a8] ret_from_kernel_thread+0x5c/0xb4
    Instruction dump:
    <...>

    With fix:

    # dmesg
    lpfc 0006:01:00.4: 4:(0):3053 lpfc_log_verbose changed from 0 (x0) to 839909376 (x32100000)
    lpfc: skip scsi_done()
    <...>
    lpfc 0006:01:00.4: 4:(0):0713 SCSI layer issued Device Reset (0, 0) return x2002
    <...>
    lpfc 0006:01:00.4: 4:(0):0723 SCSI layer issued Target Reset (1, 0) return x2002
    <...>
    lpfc 0006:01:00.4: 4:(0):0714 SCSI layer issued Bus Reset Data: x2002
    <...>
    lpfc 0006:01:00.4: 4:(0):3172 SCSI layer issued Host Reset Data:
    <...>

Fixes: 8b0dff1416 ("lpfc: Add support for using block multi-queue")
Cc: <stable@vger.kernel.org> # v4.2+
Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-07-27 00:31:09 -04:00
..
accessibility
acpi Merge branches 'acpica-fixes', 'acpi-pci-fixes' and 'acpi-debug-fixes' 2016-07-07 23:37:37 +02:00
amba ARM: 8566/1: drivers: amba: properly handle devices with power domains 2016-05-05 19:00:40 +01:00
android
ata Merge branch 'for-4.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata 2016-06-28 12:11:31 -07:00
atm atm: iphase: off by one in rx_pkt() 2016-05-31 11:52:59 -07:00
auxdisplay
base Driver core fixes for 4.7-rc4 2016-06-18 06:04:01 -10:00
bcma MTD updates for v4.7: 2016-05-24 11:00:20 -07:00
block Merge branch 'for-linus' of git://git.kernel.dk/linux-block 2016-07-07 15:34:09 -07:00
bluetooth Bluetooth: Add USB ID 13D3:3487 to ath3k 2016-05-13 16:54:59 +02:00
bus Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus 2016-05-19 10:02:26 -07:00
cdrom
char ipmi: Remove smi_msg from waiting_rcv_msgs list before handle_one_recv_msg() 2016-06-13 08:56:28 -05:00
clk A bunch of fixes. Some for the newly added rk3399 clock tree, some 2016-06-20 17:01:45 -07:00
clocksource Small release overall. 2016-05-19 11:27:09 -07:00
connector connector: fix out-of-order cn_proc netlink message delivery 2016-06-28 08:48:33 -04:00
cpufreq cpufreq: Avoid false-positive WARN_ON()s in cpufreq_update_policy() 2016-06-28 03:29:29 +02:00
cpuidle cpuidle: Fix last_residency division 2016-07-04 14:17:34 +02:00
crypto crypto: ux500 - memmove the right size 2016-06-13 17:43:05 +08:00
dax /dev/dax, core: file operations and dax-mmap 2016-05-20 22:02:55 -07:00
dca
devfreq PM / devfreq: Send the DEVFREQ_POSTCHANGE notification when target() is failed 2016-06-23 23:15:12 +02:00
dio
dma dmaengine: mv_xor: Fix incorrect offset in dma_map_page() 2016-06-07 12:44:23 +05:30
dma-buf dma-buf: use vma_pages() 2016-05-31 22:17:05 +05:30
edac EDAC, sb_edac: Readd accidentally dropped Broadwell-D support 2016-06-03 17:28:21 +02:00
eisa
extcon extcon: palmas: Fix boot up state of VBUS when using GPIO detection 2016-06-15 17:17:22 +09:00
firewire treewide: replace dev->trans_start update with helper 2016-05-04 14:16:49 -04:00
firmware efi/arm: Fix the format of EFI debug messages 2016-06-03 09:57:36 +02:00
fmc
fpga
gpio Revert "gpio: gpiolib-of: Allow compile testing" 2016-07-05 19:03:04 +02:00
gpu Allwinner DRM driver fixes for 4.7, take 2 2016-07-08 13:29:11 +10:00
hid HID: multitouch: enable palm rejection for Windows Precision Touchpad 2016-06-28 13:24:14 +02:00
hsi HSI: omap-ssi: move omap_ssi_port_update_fclk 2016-05-09 22:45:18 +02:00
hv
hwmon hwmon: (dell-smm) Cache fan_type() calls and change fan detection 2016-06-23 06:24:23 -07:00
hwspinlock drivers/hwspinlock: use correct radix tree API 2016-05-20 17:58:30 -07:00
hwtracing coresight: Handle build path error 2016-06-16 00:13:06 -07:00
i2c i2c: mux: reg: Provide of_match_table 2016-06-09 22:38:16 +02:00
ide
idle
iio iio:ad7266: Fix probe deferral for vref 2016-06-26 17:39:26 +01:00
infiniband Merge branches '4.7-rc-misc', 'hfi1-fixes', 'i40iw-rc-fixes' and 'mellanox-rc-fixes' into k.o/for-4.7-rc 2016-06-23 12:22:33 -04:00
input Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input 2016-06-27 20:34:43 -07:00
iommu iommu/amd: Fix unity mapping initialization race 2016-07-06 18:04:55 +02:00
ipack
irqchip irqchip/mips-gic: Match IPI IRQ domain by bus token only 2016-07-05 16:54:21 +02:00
isdn TTY and Serial driver update for 4.7-rc1 2016-05-20 20:57:27 -07:00
leds leds: handle suspend/resume in heartbeat trigger 2016-06-08 11:47:06 +02:00
lguest
lightnvm lightnvm: reserved space calculation incorrect 2016-05-06 12:51:10 -06:00
macintosh
mailbox mailbox: Fix devm_ioremap_resource error detection code 2016-05-08 22:44:46 +05:30
mcb mcb: Acquire reference to carrier module in core 2016-06-13 18:49:30 -07:00
md Merge branch 'for-linus' of git://git.kernel.dk/linux-block 2016-05-27 14:28:09 -07:00
media Update my main e-mails at the Kernel tree 2016-06-15 15:35:37 -10:00
memory memory: omap-gpmc: Fix omap gpmc EXTRADELAY timing 2016-06-16 11:43:48 +03:00
memstick drivers/memstick/core/mspro_block: use kmemdup 2016-05-23 17:04:14 -07:00
message SCSI misc on 20160517 2016-05-18 16:38:59 -07:00
mfd mfd: max77620: Fix FPS switch statements 2016-06-30 07:44:23 +01:00
misc mei: don't use wake_up_interruptible for wr_ctrl 2016-06-10 22:14:24 -07:00
mmc mmc: sunxi: Re-enable eMMC HS-DDR modes on Allwinner A80 2016-06-02 10:40:20 +02:00
mtd ubi: Make recover_peb power cut aware 2016-06-23 00:29:32 +02:00
net cxgb4: update latest firmware version supported 2016-07-05 11:53:25 -07:00
nfc NFC: pn533: handle interrupted commands in pn533_recv_frame 2016-05-10 00:01:47 +02:00
ntb
nubus
nvdimm libnvdimm, pfn, dax: fix initialization vs autodetect for mode + alignment 2016-06-23 17:50:39 -07:00
nvme NVMe: Only release requested regions 2016-06-09 14:28:28 -06:00
nvmem remove lots of IS_ERR_VALUE abuses 2016-05-27 15:26:11 -07:00
of drivers/of: Fix depth for sub-tree blob in unflatten_dt_nodes() 2016-06-09 14:36:34 -05:00
oprofile
parisc
parport
pci PCI: Fix unaligned accesses in VC code 2016-06-20 13:24:20 -05:00
pcmcia
perf arm: pmu: Fix non-devicetree probing 2016-06-15 09:51:35 +01:00
phy - Final patches fixing Reset API change 2016-07-01 15:17:16 -07:00
pinctrl pinctrl: baytrail: Fix mingled clock pins 2016-06-23 11:05:04 +02:00
platform platform/chrome: cros_ec_dev - double fetch bug in ioctl 2016-07-05 14:01:52 -07:00
pnp driver core update for 4.7-rc1 2016-05-20 21:26:15 -07:00
power power_supply: tps65217-charger: Fix NULL deref during property export 2016-06-16 15:54:11 +02:00
powercap Power management material for v4.7-rc1 2016-05-16 19:17:22 -07:00
pps
ps3
ptp ptp: oops in ptp_ioctl() 2016-05-29 22:32:27 -07:00
pwm pwm: atmel-hlcdc: Fix default PWM polarity 2016-06-14 10:51:45 +02:00
rapidio rapidio/mport_cdev: fix uapi type definitions 2016-05-05 17:38:53 -07:00
ras
regulator regulator: Fix qcom-smd list voltage issues for msm8974 2016-07-13 04:22:16 +09:00
remoteproc remoteproc: Add additional crash reasons 2016-05-12 15:50:19 -07:00
reset
rpmsg rpmsg: add THIS_MODULE to rpmsg_driver in rpmsg core 2016-05-06 11:08:58 -07:00
rtc rtc: tps6586x: rename so module can be autoloaded 2016-05-21 17:07:17 +02:00
s390 qeth: delete napi struct when removing a qeth device 2016-07-04 23:32:08 -07:00
sbus openprom: fix warning 2016-05-20 18:33:37 -07:00
scsi lpfc: fix oops in lpfc_sli4_scmd_to_wqidx_distr() from lpfc_send_taskmgmt() 2016-07-27 00:31:09 -04:00
sfi
sh
sn
soc soc: mtk-pmic-wrap: avoid integer overflow warning 2016-05-19 15:20:24 +02:00
spi Merge remote-tracking branches 'spi/fix/ep93xx', 'spi/fix/rockchip', 'spi/fix/sunxi' and 'spi/fix/ti-qspi' into spi-linus 2016-06-30 13:17:29 +01:00
spmi
ssb
staging staging: iio: accel: fix error check 2016-06-26 15:57:02 +01:00
target Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending 2016-05-28 12:04:17 -07:00
tc
thermal Merge branch 'for-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux 2016-06-12 06:30:39 -07:00
thunderbolt
tty devpts: fix null pointer dereference on failed memory allocation 2016-06-26 11:39:00 -07:00
uio
usb - Final patches fixing Reset API change 2016-07-01 15:17:16 -07:00
uwb
vfio vfio/pci: Allow VPD short read 2016-05-31 21:25:52 -06:00
vhost target: make close_session optional 2016-05-10 01:19:26 -07:00
video OMAPDSS: HDMI5: Change DDC timings 2016-05-31 08:20:43 +03:00
virt
virtio virtio_balloon: fix PFN format for virtio-1 2016-05-22 19:44:13 +03:00
vlynq
vme
w1
watchdog watchdog: ebc-c384_wdt: Allow build for X86_64 2016-06-17 20:21:12 -07:00
xen xen/acpi: allow xen-acpi-processor driver to load on Xen 4.7 2016-07-08 14:53:13 +01:00
zorro
Kconfig libnvdimm for 4.7 2016-05-23 11:18:01 -07:00
Makefile /dev/dax, pmem: direct access to persistent memory 2016-05-20 22:02:53 -07:00