Commit graph

1715 commits

Author SHA1 Message Date
Jens Axboe
488211844e floppy: switch to one queue per drive instead of sharing a queue
Pretty straight forward conversion. Note that we do round-robin
between the drives that have available requests, before we simply
used the drive that the IO scheduler told us to. Since the IO
scheduler doesn't care about multiple devices per queue, the resulting
sort would not have made sense.

Fixed by Vivek to get rid of a double lock problem in set_next_request()

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
2010-09-22 09:32:36 +02:00
Dan Carpenter
b0722cb1ac cciss: freeing uninitialized data on error path
The "h->scatter_list" is allocated inside a for loop.  If any of those
allocations fail, then the rest of the list is uninitialized data.  When
we free it we should start from the top and free backwards so that we
don't call kfree() on uninitialized pointers.

Also if the allocation for "h->scatter_list" fails then we would get an
Oops here.  I should have noticed this when I send: 4ee69851c "cciss:
handle allocation failure."  but I didn't.  Sorry about that.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-21 11:49:17 +02:00
Christoph Hellwig
dd3932eddf block: remove BLKDEV_IFL_WAIT
All the blkdev_issue_* helpers can only sanely be used for synchronous
caller.  To issue cache flushes or barriers asynchronously the caller needs
to set up a bio by itself with a completion callback to move the asynchronous
state machine ahead.  So drop the BLKDEV_IFL_WAIT flag that is always
specified when calling blkdev_issue_* and also remove the now unused flags
argument to blkdev_issue_flush and blkdev_issue_zeroout.  For
blkdev_issue_discard we need to keep it for the secure discard flag, which
gains a more descriptive name and loses the bitops vs flag confusion.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-16 20:52:58 +02:00
Martin K. Petersen
c8bf133682 Consolidate min_not_zero
We have several users of min_not_zero, each of them using their own
definition.  Move the define to kernel.h.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@carl.home.kernel.dk>
2010-09-10 20:07:38 +02:00
Linus Torvalds
ff3cb3fec3 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  block: Range check cpu in blk_cpu_to_group
  scatterlist: prevent invalid free when alloc fails
  writeback: Fix lost wake-up shutting down writeback thread
  writeback: do not lose wakeup events when forking bdi threads
  cciss: fix reporting of max queue depth since init
  block: switch s390 tape_block and mg_disk to elevator_change()
  block: add function call to switch the IO scheduler from a driver
  fs/bio-integrity.c: return -ENOMEM on kmalloc failure
  bio-integrity.c: remove dependency on __GFP_NOFAIL
  BLOCK: fix bio.bi_rw handling
  block: put dev->kobj in blk_register_queue fail path
  cciss: handle allocation failure
  cfq-iosched: Documentation help for new tunables
  cfq-iosched: blktrace print per slice sector stats
  cfq-iosched: Implement tunable group_idle
  cfq-iosched: Do group share accounting in IOPS when slice_idle=0
  cfq-iosched: Do not idle if slice_idle=0
  cciss: disable doorbell reset on reset_devices
  blkio: Fix return code for mkdir calls
2010-09-10 07:26:27 -07:00
Tejun Heo
02c42b7a68 virtio_blk: drop REQ_HARDBARRIER support
Remove now unused REQ_HARDBARRIER support.  virtio_blk already
supports REQ_FLUSH and the usefulness of REQ_FUA for virtio_blk is
questionable at this point, so there's nothing else to do to support
new REQ_FLUSH/FUA interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:35:37 +02:00
Tejun Heo
6259f28459 block/loop: implement REQ_FLUSH/FUA support
Deprecate REQ_HARDBARRIER and implement REQ_FLUSH/FUA instead.  Also,
instead of checking file->f_op->fsync() directly, look at the value of
vfs_fsync() and ignore -EINVAL return.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:35:37 +02:00
Tejun Heo
9cbbdca44a block: remove spurious uses of REQ_HARDBARRIER
REQ_HARDBARRIER is deprecated.  Remove spurious uses in the following
users.  Please note that other than osdblk, all other uses were
already spurious before deprecation.

* osdblk: osdblk_rq_fn() won't receive any request with
  REQ_HARDBARRIER set.  Remove the test for it.

* pktcdvd: use of REQ_HARDBARRIER in pkt_generic_packet() doesn't mean
  anything.  Removed.

* aic7xxx_old: Setting MSG_ORDERED_Q_TAG on REQ_HARDBARRIER is
  spurious.  Removed.

* sas_scsi_host: Setting TASK_ATTR_ORDERED on REQ_HARDBARRIER is
  spurious.  Removed.

* scsi_tcq: The ordered tag path wasn't being used anyway.  Removed.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Boaz Harrosh <bharrosh@panasas.com>
Cc: James Bottomley <James.Bottomley@suse.de>
Cc: Peter Osterlund <petero2@telia.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:35:36 +02:00
Tejun Heo
4913efe456 block: deprecate barrier and replace blk_queue_ordered() with blk_queue_flush()
Barrier is deemed too heavy and will soon be replaced by FLUSH/FUA
requests.  Deprecate barrier.  All REQ_HARDBARRIERs are failed with
-EOPNOTSUPP and blk_queue_ordered() is replaced with simpler
blk_queue_flush().

blk_queue_flush() takes combinations of REQ_FLUSH and FUA.  If a
device has write cache and can flush it, it should set REQ_FLUSH.  If
the device can handle FUA writes, it should also set REQ_FUA.

All blk_queue_ordered() users are converted.

* ORDERED_DRAIN is mapped to 0 which is the default value.
* ORDERED_DRAIN_FLUSH is mapped to REQ_FLUSH.
* ORDERED_DRAIN_FLUSH_FUA is mapped to REQ_FLUSH | REQ_FUA.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Boaz Harrosh <bharrosh@panasas.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Alasdair G Kergon <agk@redhat.com>
Cc: Pierre Ossman <drzeus@drzeus.cx>
Cc: Stefan Weinhuber <wein@de.ibm.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:35:36 +02:00
Tejun Heo
6958f14545 block: kill QUEUE_ORDERED_BY_TAG
Nobody is making meaningful use of ORDERED_BY_TAG now and queue
draining for barrier requests will be removed soon which will render
the advantage of tag ordering moot.  Kill ORDERED_BY_TAG.  The
following users are affected.

* brd: converted to ORDERED_DRAIN.
* virtio_blk: ORDERED_TAG path was already marked deprecated.  Removed.
* xen-blkfront: ORDERED_TAG case dropped.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:35:36 +02:00
Tejun Heo
589d7ed02a block/loop: queue ordered mode should be DRAIN_FLUSH
loop implements FLUSH using fsync but was incorrectly setting its
ordered mode to DRAIN.  Change it to DRAIN_FLUSH.  In practice, this
doesn't change anything as loop doesn't make use of the block layer
ordered implementation.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:35:36 +02:00
Stephen M. Cameron
fcfb5c0ce1 cciss: remove some superfluous tests from cciss_bigpassthru()
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:12:40 +02:00
Stephen M. Cameron
0c9f5ba7cb cciss: factor out cciss_big_passthru
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:12:39 +02:00
Stephen M. Cameron
f32f125b1c cciss: factor out cciss_passthru
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:12:37 +02:00
Stephen M. Cameron
0894b32c5c cciss: factor out cciss_getluninfo
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:12:36 +02:00
Stephen M. Cameron
c525919ddf cciss: factor out cciss_getdrivver
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:12:35 +02:00
Stephen M. Cameron
8a4f7fbfdd cciss: factor out cciss_getfirmver
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:12:34 +02:00
Stephen M. Cameron
d18dfad4e2 cciss: factor out cciss_getbustypes
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:12:33 +02:00
Stephen M. Cameron
93c7493113 cciss: factor out cciss_getheartbeat
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:12:32 +02:00
Stephen M. Cameron
4f43f32cd3 cciss: factor out cciss_setnodename
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:12:32 +02:00
Stephen M. Cameron
2521610942 cciss: factor out cciss_getnodename
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:12:31 +02:00
Stephen M. Cameron
4c800eed9a cciss: factor out cciss_setintinfo
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:12:30 +02:00
Stephen M. Cameron
576e661c65 cciss: factor out cciss_getintinfo
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:12:29 +02:00
Stephen M. Cameron
0a25a5aee7 cciss: factor out cciss_getpciinfo
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:12:28 +02:00
Stephen M. Cameron
2a643ec67f cciss: fix reporting of max queue depth since init
The ioctl path and the scsi tape path were not accounting
for their additions to the queue depth.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-25 19:58:53 +02:00
Linus Torvalds
c05e1e23b8 Merge branch 'for-upstream/pvhvm' of git://xenbits.xensource.com/people/ianc/linux-2.6
* 'for-upstream/pvhvm' of git://xenbits.xensource.com/people/ianc/linux-2.6:
  xen: pvhvm: make it clearer that XEN_UNPLUG_* define bits in a bitfield
  xen: pvhvm: rename xen_emul_unplug=ignore to =unnnecessary
  xen: pvhvm: allow user to request no emulated device unplug
2010-08-23 18:29:18 -07:00
Milan Broz
ee86273062 loop: add some basic read-only sysfs attributes
Create /sys/block/loopX/loop directory and provide these attributes:
 - backing_file
 - autoclear
 - offset
 - sizelimit

This loop directory is present only if loop device is configured.

To be used in util-linux-ng (and possibly elsewhere like udev rules)
where code need to get loop attributes from kernel (and not store
duplicate info in userspace).

Moreover loop ioctls are not even able to provide full backing
file info because of buffer limits.

Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-23 15:18:10 +02:00
Jens Axboe
52cc2eef31 block: switch s390 tape_block and mg_disk to elevator_change()
Now that we have this API, switch the two in-kernel users to it.
Resolves an oops introduced by commit
1abec4fdbb.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-23 14:02:44 +02:00
Ian Campbell
1dc7ce99b0 xen: pvhvm: rename xen_emul_unplug=ignore to =unnnecessary
It is not immediately clear what this option causes to become
ignored. The actual meaning is that it is not necessary to unplug the
emulated devices to safely use the PV ones, even if the platform does
not support the unplug protocol. (pressumably the user will only add
this option if they have ensured that their domain configuration is
safe).

I think xen_emul_unplug=unnecessary better captures this.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: Stefano Stabellini <Stefano.Stabellini@eu.citrix.com>
2010-08-23 11:59:29 +01:00
Jiri Slaby
5e00d1b5b4 BLOCK: fix bio.bi_rw handling
Return of the bi_rw tests is no longer bool after commit 74450be1. But
results of such tests are stored in bools. This doesn't fit in there
for some compilers (gcc 4.5 here), so either use !! magic to get real
bools or use ulong where the result is assigned somewhere.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-23 12:33:10 +02:00
Dan Carpenter
4ee69851cd cciss: handle allocation failure
If kmalloc() fails then cleanup and return failure (-1).

Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-23 12:28:15 +02:00
Stephen M. Cameron
75230ff275 cciss: disable doorbell reset on reset_devices
The doorbell reset initially appears to work correctly,
the controller resets, comes up, some i/o can even be
done, but on at least some Smart Arrays in some servers,
it eventually causes a subsequent controller lockup due
to some kind of PCIe error, and kdump can end up leaving
the root filesystem in an unbootable state.  For this
reason, until the problem is fixed, or at least isolated
to certain hardware enough to be avoided, the doorbell
reset should not be used at all.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-23 11:02:17 +02:00
Graeme Smecher
7a50d06e24 of: fix missing headers for of_address_to_resource() in MTD and SysACE drivers
The drivers for Xilinx' SystemACE and physically mapped MTDs were missing
prototypes for of_address_to_resource(). This patch adds the necessary
headers.

Signed-off-by: Graeme Smecher <graeme.smecher@mail.mcgill.ca>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-08-17 13:16:47 -06:00
Linus Torvalds
58d4ea65b9 Merge branch 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6
* 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6:
  mmc_spi: Fix unterminated of_match_table
  of/sparc: fix build regression from of_device changes
  of/device: Replace struct of_device with struct platform_device
2010-08-12 09:11:31 -07:00
Linus Torvalds
2f9e825d3e Merge branch 'for-2.6.36' of git://git.kernel.dk/linux-2.6-block
* 'for-2.6.36' of git://git.kernel.dk/linux-2.6-block: (149 commits)
  block: make sure that REQ_* types are seen even with CONFIG_BLOCK=n
  xen-blkfront: fix missing out label
  blkdev: fix blkdev_issue_zeroout return value
  block: update request stacking methods to support discards
  block: fix missing export of blk_types.h
  writeback: fix bad _bh spinlock nesting
  drbd: revert "delay probes", feature is being re-implemented differently
  drbd: Initialize all members of sync_conf to their defaults [Bugz 315]
  drbd: Disable delay probes for the upcomming release
  writeback: cleanup bdi_register
  writeback: add new tracepoints
  writeback: remove unnecessary init_timer call
  writeback: optimize periodic bdi thread wakeups
  writeback: prevent unnecessary bdi threads wakeups
  writeback: move bdi threads exiting logic to the forker thread
  writeback: restructure bdi forker loop a little
  writeback: move last_active to bdi
  writeback: do not remove bdi from bdi_list
  writeback: simplify bdi code a little
  writeback: do not lose wake-ups in bdi threads
  ...

Fixed up pretty trivial conflicts in drivers/block/virtio_blk.c and
drivers/scsi/scsi_error.c as per Jens.
2010-08-10 15:22:42 -07:00
Jens Axboe
a4cc14ec9f xen-blkfront: fix missing out label
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-08 21:50:05 -04:00
Lars Ellenberg
e7f52dfb4f drbd: revert "delay probes", feature is being re-implemented differently
It was a now abandoned attempt to throttle resync bandwidth
based on the delay it causes on the bulk data socket.
It has no userbase yet, and has been disabled by
9173465ccb51c09cc3102a10af93e9f469a0af6f already.
This removes the now unused code.

The basic feature, namely using up "idle" bandwith
of network and disk IO subsystem, with minimal impact
to application IO, is being reimplemented differently.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:53:57 +02:00
Philipp Reisner
85f4cc17a6 drbd: Initialize all members of sync_conf to their defaults [Bugz 315]
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:53:57 +02:00
Philipp Reisner
6710a57603 drbd: Disable delay probes for the upcomming release
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:53:57 +02:00
Kulikov Vasiliy
f6c4c8e19a cpqarray: check put_user() result
put_user() may fail, if so return -EFAULT.

Signed-off-by: Kulikov Vasiliy <segooon@gmail.com>
Acked-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:53:03 +02:00
Jeremy Fitzhardinge
7901d14144 xen/blkfront: Use QUEUE_ORDERED_DRAIN for old backends
If there's no feature-barrier key in xenstore, then it means its a fairly
old backend which does uncached in-order writes, which means ORDERED_DRAIN
is appropriate.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-08-07 18:52:53 +02:00
Jeremy Fitzhardinge
4dab46ff26 xen/blkfront: use tagged queuing for barriers
When barriers are supported, then use QUEUE_ORDERED_TAG to tell the block
subsystem that it doesn't need to do anything else with the barriers.
Previously we used ORDERED_DRAIN which caused the block subsystem to
drain all pending IO before submitting the barrier, which would be
very expensive.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-08-07 18:52:53 +02:00
Stephen Hemminger
3b06c21e84 floppy: make controller const
The struct cont_t is just a set of virtual function pointers.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:31 +02:00
Julia Lawall
ad96a7a7ea drivers/block: use memdup_user
Use memdup_user when user data is immediately copied into the
allocated region.  Some checkpatch cleanups in nearby code.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
expression from,to,size,flag;
position p;
identifier l1,l2;
@@

-  to = \(kmalloc@p\|kzalloc@p\)(size,flag);
+  to = memdup_user(from,size);
   if (
-      to==NULL
+      IS_ERR(to)
                 || ...) {
   <+... when != goto l1;
-  -ENOMEM
+  PTR_ERR(to)
   ...+>
   }
-  if (copy_from_user(to, from, size) != 0) {
-    <+... when != goto l2;
-    -EFAULT
-    ...+>
-  }
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Cc: Chirag Kantharia <chirag.kantharia@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:31 +02:00
Stephen M. Cameron
8112586063 cciss: cleanup interrupt_not_for_us
cciss: cleanup interrupt_not_for_us
In the case of MSI/MSIX interrutps, we don't need to check
if the interrupt is for us, and in the case of the intx interrupt
handler, when checking if the interrupt is for us, we don't need
to check if we're using MSI/MSIX, we know we're not.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:30 +02:00
Stephen M. Cameron
b2a4a43dba cciss: change printks to dev_warn, etc.
cciss: change printks to dev_warn, etc.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:30 +02:00
Stephen M. Cameron
6b4d96b878 cciss: separate cmd_alloc() and cmd_special_alloc()
cciss: separate cmd_alloc() and cmd_special_alloc()
cmd_alloc() took a parameter which caused it to either allocate
from a pre-allocated pool, or allocate using pci_alloc_consistent.
This parameter is always known at compile time, so this would
be better handled by breaking the function into two functions
and differentiating the cases by function names.  Same goes
for cmd_free().

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:30 +02:00
Stephen M. Cameron
f70dba8366 cciss: use consistent variable names
cciss: use consistent variable names
"h", for the hba structure and "c" for the command structures.
and get rid of trivial CCISS_LOCK macro.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:30 +02:00
Stephen M. Cameron
058a0f9f31 cciss: forbid hard reset of 640x boards
cciss: forbid hard reset of 640x boards
The 6402/6404 are two PCI devices -- two Smart Array controllers
-- that fit into one slot.  It is possible to reset them independently,
however, they share a battery backed cache module.  One of the pair
controls the cache and the 2nd one access the cache through the first
one.  If you reset the one controlling the cache, the other one will
not be a happy camper.  So we just forbid resetting this conjoined
mess.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:30 +02:00
Stephen M. Cameron
adfbc1ff34 cciss: sanitize max commands
cciss: sanitize max commands
Some controllers might try to tell us they support 0 commands
in performant mode.  This is a lie told by buggy firmware.
We have to be wary of this lest we try to allocate a negative
number of command blocks, which will be treated as unsigned,
and get an out of memory condition.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:30 +02:00
Stephen M. Cameron
a6528d0172 cciss: fix hard reset code.
cciss: Fix hard reset code.
Smart Array controllers newer than the P600 do not honor the
PCI power state method of resetting the controllers.  Instead,
in these cases we can get them to reset via the "doorbell" register.

This escaped notice until we began using "performant" mode because
the fact that the controllers did not reset did not normally
impede subsequent operation, and so things generally appeared to
"work".  Once the performant mode code was added, if the controller
does not reset, it remains in performant mode.  The code immediately
after the reset presumes the controller is in "simple" mode
(which previously, it had remained in simple mode the whole time).
If the controller remains in performant mode any code which presumes
it is in simple mode will not work.  So the reset needs to be fixed.

Unfortunately there are some controllers which cannot be reset by
either method. (eg. p800).  We detect these cases by noticing that
the controller seems to remain in performant mode even after a
reset has been attempted.  In those cases we ignore the controller,
as any commands outstanding on it will result in stale completions.
To sum up, we try to do a better job of resetting the controller if
"reset_devices" is set, and if it doesn't work, we ignore that
controller.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:30 +02:00
Stephen M. Cameron
83123cb11b cciss: factor out cciss_reset_devices()
cciss: factor out cciss_reset_devices()

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:12 +02:00
Stephen M. Cameron
8e93bf6d6c cciss: factor out cciss_find_cfg_addrs.
Rationale for this is that I will also need to use this code
in fixing kdump host reset code prior to having the hba structure.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:12 +02:00
Stephen M. Cameron
b993313540 cciss: factor out cciss_enter_performant_mode
cciss: factor out cciss_enter_performant_mode

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:12 +02:00
Stephen M. Cameron
0f8a6a1e7b cciss: factor out cciss_wait_for_mode_change_ack()
cciss: factor out cciss_wait_for_mode_change_ack()

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:11 +02:00
Stephen M. Cameron
fe3b7527db cciss: make cciss_put_controller_into_performant_mode as __devinit
cciss: make cciss_put_controller_into_performant_mode as __devinit

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:11 +02:00
Stephen M. Cameron
ff5f58f06d cciss: cleanup some debug ifdefs
cciss: cleanup some debug ifdefs

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:11 +02:00
Stephen M. Cameron
bfd63ee571 cciss: factor out cciss_p600_dma_prefetch_quirk()
cciss: factor out cciss_p600_dma_prefetch_quirk()

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:11 +02:00
Stephen M. Cameron
322e304c4d cciss: factor out cciss_enable_scsi_prefetch()
cciss: factor out cciss_enable_scsi_prefetch()

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:11 +02:00
Stephen M. Cameron
501b92cd6b cciss: factor out CISS_signature_present()
cciss: factor out CISS_signature_present()

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:11 +02:00
Stephen M. Cameron
afadbf4b95 cciss: factor out cciss_find_board_params
cciss: factor out cciss_find_board_params

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:11 +02:00
Stephen M. Cameron
da5503217d cciss: fix leak of ioremapped memory
cciss: fix leak of ioremapped memory
in cciss_pci_init error path.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:11 +02:00
Stephen M. Cameron
4809d0988f cciss: factor out cciss_find_cfgtables
cciss: factor out cciss_find_cfgtables

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:11 +02:00
Stephen M. Cameron
e99ba13627 cciss: factor out cciss_wait_for_board_ready()
cciss: factor out cciss_wait_for_board_ready()

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:11 +02:00
Stephen M. Cameron
d474830da6 cciss: factor out cciss_find_memory_BAR()
cciss: factor out cciss_find_memory_BAR()

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:10 +02:00
Stephen M. Cameron
dac5488a9e cciss: remove board_id parameter from cciss_interrupt_mode()
cciss: remove board_id parameter from cciss_interrupt_mode()

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:10 +02:00
Stephen M. Cameron
dd9c426e92 cciss: factor out cciss_board_disabled
cciss: factor out cciss_board_disabled

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:10 +02:00
Stephen M. Cameron
6539fa9b2e cciss: factor out cciss_lookup_board_id
cciss: factor out cciss_lookup_board_id

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:10 +02:00
Stephen M. Cameron
292e50dd39 cciss: save pdev pointer in per hba structure early to avoid passing it around so much.
cciss: save pdev pointer in per hba structure early to avoid passing it around so much.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:10 +02:00
Stephen M. Cameron
373b45f7b6 cciss: Set the performant mode bit in the scsi half of the driver
cciss: Set the performant mode bit in the scsi half of the driver
In a couple of places, the performant mode bit wasn't being set in
the scsi half of the driver, causing commands to seem to hang.  Use
enqueue_cmd_and_start_io() where appropriate.  This fixes a bug that

	echo engage scsi > /proc/driver/cciss/cciss0

would hang.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:10 +02:00
Daniel Stodden
d54142c71f blkfront: Klog the unclean release path
Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:51:21 +02:00
Daniel Stodden
7b32d1044a blkfront: Remove obsolete info->users
This is just bd_openers, protected by the bd_mutex.

Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:49:20 +02:00
Daniel Stodden
acfca3c622 blkfront: Remove obsolete info->users
This is just bd_openers, protected by the bd_mutex.

Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:47:26 +02:00
Daniel Stodden
fa1bd3591a blkfront: Lock blockfront_info during xbdev removal
Same approach as blkfront_closing:
 * Grab the bdev safely, holding the info mutex.
 * Zap xbdev safely, holding the info mutex.
 * Try bdev removal safely, holding bd_mutex.

Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-08-07 18:45:27 +02:00
Daniel Stodden
7fd152f4b6 blkfront: Fix blkfront backend switch race (bdev release)
We cannot read backend state within bdev operations, because it risks
grabbing the state change before xenbus gets to do it.

Fixed by tracking deferral with a frontend switch to Closing. State
exposure isn't strictly necessary, but the backends won't mind.

For a 'clean' deferral this seems actually a more decent protocol than
raising errors.

Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:45:12 +02:00
Daniel Stodden
139617437a blkfront: Fix blkfront backend switch race (bdev open)
We need not mind if users grab a late handle on a closing disk. We
probably even should not. But we have to make sure it's not a dead
one already

Let the bdev deal with a gendisk deleted under its feet. Takes the
info mutex to decide a race against backend closing.

Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:38:43 +02:00
Daniel Stodden
b70f5fa043 blkfront: Lock blkfront_info when closing
The bdev .open/.release fops race against backend switches to Closing,
handled by the XenBus thread.

The original code attempted to serialize block device holders and
xenbus only via bd_mutex. This is insufficient, the info->bd pointer
may already be stale (or null) while xenbus tries to bump up the
refcount.

Protect blkfront_info with a dedicated mutex.

Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-08-07 18:38:43 +02:00
Daniel Stodden
a66b5aebb7 blkfront: Clean up vbd release
* Current blkfront_closing is rather a xlvbd_release_gendisk.
   Renamed in preparation of later patches (need the name again).

 * Removed the misleading comment -- this only applied to the backend
   switch handler, and the queue is already flushed btw.

 * Break out the xenbus call, callers know better when to switch
   frontend state.

Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:38:43 +02:00
Daniel Stodden
9897cb5323 blkfront: Fix gendisk leak
Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-08-07 18:31:37 +02:00
Daniel Stodden
89de1669ac blkfront: Fix backtrace in del_gendisk
The call to del_gendisk follows an non-refcounted gd->queue
pointer. We release the last ref in blk_cleanup_queue. Fixed by
reordering releases accordingly.

Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-08-07 18:31:35 +02:00
K. Y. Srinivasan
2def141e71 xen/blkfront: revalidate after setting capacity
Signed-off-by: K. Y. Srinivasan <ksrinivasan@novell.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-08-07 18:31:31 +02:00
Jeremy Fitzhardinge
b4dddb498c xen/blkfront: avoid compiler warning from missing cases
Fix:
drivers/block/xen-blkfront.c: In function ‘blkfront_connect’:
drivers/block/xen-blkfront.c:933: warning: enumeration value ‘BLKIF_STATE_DISCONNECTED’ not handled in switch

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-08-07 18:31:29 +02:00
K. Y. Srinivasan
1fa73be6be xen/front: Propagate changed size of VBDs
Support dynamic resizing of virtual block devices. This patch supports
both file backed block devices as well as physical devices that can be
dynamically resized on the host side.

Signed-off-by: K. Y. Srinivasan <ksrinivasan@novell.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-08-07 18:31:27 +02:00
Jan Beulich
5d7ed20e82 blkfront: don't access freed struct xenbus_device
Unfortunately commit "blkfront: fixes for 'xm block-detach ... --force'"
still wasn't quite right - there was a reference to freed memory left
from blkfront_closing().

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:31:12 +02:00
Jan Beulich
0e34582699 blkfront: fixes for 'xm block-detach ... --force'
Prevent prematurely freeing 'struct blkfront_info' instances (when the
xenbus data structures are gone, but the Linux ones are still needed).

Prevent adding a disk with the same (major, minor) [and hence the same
name and sysfs entries, which leads to oopses] when the previous
instance wasn't fully de-allocated yet.

This still doesn't address all issues resulting from forced detach:
I/O submitted after the detach still blocks forever, likely preventing
subsequent un-mounting from completing. It's not clear to me (not
knowing much about the block layer) how this can be avoided.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:28:55 +02:00
Ian Campbell
203fd61f42 xen: use less generic names in blkfront driver.
All Xen frontend drivers have a couple of identically named functions which
makes figuring out which device went wrong from a stacktrace harder than it
needs to be. Rename them to something specificto the device type.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-08-07 18:26:39 +02:00
Arnd Bergmann
6e9624b8ca block: push down BKL into .open and .release
The open and release block_device_operations are currently
called with the BKL held. In order to change that, we must
first make sure that all drivers that currently rely
on this have no regressions.

This blindly pushes the BKL into all .open and .release
operations for all block drivers to prepare for the
next step. The drivers can subsequently replace the BKL
with their own locks or remove it completely when it can
be shown that it is not needed.

The functions blkdev_get and blkdev_put are the only
remaining users of the big kernel lock in the block
layer, besides a few uses in the ioctl code, none
of which need to serialize with blkdev_{get,put}.

Most of these two functions is also under the protection
of bdev->bd_mutex, including the actual calls to
->open and ->release, and the common code does not
access any global data structures that need the BKL.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:25:34 +02:00
Arnd Bergmann
8a6cfeb6de block: push down BKL into .locked_ioctl
As a preparation for the removal of the big kernel
lock in the block layer, this removes the BKL
from the common ioctl handling code, moving it
into every single driver still using it.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:25:00 +02:00
FUJITA Tomonori
00fff26539 block: remove q->prepare_flush_fn completely
This removes q->prepare_flush_fn completely (changes the
blk_queue_ordered API).

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:24:15 +02:00
FUJITA Tomonori
dd40e456a4 virtio_blk: stop using q->prepare_flush_fn
use REQ_FLUSH flag instead.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:24:14 +02:00
FUJITA Tomonori
98d8c8f40e ps3disk: stop using q->prepare_flush_fn
REQ_FLUSH flag enables us to kill ps3disk_prepare_flush().

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:24:03 +02:00
FUJITA Tomonori
7f9815f09d osdblk: stop using q->prepare_flush_fn
use REQ_FLUSH flag instead.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:24:00 +02:00
Randy Dunlap
511d37af66 block/xd.c: fix brace typo
Fix extra brace typo that is causing build errors.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:23:14 +02:00
Christoph Hellwig
4c4762d10f block: fix some more cmd_type cleanup fallout
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:22:29 +02:00
Jens Axboe
15fa6e8165 virtio_blk: add default case to cmd type switch
On compilation, gcc correctly detects that we do not handle
all types:

In function ‘blk_done’:
warning: enumeration value ‘REQ_TYPE_FS’ not handled in switch
warning: enumeration value ‘REQ_TYPE_SENSE’ not handled in switch
warning: enumeration value ‘REQ_TYPE_PM_SUSPEND’ not handled in switch
warning: enumeration value ‘REQ_TYPE_PM_RESUME’ not handled in switch
warning: enumeration value ‘REQ_TYPE_PM_SHUTDOWN’ not handled in switch
warning: enumeration value ‘REQ_TYPE_LINUX_BLOCK’ not handled in switch
warning: enumeration value ‘REQ_TYPE_ATA_TASKFILE’ not handled in switch
warning: enumeration value ‘REQ_TYPE_ATA_PC’ not handled in switch

which is a bit pointless since this is at the end of the request
processessing. Add a default case that just breaks out.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:22:26 +02:00
Christoph Hellwig
7b6d91daee block: unify flags for struct bio and struct request
Remove the current bio flags and reuse the request flags for the bio, too.
This allows to more easily trace the type of I/O from the filesystem
down to the block driver.  There were two flags in the bio that were
missing in the requests:  BIO_RW_UNPLUG and BIO_RW_AHEAD.  Also I've
renamed two request flags that had a superflous RW in them.

Note that the flags are in bio.h despite having the REQ_ name - as
blkdev.h includes bio.h that is the only way to go for now.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:20:39 +02:00
Christoph Hellwig
33659ebbae block: remove wrappers for request type/flags
Remove all the trivial wrappers for the cmd_type and cmd_flags fields in
struct requests.  This allows much easier grepping for different request
types instead of unwinding through macros.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:17:56 +02:00
Stephen Hemminger
01b6b67eda floppy: use warning macros
Convert assertions to use WARN().  There are several error checks in the
code for things that should never happen.  Convert them to standard
warnings so kerneloops.org will see them.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:15:43 +02:00
Stephen Hemminger
b862f26fe1 floppy: use wait_event_interruptible
Convert wait loops to use wait_event_ macros.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:15:41 +02:00
Stephen Hemminger
21af544804 floppy: fix signed/unsigned warnings
Ioctl cmd value is unsigned, so change normalize_ioctl

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:15:39 +02:00
Stephen Hemminger
be1c0fbfb4 floppy: cmos attribute should be static
As reported by sparse, cmos attribute is local.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:15:37 +02:00
Stephen Hemminger
575cfc673e floppy: use atomic type for usage_count
The usage_count was being protected by a lock which was only there to
create an atomic counter.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:15:36 +02:00
Stephen Hemminger
41a55b4de3 floppy: silence warning during disk test
The first thing the floppy does is read block 0 to test geometry and to
test for disk presence.  If disk is not present this causes a console
warning message about failed I/O.  Set flag to silence.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:15:34 +02:00
Stephen Hemminger
be7a12bb1a floppy: remove unnecessary inlines
These routines are all big enough that is better to let the compiler
decide to inline or not.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:15:32 +02:00
Stephen Hemminger
285203c8ff floppy: initialize debug jiffies offset
Set debug jiffies offset at initialization.  Avoids wierd values showing
up if debugging enabled.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:15:30 +02:00
Mike Miller
f3bcb14332 cciss: change pad value from 32 to 0
Change the command padding on 32-bit systems to 0 since setting it to 32
has the identical effect.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:15:29 +02:00
Mike Miller
b0dd5cad3a cciss: remove errant debug code
Remove a debug statement left behind by accident Ths debug statement got
left behind.  It was commented out after use but not deleted.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:15:27 +02:00
Mike Miller
29979a7122 cciss: move next_command function from ifdef
The definition of next_command also ended up in wrong place It ended up
inside an "#ifdef CONFIG_PROCFS".  Already caught by Randy Dunlap and a
couple others.  Tried to put it somewhere that made sense.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:15:25 +02:00
Mike Miller
b14aa6dcd0 cciss: fix call to put_controller_in_performant_mode
call to put_controller_in_performant_mode was in the wrong place
The call inadvertently ended up in an error path.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:15:23 +02:00
Mike Miller
256aea3fd3 cciss: make sure we request the performant mode irq
Make sure we register the performant mode interrupt Another blunder.
Seemed to work because the call to put_controller_into_performant_mode was
never called.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:15:21 +02:00
Mike Miller
841fdffdd3 cciss: new controller support and bump driver version
Add support for new controllers due out next year.  HP must continue to
support new controllers in older distros.  All vendors require support be
upstream.  These controllers support only 16 commands in simple mode but
can support up to 1024 in performant mode.  See patch 5/6/ We have no
marketing names yet.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:12:51 +02:00
Mike Miller
5e216153c3 cciss: add performant mode support for Stars/Sirius
Add a mode of controller operation called Performant Mode.  Even though
cciss has been deprecated in favor of hpsa there are new controllers due
out next year that HP must support in older vendor distros.  Vendors
require all fixes/features be upstream.  These new controllers support
only 16 commands in simple mode but support up to 1024 in performant mode.
This requires us to add this support at this late date.

The performant mode transport minimizes host PCI accesses by performinf
many completions per read.  PCI writes are posted so the host can write
then immediately get off the bus not waiting for the writwe to complete to
the target.  In the context of performant mode the host read out to a
controller pulls all posted writes into host memory ensuring the reply
queue is coherent.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:12:51 +02:00
Mike Miller
1d1414419f cciss: make interrupt access methods return type bool
Change the return type of our interrupt access routines to bool from
unsigned long.  It makes more sense that way.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:12:51 +02:00
Mike Miller
2cf3af1c9e cciss: check for msi in interrupt_not_for_us
Check to see if h->msi[x]_vector is set.  We need this for a following
patch.  Without this check we process one interrupt then stop because in
msi[x] mode the interrupt pending bit is not set.  Not sure why we didn't
encounter this before.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:12:35 +02:00
Mike Miller
0c2b39087c cciss: clean up interrupt handler
Simplify the interrupt handler code to more closely match hpsa and to
hopefully make it easier to follow.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:12:33 +02:00
Mike Miller
664a717d3a cciss: enqueue and submit io
Clean up some code where we subit our io.  The same 5 lines appeared
several times.  Also helps for a following patch.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:12:32 +02:00
Grant Likely
2dc1158137 of/device: Replace struct of_device with struct platform_device
of_device is just an alias for platform_device, so remove it entirely.  Also
replace to_of_device() with to_platform_device() and update comment blocks.

This patch was initially generated from the following semantic patch, and then
edited by hand to pick up the bits that coccinelle didn't catch.

@@
@@
-struct of_device
+struct platform_device

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Reviewed-by: David S. Miller <davem@davemloft.net>
2010-08-06 09:25:50 -06:00
Linus Torvalds
552c7dbb34 Merge branch 'virtio' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
* 'virtio' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
  virtio_blk: Remove VBID ioctl
  virtio_blk: Add 'serial' attribute to virtio-blk devices (v2)
  virtio_blk: support barriers without FLUSH feature
2010-08-05 13:49:37 -07:00
Linus Torvalds
db7a1535d2 Merge branch 'upstream/xen' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen
* 'upstream/xen' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen: (23 commits)
  xen/panic: use xen_reboot and fix smp_send_stop
  Xen: register panic notifier to take crashes of xen guests on panic
  xen: support large numbers of CPUs with vcpu info placement
  xen: drop xen_sched_clock in favour of using plain wallclock time
  pvops: do not notify callers from register_xenstore_notifier
  Introduce CONFIG_XEN_PVHVM compile option
  blkfront: do not create a PV cdrom device if xen_hvm_guest
  support multiple .discard.* sections to avoid section type conflicts
  xen/pvhvm: fix build problem when !CONFIG_XEN
  xenfs: enable for HVM domains too
  x86: Call HVMOP_pagetable_dying on exit_mmap.
  x86: Unplug emulated disks and nics.
  x86: Use xen_vcpuop_clockevent, xen_clocksource and xen wallclock.
  implement O_NONBLOCK for /proc/xen/xenbus
  xen: Fix find_unbound_irq in presence of ioapic irqs.
  xen: Add suspend/resume support for PV on HVM guests.
  xen: Xen PCI platform device driver.
  x86/xen: event channels delivery on HVM.
  x86: early PV on HVM features initialization.
  xen: Add support for HVM hypercalls.
  ...
2010-08-05 13:45:50 -07:00
Ryan Harper
6c99a8528f virtio_blk: Remove VBID ioctl
With the availablility of a sysfs device attribute for examining disk serial
numbers the ioctl is no longer needed.  The user-space changes for this aren't
upstream yet so we don't have any users to worry about.

Signed-off-by: Ryan Harper <ryanh@us.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-08-05 13:05:31 +09:30
Ryan Harper
a5eb9e4ff1 virtio_blk: Add 'serial' attribute to virtio-blk devices (v2)
Create a new attribute for virtio-blk devices that will fetch the serial number
of the block device.  This attribute can be used by udev to create disk/by-id
symlinks for devices that don't have a UUID (filesystem) associated with them.

ATA_IDENTIFY strings are special in that they can be up to 20 chars long
and aren't required to be nul-terminated.  The buffer is also zero-padded
meaning that if the serial is 19 chars or less that we get a nul-terminated
string.  When copying this value into a string buffer, we must be careful to
copy up to the nul (if it present) and only 20 if it is longer and not to
attempt to nul terminate; this isn't needed.

Changes since v1:
- Added BUILD_BUG_ON() for PAGE_SIZE check
- Removed min() since BUILD_BUG_ON() handles the check
- Replaced serial_sysfs() by copying id directly to buffer

Signed-off-by: Ryan Harper <ryanh@us.ibm.com>
Signed-off-by: john cooper <john.cooper@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-08-05 13:05:30 +09:30
Christoph Hellwig
10bc310c27 virtio_blk: support barriers without FLUSH feature
If we want to support barriers with the cache=writethrough mode in qemu
we need to tell the block layer that we only need queue drains to
implement a barrier.  Follow the model set by SCSI and IDE and assume
that there is no volatile write cache if the host doesn't advertize it.
While this might imply working barriers on old qemu versions or other
hypervisors that actually have a volatile write cache this is only a
cosmetic issue - these hypervisors don't guarantee any data integrity
with or without this patch, but with the patch we at least provide
data ordering.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-08-05 13:05:29 +09:30
Jiri Kosina
d790d4d583 Merge branch 'master' into for-next 2010-08-04 15:14:38 +02:00
Stefano Stabellini
b98a409b80 blkfront: do not create a PV cdrom device if xen_hvm_guest
It is not possible to unplug emulated cdrom devices, and PV cdroms don't
handle media insert, eject and stream, so we are better off disabling PV
cdroms when running as a Xen HVM guest.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2010-07-29 11:11:08 -07:00
Stefano Stabellini
c1c5413ad5 x86: Unplug emulated disks and nics.
Add a xen_emul_unplug command line option to the kernel to unplug
xen emulated disks and nics.

Set the default value of xen_emul_unplug depending on whether or
not the Xen PV frontends and the Xen platform PCI driver have
been compiled for this kernel (modules or built-in are both OK).

The user can specify xen_emul_unplug=ignore to enable PV drivers on HVM
even if the host platform doesn't support unplug.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-07-26 23:13:25 -07:00
Kulikov Vasiliy
0e4a9d03df block: cciss: use ARRAY_SIZE
Change sizeof(x) / sizeof(*x) to ARRAY_SIZE(x).

Signed-off-by: Kulikov Vasiliy <segooon@gmail.com>
Acked-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-07-20 17:02:03 +02:00
Pavel Machek
a2531293db update email address
pavel@suse.cz no longer works, replace it with working address.

Signed-off-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-07-19 10:56:54 +02:00
Uwe Kleine-König
698f93159a fix comment/printk typos concerning "already"
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-07-11 21:45:40 +02:00
Stephen M. Cameron
79600aadcf cciss: set SCSI max cmd len to 16, as default is wrong
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Cc: Mike Miller <mikem@beardog.cce.hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-15 08:12:34 +02:00
Jens Axboe
552618d124 cpqarray: fix two more wrong section type
cpqarray_register_ctlr() and cpqarray_eisa_detect() also
need to be marked as __devinit.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-14 15:21:33 +02:00
Jens Axboe
d4a3895f5d cpqarray: fix wrong __init type on pci probe function
It needs to be __devinit, not __init.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-14 12:55:09 +02:00
Philipp Reisner
dc66c74de6 drbd: Fixed a race between disk-attach and unexpected state changes
This was a very hard to trigger race condition.

If we got a state packet from the peer, after drbd_nl_disk() has
already changed the disk state to D_NEGOTIATING but
after_state_ch() was not yet run by the worker, then receive_state()
might called drbd_sync_handshake(), which in turn crashed
when accessing p_uuid.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-06-14 12:19:41 +02:00
Linus Torvalds
d2dd328b7f Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: (27 commits)
  block: make blk_init_free_list and elevator_init idempotent
  block: avoid unconditionally freeing previously allocated request_queue
  pipe: change /proc/sys/fs/pipe-max-pages to byte sized interface
  pipe: change the privilege required for growing a pipe beyond system max
  pipe: adjust minimum pipe size to 1 page
  block: disable preemption before using sched_clock()
  cciss: call BUG() earlier
  Preparing 8.3.8rc2
  drbd: Reduce verbosity
  drbd: use drbd specific ratelimit instead of global printk_ratelimit
  drbd: fix hang on local read errors while disconnected
  drbd: Removed the now empty w_io_error() function
  drbd: removed duplicated #includes
  drbd: improve usage of MSG_MORE
  drbd: need to set socket bufsize early to take effect
  drbd: improve network latency, TCP_QUICKACK
  drbd: Revert "drbd: Create new current UUID as late as possible"
  brd: support discard
  Revert "writeback: fix WB_SYNC_NONE writeback from umount"
  Revert "writeback: ensure that WB_SYNC_NONE writeback with sb pinned is sync"
  ...
2010-06-04 15:37:44 -07:00
Linus Torvalds
39059cceed Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc/macio: Fix probing of macio devices by using the right of match table
  agp/uninorth: Fix oops caused by flushing too much
  powerpc/pasemi: Update MAINTAINERS file
  powerpc/cell: Fix integer constant warning
  powerpc/kprobes: Remove resume_execution() in kprobes
  powerpc/macio: Don't dereference pointer before null check
2010-06-03 15:46:37 -07:00
Christoph Hellwig
a5b365a652 virtio-blk: fix minimum number of S/G elements
We need at least one S/G element to operate properly, as does the block
layer which increments it to one anyway.  We hit this due to a qemu
bug which advertises a sg_elements of 0 under some circumstances.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (tweaked logic)
2010-06-03 22:39:18 +09:30
Benjamin Herrenschmidt
c2cdf6aba0 powerpc/macio: Fix probing of macio devices by using the right of match table
Grant patches added an of mach table to struct device_driver. However,
while he changed the macio device code to use that, he left the match
table pointer in struct macio_driver and didn't update drivers to use
the "new" one, thus breaking the probing.

This completes the change by moving all drivers to setup the "new"
one, removing all traces of the old one, and while at it (since it
changes the exact same locations), I also remove two other duplicates
from struct driver which are the name and owner fields.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-06-02 17:50:38 +10:00
Jens Axboe
b4ca761577 Merge branch 'master' into for-linus
Conflicts:
	fs/pipe.c

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-01 12:42:12 +02:00
Dan Carpenter
713b686494 cciss: call BUG() earlier
I moved the range check after the increment.  The current code would
write past the end of the array once before calling BUG().

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-01 12:17:48 +02:00
Philipp Reisner
2a0ab2cd73 drbd: Reduce verbosity
The "Local READ/WRITE failed" messages are too verbose.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-01 11:12:27 +02:00
Lars Ellenberg
7383506c87 drbd: use drbd specific ratelimit instead of global printk_ratelimit
using the global printk_ratelimit() may mask other messages.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-01 11:12:27 +02:00
Lars Ellenberg
d255e5ff5f drbd: fix hang on local read errors while disconnected
"canceled" w_read_retry_remote never completed, if they have been
canceled after drbd_disconnect connection teardown cleanup has already
run (or we are currently not connected anyways).

Fixed by not queueing a remote retry if we already know it won't work
(pdsk not uptodate), and cleanup ourselves on "cancel", in case we hit a
race with drbd_disconnect.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-01 11:12:27 +02:00
Philipp Reisner
32fa7e91f9 drbd: Removed the now empty w_io_error() function
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-01 11:12:27 +02:00
Andrea Gelmini
039e1fb654 drbd: removed duplicated #includes
drbd/drbd_receiver.c: linux/mm.h is included more than once.

Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-01 11:12:27 +02:00
Lars Ellenberg
ba11ad9a3b drbd: improve usage of MSG_MORE
It seems to improve performance if we allow the "p_data" header in its
own frame (no MSG_MORE), but sendpage all but the last page with MSG_MORE.
This is also in preparation of a later zero copy receive implementation.

Suggested by Eduard.Guzovsky@stratus.com on drbd-dev.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-01 11:12:27 +02:00
Lars Ellenberg
5dbf167338 drbd: need to set socket bufsize early to take effect
quoting tcp(7):
    On individual connections, the socket buffer size must be set prior to the
    listen(2) or connect(2) calls in order to have it take effect.

This adds a wrapper to do so, and uses it appropriately.
Improves performance in certain situations.

Note that because we cannot easily determine which socket will be
"meta" and wich "data" (bulk) socket, we adjust both sockets.
Previously, DRBD only adjusted the bufsizes of the "data" socket.

Thanks again to Eduard.Guzovsky@stratus.com.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-01 11:12:27 +02:00
Lars Ellenberg
344fa462e3 drbd: improve network latency, TCP_QUICKACK
On Thu, Apr 29, 2010 at 04:00:50PM -0400, Eduard.Guzovsky@stratus.com
 wrote on drbd-dev@lists.linbit.com
 Subject: [Drbd-dev] DRBD small synchronous writes performance improvements

> 1. TCP_QUICKACK option is set incorrectly. The goal was force TCP to
> send and ACK as a  "one time" event.  Instead the code permanently sets
> connection in the QUICKACK mode.

He is right, we actually want to use an even val with TCP_QUICKACK.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-01 11:12:27 +02:00
Philipp Reisner
2c8d196759 drbd: Revert "drbd: Create new current UUID as late as possible"
The late-UUID writing is delayed until the next release.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-01 11:12:26 +02:00
Nick Piggin
b7c335713e brd: support discard
Support discard requests in brd by zeroing or deleting the underlying backing
pages. This is simply to help with testing and documentation nature of
brd code.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-06-01 11:09:20 +02:00
Grant Likely
cf9b59e9d3 Merge remote branch 'origin' into secretlab/next-devicetree
Merging in current state of Linus' tree to deal with merge conflicts and
build failures in vio.c after merge.

Conflicts:
	drivers/i2c/busses/i2c-cpm.c
	drivers/i2c/busses/i2c-mpc.c
	drivers/net/gianfar.c

Also fixed up one line in arch/powerpc/kernel/vio.c to use the
correct node pointer.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-05-22 00:36:56 -06:00
Grant Likely
4018294b53 of: Remove duplicate fields from of_platform_driver
.name, .match_table and .owner are duplicated in both of_platform_driver
and device_driver.  This patch is a removes the extra copies from struct
of_platform_driver and converts all users to the device_driver members.

This patch is a pretty mechanical change.  The usage model doesn't change
and if any drivers have been missed, or if anything has been fixed up
incorrectly, then it will fail with a compile time error, and the fixup
will be trivial.  This patch looks big and scary because it touches so
many files, but it should be pretty safe.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Sean MacLennan <smaclennan@pikatech.com>
2010-05-22 00:10:40 -06:00
Linus Torvalds
e8bebe2f71 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (69 commits)
  fix handling of offsets in cris eeprom.c, get rid of fake on-stack files
  get rid of home-grown mutex in cris eeprom.c
  switch ecryptfs_write() to struct inode *, kill on-stack fake files
  switch ecryptfs_get_locked_page() to struct inode *
  simplify access to ecryptfs inodes in ->readpage() and friends
  AFS: Don't put struct file on the stack
  Ban ecryptfs over ecryptfs
  logfs: replace inode uid,gid,mode initialization with helper function
  ufs: replace inode uid,gid,mode initialization with helper function
  udf: replace inode uid,gid,mode init with helper
  ubifs: replace inode uid,gid,mode initialization with helper function
  sysv: replace inode uid,gid,mode initialization with helper function
  reiserfs: replace inode uid,gid,mode initialization with helper function
  ramfs: replace inode uid,gid,mode initialization with helper function
  omfs: replace inode uid,gid,mode initialization with helper function
  bfs: replace inode uid,gid,mode initialization with helper function
  ocfs2: replace inode uid,gid,mode initialization with helper function
  nilfs2: replace inode uid,gid,mode initialization with helper function
  minix: replace inode uid,gid,mode init with helper
  ext4: replace inode uid,gid,mode init with helper
  ...

Trivial conflict in fs/fs-writeback.c (mark bitfields unsigned)
2010-05-21 19:37:45 -07:00
Linus Torvalds
1756ac3d3c Merge branch 'virtio' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
* 'virtio' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: (27 commits)
  drivers/char: Eliminate use after free
  virtio: console: Accept console size along with resize control message
  virtio: console: Store each console's size in the console structure
  virtio: console: Resize console port 0 on config intr only if multiport is off
  virtio: console: Add support for nonblocking write()s
  virtio: console: Rename wait_is_over() to will_read_block()
  virtio: console: Don't always create a port 0 if using multiport
  virtio: console: Use a control message to add ports
  virtio: console: Move code around for future patches
  virtio: console: Remove config work handler
  virtio: console: Don't call hvc_remove() on unplugging console ports
  virtio: console: Return -EPIPE to hvc_console if we lost the connection
  virtio: console: Let host know of port or device add failures
  virtio: console: Add a __send_control_msg() that can send messages without a valid port
  virtio: Revert "virtio: disable multiport console support."
  virtio: add_buf_gfp
  trans_virtio: use virtqueue_xxx wrappers
  virtio-rng: use virtqueue_xxx wrappers
  virtio_ring: remove a level of indirection
  virtio_net: use virtqueue_xxx wrappers
  ...

Fix up conflicts in drivers/net/virtio_net.c due to new virtqueue_xxx
wrappers changes conflicting with some other cleanups.
2010-05-21 17:22:52 -07:00
Christoph Hellwig
8018ab0574 sanitize vfs_fsync calling conventions
Now that the last user passing a NULL file pointer is gone we can remove
the redundant dentry argument and associated hacks inside vfs_fsynmc_range.

The next step will be removig the dentry argument from ->fsync, but given
the luck with the last round of method prototype changes I'd rather
defer this until after the main merge window.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:21 -04:00
Jens Axboe
ee9a3607fb Merge branch 'master' into for-2.6.35
Conflicts:
	fs/ext3/fsync.c

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-05-21 21:27:26 +02:00
Philipp Reisner
4e23a59ed1 drbd: Do not free p_uuid early, this is done in the exit code of the receiver
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-05-21 21:12:01 +02:00
Philipp Reisner
23ce422748 drbd: Null pointer deref fix to the large "multi bio rewrite"
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-05-21 21:12:01 +02:00
Philipp Reisner
fc8ce1941d drbd: Fix: Do not detach, if a bio with a barrier fails
Introduced a few days ago:
  commit 45bb912bd5
  Author: Lars Ellenberg <lars.ellenberg@linbit.com>
  Date:   Fri May 14 17:10:48 2010 +0200

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-05-21 21:12:00 +02:00
Philipp Reisner
4604d63668 drbd: Ensure to not trigger late-new-UUID creation multiple times
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-05-21 21:12:00 +02:00
Philipp Reisner
31a31dccdd drbd: Do not Oops when C_STANDALONE when uuid gets generated
Got introduces with

commit 0c3f34516e
Author: Philipp Reisner <philipp.reisner@linbit.com>
Date:   Mon May 17 16:10:43 2010 +0200

    drbd: Create new current UUID as late as possible

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-05-21 21:12:00 +02:00
David Zeuthen
c3473c6354 generate "change" uevent for loop device
Recent udev versions probe loop devices for filesystems meaning that
the /dev/disk hierarchy may contain useful entries such as

 $ ls -l /dev/disk/by-label/Fedora-12-x86_64-Live
 lrwxrwxrwx 1 root root 11 Mar 11 13:41 /dev/disk/by-label/Fedora-12-x86_64-Live -> ../../loop0

Unfortunately, no "change" uevent is generated when the loop device is
detached so the symlink persists. Additionally, no "change" uevent is
guaranteed to be generated when attaching an fd or changing capacity.
For example,  user space could open the loop device O_RDONLY (in fact,
recent util-linux-ng does this) so udev's OPTIONS+="watch" machinery may
not trigger the "change" uevent.

This patch ensures that the "change" uevent is generated in all of
these cases. As a result, the /dev/disk hierarchy works as expected
for loop devices.

Signed-off-by: David Zeuthen <davidz@redhat.com>
Acked-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-21 09:37:30 -07:00
Linus Torvalds
f39d01be4c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (44 commits)
  vlynq: make whole Kconfig-menu dependant on architecture
  add descriptive comment for TIF_MEMDIE task flag declaration.
  EEPROM: max6875: Header file cleanup
  EEPROM: 93cx6: Header file cleanup
  EEPROM: Header file cleanup
  agp: use NULL instead of 0 when pointer is needed
  rtc-v3020: make bitfield unsigned
  PCI: make bitfield unsigned
  jbd2: use NULL instead of 0 when pointer is needed
  cciss: fix shadows sparse warning
  doc: inode uses a mutex instead of a semaphore.
  uml: i386: Avoid redefinition of NR_syscalls
  fix "seperate" typos in comments
  cocbalt_lcdfb: correct sections
  doc: Change urls for sparse
  Powerpc: wii: Fix typo in comment
  i2o: cleanup some exit paths
  Documentation/: it's -> its where appropriate
  UML: Fix compiler warning due to missing task_struct declaration
  UML: add kernel.h include to signal.c
  ...
2010-05-20 09:20:59 -07:00
Michael S. Tsirkin
09ec6b69d2 virtio_blk: use virtqueue_xxx wrappers
Switch virtio_blk to new virtqueue_xxx wrappers.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-05-19 22:15:42 +09:30
Rusty Russell
bdb4a13057 virtio_blk: remove multichar constant.
drivers/block/virtio_blk.c:228:13: warning: multi-character character constant

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: john cooper <john.cooper@redhat.com>
2010-05-19 22:15:41 +09:30
john cooper
234f2725a5 Add virtio disk identification ioctl
Return serial string to the guest application via
ioctl driver call.

Note this form of interface to the guest userland
was the consensus when the prior version using
the ATA_IDENTIFY came under dispute.

Signed-off-by: john cooper <john.cooper@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-05-19 22:15:40 +09:30
john cooper
4cb2ea28c5 Add virtio disk identification support
Add virtio-blk device id (s/n) support via virtio request.

Signed-off-by: john cooper <john.cooper@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-05-19 22:15:40 +09:30
Grant Likely
61c7a080a5 of: Always use 'struct device.of_node' to get device node pointer.
The following structure elements duplicate the information in
'struct device.of_node' and so are being eliminated.  This patch
makes all readers of these elements use device.of_node instead.

(struct of_device *)->node
(struct dev_archdata *)->prom_node (sparc)
(struct dev_archdata *)->of_node (powerpc & microblaze)

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-05-18 16:10:44 -06:00
Linus Torvalds
1014cfe2fb Merge branch 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  lockdep: Reduce stack_trace usage
  lockdep: No need to disable preemption in debug atomic ops
  lockdep: Actually _dec_ in debug_atomic_dec
  lockdep: Provide off case for redundant_hardirqs_on increment
  lockdep: Simplify debug atomic ops
  lockdep: Fix redundant_hardirqs_on incremented with irqs enabled
  lockstat: Make lockstat counting per cpu
  i8253: Convert i8253_lock to raw_spinlock
2010-05-18 08:17:35 -07:00
Julia Lawall
2db4e42eac drivers/block/drbd: Use kzalloc
Use kzalloc rather than the combination of kmalloc and memset.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
expression x,size,flags;
statement S;
@@

-x = kmalloc(size,flags);
+x = kzalloc(size,flags);
 if (x == NULL) S
-memset(x, 0, size);
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 02:04:10 +02:00
Philipp Reisner
0c3f34516e drbd: Create new current UUID as late as possible
The choice was to either delay creation of the new UUID until
IO got thawed or to delay it until the first IO request.

Both are correct, the later is more friendly to users of
dual-primary setups, that actually only write on one side.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 02:03:49 +02:00
Philipp Reisner
9a25a04c80 drbd: If we detect late that IO got frozen, retry after we thawed.
If we detect late (= after grabing mdev->req_lock) that IO got frozen, we
return 1 to generic_make_request(), which simply will retry to make a
request for that bio.

In the subsequent call of generic_make_request() into drbd_make_request_26()
we sleep in inc_ap_bio().

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 02:03:32 +02:00
Lars Ellenberg
a1c88d0d7a drbd: always use_bmbv, ignore setting
Now that the peer may handle multi-bio EEs,
we can ignore the peer's limit,
and concentrate on the limits of the local IO stack.

This is safe accross drbd protocol versions,
as our queue_max_sectors() will be adjusted accordingly.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 02:03:05 +02:00
Lars Ellenberg
bb3d000cb9 drbd: allow resync requests to be larger than max_segment_size
this should allow for better background resync performance.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 02:02:36 +02:00
Lars Ellenberg
45bb912bd5 drbd: Allow drbd_epoch_entries to use multiple bios.
This should allow for better performance if the lower level IO stack
of the peers differs in limits exposed either via the queue,
or via some merge_bvec_fn.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 02:01:23 +02:00
Lars Ellenberg
708d740ed8 drbd: reduce sizeof struct drbd_epoch_entry by 8 byte by aligning members
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:28:35 +02:00
Philipp Reisner
162f3ec7f0 drbd: Fixes to the new delay_probes code
* Only send delay_probes with protocol 93 or newer
* drbd_send_delay_probes() is called only from worker context,
  no atomic_t needed for delay_seq

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:28:08 +02:00
Philipp Reisner
a8cdfd8d3b drbd: A fixes to the new resync speed code
* Mention P_DELAY_PROBE in the packet naming array
* Do not corrupt the mdev->data.work list in case the timer goes
  off before delay_probe_work got handled by the worker
* Do not mod_timer() twice for a single delay_probe pair

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:26:51 +02:00
Philipp Reisner
eedf386ae9 drbd: Proc bits of new resync speed stuff
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:26:27 +02:00
Philipp Reisner
cdd67a7460 drbd: Control the actual resync rate based on the queuing delay of data packets
In a setup with a high bandwidth and high latency network, eventually
involving deep queues in routers, it is beneficial to only fill those
queues up to an limited extend with resync data.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:25:47 +02:00
Philipp Reisner
bd26bfc5b4 drbd: Actually send delay probes
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:25:28 +02:00
Philipp Reisner
67c7ddd055 drbd: Four new configuration settings for resync speed control
To reasonably control resync speed over drbd-proxy connections,
drbd has to measure the current delay of packets transmitted over
the (possibly congested) data socket vs the meta-data socket.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:25:00 +02:00
Philipp Reisner
7237bc430f drbd: Sending of delay_probes
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:22:46 +02:00
Philipp Reisner
0ced55a3be drbd: Receiving of delay_probes
Delay_probes are new packets in the DRBD protocol, which allow
DRBD to know the current delay packets have on the data socket.
(relative to the meta data socket)

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:22:11 +02:00
Philipp Reisner
5223671bb0 drbd: Fixed bitmap in case of online-grow without resync
The "surplus" bits of the old (smaller) bitmap must be clean
in case of online-grow without resync.

Note: Reverted 67ae8b80d4a116ab3b7094eb3723506b20c06dff as
well, since the lines added by this patch are redundant. The
bits get set by the bm_set_surplus(b) call before that.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:20:33 +02:00
Philipp Reisner
6b4388ac1f drbd: Added transmission faults to the fault injection code
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:19:51 +02:00
Philipp Reisner
087c24925c drbd: bugfix: Make resize work, if remote's size was limiting and increased in the meantime
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:18:22 +02:00
Philipp Reisner
6495d2c6d0 drbd: Implemented the --assume-clean option for drbdsetup resize
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:17:47 +02:00
Philipp Reisner
b4ee79dac3 drbd: Added some missing statics
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:17:11 +02:00
Philipp Reisner
fd76438c24 drbd: Make sure to resync all of the new storage upon online resize
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:16:20 +02:00
Philipp Reisner
e89b591c3a drbd: Implemented flags for the resize packet
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:15:44 +02:00
Philipp Reisner
02d9a94bbb drbd: Implemented the set_new_bits parameter for drbd_bm_resize()
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:14:43 +02:00
Philipp Reisner
d845030f21 drbd: made determin_dev_size's parameter an flag enum
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:14:04 +02:00
Adam Gandelman
3a11a48789 drbd: New handler: initial-split-brain
Some wish to be notified of all instances of split brain, not just those that
go unresolved.  The initial-split-brain handler is called to notify someone
upon  detection of all split brain conditions even if auto-recovery policies
are configured.

Signed-off-by: Adam Gandelman <adam.gandelman@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:13:33 +02:00
Lars Ellenberg
979f5c7f1f drbd: fail_requests_early: remove incorrect and unnecessary optimization
The condition does not fit the commend (I may well be Primary,
even if I lost the disk earlier and now the connection).

And this is catched below anyways, where it also gets logged.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:10:31 +02:00
Lars Ellenberg
6666032ade drbd: check for corrupt or malicous sector addresses when receiving data
Even if it should never happen if the peer does behave, we need to
double check, and not even attempt access beyond end of device.
It usually would be caught by lower layers, resulting in "IO error",
but may also end up in the internal meta data area.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:09:57 +02:00
Philipp Reisner
c3fe30b0e7 drbd: cleanup: This code path to trigger a resync is no longer needed
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:09:13 +02:00
Lars Ellenberg
8d4ce82b3c drbd: don't start a resync without access to up-to-date Data
In case both nodes are "inconsistent", invalidate would
have started a resync anyways, without a chance to ever
succeed, just filling the logs with warning messages.

Simply disallow that state change,
re-using the SS_NO_UP_TO_DATE_DISK return value.

This also changes the corresponding error string to
"Need access to UpToDate Data" -- I found the
"Refusing to be Primary without at least one UpToDate disk"
answer misleading in some situations anyways.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:08:18 +02:00
Lars Ellenberg
c3470cde57 drbd: fix potential protocol error
Don't forget to drain the digest in case we cannot satisfy a
checksum based resync or online-verify request.

It would additionally cause a protocoll error,
dropping the connection.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:07:38 +02:00
Lars Ellenberg
8d1894ebe4 drbd: remove bogus ASSERT
block_id may be ID_SYNCER,
as well as checksum based resync request magic, or online verify magic.

Let's just drop that ASSERT.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:06:59 +02:00
Lars Ellenberg
e0f83012dc drbd: fix regression: attach while connected failed
commit e4f925e12e
Author: Philipp Reisner <philipp.reisner@linbit.com>
Date:   Wed Mar 17 14:18:41 2010 +0100

    drbd: Do not upgrade state to Outdated if already Inconsistent

prevented the necessary state transition for attaching while connected
(Diskless -> Consistent respectively Outdated).
This is the fix for the fix.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:06:07 +02:00
Philipp Reisner
e4f925e12e drbd: Do not upgrade state to Outdated if already Inconsistent [Bugz 277]
There was a race condition:
  In a situation with a SyncSource+Primary and a SyncTarget+Secondary node,
  and a resync dependency to some other device. After both nodes decided
  to do the resync, the other device finishes its resync process.
  At that time SyncSource already sent the P_SYNC_UUID packet, and
  already updated its peer disk state to Inconsistent.
  The SyncTarget node waits for the P_SYNC_UUID and sends a state packet
  to report the resync dependency change. That packet still carries
  a disk state of Outdated.

Impact:
  If application writes come in, during that time on the Primary node,
  those do not get replicated, and the out-of-sync counter gets increased.
  => The completion of resync is not detected on the primary node.
  => stalled.
  Those blocks get resync'ed with the next resync, since the are get
  marked as out-of-sync in the bitmap.

In order to fix this, we filter out that wrong state change in the
sanitize_state() function.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:01:05 +02:00