linux

mirror of https://github.com/torvalds/linux synced 2024-10-23 11:46:42 +00:00

Author	SHA1	Message	Date
Lars Ellenberg	e334f55095	drbd: make sure disk cleanup happens in worker context The recent fix to put_ldev() (correct ordering of access to local_cnt and state.disk; memory barrier in __drbd_set_state) guarantees that the cleanup happens exactly once. However it does not yet guarantee that the cleanup happens from worker context, the last put_ldev() may still happen from atomic context, which must not happen: blkdev_put() may sleep. Fix this by scheduling the cleanup to the worker instead, using a couple more bits in device->flags and a new helper, drbd_device_post_work(). Generalized the "resync progress" work to cover these new work bits. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2014-07-10 18:34:55 +02:00
Lars Ellenberg	ba3c6fb87d	drbd: close race when detaching from disk BUG: unable to handle kernel NULL pointer dereference at 0000000000000058 IP: bd_release+0x21/0x70 Process drbd_w_t7146 Call Trace: close_bdev_exclusive drbd_free_ldev [drbd] drbd_ldev_destroy [drbd] w_after_state_ch [drbd] Race probably went like this: state.disk = D_FAILED ... first one to hit zero during D_FAILED: put_ldev() /* ----------------> 0 / i = atomic_dec_return() if (i == 0) if (state.disk == D_FAILED) schedule_work(go_diskless) / 1 <------ / get_ldev_if_state() go_diskless() do_some_pre_cleanup() corresponding put_ldev(): force_state(D_DISKLESS) / 0 <------ / i = atomic_dec_return() if (i == 0) atomic_inc() / ---------> 1 / state.disk = D_DISKLESS schedule_work(after_state_ch) / execution pre-empted by IRQ ? / after_state_ch() put_ldev() i = atomic_dec_return() / 0 / if (i == 0) if (state.disk == D_DISKLESS) if (state.disk == D_DISKLESS) drbd_ldev_destroy() drbd_ldev_destroy(); Trying to fix this by checking the disk state before* the atomic_dec_return(), which implies memory barriers, and by inserting extra memory barriers around the state assignment in __drbd_set_state(). Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2014-07-10 18:34:54 +02:00
Lars Ellenberg	2ed912e9d3	drbd: explicitly submit meta data requests with REQ_NOIDLE For some reason we have assumed NOIDLE was implied by one of the other flags we set. It is not (anymore?). Explicitly set REQ_NOIDLE for synchronous meta data updates, or we can seriously starve random writes when using CFQ. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2014-07-10 18:34:54 +02:00
Lars Ellenberg	720979fb90	drbd: move set_disk_ro() to after we persisted the new role This probably does not have any real life impact, but we should first persist any potentially new UUID and other meta data flags, as well as our new role, before we allow/disallow write access. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2014-07-10 18:34:53 +02:00
Lars Ellenberg	123ff122ad	drbd: trigger tcp_push_pending_frames() for PING and PING_ACK This should reduce latency for such in-DRBD-protocol "pings", and may help reduce spurious disconnect/reconnect cycles due to "PingAck did not arrive in time." Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2014-07-10 18:34:52 +02:00
Lars Ellenberg	66ce6dbce2	drbd: re-add lost conf_mutex protection in drbd_set_role The conf_update mutex used to be held while clearing the net_conf->discard_my_data flag inside drbd_set_role. It was moved into drbd_adm_set_role with drbd: allow parallel promote/demote actions but then replaced at that location by the newly introduced adm_mutex with drbd: Fix a potential deadlock in drbdsetup, introduce resource->adm_mutex And I simply forgot to put it back in at the original location. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2014-07-10 18:34:52 +02:00
Lars Ellenberg	fcb096740a	drbd: stop the meta data sync timer before open coded meta data sync If we re-write all meta data due to resize, we have open-coded write-out of our meta data super block. Stop the md_sync_timer, it would just trigger scary but in this case spurious "timer expired" messages. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2014-07-10 18:34:51 +02:00
Lars Ellenberg	5ab7d2c005	drbd: fix resync finished detection This fixes one recent regresion, and one long existing bug. The bug: drbd_try_clear_on_disk_bm() assumed that all "count" bits have to be accounted in the resync extent corresponding to the start sector. Since we allow application requests to cross our "extent" boundaries, this assumption is no longer true, resulting in possible misaccounting, scary messages ("BAD! sector=12345s enr=6 rs_left=-7 rs_failed=0 count=58 cstate=..."), and potentially, if the last bit to be cleared during resync would reside in previously misaccounted resync extent, the resync would never be recognized as finished, but would be "stalled" forever, even though all blocks are in sync again and all bits have been cleared... The regression was introduced by drbd: get rid of atomic update on disk bitmap works For an "empty" resync (rs_total == 0), we must not "finish" the resync on the SyncSource before the SyncTarget knows all relevant information (sync uuid). We need to wait for the full round-trip, the SyncTarget will then explicitly notify us. Also for normal, non-empty resyncs (rs_total > 0), the resync-finished condition needs to be tested before the schedule() in wait_for_work, or it is likely to be missed. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2014-07-10 18:34:50 +02:00
Lars Ellenberg	a80ca1ae81	drbd: fix a race stopping the worker thread We may implicitly call drbd_send() from inside wait_for_work(), via maybe_send_barrier(). If the "stop" signal was send just before that, drbd_send() would call flush_signals(), and we would run an unbounded schedule() afterwards. Fix: check for thread_state == RUNNING before we schedule() Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2014-07-10 18:34:50 +02:00
Lars Ellenberg	c7a58db4e9	drbd: get rid of atomic update on disk bitmap works Just trigger the occasional lazy bitmap write-out during resync from the central wait_for_work() helper. Previously, during resync, bitmap pages would be written out separately, synchronously, one at a time, at least 8 times each (every 512 bytes worth of bitmap cleared). Now we trigger "merge friendly" bulk write out of all cleared pages every two seconds during resync, and once the resync is finished. Most pages will be written out only once. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2014-07-10 18:34:49 +02:00
Lars Ellenberg	70df70927b	drbd: allow write-ordering policy to be bumped up again Previously, once you disabled flushes as a means of enforcing write-ordering, you'd need to detach/re-attach to enable them again. Allow drbdsetup disk-options to re-enable previously disabled write-ordering policy options at runtime. While at it fix RCU in drbd_bump_write_ordering() max_allowed_wo() uses rcu_dereference, therefore it must be called within rcu_read_lock()/rcu_read_unlock() Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2014-07-10 15:22:22 +02:00
Lars Ellenberg	44a4d55184	drbd: refactor use of first_peer_device() Reduce the number of calls to first_peer_device(). Instead, call first_peer_device() just once to assign a local variable peer_device. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2014-07-10 15:22:22 +02:00
Lars Ellenberg	35b5ed5bba	drbd: reduce number of spinlock drop/re-aquire cycles Instead of dropping and re-aquiring the spinlock around the submit, just remember that we want to submit, and do that only once we have dropped the spinlock for good. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2014-07-10 15:22:21 +02:00
Philipp Reisner	28995af5cf	drbd: rename drbd_free_bc() to drbd_free_ldev() Since the member of drbd_device is called ldev Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2014-07-10 15:22:21 +02:00
Philipp Reisner	8fe39aac05	drbd: device->ldev is not guaranteed on an D_ATTACHING disk Some parts of the code assumed that get_ldev_if_state(device, D_ATTACHING) is sufficient to access the ldev member of the device object. That was wrong. ldev may not be there or might be freed at any time if the device has a disk state of D_ATTACHING. bm_rw() Documented that drbd_bm_read() is only called from drbd_adm_attach. drbd_bm_write() is only called when a reference is held, and it is documented that a caller has to hold a reference before calling drbd_bm_write() drbd_bm_write_page() Use get_ldev() instead of get_ldev_if_state(device, D_ATTACHING) drbd_bmio_set_n_write() No longer use get_ldev_if_state(device, D_ATTACHING). All callers hold a reference to ldev now. drbd_bmio_clear_n_write() All callers where holding a reference of ldev anyways. Remove the misleading get_ldev_if_state(device, D_ATTACHING) drbd_reconsider_max_bio_size() Removed the get_ldev_if_state(device, D_ATTACHING). All callers now pass a struct drbd_backing_dev* when they have a proper reference, or a NULL pointer. Before this fix, the receiver could trigger a NULL pointer deref when in drbd_reconsider_max_bio_size() drbd_bump_write_ordering() Used get_ldev_if_state(device, D_ATTACHING) with the wrong assumption. Remove it, and allow the caller to pass in a struct drbd_backing_dev* when the caller knows that accessing this bdev is safe. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2014-07-10 15:22:20 +02:00
Philipp Reisner	e952658020	drbd: Move write_ordering from connection to resource Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2014-07-10 15:22:19 +02:00
Ming Lei	6a27b656fc	block: virtio-blk: support multi virt queues per virtio-blk device Firstly this patch supports more than one virtual queues for virtio-blk device. Secondly this patch maps the virtual queue to blk-mq's hardware queue. With this approach, both scalability and performance can be improved. Signed-off-by: Ming Lei <ming.lei@canonical.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2014-07-01 10:51:03 -06:00
Ming Lei	cb553215d5	include/uapi/linux/virtio_blk.h: introduce feature of VIRTIO_BLK_F_MQ Current virtio-blk spec only supports one virtual queue for transfering data between VM and host, and inside VM all kinds of operations on the virtual queue needs to hold one lock, so cause below problems: - bad scalability - bad throughput This patch requests to introduce feature of VIRTIO_BLK_F_MQ so that more than one virtual queues can be used to virtio-blk device, then above problems can be solved or eased. Signed-off-by: Ming Lei <ming.lei@canonical.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2014-07-01 10:51:01 -06:00
Douglas Gilbert	d15156138d	block SG_IO: add SG_FLAG_Q_AT_HEAD flag After the SG_IO ioctl was copied into the block layer and later into the bsg driver, subtle differences emerged. One difference is the way injected commands are queued through the block layer (i.e. this is not SCSI device queueing nor SATA NCQ). Summarizing: - SG_IO on block layer device: blk_exec*(at_head=false) - sg device SG_IO: at_head=true - bsg device SG_IO: at_head=true Some time ago Boaz Harrosh introduced a sg v4 flag called BSG_FLAG_Q_AT_TAIL to override the bsg driver default. A recent patch titled: "sg: add SG_FLAG_Q_AT_TAIL flag" allowed the sg driver default to be overridden. This patch allows a SG_IO ioctl sent to a block layer device to have its default overridden. ChangeLog: - introduce SG_FLAG_Q_AT_HEAD flag in sg.h to cause commands that are injected via a block layer device SG_IO ioctl to set at_head=true - make comments clearer about queueing in sg.h since the header is used both by the sg device and block layer device implementations of the SG_IO ioctl. - introduce BSG_FLAG_Q_AT_HEAD in bsg.h for compatibility (it does nothing) and update comments. Signed-off-by: Douglas Gilbert <dgilbert@interlog.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: Jens Axboe <axboe@fb.com>	2014-07-01 10:48:05 -06:00
Akinobu Mita	9b4231bf99	block: fix SG_[GS]ET_RESERVED_SIZE ioctl when max_sectors is huge SG_GET_RESERVED_SIZE and SG_SET_RESERVED_SIZE ioctls access a reserved buffer in bytes as int type. The value needs to be capped at the request queue's max_sectors. But integer overflow is not correctly handled in the calculation when converting max_sectors from sectors to bytes. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: "James E.J. Bottomley" <JBottomley@parallels.com> Cc: Douglas Gilbert <dgilbert@interlog.com> Cc: linux-scsi@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>	2014-07-01 10:43:09 -06:00
Akinobu Mita	63f2649659	block: fix BLKSECTGET ioctl when max_sectors is greater than USHRT_MAX BLKSECTGET ioctl loads the request queue's max_sectors as unsigned short value to the argument pointer. So if the max_sector is greater than USHRT_MAX, the upper 16 bits of that is just discarded. In such case, USHRT_MAX is more preferable than the lower 16 bits of max_sectors. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: "James E.J. Bottomley" <JBottomley@parallels.com> Cc: Douglas Gilbert <dgilbert@interlog.com> Cc: linux-scsi@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>	2014-07-01 10:43:07 -06:00
Fabian Frederick	16e1556526	block/partitions/efi.c: kerneldoc fixing Adding function documentation and fixing kerneldoc warnings ('field: description' uniformization). Cc: Davidlohr Bueso <davidlohr@hp.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Jens Axboe <axboe@fb.com>	2014-07-01 10:40:05 -06:00
Fabian Frederick	dce14c239a	block/partitions/msdos.c: code clean-up checkpatch fixing: WARNING: Missing a blank line after declarations WARNING: space prohibited between function name and open parenthesis '(' ERROR: spaces required around that '<' (ctx:VxV) Cc: Jens Axboe <axboe@kernel.dk> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Jens Axboe <axboe@fb.com>	2014-07-01 10:40:03 -06:00
Fabian Frederick	600ffc5ead	block/partitions/amiga.c: replace nolevel printk by pr_err Also add no prefix pr_fmt to avoid any future default format update Cc: Jens Axboe <axboe@kernel.dk> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Jens Axboe <axboe@fb.com>	2014-07-01 10:40:02 -06:00
Fabian Frederick	472d5e2af2	block/partitions/aix.c: replace countsize kzalloc by kcalloc kcalloc manages countsizeof overflow. Cc: Jens Axboe <axboe@kernel.dk> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Jens Axboe <axboe@fb.com>	2014-07-01 10:40:00 -06:00
Gu Zheng	cbcd1054a1	bio-integrity: add "bip_max_vcnt" into struct bio_integrity_payload Commit `08778795` ("block: Fix nr_vecs for inline integrity vectors") from Martin introduces the function bip_integrity_vecs(get the useful vectors) to fix the issue about nr_vecs for inline integrity vectors that reported by David Milburn. But it seems that bip_integrity_vecs() will return the wrong number if the bio is not based on any bio_set for some reason(bio->bi_pool == NULL), because in that case, the bip_inline_vecs[0] is malloced directly. So here we add the bip_max_vcnt to record the count of vector slots, and cleanup the function bip_integrity_vecs(). Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Kent Overstreet <kmo@daterainc.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2014-07-01 10:36:47 -06:00
Tejun Heo	add703fda9	blk-mq: use percpu_ref for mq usage count Currently, blk-mq uses a percpu_counter to keep track of how many usages are in flight. The percpu_counter is drained while freezing to ensure that no usage is left in-flight after freezing is complete. blk_mq_queue_enter/exit() and blk_mq_[un]freeze_queue() implement this per-cpu gating mechanism. This type of code has relatively high chance of subtle bugs which are extremely difficult to trigger and it's way too hairy to be open coded in blk-mq. percpu_ref can serve the same purpose after the recent changes. This patch replaces the open-coded per-cpu usage counting and draining mechanism with percpu_ref. blk_mq_queue_enter() performs tryget_live on the ref and exit() performs put. blk_mq_freeze_queue() kills the ref and waits until the reference count reaches zero. blk_mq_unfreeze_queue() revives the ref and wakes up the waiters. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Nicholas A. Bellinger <nab@linux-iscsi.org> Cc: Kent Overstreet <kmo@daterainc.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2014-07-01 10:34:38 -06:00
Tejun Heo	72d6f02a8d	blk-mq: collapse __blk_mq_drain_queue() into blk_mq_freeze_queue() Keeping __blk_mq_drain_queue() as a separate function doesn't buy us anything and it's gonna be further simplified. Let's flatten it into its caller. This patch doesn't make any functional change. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Nicholas A. Bellinger <nab@linux-iscsi.org> Signed-off-by: Jens Axboe <axboe@fb.com>	2014-07-01 10:33:02 -06:00
Tejun Heo	780db2071a	blk-mq: decouble blk-mq freezing from generic bypassing blk_mq freezing is entangled with generic bypassing which bypasses blkcg and io scheduler and lets IO requests fall through the block layer to the drivers in FIFO order. This allows forward progress on IOs with the advanced features disabled so that those features can be configured or altered without worrying about stalling IO which may lead to deadlock through memory allocation. However, generic bypassing doesn't quite fit blk-mq. blk-mq currently doesn't make use of blkcg or ioscheds and it maps bypssing to freezing, which blocks request processing and drains all the in-flight ones. This causes problems as bypassing assumes that request processing is online. blk-mq works around this by conditionally allowing request processing for the problem case - during queue initialization. Another weirdity is that except for during queue cleanup, bypassing started on the generic side prevents blk-mq from processing new requests but doesn't drain the in-flight ones. This shouldn't break anything but again highlights that something isn't quite right here. The root cause is conflating blk-mq freezing and generic bypassing which are two different mechanisms. The only intersecting purpose that they serve is during queue cleanup. Let's properly separate blk-mq freezing from generic bypassing and simply use it where necessary. * request_queue->mq_freeze_depth is added and blk_mq_[un]freeze_queue() now operate on this counter instead of ->bypass_depth. The replacement for QUEUE_FLAG_BYPASS isn't added but the counter is tested directly. This will be further updated by later changes. * blk_mq_drain_queue() is dropped and "__" prefix is dropped from blk_mq_freeze_queue(). Queue cleanup path now calls blk_mq_freeze_queue() directly. * blk_queue_enter()'s fast path condition is simplified to simply check @q->mq_freeze_depth. Previously, the condition was !blk_queue_dying(q) && (!blk_queue_bypass(q) \|\| !blk_queue_init_done(q)) mq_freeze_depth is incremented right after dying is set and blk_queue_init_done() exception isn't necessary as blk-mq doesn't start frozen, which only leaves the blk_queue_bypass() test which can be replaced by @q->mq_freeze_depth test. This change simplifies the code and reduces confusion in the area. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Nicholas A. Bellinger <nab@linux-iscsi.org> Signed-off-by: Jens Axboe <axboe@fb.com>	2014-07-01 10:31:13 -06:00
Tejun Heo	776687bce4	block, blk-mq: draining can't be skipped even if bypass_depth was non-zero Currently, both blk_queue_bypass_start() and blk_mq_freeze_queue() skip queue draining if bypass_depth was already above zero. The assumption is that the one which bumped the bypass_depth should have performed draining already; however, there's nothing which prevents a new instance of bypassing/freezing from starting before the previous one finishes draining. The current code may allow the later bypassing/freezing instances to complete while there still are in-flight requests which haven't finished draining. Fix it by draining regardless of bypass_depth. We still skip draining from blk_queue_bypass_start() while the queue is initializing to avoid introducing excessive delays during boot. INIT_DONE setting is moved above the initial blk_queue_bypass_end() so that bypassing attempts can't slip inbetween. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Nicholas A. Bellinger <nab@linux-iscsi.org> Signed-off-by: Jens Axboe <axboe@fb.com>	2014-07-01 10:29:17 -06:00
Tejun Heo	531ed6261e	blk-mq: fix a memory ordering bug in blk_mq_queue_enter() blk-mq uses a percpu_counter to keep track of how many usages are in flight. The percpu_counter is drained while freezing to ensure that no usage is left in-flight after freezing is complete. blk_mq_queue_enter/exit() and blk_mq_[un]freeze_queue() implement this per-cpu gating mechanism; unfortunately, it contains a subtle bug - smp_wmb() in blk_mq_queue_enter() doesn't prevent prevent the cpu from fetching @q->bypass_depth before incrementing @q->mq_usage_counter and if freezing happens inbetween the caller can slip through and freezing can be complete while there are active users. Use smp_mb() instead so that bypass_depth and mq_usage_counter modifications and tests are properly interlocked. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Nicholas A. Bellinger <nab@linux-iscsi.org> Signed-off-by: Jens Axboe <axboe@fb.com>	2014-07-01 10:27:06 -06:00
Jens Axboe	17737d3b59	Merge branch 'for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu into for-3.17/core Merge the percpu_ref changes from Tejun, he says they are stable now.	2014-07-01 10:19:04 -06:00
Linus Torvalds	4c834452aa	Linux 3.16-rc3	2014-06-29 14:11:36 -07:00
Linus Torvalds	ef2e0391e5	Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm Pull ARM fixes from Russell King: "Another round of ARM fixes. The largest change here is the L2 changes to work around problems for the Armada 37x/380 devices, where most of the size comes down to comments rather than code. The other significant fix here is for the ptrace code, to ensure that rewritten syscalls work as intended. This was pointed out by Kees Cook, but Will Deacon reworked the patch to be more elegant. The remainder are fairly trivial changes" * 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: ARM: 8087/1: ptrace: reload syscall number after secure_computing() check ARM: 8086/1: Set memblock limit for nommu ARM: 8085/1: sa1100: collie: add top boot mtd partition ARM: 8084/1: sa1100: collie: revert back to cfi_probe ARM: 8080/1: mcpm.h: remove unused variable declaration ARM: 8076/1: mm: add support for HW coherent systems in PL310 cache	2014-06-29 13:40:08 -07:00
Randy Dunlap	97be078b87	MAINTAINERS: exceptions for Documentation maintainer Note that I don't maintain Documentation/ABI/, Documentation/devicetree/, or the language translation files. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-06-29 13:38:33 -07:00
Dan Carpenter	7d19e91b52	Documentation: add section about git to email-clients.txt These days most people use git to send patches so I have added a section about that. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-06-29 13:38:33 -07:00
Will Deacon	42309ab450	ARM: 8087/1: ptrace: reload syscall number after secure_computing() check On the syscall tracing path, we call out to secure_computing() to allow seccomp to check the syscall number being attempted. As part of this, a SIGTRAP may be sent to the tracer and the syscall could be re-written by a subsequent SET_SYSCALL ptrace request. Unfortunately, this new syscall is ignored by the current code unless TIF_SYSCALL_TRACE is also set on the current thread. This patch slightly reworks the enter path of the syscall tracing code so that we always reload the syscall number from current_thread_info()->syscall after the potential ptrace traps. Acked-by: Kees Cook <keescook@chromium.org> Tested-by: Kees Cook <keescook@chromium.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2014-06-29 10:29:35 +01:00
Laura Abbott	6980c3e251	ARM: 8086/1: Set memblock limit for nommu Commit `1c2f87c` (ARM: 8025/1: Get rid of meminfo) changed find_limits to use memblock_get_current_limit for calculating the max_low pfn. nommu targets never actually set a limit on memblock though which means memblock_get_current_limit will just return the default value. Set the memblock_limit to be the end of DDR to make sure bounds are calculated correctly. Signed-off-by: Laura Abbott <lauraa@codeaurora.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2014-06-29 10:29:34 +01:00
Andrea Adami	3abe742339	ARM: 8085/1: sa1100: collie: add top boot mtd partition The CFI mapping is now perfect so we can expose the top block, read only. There isn't much to read, though, just the sharpsl_params values. Signed-off-by: Andrea Adami <andrea.adami@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2014-06-29 10:29:34 +01:00
Andrea Adami	92183103d8	ARM: 8084/1: sa1100: collie: revert back to cfi_probe Reverts commit `d26b17edaf` ARM: sa1100: collie.c: fall back to jedec_probe flash detection Unfortunately the detection was challenged on the defective unit used for tests: one of the NOR chips did not respond to the CFI query. Moreover that bad device needed extra delays on erase-suspend/resume cycles. Tested personally on 3 different units and with feedback of two other users. Signed-off-by: Andrea Adami <andrea.adami@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2014-06-29 10:29:33 +01:00
Nicolas Pitre	d0ba7cc02c	ARM: 8080/1: mcpm.h: remove unused variable declaration The sync_phys variable has been replaced by link time computation in mcpm_head.S before the code was submitted upstream. Signed-off-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2014-06-29 10:29:32 +01:00
Thomas Petazzoni	98ea2dba65	ARM: 8076/1: mm: add support for HW coherent systems in PL310 cache When a PL310 cache is used on a system that provides hardware coherency, the outer cache sync operation is useless, and can be skipped. Moreover, on some systems, it is harmful as it causes deadlocks between the Marvell coherency mechanism, the Marvell PCIe controller and the Cortex-A9. To avoid this, this commit introduces a new Device Tree property 'arm,io-coherent' for the L2 cache controller node, valid only for the PL310 cache. It identifies the usage of the PL310 cache in an I/O coherent configuration. Internally, it makes the driver disable the outer cache sync operation. Note that technically speaking, a fully coherent system wouldn't require any of the other .outer_cache operations. However, in practice, when booting secondary CPUs, these are not yet coherent, and therefore a set of cache maintenance operations are necessary at this point. This explains why we keep the other .outer_cache operations and only ->sync is disabled. While in theory any write to a PL310 register could cause the deadlock, in practice, disabling ->sync is sufficient to workaround the deadlock, since the other cache maintenance operations are only used in very specific situations. Contrary to previous versions of this patch, this new version does not simply NULL-ify the ->sync member, because the l2c_init_data structures are now 'const' and therefore cannot be modified, which is a good thing. Therefore, this patch introduces a separate l2c_init_data instance, called of_l2c310_coherent_data. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2014-06-29 10:26:37 +01:00
Linus Torvalds	24b414d5a7	spi: Fixes for v3.16 A few driver specific fixes, the biggest one being a fix for the newly added Qualcomm SPI controller driver to make it not use its internal chip select due to hardware bugs, replacing it with GPIOs. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJTrr4lAAoJELSic+t+oim9/30P/3wie8c4eNjRQLPYrNmq9y1t S9sF6U6ur3mDD1GXol1hoe3FU3aaF9TiNU8sq8wtj2TCvmB6BLlH70uKdIIbKB4B b1WVNoclBTZYE3v8FdsJce5eyBjaNY90cRJ5TGwwIWowSNuWwo0/+1zWWzzyH9AF u2STutPSZAJL99RRnUZj+5zx9cspTnc+TNX5UFIXbRcvLPnYkg8TeQvXFUAH9CgL p0sbveu8C3ZS2es5h1Py/f9v0/Pv00fGCVNW073JA82I5viUAogI5+63e65kKZcm xjdY4yIrIrsORjbpUny7YjkgdxdpAWjO4IFnFdPT4GZBe9Ad7ACYKfiox/q/EO7O H8z8v6Ebq/6+wV9uvZtSTIWd2PRV7YUsYijTZGFN02+f4QQq5OckgQMD0ODOBb4k uI1qiTcd11g1gXlExne23fzvBzpCfD1h2N8h/DyXRBhlyb9NupaABjqVP/fkds5g k5j6VuYcGy3UZeGmSOrx85d5pWgOBp/hDJ8DwWo9AUGPrnDx9HgAerxhNMXaJcQA PFypfT2L23ACxbTSZmAj2uRbKv4zyiqM+xtxdLmY/KrvpuDoKAlDGtZV1TFeNo7c 4H3c96fAc/i0BsinY3b1weQ+oGNcNWv9nvdqVGxuBvWs862HkvUcP5mB5ki5BGxh h3uVSfQeETPizC+3w52A =Jdak -----END PGP SIGNATURE----- Merge tag 'spi-v3.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "A few driver specific fixes, the biggest one being a fix for the newly added Qualcomm SPI controller driver to make it not use its internal chip select due to hardware bugs, replacing it with GPIOs" * tag 'spi-v3.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: qup: Remove chip select function spi: qup: Fix order of spi_register_master spi: sh-sci: fix use-after-free in sh_sci_spi_remove() spi/pxa2xx: fix incorrect SW mode chipselect setting for BayTrail LPSS SPI	2014-06-28 11:32:32 -07:00
Linus Torvalds	4194976b09	regulator: Fixes for v3.16 Several driver specific fixes here, the palmas fixes being especially important for a range of boards - the recent updates to support new devices have introduced several regressions. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJTrsM3AAoJELSic+t+oim9ACIP/RIatlNlq0ALnWaYqaGCkQIF /kQJkBmInCIZ9e7zumnOxpE39OQIcfhqpQeH1aeVlQA6FabGhjdss8AJoM0VbUNS Cj46QnvADMruuow/ZQ4er2Na2w0lzO7CGffnrK+J1N2K3mivkJbBDomtYyJ93/kw rgqEWahszvIvHFD+yFITDhFW7I1f6YaKGj7oTIh844XD3kmphXxCRgQuojYaw7Xa qwLou8OfQL7NZv+z5suffO6n6ga3FgxNge+zynPXSx9lUgnUL/DZzf4tbE42GyXk wPANjTg2JBAwyb/X6zvb7hU7ag5clZhpfvOZRtmYHeyQDQcdUYfVKadBZE9ZAqQv 1B6q5XJQeE/XzHicQav0WN+FI/lOdf/ePyRRL2kGqAdQkDqnBM3lpRvM8c7uAhLT r75z822QBug7pw9HSUN3CIhTMpHxdmXq3tVLGOULLXdBxDPNOvXEKqjj3khj+NZm 5JJEcCYsu0PgbyJKdSF/yIeCFLY0slXNfnuWUvoMlOcpY3EgxwUi+FOxjmPxhq5v df47PJ4vcRXh/wHeA1oZQ1NDUEMfP1iX4ysUktQfaUUKq1QjdIegj4aykRHriVfc OQGJ4S3Dco8FDeWZtpmaFnA68cJR4nFCMzZOqLiK/wHadpF7qcgEKNync0QAvKK3 Y15h23vpcq+OWdHGnObL =K0Hp -----END PGP SIGNATURE----- Merge tag 'regulator-v3.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fixes from Mark Brown: "Several driver specific fixes here, the palmas fixes being especially important for a range of boards - the recent updates to support new devices have introduced several regressions" * tag 'regulator-v3.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: tps65218: Correct the the config register for LDO1 regulator: tps65218: Add the missing of_node assignment in probe regulator: palmas: fix typo in enable_reg calculation regulator: bcm590xx: fix vbus name regulator: palmas: Fix SMPS enable/disable/is_enabled	2014-06-28 11:31:58 -07:00
Linus Torvalds	eb477e03fe	Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull SCSI target fixes from Nicholas Bellinger: "Mostly minor fixes this time around. The highlights include: - iscsi-target CHAP authentication fixes to enforce explicit key values (Tejas Vaykole + rahul.rane) - fix a long-standing OOPs in target-core when a alua configfs attribute is accessed after port symlink has been removed. (Sebastian Herbszt) - fix a v3.10.y iscsi-target regression causing the login reject status class/detail to be ignored (Christoph Vu-Brugier) - fix a v3.10.y iscsi-target regression to avoid rejecting an existing ITT during Data-Out when data-direction is wrong (Santosh Kulkarni + Arshad Hussain) - fix a iscsi-target related shutdown deadlock on UP kernels (Mikulas Patocka) - fix a v3.16-rc1 build issue with vhost-scsi + !CONFIG_NET (MST)" * git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: iscsi-target: fix iscsit_del_np deadlock on unload iovec: move memcpy_from/toiovecend to lib/iovec.c iscsi-target: Avoid rejecting incorrect ITT for Data-Out tcm_loop: Fix memory leak in tcm_loop_submission_work error path iscsi-target: Explicily clear login response PDU in exception path target: Fix left-over se_lun->lun_sep pointer OOPs iscsi-target; Enforce 1024 byte maximum for CHAP_C key value iscsi-target: Convert chap_server_compute_md5 to use kstrtoul	2014-06-28 09:43:58 -07:00
Mark Brown	7216a41839	Merge remote-tracking branches 'spi/fix/pxa2xx', 'spi/fix/qup' and 'spi/fix/sh-sci' into spi-linus	2014-06-28 14:01:23 +01:00
Mark Brown	11767484b8	Merge remote-tracking branches 'regulator/fix/bcm590xx', 'regulator/fix/palmas' and 'regulator/fix/tps65218' into regulator-linus	2014-06-28 14:01:04 +01:00
Tejun Heo	2d7227828e	percpu-refcount: implement percpu_ref_reinit() and percpu_ref_is_zero() Now that explicit invocation of percpu_ref_exit() is necessary to free the percpu counter, we can implement percpu_ref_reinit() which reinitializes a released percpu_ref. This can be used implement scalable gating switch which can be drained and then re-opened without worrying about memory allocation failures. percpu_ref_is_zero() is added to be used in a sanity check in percpu_ref_exit(). As this function will be useful for other purposes too, make it a public interface. v2: Use smp_read_barrier_depends() instead of smp_load_acquire(). We only need data dep barrier and smp_load_acquire() is stronger and heavier on some archs. Spotted by Lai Jiangshan. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Kent Overstreet <kmo@daterainc.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com>	2014-06-28 08:10:14 -04:00
Tejun Heo	9a1049da9b	percpu-refcount: require percpu_ref to be exited explicitly Currently, a percpu_ref undoes percpu_ref_init() automatically by freeing the allocated percpu area when the percpu_ref is killed. While seemingly convenient, this has the following niggles. * It's impossible to re-init a released reference counter without going through re-allocation. * In the similar vein, it's impossible to initialize a percpu_ref count with static percpu variables. * We need and have an explicit destructor anyway for failure paths - percpu_ref_cancel_init(). This patch removes the automatic percpu counter freeing in percpu_ref_kill_rcu() and repurposes percpu_ref_cancel_init() into a generic destructor now named percpu_ref_exit(). percpu_ref_destroy() is considered but it gets confusing with percpu_ref_kill() while "exit" clearly indicates that it's the counterpart of percpu_ref_init(). All percpu_ref_cancel_init() users are updated to invoke percpu_ref_exit() instead and explicit percpu_ref_exit() calls are added to the destruction path of all percpu_ref users. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Benjamin LaHaise <bcrl@kvack.org> Cc: Kent Overstreet <kmo@daterainc.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: Nicholas A. Bellinger <nab@linux-iscsi.org> Cc: Li Zefan <lizefan@huawei.com>	2014-06-28 08:10:14 -04:00
Tejun Heo	7d74207512	percpu-refcount: use unsigned long for pcpu_count pointer percpu_ref->pcpu_count is a percpu pointer with a status flag in its lowest bit. As such, it always goes through arithmetic operations which is very cumbersome to do on a pointer. It has to be first casted to unsigned long and then back. Let's just make the field unsigned long so that we can skip the first casts. While at it, rename it to pcpu_counter_ptr to clarify that it's a pointer value. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Kent Overstreet <kmo@daterainc.com> Cc: Christoph Lameter <cl@linux-foundation.org>	2014-06-28 08:10:13 -04:00

1 2 3 4 5 ...

455659 commits