linux/block
Qais Yousef af550e4c96 block/blk-mq: Don't complete locally if capacities are different
The logic in blk_mq_complete_need_ipi() assumes SMP systems where all
CPUs have equal compute capacities and only LLC cache can make
a different on perceived performance. But this assumption falls apart on
HMP systems where LLC is shared, but the CPUs have different capacities.
Staying local then can have a big performance impact if the IO request
was done from a CPU with higher capacity but the interrupt is serviced
on a lower capacity CPU.

Use the new cpus_equal_capacity() function to check if we need to send
an IPI.

Without the patch I see the BLOCK softirq always running on little cores
(where the hardirq is serviced). With it I can see it running on all
cores.

This was noticed after the topology change [1] where now on a big.LITTLE
we truly get that the LLC is shared between all cores where as in the
past it was being misrepresented for historical reasons. The logic
exposed a missing dependency on capacities for such systems where there
can be a big performance difference between the CPUs.

This of course introduced a noticeable change in behavior depending on
how the topology is presented. Leading to regressions in some workloads
as the performance of the BLOCK softirq on littles can be noticeably
worse on some platforms.

Worth noting that we could have checked for capacities being greater
than or equal instead for equality. This will lead to favouring higher
performance always. But opted for equality instead to match the
performance of the requester without making an assumption that can lead
to power trade-offs which these systems tend to be sensitive about. If
the requester would like to run faster, it's better to rely on the
scheduler to give the IO requester via some facility to run on a faster
core; and then if the interrupt triggered on a CPU with different
capacity we'll make sure to match the performance the requester is
supposed to run at.

[1] https://lpc.events/event/16/contributions/1342/attachments/962/1883/LPC-2022-Android-MC-Phantom-Domains.pdf

Signed-off-by: Qais Yousef <qyousef@layalina.io>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20240223155749.2958009-3-qyousef@layalina.io
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-02-24 12:48:01 -07:00
..
partitions block: Move checking GENHD_FL_NO_PART to bdev_add_partition() 2024-01-22 09:51:29 -07:00
badblocks.c badblocks: avoid checking invalid range in badblocks_check() 2023-12-23 18:38:08 -07:00
bdev.c vfs-6.8.super 2024-01-08 10:43:51 -08:00
bfq-cgroup.c block: add blk_time_get_ns() and blk_time_get() helpers 2024-02-05 10:07:22 -07:00
bfq-iosched.c block: add blk_time_get_ns() and blk_time_get() helpers 2024-02-05 10:07:22 -07:00
bfq-iosched.h block, bfq: remove BFQ_WEIGHT_LEGACY_DFL 2023-04-06 16:17:32 -06:00
bfq-wf2q.c block, bfq: inject I/O to underutilized actuators 2023-01-29 15:18:33 -07:00
bio-integrity.c block: support PI at non-zero offset within metadata 2024-02-12 08:49:31 -07:00
bio.c block: io wait hang check helper 2024-02-24 12:46:46 -07:00
blk-cgroup-fc-appid.c block: Replace all non-returning strlcpy with strscpy 2023-06-01 09:13:31 -06:00
blk-cgroup-rwstat.c Revert "blk-cgroup: pin the gendisk in struct blkcg_gq" 2023-02-14 14:24:09 -07:00
blk-cgroup-rwstat.h block: Use the new blk_opf_t type 2022-07-14 12:14:30 -06:00
blk-cgroup.c block: add blk_time_get_ns() and blk_time_get() helpers 2024-02-05 10:07:22 -07:00
blk-cgroup.h block: move cgroup time handling code into blk.h 2024-02-05 10:07:17 -07:00
blk-core.c block: pass a queue_limits argument to blk_alloc_queue 2024-02-13 08:56:59 -07:00
blk-crypto-fallback.c blk-crypto: dynamically allocate fallback profile 2023-08-18 15:00:39 -06:00
blk-crypto-internal.h blk-crypto: remove blk_crypto_insert_cloned_request() 2023-03-16 09:35:09 -06:00
blk-crypto-profile.c blk-crypto: use dynamic lock class for blk_crypto_profile::lock 2023-07-05 16:36:12 -06:00
blk-crypto-sysfs.c block: make kobj_type structures constant 2023-02-09 09:38:16 -07:00
blk-crypto.c blk-crypto: make blk_crypto_evict_key() more robust 2023-03-16 09:35:09 -06:00
blk-flush.c block: add blk_time_get_ns() and blk_time_get() helpers 2024-02-05 10:07:22 -07:00
blk-ia-ranges.c block: make kobj_type structures constant 2023-02-09 09:38:16 -07:00
blk-integrity.c block: support PI at non-zero offset within metadata 2024-02-12 08:49:31 -07:00
blk-ioc.c blk-ioc: fix recursive spin_lock/unlock_irq() in ioc_clear_queue() 2023-06-07 07:51:00 -06:00
blk-iocost.c block: add blk_time_get_ns() and blk_time_get() helpers 2024-02-05 10:07:22 -07:00
blk-iolatency.c block: add blk_time_get_ns() and blk_time_get() helpers 2024-02-05 10:07:22 -07:00
blk-ioprio.c blk-ioprio: Introduce promote-to-rt policy 2023-06-06 22:26:26 -06:00
blk-ioprio.h blk-ioprio: pass a gendisk to blk_ioprio_init and blk_ioprio_exit 2022-09-26 19:09:31 -06:00
blk-lib.c blk-lib: check for kill signal 2024-02-24 12:46:46 -07:00
blk-map.c block: Fix WARNING in _copy_from_iter 2024-01-23 08:56:55 -07:00
blk-merge.c block: remove two comments in bio_split_discard 2023-12-29 08:44:12 -07:00
blk-mq-cpumap.c blk-mq: include <linux/blk-mq.h> in block/blk-mq.h 2023-04-13 06:52:29 -06:00
blk-mq-debugfs-zoned.c block: move zone related fields to struct gendisk 2022-07-06 06:46:26 -06:00
blk-mq-debugfs.c blk-mq: Remove the hctx 'run' debugfs attribute 2024-01-17 14:16:34 -07:00
blk-mq-debugfs.h block: remove per-disk debugfs files in blk_unregister_queue 2022-06-17 07:31:05 -06:00
blk-mq-pci.c blk-mq: include <linux/blk-mq.h> in block/blk-mq.h 2023-04-13 06:52:29 -06:00
blk-mq-sched.c blk-mq: Remove the hctx 'run' debugfs attribute 2024-01-17 14:16:34 -07:00
blk-mq-sched.h blk-mq: make sure elevator callbacks aren't called for passthrough request 2023-05-18 19:42:54 -06:00
blk-mq-sysfs.c blk-mq: include <linux/blk-mq.h> in block/blk-mq.h 2023-04-13 06:52:29 -06:00
blk-mq-tag.c for-6.5/block-2023-06-23 2023-06-26 12:47:20 -07:00
blk-mq-virtio.c blk-mq: include <linux/blk-mq.h> in block/blk-mq.h 2023-04-13 06:52:29 -06:00
blk-mq.c block/blk-mq: Don't complete locally if capacities are different 2024-02-24 12:48:01 -07:00
blk-mq.h blk-mq: update driver tags request table when start request 2023-09-22 08:52:13 -06:00
blk-pm.c block: Remove blk_set_runtime_active() 2023-11-20 10:22:40 -07:00
blk-pm.h block: Remove unused blk_pm_*() function definitions 2021-02-22 06:33:48 -07:00
blk-rq-qos.c block: correct stale comment in rq_qos_wait 2023-09-18 14:15:28 -06:00
blk-rq-qos.h block: skip QUEUE_FLAG_STATS and rq-qos for passthrough io 2023-12-01 18:29:18 -07:00
blk-settings.c block: Clear zone limits for a non-zoned stacked queue 2024-02-22 10:35:18 -07:00
blk-stat.c blk-mq: include <linux/blk-mq.h> in block/blk-mq.h 2023-04-13 06:52:29 -06:00
blk-stat.h block: make queue stat accounting a reference 2021-12-14 17:23:05 -07:00
blk-sysfs.c block: use queue_limits_commit_update in queue_discard_max_store 2024-02-13 08:56:59 -07:00
blk-throttle.c blk-throttle: Eliminate redundant checks for data direction 2024-02-05 10:16:12 -07:00
blk-throttle.h blk-throttle: print signed value 'carryover_bytes/ios' for user 2023-08-30 10:15:01 -06:00
blk-timeout.c block: blk-timeout: delete duplicated word 2020-07-31 16:29:47 -06:00
blk-wbt.c block: add blk_time_get_ns() and blk_time_get() helpers 2024-02-05 10:07:22 -07:00
blk-wbt.h blk-wbt: remove the separate write cache tracking 2023-12-26 09:28:10 -07:00
blk-zoned.c block: Do not include rbtree.h in blk-zoned.c 2024-02-22 10:35:18 -07:00
blk.h block: io wait hang check helper 2024-02-24 12:46:46 -07:00
bounce.c block: change the blk_queue_bounce calling convention 2022-08-02 17:22:54 -06:00
bsg-lib.c block: pass a queue_limits argument to blk_mq_init_queue 2024-02-13 08:56:59 -07:00
bsg.c SCSI misc on 20230629 2023-06-30 11:57:07 -07:00
disk-events.c block: move bdev_mark_dead out of disk_check_media_change 2023-10-28 13:29:23 +02:00
early-lookup.c block: don't return -EINVAL for not found names in devt_from_devname 2023-06-22 09:09:33 -06:00
elevator.c blk-mq: release scheduler resource when request completes 2023-08-19 07:47:17 -06:00
elevator.h blk-mq: pass a flags argument to elevator_type->insert_requests 2023-04-13 06:52:30 -06:00
fops.c fs: convert block_write_full_page to block_write_full_folio 2023-12-29 11:58:35 -08:00
genhd.c block: pass a queue_limits argument to blk_alloc_disk 2024-02-19 16:58:23 -07:00
holder.c block: fix deadlock between bd_link_disk_holder and partition scan 2024-02-23 07:44:19 -07:00
ioctl.c block: Move checking GENHD_FL_NO_PART to bdev_add_partition() 2024-01-22 09:51:29 -07:00
ioprio.c block: move __get_task_ioprio() into header file 2024-01-08 12:27:39 -07:00
Kconfig block: Add config option to not allow writing to mounted devices 2023-11-18 14:59:25 +01:00
Kconfig.iosched block: Default to use cgroup support for BFQ 2023-01-30 09:42:42 -07:00
kyber-iosched.c blk-mq: pass a flags argument to elevator_type->insert_requests 2023-04-13 06:52:30 -06:00
Makefile block: move the code to do early boot lookup of block devices to block/ 2023-06-05 10:57:40 -06:00
mq-deadline.c block/mq-deadline: use correct way to throttling write requests 2023-08-08 15:46:41 -06:00
opal_proto.h block: sed-opal: Implement IOC_OPAL_REVERT_LSP 2023-08-22 11:10:26 -06:00
sed-opal.c for-6.7/block-2023-10-30 2023-11-01 12:30:07 -10:00
t10-pi.c block: support PI at non-zero offset within metadata 2024-02-12 08:49:31 -07:00