linux/io_uring
Pavel Begunkov a0727c7383 io_uring: improve cqe !tracing hot path
While looking at io_fill_cqe_req()'s asm I stumbled on our trace points
turning into the chunk below:

trace_io_uring_complete(req->ctx, req, req->cqe.user_data,
			req->cqe.res, req->cqe.flags,
			req->extra1, req->extra2);

io_uring/io_uring.c:898: 	trace_io_uring_complete(req->ctx, req, req->cqe.user_data,
	movq	232(%rbx), %rdi	# req_44(D)->big_cqe.extra2, _5
	movq	224(%rbx), %rdx	# req_44(D)->big_cqe.extra1, _6
	movl	84(%rbx), %r9d	# req_44(D)->cqe.D.81184.flags, _7
	movl	80(%rbx), %r8d	# req_44(D)->cqe.res, _8
	movq	72(%rbx), %rcx	# req_44(D)->cqe.user_data, _9
	movq	88(%rbx), %rsi	# req_44(D)->ctx, _10
./arch/x86/include/asm/jump_label.h:27: 	asm_volatile_goto("1:"
	1:jmp .L1772 # objtool NOPs this 	#
	...

It does a jump_label for actual tracing, but those 6 moves will stay
there in the hottest io_uring path. As an optimisation, add a
trace_io_uring_complete_enabled() check, which is also uses jump_labels,
it tricks the compiler into behaving. It removes the junk without
changing anything else int the hot path.

Note: apparently, it's not only me noticing it, and people are also
working it around. We should remove the check when it's solved
generically or rework tracing.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/555d8312644b3776f4be7e23f9b92943875c4bc7.1692916914.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-08-24 17:16:19 -06:00
..
advise.c io_uring: always go async for unsupported fadvise flags 2023-01-29 15:18:26 -07:00
advise.h
alloc_cache.h io_uring/rsrc: consolidate node caching 2023-04-12 12:09:41 -06:00
cancel.c io_uring/cancel: wire up IORING_ASYNC_CANCEL_OP for sync cancel 2023-07-17 10:05:48 -06:00
cancel.h io_uring/cancel: support opcode based lookup and cancelation 2023-07-17 10:05:48 -06:00
epoll.c io_uring: undeprecate epoll_ctl support 2023-05-26 20:22:41 -06:00
epoll.h
fdinfo.c io_uring/fdinfo: get rid of ref tryget 2023-08-10 10:24:19 -06:00
fdinfo.h
filetable.c io_uring: add helpers to decode the fixed file file_ptr 2023-06-20 09:36:22 -06:00
filetable.h io_uring: add helpers to decode the fixed file file_ptr 2023-06-20 09:36:22 -06:00
fs.c io_uring: for requests that require async, force it 2023-01-29 15:18:26 -07:00
fs.h
io-wq.c io_uring/sqpoll: fix io-wq affinity when IORING_SETUP_SQPOLL is used 2023-08-16 13:40:28 -06:00
io-wq.h io_uring/sqpoll: fix io-wq affinity when IORING_SETUP_SQPOLL is used 2023-08-16 13:40:28 -06:00
io_uring.c io_uring/sqpoll: fix io-wq affinity when IORING_SETUP_SQPOLL is used 2023-08-16 13:40:28 -06:00
io_uring.h io_uring: improve cqe !tracing hot path 2023-08-24 17:16:19 -06:00
kbuf.c for-6.4/io_uring-2023-04-21 2023-04-26 12:40:31 -07:00
kbuf.h io_uring: add support for user mapped provided buffer ring 2023-04-03 07:14:21 -06:00
Makefile
msg_ring.c io_uring: use io_file_from_index in io_msg_grab_file 2023-06-20 09:36:22 -06:00
msg_ring.h io_uring: get rid of double locking 2022-12-07 06:47:13 -07:00
net.c io_uring: never overflow io_aux_cqe 2023-08-11 10:42:57 -06:00
net.h io_uring: Add KASAN support for alloc_caches 2023-04-03 07:16:14 -06:00
nop.c
nop.h
notif.c io_uring/notif: add constant for ubuf_info flags 2023-04-15 14:21:04 -06:00
notif.h io_uring/notif: add constant for ubuf_info flags 2023-04-15 14:21:04 -06:00
opdef.c io_uring: Pass whole sqe to commands 2023-05-04 08:19:05 -06:00
opdef.h io_uring: Split io_issue_def struct 2023-01-29 15:17:41 -07:00
openclose.c fsnotify: move fsnotify_open() hook into do_dentry_open() 2023-06-12 10:43:45 +02:00
openclose.h
poll.c io_uring: never overflow io_aux_cqe 2023-08-11 10:42:57 -06:00
poll.h io_uring: avoid indirect function calls for the hottest task_work 2023-06-02 08:55:37 -06:00
refs.h
rsrc.c io_uring/rsrc: keep one global dummy_ubuf 2023-08-11 10:42:57 -06:00
rsrc.h io_uring/rsrc: Annotate struct io_mapped_ubuf with __counted_by 2023-08-17 19:14:47 -06:00
rw.c io_uring: open code io_fill_cqe_req() 2023-08-11 10:42:57 -06:00
rw.h io_uring: avoid indirect function calls for the hottest task_work 2023-06-02 08:55:37 -06:00
slist.h io_uring: silence variable ‘prev’ set but not used warning 2023-03-09 10:10:58 -07:00
splice.c io_uring/splice: use fput() directly 2023-08-10 10:24:25 -06:00
splice.h
sqpoll.c io_uring/sqpoll: fix io-wq affinity when IORING_SETUP_SQPOLL is used 2023-08-16 13:40:28 -06:00
sqpoll.h io_uring/sqpoll: fix io-wq affinity when IORING_SETUP_SQPOLL is used 2023-08-16 13:40:28 -06:00
statx.c io_uring: for requests that require async, force it 2023-01-29 15:18:26 -07:00
statx.h
sync.c io_uring: for requests that require async, force it 2023-01-29 15:18:26 -07:00
sync.h
tctx.c io_uring: Add io_uring_setup flag to pre-register ring fd and never install it 2023-05-16 08:06:00 -06:00
tctx.h io_uring: simplify __io_uring_add_tctx_node 2022-10-07 12:25:30 -06:00
timeout.c io_uring: never overflow io_aux_cqe 2023-08-11 10:42:57 -06:00
timeout.h io_uring: remove unused return from io_disarm_next 2022-09-21 13:15:01 -06:00
uring_cmd.c io_uring: Add io_uring command support for sockets 2023-08-09 10:46:15 -06:00
uring_cmd.h io_uring: Remove unnecessary BUILD_BUG_ON 2023-05-04 08:19:05 -06:00
xattr.c io_uring: for requests that require async, force it 2023-01-29 15:18:26 -07:00
xattr.h