Commit graph

1646 commits

Author SHA1 Message Date
Pavel Begunkov 7e3709d576 io_uring: optimise kiocb layout
We want ->comp_list in the second cacheline, which is hotter comparing
to the 3rd. Swap the field with ->link, which is not as hot and
controlled by flags and so not accessed unless there is a link.

By the way add a couple of comments for io_kiocb fields.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/9d9dde31f8f62279a5f48c575bbc27b8290edc0c.1633373302.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:54 -06:00
Pavel Begunkov 6224590d24 io_uring: add flag to not fail link after timeout
For some reason non-off IORING_OP_TIMEOUT always fails links, it's
pretty inconvenient and unnecessary limits chaining after it to hard
linking, which is far from ideal, e.g. doesn't pair well with timeout
cancellation. Add a flag forcing it to not fail links on -ETIME.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/17c7ec0fb7a6113cc6be8cdaedcada0ba836ac0e.1633199723.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:54 -06:00
Pavel Begunkov 30d51dd4ad io_uring: clean up buffer select
Hiding a pointer to a struct io_buffer in rw.addr is error prone. We
have some place in io_kiocb, so keep kbuf's in a separate field
without aliasing and risks of it being misused.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/3e63a6a953b04cad81d9ea827b12344dd57b37b4.1633107393.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:54 -06:00
Pavel Begunkov fc0ae0244b io_uring: init opcode in io_init_req()
Move io_req_prep() call inside of io_init_req(), it simplifies a bit
error handling for callers.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/a0f59291fd52da4672c323542fd56fd899e23f8f.1633107393.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:54 -06:00
Pavel Begunkov e0eb71dcfc io_uring: don't return from io_drain_req()
Never return from io_drain_req() but punt to tw if we've got there but
it's a false positive and we shouldn't actually drain.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/93583cee51b8783706b76c73196c155b28d9e762.1633107393.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:54 -06:00
Pavel Begunkov 22b2ca310a io_uring: extra a helper for drain init
Add a helper io_init_req_drain for initialising requests with
IOSQE_DRAIN set. Also move bits from preambule of io_drain_req() in
there, because we already modify all the bits needed inside the helper.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/dcb412825b35b1cb8891245a387d7d69f8d14cef.1633107393.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:54 -06:00
Pavel Begunkov 5e371265ea io_uring: disable draining earlier
Clear ->drain_active in two more cases where we check for a need of
draining. It's not a bug, but still may lead to some extra requests
being punted to io-wq, and that may be not desirable.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/d20b265f77bb4e8860b15b9987252c7c711dfcba.1632516769.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:54 -06:00
Pavel Begunkov a1cdbb4cb5 io_uring: comment why inline complete calls io_clean_op()
io_req_complete_state() calls io_clean_op() and it may be not entirely
obvious, leave a comment.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/21806f862151e223fdf439e5e8ed7178a8d66979.1632516769.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:54 -06:00
Pavel Begunkov ef05d9ebcc io_uring: kill off ->inflight_entry field
->inflight_entry is not used anymore after converting everything to
single linked lists, remove it. Also adjust io_kiocb layout, so all hot
bits are in first 3 cachelines.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/fd8d68087ede26c4e1707ce6b175aa1eb2381f2b.1632516769.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:54 -06:00
Pavel Begunkov 6962980947 io_uring: restructure submit sqes to_submit checks
Put an explicit check for number of requests to submit. First,
we can turn while into do-while and it generates better code, and second
that if can be cheaper, e.g. by using CPU flags after sub in
io_sqring_entries().

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/5926baadd20c28feab7a5e1725fedf32e4553ff7.1632516769.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:54 -06:00
Pavel Begunkov d9f9d2842c io_uring: reshuffle queue_sqe completion handling
If a request completed inline the result should only be zero, it's a
grave error otherwise. So, when we see REQ_F_COMPLETE_INLINE it's not
even necessary to check the return code, and the flag check can be moved
earlier.

It's one "if" less for inline completions, and same two checks for it
normally completing (ret == 0). Those are two cases we care about the
most.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/ebd4e397a9c26d96c99b24447acc309741041a83.1632516769.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:54 -06:00
Pavel Begunkov d475a9a622 io_uring: inline hot path of __io_queue_sqe()
Extract slow paths from __io_queue_sqe() into a function and inline the
hot path. With that we have everything completely inlined on the
submission path up until io_issue_sqe().

-> io_submit_sqes()
  -> io_submit_sqe() (inlined)
    -> io_queue_sqe() (inlined)
       -> __io_queue_sqe() (inlined)
         -> io_issue_sqe()

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/f1606864d95d7f26dc28c7eec3dc6ed6ec32618a.1632516769.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:53 -06:00
Pavel Begunkov 4652fe3f10 io_uring: split slow path from io_queue_sqe
We don't want the slow path of io_queue_sqe to be inlined, so extract a
function from it.

   text    data     bss     dec     hex filename
  91950   13986       8  105944   19dd8 ./fs/io_uring.o
  91758   13986       8  105752   19d18 ./fs/io_uring.o

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/fb01253911f8fb374268f65b1ba939b54ca6583f.1632516769.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:53 -06:00
Pavel Begunkov 2a56a9bd64 io_uring: remove drain_active check from hot path
req->ctx->active_drain is a bit too expensive, partially because of two
dereferences. Do a trick, if we see it set in io_init_req(), set
REQ_F_FORCE_ASYNC and it automatically goes through a slower path where
we can catch it. It's nearly free to do in io_init_req() because there
is already ->restricted check and it's in the same byte of a bitmask.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/d7e7ddc63c15e8a300833132abb3eb8fd3918aef.1632516769.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:53 -06:00
Pavel Begunkov f15a343177 io_uring: deduplicate io_queue_sqe() call sites
There are two call sites of io_queue_sqe() in io_submit_sqe(), combine
them into one, because io_queue_sqe() is inline and we don't want to
bloat binary, and will become even bigger

   text    data     bss     dec     hex filename
  92126   13986       8  106120   19e88 ./fs/io_uring.o
  91966   13986       8  105960   19de8 ./fs/io_uring.o

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/506124b8e767f0a4576f7a459f6aea3d13fb4dda.1632516769.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:53 -06:00
Pavel Begunkov 553deffd09 io_uring: don't pass state to io_submit_state_end
Submission state and ctx and coupled together, no need to passs

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/e22d77a5786ef77e0c49b933ad74bae55cfb6ca6.1632516769.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:53 -06:00
Pavel Begunkov 1cce17aca6 io_uring: don't pass tail into io_free_batch_list
io_free_batch_list() iterates all requests in the passed in list,
so we don't really need to know the tail but can keep iterating until
meet NULL. Just pass the first node into it and it will be enough.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/4a12c84b6d887d980e05f417ba4172d04c64acae.1632516769.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:53 -06:00
Pavel Begunkov d4b7a5ef2b io_uring: inline completion batching helpers
We now have a single function for batched put of requests, just inline
struct req_batch and all related helpers into it.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/595a2917f80dd94288cd7203052c7934f5446580.1632516769.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:53 -06:00
Pavel Begunkov f5ed3bcd5b io_uring: optimise batch completion
First, convert rest of iopoll bits to single linked lists, and also
replace per-request list_add_tail() with splicing a part of slist.

With that, use io_free_batch_list() to put/free requests. The main
advantage of it is that it's now the only user of struct req_batch and
friends, and so they can be inlined. The main overhead there was
per-request call to not-inlined io_req_free_batch(), which is expensive
enough.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/b37fc6d5954b241e025eead7ab92c6f44a42f229.1632516769.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:53 -06:00
Pavel Begunkov b3fa03fd1b io_uring: convert iopoll_completed to store_release
Convert explicit barrier around iopoll_completed to smp_load_acquire()
and smp_store_release(). Similar on the callback side, but replaces a
single smp_rmb() with per-request smp_load_acquire(), neither imply any
extra CPU ordering for x86. Use READ_ONCE as usual where it doesn't
matter.

Use it to move filling CQEs by iopoll earlier, that will be necessary
to avoid traversing the list one extra time in the future.

Suggested-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/8bd663cb15efdc72d6247c38ee810964e744a450.1632516769.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:53 -06:00
Pavel Begunkov 3aa83bfb6e io_uring: add a helper for batch free
Add a helper io_free_batch_list(), which takes a single linked list and
puts/frees all requests from it in an efficient manner. Will be reused
later.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/4fc8306b542c6b1dd1d08e8021ef3bdb0ad15010.1632516769.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:53 -06:00
Pavel Begunkov 5eef4e87eb io_uring: use single linked list for iopoll
Use single linked lists for keeping iopoll requests, takes less space,
may be faster, but mostly will be of benefit for further patches.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/314033676b100cd485518c3bc55e1b95a0dcd71f.1632516769.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:53 -06:00
Pavel Begunkov e3f721e6f6 io_uring: split iopoll loop
The main loop of io_do_iopoll() iterates and does ->iopoll() until it
meets a first completed request, then it continues from that position
and splices requests to pass them through io_iopoll_complete().

Split the loop in two for clearness, iopolling and reaping completed
requests from the list.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/a7f6fd27a94845e5dc925a47a4a9765a92e514fb.1632516769.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:53 -06:00
Pavel Begunkov c2b6c6bc4e io_uring: replace list with stack for req caches
Replace struct list_head free_list serving for caching requests with
singly linked stack, which is faster.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/1bc942b82422fb2624b8353bd93aca183a022846.1632516769.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:53 -06:00
Pavel Begunkov 3ab665b74e io_uring: remove allocation cache array
We have several of request allocation layers, remove the last one, which
is the submit->reqs array, and always use submit->free_reqs instead.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/8547095c35f7a87bab14f6447ecd30a273ed7500.1632516769.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:53 -06:00
Pavel Begunkov 6f33b0bc4e io_uring: use slist for completion batching
Currently we collect requests for completion batching in an array.
Replace them with a singly linked list. It's as fast as arrays but
doesn't take some much space in ctx, and will be used in future patches.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/a666826f2854d17e9fb9417fb302edfeb750f425.1632516769.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:53 -06:00
Pavel Begunkov 5ba3c874eb io_uring: make io_do_iopoll return number of reqs
Don't pass nr_events pointer around but return directly, it's less
expensive than pointer increments.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/f771a8153a86f16f12ff4272524e9e549c5de40b.1632516769.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:53 -06:00
Pavel Begunkov 87a115fb71 io_uring: force_nonspin
We don't really need to pass the number of requests to complete into
io_do_iopoll(), a flag whether to enforce non-spin mode is enough.

Should be straightforward, maybe except io_iopoll_check(). We pass !min
there, because we do never enter with the number of already reaped
requests is larger than the specified @min, apart from the first
iteration, where nr_events is 0 and so the final check should be
identical.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/782b39d1d8ec584eae15bca0a1feb6f0571fe5b8.1632516769.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:53 -06:00
Pavel Begunkov 6878b40e7b io_uring: mark having different creds unlikely
Hint the compiler that it's not as likely to have creds different from
current attached to a request. The current code generation is far from
ideal, hopefully it can help to some compilers to remove duplicated jump
tables and so.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/e7815251ac4bf5a4a23d298c752f029ae19f3837.1632516769.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:53 -06:00
Hao Xu 8d4af6857c io_uring: return boolean value for io_alloc_async_data
boolean value is good enough for io_alloc_async_data.

Signed-off-by: Hao Xu <haoxu@linux.alibaba.com>
Link: https://lore.kernel.org/r/20210922101522.9179-1-haoxu@linux.alibaba.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:53 -06:00
Pavel Begunkov 68fe256aad io_uring: optimise io_req_init() sqe flags checks
IOSQE_IO_DRAIN is quite marginal and we don't care too much about
IOSQE_BUFFER_SELECT. Save to ifs and hide both of them under
SQE_VALID_FLAGS check. Now we first check whether it uses a "safe"
subset, i.e. without DRAIN and BUFFER_SELECT, and only if it's not
true we test the rest of the flags.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/dccfb9ab2ab0969a2d8dc59af88fa0ce44eeb1d5.1631703764.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:53 -06:00
Pavel Begunkov a3f349071e io_uring: remove ctx referencing from complete_post
Now completions are done from task context, that means that it's either
the task itself, task_work or io-wq worker. In all those cases the ctx
will be staying alive by mutexing, explicit referencing or req references
by iowq. Remove extra ctx pinning from io_req_complete_post().

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/60a0e96434c16ab4fe587651448290d61ec9a113.1631703756.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:53 -06:00
Hao Xu 83f84356bc io_uring: add more uring info to fdinfo for debug
Developers may need some uring info to help themselves debug and address
issues in production. This includes sqring/cqring head/tail and the
detailed sqe/cqe info, which is very useful when an application is hung
on a ring.

Signed-off-by: Hao Xu <haoxu@linux.alibaba.com>
Link: https://lore.kernel.org/r/20210913130854.38542-1-haoxu@linux.alibaba.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:52 -06:00
Pavel Begunkov d97ec6239a io_uring: kill extra wake_up_process in tw add
TWA_SIGNAL already wakes the thread, no need in wake_up_process() after
it.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/7e90cf643f633e857443e0c9e72471b221735c50.1631115443.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:52 -06:00
Pavel Begunkov c450178d9b io_uring: dedup CQE flushing non-empty checks
We don't do io_submit_flush_completions() when there is no requests
enqueued, and every single caller checks for it. Hide that check into
the function not forgetting about inlining. That will make it much
easier for changing the empty check condition in the future.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/d7ff8cef5da1b38e8ea648f5aad9a315ddfc7b57.1631115443.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:52 -06:00
Pavel Begunkov d81499bfcd io_uring: inline linked part of io_req_find_next
Inline part of __io_req_find_next() that returns a request but doesn't
need io_disarm_next(). It's just two places, but makes links a bit
faster.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/4126d13f23d0e91b39b3558e16bd86cafa7fcef2.1631115443.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:52 -06:00
Pavel Begunkov 6b639522f6 io_uring: inline io_dismantle_req
io_dismantle_req() is hot, and not _too_ huge. Inline it, there are 3
call sites, which hopefully will turn into 2 in the future.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/bdd2dc30716cac270c2403e99bccd6286e4ae201.1631115443.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:52 -06:00
Pavel Begunkov 4b628aeb69 io_uring: kill off ios_left
->ios_left is only used to decide whether to plug or not, kill it to
avoid this extra accounting, just use the initial submission number.
There is no much difference in regards of enabling plugging, where this
one does it in a few more cases, but all major ones should be covered
well.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/f13993bcf5b477f9a7d52881fc49f9457ea9870a.1631115443.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:52 -06:00
Jens Axboe a87acfde94 io_uring: dump sqe contents if issue fails
I recently had to look at a production problem where a request ended
up getting the dreaded -EINVAL error on submit. The most used and
hence useless of error codes, as it just tells you that something
was wrong with your request, but not more than that.

Let's dump the full sqe contents if we run into an issue failure,
that'll allow easier diagnosing of a wide variety of issues.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:52 -06:00
Jens Axboe b688f11e86 io_uring: utilize the io batching infrastructure for more efficient polled IO
Wire up using an io_comp_batch for f_op->iopoll(). If the lower stack
supports it, we can handle high rates of polled IO more efficiently.

This raises the single core efficiency on my system from ~6.1M IOPS to
~6.6M IOPS running a random read workload at depth 128 on two gen2
Optane drives.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18 14:40:46 -06:00
Jens Axboe 5a72e899ce block: add a struct io_comp_batch argument to fops->iopoll()
struct io_comp_batch contains a list head and a completion handler, which
will allow completions to more effciently completed batches of IO.

For now, no functional changes in this patch, we just define the
io_comp_batch structure and add the argument to the file_operations iopoll
handler.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18 14:40:40 -06:00
Christoph Hellwig d729cf9acb io_uring: don't sleep when polling for I/O
There is no point in sleeping for the expected I/O completion timeout
in the io_uring async polling model as we never poll for a specific
I/O.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Mark Wunderlich <mark.wunderlich@intel.com>
Link: https://lore.kernel.org/r/20211012111226.760968-11-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18 06:17:36 -06:00
Christoph Hellwig ef99b2d376 block: replace the spin argument to blk_iopoll with a flags argument
Switch the boolean spin argument to blk_poll to passing a set of flags
instead.  This will allow to control polling behavior in a more fine
grained way.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Mark Wunderlich <mark.wunderlich@intel.com>
Link: https://lore.kernel.org/r/20211012111226.760968-10-hch@lst.de
[axboe: adapt to changed io_uring iopoll]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18 06:17:36 -06:00
Christoph Hellwig 30da1b45b1 io_uring: fix a layering violation in io_iopoll_req_issued
syscall-level code can't just poke into the details of the poll cookie,
which is private information of the block layer.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211012111226.760968-5-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18 06:17:35 -06:00
Hao Xu 14cfbb7a78 io_uring: fix wrong condition to grab uring lock
Grab uring lock when we are in io-worker rather than in the original
or system-wq context since we already hold it in these two situation.

Signed-off-by: Hao Xu <haoxu@linux.alibaba.com>
Fixes: b66ceaf324 ("io_uring: move iopoll reissue into regular IO path")
Link: https://lore.kernel.org/r/20211014140400.50235-1-haoxu@linux.alibaba.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-14 09:06:11 -06:00
Pavel Begunkov 3f008385d4 io_uring: kill fasync
We have never supported fasync properly, it would only fire when there
is something polling io_uring making it useless. The original support came
in through the initial io_uring merge for 5.1. Since it's broken and
nobody has reported it, get rid of the fasync bits.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/2f7ca3d344d406d34fa6713824198915c41cea86.1633080236.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-01 11:16:02 -06:00
Pavel Begunkov 7df778be2f io_uring: make OP_CLOSE consistent with direct open
From recently open/accept are now able to manipulate fixed file table,
but it's inconsistent that close can't. Close the gap, keep API same as
with open/accept, i.e. via sqe->file_slot.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-09-24 14:07:54 -06:00
Pavel Begunkov 9f3a2cb228 io_uring: kill extra checks in io_write()
We don't retry short writes and so we would never get to async setup in
io_write() in that case. Thus ret2 > 0 is always false and
iov_iter_advance() is never used. Apparently, the same is found by
Coverity, which complains on the code.

Fixes: cd65869512 ("io_uring: use iov_iter state save/restore helpers")
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/5b33e61034748ef1022766efc0fb8854cfcf749c.1632500058.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-09-24 10:26:11 -06:00
Jens Axboe cdb31c29d3 io_uring: don't punt files update to io-wq unconditionally
There's no reason to punt it unconditionally, we just need to ensure that
the submit lock grabbing is conditional.

Fixes: 05f3fb3c53 ("io_uring: avoid ring quiesce for fixed file set unregister and update")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-09-24 10:24:34 -06:00
Jens Axboe 9990da93d2 io_uring: put provided buffer meta data under memcg accounting
For each provided buffer, we allocate a struct io_buffer to hold the
data associated with it. As a large number of buffers can be provided,
account that data with memcg.

Fixes: ddf0322db7 ("io_uring: add IORING_OP_PROVIDE_BUFFERS")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-09-24 10:24:34 -06:00