linux/arch
Alexei Starovoitov 1e6c62a882 bpf: Introduce sleepable BPF programs
Introduce sleepable BPF programs that can request such property for themselves
via BPF_F_SLEEPABLE flag at program load time. In such case they will be able
to use helpers like bpf_copy_from_user() that might sleep. At present only
fentry/fexit/fmod_ret and lsm programs can request to be sleepable and only
when they are attached to kernel functions that are known to allow sleeping.

The non-sleepable programs are relying on implicit rcu_read_lock() and
migrate_disable() to protect life time of programs, maps that they use and
per-cpu kernel structures used to pass info between bpf programs and the
kernel. The sleepable programs cannot be enclosed into rcu_read_lock().
migrate_disable() maps to preempt_disable() in non-RT kernels, so the progs
should not be enclosed in migrate_disable() as well. Therefore
rcu_read_lock_trace is used to protect the life time of sleepable progs.

There are many networking and tracing program types. In many cases the
'struct bpf_prog *' pointer itself is rcu protected within some other kernel
data structure and the kernel code is using rcu_dereference() to load that
program pointer and call BPF_PROG_RUN() on it. All these cases are not touched.
Instead sleepable bpf programs are allowed with bpf trampoline only. The
program pointers are hard-coded into generated assembly of bpf trampoline and
synchronize_rcu_tasks_trace() is used to protect the life time of the program.
The same trampoline can hold both sleepable and non-sleepable progs.

When rcu_read_lock_trace is held it means that some sleepable bpf program is
running from bpf trampoline. Those programs can use bpf arrays and preallocated
hash/lru maps. These map types are waiting on programs to complete via
synchronize_rcu_tasks_trace();

Updates to trampoline now has to do synchronize_rcu_tasks_trace() and
synchronize_rcu_tasks() to wait for sleepable progs to finish and for
trampoline assembly to finish.

This is the first step of introducing sleepable progs. Eventually dynamically
allocated hash maps can be allowed and networking program types can become
sleepable too.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: KP Singh <kpsingh@google.com>
Link: https://lore.kernel.org/bpf/20200827220114.69225-3-alexei.starovoitov@gmail.com
2020-08-28 21:20:33 +02:00
..
alpha iomap: constify ioreadX() iomem argument (as in generic implementation) 2020-08-14 19:56:57 -07:00
arc mm/gup: remove task_struct pointer for all gup code 2020-08-12 10:58:04 -07:00
arm all arch: remove system call sys_sysctl 2020-08-14 19:56:56 -07:00
arm64 all arch: remove system call sys_sysctl 2020-08-14 19:56:56 -07:00
c6x Merge branch 'work.regset' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-08-07 09:29:25 -07:00
csky mm/csky: use general page fault accounting 2020-08-12 10:58:03 -07:00
h8300 uaccess: remove segment_eq 2020-08-12 10:57:58 -07:00
hexagon mm/hexagon: use general page fault accounting 2020-08-12 10:58:03 -07:00
ia64 all arch: remove system call sys_sysctl 2020-08-14 19:56:56 -07:00
m68k Cleanup, SECCOMP_FILTER support, message printing fixes, and other 2020-08-15 18:50:32 -07:00
microblaze all arch: remove system call sys_sysctl 2020-08-14 19:56:56 -07:00
mips all arch: remove system call sys_sysctl 2020-08-14 19:56:56 -07:00
nds32 mm/nds32: use general page fault accounting 2020-08-12 10:58:03 -07:00
nios2 mm/nios2: use general page fault accounting 2020-08-12 10:58:03 -07:00
openrisc OpenRISC updates for 5.9 2020-08-14 14:04:53 -07:00
parisc parisc: fix PMD pages allocation by restoring pmd_alloc_one() 2020-08-16 10:53:13 -07:00
powerpc iomap: constify ioreadX() iomem argument (as in generic implementation) 2020-08-14 19:56:57 -07:00
riscv A RISC-V Fix for 5.9 2020-08-15 18:54:42 -07:00
s390 all arch: remove system call sys_sysctl 2020-08-14 19:56:56 -07:00
sh Cleanup, SECCOMP_FILTER support, message printing fixes, and other 2020-08-15 18:50:32 -07:00
sparc all arch: remove system call sys_sysctl 2020-08-14 19:56:56 -07:00
um Cleanup, SECCOMP_FILTER support, message printing fixes, and other 2020-08-15 18:50:32 -07:00
x86 bpf: Introduce sleepable BPF programs 2020-08-28 21:20:33 +02:00
xtensa all arch: remove system call sys_sysctl 2020-08-14 19:56:56 -07:00
.gitignore
Kconfig A set oftimekeeping/VDSO updates: 2020-08-14 14:26:08 -07:00