linux/kernel/bpf
Andrii Nakryiko a4561f5afe bpf: Use O(log(N)) binary search to find line info record
Real-world BPF applications keep growing in size. Medium-sized production
application can easily have 50K+ verified instructions, and its line
info section in .BTF.ext has more than 3K entries.

When verifier emits log with log_level>=1, it annotates assembly code
with matched original C source code. Currently it uses linear search
over line info records to find a match. As complexity of BPF
applications grows, this O(K * N) approach scales poorly.

So, let's instead of linear O(N) search for line info record use faster
equivalent O(log(N)) binary search algorithm. It's not a plain binary
search, as we don't look for exact match. It's an upper bound search
variant, looking for rightmost line info record that starts at or before
given insn_off.

Some unscientific measurements were done before and after this change.
They were done in VM and fluctuate a bit, but overall the speed up is
undeniable.

BASELINE
========
File                              Program           Duration (us)   Insns
--------------------------------  ----------------  -------------  ------
katran.bpf.o                      balancer_ingress        2497130  343552
pyperf600.bpf.linked3.o           on_event               12389611  627288
strobelight_pyperf_libbpf.o       on_py_event              387399   52445
--------------------------------  ----------------  -------------  ------

BINARY SEARCH
=============

File                              Program           Duration (us)   Insns
--------------------------------  ----------------  -------------  ------
katran.bpf.o                      balancer_ingress        2339312  343552
pyperf600.bpf.linked3.o           on_event                5602203  627288
strobelight_pyperf_libbpf.o       on_py_event              294761   52445
--------------------------------  ----------------  -------------  ------

While Katran's speed up is pretty modest (about 105ms, or 6%), for
production pyperf BPF program (on_py_event) it's much greater already,
going from 387ms down to 295ms (23% improvement).

Looking at BPF selftests's biggest pyperf example, we can see even more
dramatic improvement, shaving more than 50% of time, going from 12.3s
down to 5.6s.

Different amount of improvement is the function of overall amount of BPF
assembly instructions in .bpf.o files (which contributes to how much
line info records there will be and thus, on average, how much time linear
search will take), among other things:

$ llvm-objdump -d katran.bpf.o | wc -l
3863
$ llvm-objdump -d strobelight_pyperf_libbpf.o | wc -l
6997
$ llvm-objdump -d pyperf600.bpf.linked3.o | wc -l
87854

Granted, this only applies to debugging cases (e.g., using veristat, or
failing verification in production), but seems worth doing to improve
overall developer experience anyways.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20240214002311.2197116-1-andrii@kernel.org
2024-02-14 23:53:42 +01:00
..
preload bpf: make preloaded map iterators to display map elements count 2023-07-06 12:42:25 -07:00
arraymap.c bpf: Consistently use BPF token throughout BPF verifier logic 2024-01-24 16:21:01 -08:00
bloom_filter.c bpf: Centralize permissions checks for all BPF map types 2023-06-19 14:04:04 +02:00
bpf_cgrp_storage.c bpf: Enable bpf_cgrp_storage for cgroup1 non-attach case 2023-12-08 17:08:18 -08:00
bpf_inode_storage.c Networking changes for 6.4. 2023-04-26 16:07:23 -07:00
bpf_iter.c bpf: Add __bpf_kfunc_{start,end}_defs macros 2023-11-01 22:33:53 -07:00
bpf_local_storage.c bpf: Allow compiler to inline most of bpf_local_storage_lookup() 2024-02-11 14:06:24 -08:00
bpf_lru_list.c bpf: Address KCSAN report on bpf_lru_list 2023-05-12 12:01:03 -07:00
bpf_lru_list.h bpf: lru: Remove unused declaration bpf_lru_promote() 2023-08-08 17:21:42 -07:00
bpf_lsm.c bpf: Minor clean-up to sleepable_lsm_hooks BTF set 2024-02-01 18:37:45 +01:00
bpf_struct_ops.c bpf: Create argument information for nullable arguments. 2024-02-13 15:16:44 -08:00
bpf_task_storage.c
btf.c bpf: don't infer PTR_TO_CTX for programs with unnamed context type 2024-02-13 18:46:47 -08:00
cgroup.c bpf: remove check in __cgroup_bpf_run_filter_skb 2024-02-13 15:41:17 -08:00
cgroup_iter.c bpf: Let verifier consider {task,cgroup} is trusted in bpf_iter_reg 2023-11-07 15:24:25 -08:00
core.c bpf: Consistently use BPF token throughout BPF verifier logic 2024-01-24 16:21:01 -08:00
cpumap.c net, bpf: Add a warning if NAPI cb missed xdp_do_flush(). 2023-10-17 15:02:03 +02:00
cpumask.c bpf: treewide: Annotate BPF kfuncs in BTF 2024-01-31 20:40:56 -08:00
devmap.c net, bpf: Add a warning if NAPI cb missed xdp_do_flush(). 2023-10-17 15:02:03 +02:00
disasm.c bpf: change bpf_alu_sign_string and bpf_movsx_string to static 2023-08-04 16:15:50 -07:00
disasm.h
dispatcher.c bpf: Use arch_bpf_trampoline_size 2023-12-06 17:17:20 -08:00
hashtab.c Networking changes for 6.8. 2024-01-11 10:07:29 -08:00
helpers.c bpf: Mark bpf_spin_{lock,unlock}() helpers with notrace correctly 2024-02-13 11:11:25 -08:00
inode.c bpf: Support symbolic BPF FS delegation mount options 2024-01-24 16:21:02 -08:00
Kconfig bpf: Merge two CONFIG_BPF entries 2024-02-07 16:38:20 -08:00
link_iter.c
local_storage.c cgroup changes for v6.4-rc1 2023-04-29 10:05:22 -07:00
log.c bpf: Use O(log(N)) binary search to find line info record 2024-02-14 23:53:42 +01:00
lpm_trie.c bpf, lpm: Fix check prefixlen before walking trie 2023-11-09 19:07:38 -08:00
Makefile bpf: Introduce BPF token object 2024-01-24 16:21:01 -08:00
map_in_map.c bpf: Optimize the free of inner map 2023-12-04 17:50:26 -08:00
map_in_map.h bpf: Add map and need_defer parameters to .map_fd_put_ptr() 2023-12-04 17:50:26 -08:00
map_iter.c bpf: treewide: Annotate BPF kfuncs in BTF 2024-01-31 20:40:56 -08:00
memalloc.c bpf: Remove unnecessary cpu == 0 check in memalloc 2024-01-04 10:18:14 -08:00
mmap_unlock_work.h
mprog.c bpf: Handle bpf_mprog_query with NULL entry 2023-10-06 17:11:20 -07:00
net_namespace.c
offload.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2023-09-21 21:49:45 +02:00
percpu_freelist.c
percpu_freelist.h
prog_iter.c
queue_stack_maps.c bpf: Avoid deadlock when using queue and stack maps from NMI 2023-09-11 19:04:49 -07:00
reuseport_array.c bpf: Centralize permissions checks for all BPF map types 2023-06-19 14:04:04 +02:00
ringbuf.c bpf: Fold smp_mb__before_atomic() into atomic_set_release() 2023-10-24 14:26:07 +02:00
stackmap.c bpf: Add crosstask check to __bpf_get_stack 2023-11-10 11:06:10 -08:00
syscall.c bpf,lsm: Refactor bpf_map_alloc/bpf_map_free LSM hooks 2024-01-24 16:21:01 -08:00
sysfs_btf.c
task_iter.c bpf: bpf_iter_task_next: use next_task(kit->task) rather than next_task(kit->pos) 2023-11-19 11:43:44 -08:00
tcx.c bpf, tcx: Get rid of tcx_link_const 2023-10-23 15:01:53 -07:00
tnum.c bpf: simplify tnum output if a fully known constant 2023-12-02 11:36:51 -08:00
token.c bpf,token: Use BIT_ULL() to convert the bit mask 2024-01-29 20:04:55 -08:00
trampoline.c bpf: Use arch_bpf_trampoline_size 2023-12-06 17:17:20 -08:00
verifier.c bpf: simplify btf_get_prog_ctx_type() into btf_is_prog_ctx_type() 2024-02-13 18:46:46 -08:00