linux/kernel
Andrey Ignatov d74bad4e74 bpf: Hooks for sys_connect
== The problem ==

See description of the problem in the initial patch of this patch set.

== The solution ==

The patch provides much more reliable in-kernel solution for the 2nd
part of the problem: making outgoing connecttion from desired IP.

It adds new attach types `BPF_CGROUP_INET4_CONNECT` and
`BPF_CGROUP_INET6_CONNECT` for program type
`BPF_PROG_TYPE_CGROUP_SOCK_ADDR` that can be used to override both
source and destination of a connection at connect(2) time.

Local end of connection can be bound to desired IP using newly
introduced BPF-helper `bpf_bind()`. It allows to bind to only IP though,
and doesn't support binding to port, i.e. leverages
`IP_BIND_ADDRESS_NO_PORT` socket option. There are two reasons for this:
* looking for a free port is expensive and can affect performance
  significantly;
* there is no use-case for port.

As for remote end (`struct sockaddr *` passed by user), both parts of it
can be overridden, remote IP and remote port. It's useful if an
application inside cgroup wants to connect to another application inside
same cgroup or to itself, but knows nothing about IP assigned to the
cgroup.

Support is added for IPv4 and IPv6, for TCP and UDP.

IPv4 and IPv6 have separate attach types for same reason as sys_bind
hooks, i.e. to prevent reading from / writing to e.g. user_ip6 fields
when user passes sockaddr_in since it'd be out-of-bound.

== Implementation notes ==

The patch introduces new field in `struct proto`: `pre_connect` that is
a pointer to a function with same signature as `connect` but is called
before it. The reason is in some cases BPF hooks should be called way
before control is passed to `sk->sk_prot->connect`. Specifically
`inet_dgram_connect` autobinds socket before calling
`sk->sk_prot->connect` and there is no way to call `bpf_bind()` from
hooks from e.g. `ip4_datagram_connect` or `ip6_datagram_connect` since
it'd cause double-bind. On the other hand `proto.pre_connect` provides a
flexible way to add BPF hooks for connect only for necessary `proto` and
call them at desired time before `connect`. Since `bpf_bind()` is
allowed to bind only to IP and autobind in `inet_dgram_connect` binds
only port there is no chance of double-bind.

bpf_bind() sets `force_bind_address_no_port` to bind to only IP despite
of value of `bind_address_no_port` socket field.

bpf_bind() sets `with_lock` to `false` when calling to __inet_bind()
and __inet6_bind() since all call-sites, where bpf_bind() is called,
already hold socket lock.

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-03-31 02:15:54 +02:00
..
bpf bpf: Hooks for sys_connect 2018-03-31 02:15:54 +02:00
cgroup cgroup: fix rule checking for threaded mode switching 2018-02-21 11:39:22 -08:00
configs KVM changes for 4.16 2018-02-10 13:16:35 -08:00
debug signal: Simplify and fix kdb_send_sig 2018-01-03 18:01:08 -06:00
events perf/core: Fix ctx_event_type in ctx_resched() 2018-03-09 08:03:02 +01:00
gcov License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
irq genirq/matrix: Handle CPU offlining proper 2018-02-22 22:05:43 +01:00
livepatch Merge branch 'for-4.16/remove-immediate' into for-linus 2018-01-31 16:36:38 +01:00
locking rtmutex: Make rt_mutex_futex_unlock() safe for irq-off callsites 2018-03-09 11:06:16 +01:00
power x86/power: Fix swsusp_arch_resume prototype 2018-02-02 23:33:50 +01:00
printk Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk 2018-03-01 10:06:39 -08:00
rcu SCSI misc on 20180131 2018-01-31 11:23:28 -08:00
sched Merge branch 'for-4.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2018-03-19 15:39:02 -07:00
time timers: Forward timer base before migrating timers 2018-02-28 23:34:33 +01:00
trace bpf: Check attach type at prog load time 2018-03-31 02:14:44 +02:00
.gitignore
acct.c kernel/acct.c: fix the acct->needcheck check in check_free_space() 2018-01-04 16:45:09 -08:00
async.c kernel/async.c: revert "async: simplify lowest_in_progress()" 2018-02-06 18:32:44 -08:00
audit.c net: Convert audit_net_ops 2018-02-13 10:36:06 -05:00
audit.h audit/stable-4.15 PR 20171113 2017-11-15 13:28:48 -08:00
audit_fsnotify.c
audit_tree.c Merge branch 'fsnotify' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs 2017-11-14 14:08:20 -08:00
audit_watch.c audit/stable-4.13 PR 20170816 2017-08-16 16:48:34 -07:00
auditfilter.c audit: filter PATH records keyed on filesystem magic 2017-11-10 16:08:56 -05:00
auditsc.c audit/stable-4.15 PR 20171113 2017-11-15 13:28:48 -08:00
backtracetest.c
bounds.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
capability.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
compat.c signals: Move put_compat_sigset to compat.h to silence hardened usercopy 2018-03-02 21:31:55 +00:00
configs.c
context_tracking.c
cpu.c Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-12-31 12:30:34 -08:00
cpu_pm.c PM / CPU: replace raw_notifier with atomic_notifier 2017-07-31 13:09:49 +02:00
crash_core.c kdump: write correct address of mem_section into vmcoreinfo 2018-01-13 10:42:48 -08:00
crash_dump.c
cred.c
delayacct.c delayacct: Account blkio completion on the correct task 2018-01-16 03:29:36 +01:00
dma.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
elfcore.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
exec_domain.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
exit.c kernel/exit.c: export abort() to modules 2018-01-04 16:45:09 -08:00
extable.c extable: Make init_kernel_text() global 2018-02-21 16:54:06 +01:00
fail_function.c error-injection: Fix to prohibit jump optimization 2018-03-12 16:16:00 +01:00
fork.c include/linux/sched/mm.h: re-inline mmdrop() 2018-02-21 15:35:42 -08:00
freezer.c
futex.c pids: introduce find_get_task_by_vpid() helper 2018-02-06 18:32:46 -08:00
futex_compat.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
groups.c kernel: make groups_sort calling a responsibility group_info allocators 2017-12-14 16:00:49 -08:00
hung_task.c
irq_work.c irq/work: Improve the flag definitions 2018-01-08 19:43:15 +01:00
jump_label.c jump_label: Fix sparc64 warning 2018-03-14 16:35:26 +01:00
kallsyms.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk 2018-02-01 13:36:15 -08:00
kcmp.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt
kcov.c kcov: detect double association with a single task 2018-02-06 18:32:46 -08:00
kexec.c
kexec_core.c
kexec_file.c resource: Provide resource struct in resource walk callback 2017-11-07 15:35:57 +01:00
kexec_internal.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
kmod.c kmod: move #ifdef CONFIG_MODULES wrapper to Makefile 2017-09-08 18:26:51 -07:00
kprobes.c kprobes: Propagate error from disarm_kprobe_ftrace() 2018-02-16 09:12:58 +01:00
ksysfs.c
kthread.c treewide: Remove TIMER_FUNC_TYPE and TIMER_DATA_TYPE casts 2017-11-21 16:35:54 -08:00
latencytop.c
Makefile error-injection: Support fault injection framework 2018-01-12 17:33:38 -08:00
memremap.c kernel/memremap: Remove stale devres_free() call 2018-03-06 10:58:54 -08:00
module-internal.h
module.c module: propagate error in modules_open() 2018-03-08 21:58:51 +01:00
module_signing.c
notifier.c
nsproxy.c
padata.c padata: add SPDX identifier 2018-01-05 18:43:00 +11:00
panic.c bug: use %pB in BUG and stack protector failure 2018-03-09 16:40:01 -08:00
params.c kernel/params.c: improve STANDARD_PARAM_DEF readability 2017-10-03 17:54:26 -07:00
pid.c pids: introduce find_get_task_by_vpid() helper 2018-02-06 18:32:46 -08:00
pid_namespace.c pid: remove pidhash 2017-11-17 16:10:04 -08:00
profile.c
ptrace.c pids: introduce find_get_task_by_vpid() helper 2018-02-06 18:32:46 -08:00
range.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
reboot.c kernel/reboot.c: add devm_register_reboot_notifier() 2017-11-17 16:10:04 -08:00
relay.c kernel/relay.c: limit kmalloc size to KMALLOC_MAX_SIZE 2018-02-21 15:35:43 -08:00
resource.c Merge branch 'akpm' (patches from Andrew) 2018-02-06 22:15:42 -08:00
seccomp.c - Fix seccomp GET_METADATA to deal with field sizes correctly (Tycho Andersen) 2018-02-22 10:50:24 -08:00
signal.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching 2018-01-31 13:02:18 -08:00
smp.c smp/core: Use lockdep to assert IRQs are disabled/enabled 2017-11-08 11:13:50 +01:00
smpboot.c watchdog/core, powerpc: Lock cpus across reconfiguration 2017-10-04 10:53:54 +02:00
smpboot.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
softirq.c softirq: Eliminate cond_resched_rcu_qs() in favor of cond_resched() 2017-12-04 10:28:58 -08:00
stacktrace.c
stop_machine.c
sys.c fix typo in assignment of fs default overflow gid 2017-12-14 16:01:45 -06:00
sys_ni.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
sysctl.c pipe: reject F_SETPIPE_SZ with size over UINT_MAX 2018-02-06 18:32:47 -08:00
sysctl_binary.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
task_work.c locking/barriers: Convert users of lockless_dereference() to READ_ONCE() 2017-12-17 13:57:15 +01:00
taskstats.c pids: introduce find_get_task_by_vpid() helper 2018-02-06 18:32:46 -08:00
test_kprobes.c kprobes: Disable the jprobes test code 2017-10-20 11:02:54 +02:00
torture.c torture: Save a line in stutter_wait(): while -> for 2017-12-11 09:18:30 -08:00
tracepoint.c tracepoint: Remove smp_read_barrier_depends() from comment 2017-12-04 10:52:56 -08:00
tsacct.c
ucount.c
uid16.c kernel: make groups_sort calling a responsibility group_info allocators 2017-12-14 16:00:49 -08:00
umh.c kernel/umh.c: optimize 'proc_cap_handler()' 2017-11-17 16:10:01 -08:00
up.c smp: Avoid using two cache lines for struct call_single_data 2017-08-29 15:14:38 +02:00
user-return-notifier.c
user.c efivarfs: Limit the rate for non-root to read files 2018-02-22 10:21:02 -08:00
user_namespace.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2017-11-16 12:20:15 -08:00
utsname.c
utsname_sysctl.c
watchdog.c Merge branch 'linus' into sched/core, to pick up fixes 2017-11-08 10:17:15 +01:00
watchdog_hld.c Merge branch 'linus' into core/urgent, to pick up dependent commits 2017-11-04 08:53:04 +01:00
workqueue.c workqueue: remove unused cancel_work() 2018-03-13 13:37:42 -07:00
workqueue_internal.h Merge branch 'for-4.14-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq 2017-11-06 12:26:49 -08:00