linux/kernel/bpf
Andrey Ignatov d74bad4e74 bpf: Hooks for sys_connect
== The problem ==

See description of the problem in the initial patch of this patch set.

== The solution ==

The patch provides much more reliable in-kernel solution for the 2nd
part of the problem: making outgoing connecttion from desired IP.

It adds new attach types `BPF_CGROUP_INET4_CONNECT` and
`BPF_CGROUP_INET6_CONNECT` for program type
`BPF_PROG_TYPE_CGROUP_SOCK_ADDR` that can be used to override both
source and destination of a connection at connect(2) time.

Local end of connection can be bound to desired IP using newly
introduced BPF-helper `bpf_bind()`. It allows to bind to only IP though,
and doesn't support binding to port, i.e. leverages
`IP_BIND_ADDRESS_NO_PORT` socket option. There are two reasons for this:
* looking for a free port is expensive and can affect performance
  significantly;
* there is no use-case for port.

As for remote end (`struct sockaddr *` passed by user), both parts of it
can be overridden, remote IP and remote port. It's useful if an
application inside cgroup wants to connect to another application inside
same cgroup or to itself, but knows nothing about IP assigned to the
cgroup.

Support is added for IPv4 and IPv6, for TCP and UDP.

IPv4 and IPv6 have separate attach types for same reason as sys_bind
hooks, i.e. to prevent reading from / writing to e.g. user_ip6 fields
when user passes sockaddr_in since it'd be out-of-bound.

== Implementation notes ==

The patch introduces new field in `struct proto`: `pre_connect` that is
a pointer to a function with same signature as `connect` but is called
before it. The reason is in some cases BPF hooks should be called way
before control is passed to `sk->sk_prot->connect`. Specifically
`inet_dgram_connect` autobinds socket before calling
`sk->sk_prot->connect` and there is no way to call `bpf_bind()` from
hooks from e.g. `ip4_datagram_connect` or `ip6_datagram_connect` since
it'd cause double-bind. On the other hand `proto.pre_connect` provides a
flexible way to add BPF hooks for connect only for necessary `proto` and
call them at desired time before `connect`. Since `bpf_bind()` is
allowed to bind only to IP and autobind in `inet_dgram_connect` binds
only port there is no chance of double-bind.

bpf_bind() sets `force_bind_address_no_port` to bind to only IP despite
of value of `bind_address_no_port` socket field.

bpf_bind() sets `with_lock` to `false` when calling to __inet_bind()
and __inet6_bind() since all call-sites, where bpf_bind() is called,
already hold socket lock.

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-03-31 02:15:54 +02:00
..
arraymap.c bpf: add schedule points in percpu arrays management 2018-02-22 21:27:06 +01:00
bpf_lru_list.c bpf: lru: Lower the PERCPU_NR_SCANS from 16 to 4 2017-04-17 13:55:52 -04:00
bpf_lru_list.h bpf: Only set node->ref = 1 if it has not been set 2017-09-01 09:57:39 -07:00
cgroup.c bpf: Hooks for sys_bind 2018-03-31 02:15:18 +02:00
core.c bpf: fix bpf_prog_array_copy_to_user warning from perf event prog query 2018-02-14 08:59:37 -08:00
cpumap.c bpf: cpumap: use GFP_KERNEL instead of GFP_ATOMIC in __cpu_map_entry_alloc() 2018-02-14 15:34:27 +01:00
devmap.c bpf: add helper for copying attrs to struct bpf_map 2018-01-14 23:36:29 +01:00
disasm.c bpf: Remove struct bpf_verifier_env argument from print_bpf_insn 2018-03-23 17:38:57 +01:00
disasm.h bpf: Remove struct bpf_verifier_env argument from print_bpf_insn 2018-03-23 17:38:57 +01:00
hashtab.c bpf: add helper for copying attrs to struct bpf_map 2018-01-14 23:36:29 +01:00
helpers.c bpf: rename ARG_PTR_TO_STACK 2017-01-09 16:56:27 -05:00
inode.c bpf: comment why dots in filenames under BPF virtual FS are not allowed 2018-03-09 10:30:30 +01:00
lpm_trie.c bpf: fix rcu lockdep warning for lpm_trie map_free callback 2018-02-22 21:29:12 +01:00
Makefile bpf: only build sockmap with CONFIG_INET 2018-01-04 19:01:14 +01:00
map_in_map.c bpf: Add syscall lookup support for fd array and htab 2017-06-29 13:13:25 -04:00
map_in_map.h bpf: Add syscall lookup support for fd array and htab 2017-06-29 13:13:25 -04:00
offload.c bpf: offload: report device information about offloaded maps 2018-01-18 22:54:25 +01:00
percpu_freelist.c bpf: fix lockdep splat 2017-11-15 19:46:32 +09:00
percpu_freelist.h bpf: introduce percpu_freelist 2016-03-08 15:28:31 -05:00
sockmap.c bpf: sockmap: initialize sg table entries properly 2018-03-30 22:50:15 +02:00
stackmap.c bpf: extend stackmap to save binary_build_id+offset instead of address 2018-03-15 01:09:28 +01:00
syscall.c bpf: Hooks for sys_connect 2018-03-31 02:15:54 +02:00
tnum.c bpf/verifier: track signed and unsigned min/max values 2017-08-08 17:51:34 -07:00
verifier.c bpf: Hooks for sys_bind 2018-03-31 02:15:18 +02:00