docs: document that one shouldn't pass the audit caps to containers

Apparently this is not well know, so let' document this.
This commit is contained in:
Lennart Poettering 2021-04-28 16:40:58 +02:00
parent e6f1d7f4ec
commit feb10c665f

View file

@ -140,7 +140,7 @@ manager, please consider supporting the following interfaces.
`$CREDENTIALS_DIRECTORY` environment variable. If the container managers
does this, the credentials passed to the service manager can be propagated
to services via `LoadCredential=` (see ...). The container manager can
choose any path, but `/run/host/credentials` is recommended."
choose any path, but `/run/host/credentials` is recommended.
## Advanced Integration
@ -329,6 +329,19 @@ care should be taken to avoid naming conflicts. `systemd` (and in particular
sub-directories of `/sys/` writable, but make sure to leave the root of
`/sys/` read-only.)
8. Do not pass the `CAP_AUDIT_CONTROL`, `CAP_AUDIT_READ`, `CAP_AUDIT_WRITE`
capabilities to the container, in particular not to those making use of user
namespaces. The kernel's audit subsystem is still not virtualized for
containers, and passing these credentials is pointless hence, given the
actual attempt to make use of the audit subsystem will fail. Note that
systemd's audit support is partially conditioned on these capabilities, thus
by dropping them you ensure that you get an entirely clean boot, as systemd
will make no attempt to use it. If you pass the capabilites to the payload
systemd will assume that audit is available and works, and some components
will subsequently fail in various ways. Note that once the kernel learnt
native support for container-virtualized audit, adding the capability to the
container description will automatically make the container payload use it.
## Fully Unprivileged Container Payload
First things first, to make this clear: Linux containers are not a security