core: Avoid spurious realization of unit cgroups

Cgroups may be unnecessarily realized when they are not needed. This
happens, e.g. for mount units parsed from /proc/$PID/mountinfo, check

        touch /run/ns_mount
        unshare -n sh -c "mount --bind /proc/self/ns/net /run/ns_mount"
        # no cgroup exists
        file /sys/fs/cgroup/system.slice/run-ns_mount.mount
        systemctl daemon-reload
        # the vain cgroup exists
        file /sys/fs/cgroup/system.slice/run-ns_mount.mount

. (Such cgroups can account to a large number with many similar mounts.)

The code already accounts for "lazy" realization (see various checks for
Unit.cgroup_realized) but the unit_deserialize() in the reload/reexec
path performs unconditional realization.

Invalidate (and queue) the units for realization only if we know that
they were already realized in the past. This is a safe thing to do even
in the case the reload brings some new cgroup setting (controllers, BPF)
because units that aren't realized will use the updated setting when the
time for their realization comes. (It's not even needed to add a code
comment because the current formulation suggests the changed behavior.)
This commit is contained in:
Michal Koutný 2021-06-10 15:58:43 +02:00 committed by Luca Boccassi
parent dbb3b26f1b
commit cc815b7fea

View file

@ -526,8 +526,10 @@ int unit_deserialize(Unit *u, FILE *f, FDSet *fds) {
/* Let's make sure that everything that is deserialized also gets any potential new cgroup settings
* applied after we are done. For that we invalidate anything already realized, so that we can
* realize it again. */
unit_invalidate_cgroup(u, _CGROUP_MASK_ALL);
unit_invalidate_cgroup_bpf(u);
if (u->cgroup_realized) {
unit_invalidate_cgroup(u, _CGROUP_MASK_ALL);
unit_invalidate_cgroup_bpf(u);
}
return 0;
}