Commit graph

1485 commits

Author SHA1 Message Date
Daan De Meyer 1ab6ae1957 units: Use ImportCredential= where applicable 2023-06-08 14:09:36 +02:00
Lennart Poettering 1775872679 units: change TimeoutSec=0 to TimeoutSec=infinity
Follow-up for #27936

Let's also update a bunch of static unit files, matching what we just
did for the generators.
2023-06-06 18:23:43 +01:00
Lennart Poettering 13ffc60749 pid1: add "soft-reboot" reboot method
This adds a new mechanism for rebooting, a form of "userspace reboot"
hereby dubbed "soft-reboot". It will stop all services as in a usual
shutdown, possibly transition into a new root fs and then issue a fresh
initial transaction. The kernel is not replaced.

File descriptors can be passed over, thus opening the door for leaving
certain resources around between such reboots.

Usecase: this is an extremely quick way to reset userspace fully when
updating image based systems, without going through a full
hardware/firmware/boot loader/kernel/initrd cycle. It minimizes "grayout time"
for OS updates. (In particular when combined with kernel live patching)
2023-06-02 16:49:38 +02:00
Lennart Poettering d120ce478d units: don't stop blockdev@.target unit at shutdown
We want that cryptsetup/veritysetup devices can stick around until the
very end, as well as the users of them which might depend on
blockdev@.target for the devices. Hence leave the targets around till
the very end.

Note that their runtime is managed via StopWhenUnneeded= anyway, hence
unless their are volumes that actually survive still the very end they
target units will still be stopped.
2023-06-01 18:49:43 +02:00
Lennart Poettering 7a2f3194ff units: set DefaultDependencies=no for veritysetup slice
This mimics what we already have for cryptsetup services: the slice they
are placed in (they have their own slice since that's what we do by
default for instantiated services) shouldn't conflict with
shutdown.target, so that veritysetup services can stay around until the
very end (which is what we want for the root and usr verity volumes).

It's literally just a copy of the same unit we already have for
cryptsetup, just with an updated description string.
2023-06-01 18:49:43 +02:00
Zbigniew Jędrzejewski-Szmek bec89355c5 units: pull in local-fs-pre.target from systemd-tmpfiles-setup-dev.service
local-fs-pre.target is a passive unit, which means that it is supposed to be
pulled in by everything that is ordered before it. We had
Before=local-fs-pre.target, so add Wants= too.

I don't expect this to change anything. Instead, just make things follow the
docs so it's easier to reason about the dependency set.
2023-05-31 15:44:44 +02:00
Mike Yuan 97d822abac
Merge pull request #27787 from keszybz/firstboot-synchronous-restart
firstboot: make restart of vconsole-setup synchronuous
2023-05-27 02:30:45 +08:00
Zbigniew Jędrzejewski-Szmek b2ce20aa0c units: order systemd-firstboot after systemd-tmpfiles-setup
We may copy files from factory to /etc. The default mkosi config has
factory/etc/vconsole.conf. systemd-firstboot would race with tmpfiles-setup,
and sometimes ask for the keymap, and sometimes not.

I guess that if there are files in factory, we shouldn't ask the user for
the same configuration.
2023-05-26 15:07:01 +02:00
Zbigniew Jędrzejewski-Szmek 8eb668b9ab firstboot: synchronously wait for systemd-vconsole-setup.service/restart job
Requested in https://github.com/systemd/systemd/pull/27755#pullrequestreview-1443489520.

I dropped the info message about the job being requested, because we get
fairly verbose logs from starting the unit, and the additional message isn't
useful.

In the unit, the ordering before systemd-vconsole-setup.service is dropped,
because now it needs to happen in parallel, while systemd-firstboot.service
is running. This means that we may potentially execute vconsole-setup twice,
but it's fairly quick, so this doesn't matter much.
2023-05-26 15:07:01 +02:00
Daan De Meyer 75efd16fb0 units: Shut down networkd and resolved on switch-root
Let's explicitly order these against initrd-switch-root.target, so
that they are properly shut down before we switch root. Otherwise,
there's a race condition where networkd might only shut down after
switching root and after we've already we've loaded the unit graph,
meaning it won't be restarted in the rootfs.

Fixes #27718
2023-05-26 06:54:56 +08:00
Zbigniew Jędrzejewski-Szmek a777a59243 firstboot: process the root account after sysusers created it
We would create root account from sysusers or from firstboot, depending on
which one ran earlier. Since firstboot offers more options, in particular can
set the root password, we needed to order it earlier. This created an ugly
ordering requirement:

systemd-sysusers.service > systemd-firstboot.service > ... >
  systemd-remount-fs.service > systemd-tmpfiles-setup-dev.service >
  systemd-sysusers.service

We want sysusers.service to create basic users, so we can create nodes in dev,
so we can operate on block devices and such, so that we can resize and remount
things. But at the same time, systemd-firstboot.service can only work if it is
run early, before systemd-sysusers.service has created /etc/passwd. We can't
have it both ways: the units that want to have a fully writable root file
system cannot be ordered before units which are required to do file system
preparation.

Instead of trying to order firstboot very early, let's let it do its thing even
if it is started later. Instead of refusing to create to the root account if
/etc/passwd and /etc/shadow exist, actually check if the account is configured.
Now sysusers writes root account with password PASSWORD_UNPROVISIONED
("!unprovisioned"), and then firstboot checks for this, and will configure root
in this case.

This allows sysusers to be executed earlier (or accounts to be set up earlier
in another way).

This effectively reverts b825ab1a99.
2023-05-23 15:09:39 +02:00
Zbigniew Jędrzejewski-Szmek b42482af90 units: create /dev with --graceful first, allow sysusers to run later
We want to call systemd-tmpfiles-setup-dev.service to create /dev/fuse and
other device nodes so that module probing will work. But it is possible that
when we're in first boot, some users or groups need to be created by
systemd-sysusers first. But it is also possible that systemd-sysusers cannot
actually execute configuration because the root partition is not fully writable
yet. So let systemd-tmpfiles-setup-dev.service run earlier, possibly without
all users and groups in place. Since systemd-tmpfiles-setup-dev.service writes
to /dev only, it doesn't care how the root partition is mounted. In this early
run, some some nodes might be created with default permissions (i.e. not
accessible to non-root users or groups). This should be OK for the early boot
phase. Afterwards, we let systemd-tmpfiles-setup.service execute full
configuration. We will configure any files in /dev twice, but considering that
there's only a few of them and that the second run should only adjust ownership
and permissions, this should be OK. This way, we avoid the dependency loop.
2023-05-23 15:09:39 +02:00
Zbigniew Jędrzejewski-Szmek b93562a1a1 units: make sure proc-sys-binfmt_misc.automount is actually stopped
As with other units, stopping of the automount requires actual work,
and without the ordering dependency systemd might not execute the stop
job before shutdown.target is reached and units ordered after that are
executed.
2023-05-23 12:39:34 +02:00
Zbigniew Jędrzejewski-Szmek d6f6846464 units/systemd-repart: stop pretending that root config is executed in the initrd
I have a system with /usr/lib/repart.d/50-root.conf with GrowFileSystem=yes.
The partition wouldn't be resized in the initrd, because
ConditionDirectoryNotEmpty=|/sysusr/usr/lib/repart.d was evaluated very early,
before /sysroot was mounted. There was no ordering dependency between
systemd-repart.service and sysroot.mount. (There was After=initrd-usr-fs.target,
but it seems to be only referred to by systemd-fstab-generator, which in my
case doesn't even run, because there's no fstab.)

But in fact, we neeed to run systemd-repart in the initrd only in limited
circumstances: when we need to create the root device based on config under
sysusr.mount. If there is config on the root device, it can be executed in
the host system, early during boot. Thus, let's remove the condition on
/sysroot/…. Without an ordering dependency on sysroot.mount, it was subject to
a race condition anyway. (A race condition with a low probability of "winning",
because systemd-repart.service has no dependencies, but sysroot.mount requires
a device to be detected and the mount to happen.)

The other problem was that systemd-repart.service didn't have the ordering wrt.
initrd-switch-root.target, so it was subject to the same race condition that
was fixed for other units in 7c0e2b5559. (If the
systemd-repart.service/stop job is slow, we could end up not restarting
systemd-repart.service in the host system.)

With the changes here, I see systemd-repart.service/start running twice:
in the initrd it is skipped because the conditions fail, and then in the
host system it runs normally.

Note: support for /sysroot is retained in systemd-repart code. I don't see a
strong reason to remove it, since it may still be useful to people invoking
repart in the initrd in other circumstances.
2023-05-23 12:39:33 +02:00
Zbigniew Jędrzejewski-Szmek 4e66876dfc units: do more reordering of ordering config
No functional change, just a cleanup to make the subsequent changes easier to
see. This is a continuation of 9810e41942

> The block is reordered and split to have:
>    1. description + documentation
>    2. (optionally) conditions
>    3. all the dependencies

The dependencies for shutdown.target are listed separately because they are the
other deps are for startup, and shutdown.target only matter much later.
2023-05-23 12:39:16 +02:00
Zbigniew Jędrzejewski-Szmek a6f3a7eb8a units: order sysinit.target, debug-shell.service after systemd-vconsole-setup
Previous patch to add an implicit dependency effectively orders various getty
services after systemd-vconsole-setup.service. But I think it's cleaner to also
order the service before sysinit.target, like it was before
8125e8d38e. There might be units which don't do
use TTYVHangup= but would like to have the console fully initialized.

Also, add a manual ordering to debug-shell.service, because it has
ImplicitDependencies=no. This might delay debug-shell.service a bit, but
systemd-vconsole-setup.service has no dependencies and should be very quick, so
this should not be noticable in practice. Without the ordering, the terminal
might not have a key map loaded, making debug-shell.service hard to use.
2023-05-19 17:47:14 +02:00
Zbigniew Jędrzejewski-Szmek ce0fe01f22 units: order getty units after getty-pre.target unconditionally
Those two units had this ordering conditionalized on HAVE_SYSV_COMPAT. This
seems strange. 45e2753297 added the ordering
differently for those two files without any comment, and I think it was just
pasted or scripted erroneously.
2023-05-19 15:22:45 +02:00
Yu Watanabe d0e3ae838f unit: add conditions and deps to make oomd.socket and .service consistent
Fixes #27690.
2023-05-19 08:58:56 +02:00
Daan De Meyer 4340e5b6df Revert "units: Add missing dependencies on initrd-switch-root.target"
This reverts commit f0ad3e6b96.
2023-05-15 15:42:21 +02:00
Daan De Meyer f0ad3e6b96 units: Add missing dependencies on initrd-switch-root.target
These are all services that valid to be run in the initrd, so let's
make sure they have the appropriate dependencies on
initrd-switch-root.target so that they are stopped when we're about
to switch root.
2023-05-13 02:14:02 +08:00
Daan De Meyer 97211510b0 units: Add CAP_NET_ADMIN condition to systemd-networkd-wait-online@.service as well
It was added to CAP_NET_ADMIN but we forgot to add it to the template
service as well.
2023-05-09 17:59:55 +02:00
Yu Watanabe d421db6e8b units: add/fix Documentation= about bus interface 2023-05-09 06:10:23 +09:00
Lennart Poettering 5ae89ef347 core/systemctl: when switching root default to /sysroot/
We hardcode the path the initrd uses to prepare the final mount point at
so many places, let's also imply it in "systemctl switch-root" if not
specified.

This adds the fallback both to systemctl and to PID 1 (this is because
both to — different – checks on the path).
2023-04-28 23:26:20 +01:00
Lennart Poettering 1a3704dcc3 nspawn: port over to /supervisor/ subcgroup being delegated to nspawn
Let's make use of the new DelegateSubgroup= feature and delegate the
/supervisor/ subcgroup already to nspawn, so that moving the supervisor
process there is unnecessary.
2023-04-27 12:18:32 +02:00
Lennart Poettering f8371dbd56 udev: port to DelegateSubgroup= 2023-04-27 12:18:32 +02:00
Lennart Poettering 3975e3f8ae units: make system service manager create init.scope subcgroup for user service manager
This one is basically for free, since the service manager is already
prepared for being invoked in init.scope. Hence let's start it in the
right cgroup right-away.
2023-04-27 12:18:32 +02:00
Lennart Poettering e76b3d4ed2 units: restrict hugepages fs a bit
suid binaries and device nodes should not be placed there, hence forbid
it.

Of all the API VFS we mount from PID 1 or via a unit file this one is
the only one where we didn't add MS_NODEV/MS_NOSUID. Let's address that,
since there's really no reason why device nodes or suid binaries would
be placed in hugetlbfs.
2023-04-27 12:28:50 +09:00
Eric Curtin b9dac41837 Support /etc/system-update for OSTree systems
This is required when / is immutable and cannot be written at runtime.

Co-authored-by: Richard Hughes <richard@hughsie.com>
2023-04-25 17:40:41 +02:00
Lennart Poettering 3af48a86d9
Merge pull request #25608 from poettering/dissect-moar
dissect: add dissection policies
2023-04-12 13:46:08 +02:00
Kai Lueke 721412ac98 systemd-sysext/confext.service: Refresh on start/reload
When adding a sysext image to the system and manuall merging it, a
later "systemctl (re)start systemd-sysext" won't work because "merge"
refuses to work when something is merged already. Another problem with
"merge" at start plus "unmerge" at stop is that a service restart can't
make use of the new MOVE_MOUNT_BENEATH in the future even which would
only be available in "refresh". It also prepares us for setting up the
merged overlay for the sysroot from the initrd already, which also
would lead to the mentioned start problem of the service (One
optimization could be to skip the loading but only if we are sure that
all images were loaded and weren't modified since - this assumption is
hard because early services could want to inject a sysext, too).

Use "refresh" on service start to fix the problem that the service
can't start as soon as a manual merge was done. Also add a reload
action that allows to issue "systemctl reload systemd-sysext" and it
will make use of MOVE_MOUNT_BENEATH once we implement this in
systemd-sysext refresh (and it's available from the kernel).
2023-04-06 20:47:26 +09:00
maanyagoenka 1f839f48e0 confext: add the systemd-confext.service file 2023-04-05 21:50:04 +00:00
Lennart Poettering 73740c9f84 discover-image: automaticaly pick up sysext images from /.extra/sysext 2023-04-05 20:52:21 +02:00
Luca Boccassi de862276ed sysext: stop storing under /usr/lib[/local]/extensions/
sysexts are meant to extend /usr. All extension images and directories are opened and merged in a
single, read-only overlayfs layer, mounted on /usr.
So far, we had fallback storage directories in /usr/lib/extensions and /usr/local/lib/extensions.
This is problematic for three reasons.

Firstly, technically, for directory-based extensions the kernel will reject
creating such an overlay, as there is a recursion problem. It actively
validates that a lowerdir is not a child of another lowerdir, and fails with
-ELOOP if it is. So having a sysext /usr/lib/extensions/myextdir/ would result
in an overlayfs config lowerdir=/usr/lib/extensions/myextdir/usr/:/usr which is
not allowed, as indicated by Christian the kernel performs this check:

/*
 * Check if this layer root is a descendant of:
 * - another layer of this overlayfs instance
 * - upper/work dir of any overlayfs instance
 */

<...>

/* Walk back ancestors to root (inclusive) looking for traps */
while (!err && parent != next) {
        if (is_lower && ovl_lookup_trap_inode(sb, parent)) {
                err = -ELOOP;
                pr_err("overlapping %s path\n", name);

Secondly, there's a confusing aspect to this recursive storage. If you
have /usr/lib/extensions/myext.raw which contains /usr/lib/extensions/mynested.raw
'systemd-sysext merge' will only pick up the first one, but both will appear in
the merged root under /usr/lib/extensions/. So you have two extension images, both
appear in your merged filesystem, but only one is actually in use.

Finally, there's a conceptual aspect: the idea behind sysexts and hermetic /usr
is that the /usr tree is not modified locally, but owned by the vendor. Dropping
extensions in /usr thus goes contrary to this foundational concept.
2023-03-30 11:25:17 +01:00
Lennart Poettering 62c72c60b5 units: let's establish the coredump socket before writting core_pattern sysctl
It's a bit nicer if we only write the sysctl core_pattern once the
coredump socket is established, since it's the backend for the handler.

Given the systemd-coredump.socket basically has no dependencies that run
before it this should not really make things slower or so, it just
removes the tiny window where core pattern is in effect that wants to
connect to the backend socket but cannot.

The status quo isn't terrible, and not too different in effect: either
way, until the socket unit is up we won't process coredumps. It's mostly
what kind of behaviour you get then: an error due to /bin/false being
invoked, or an error because systemd-coredump can't connect to its
socket. After this patch we'll exclusively see the former.
2023-03-30 08:53:52 +09:00
Mike Yuan 23c4c03406
unit: sysext: update unit name for sd-tmpfiles-setup
Fixes #26882
2023-03-19 01:29:48 +08:00
Daan De Meyer cafd2c0be4 units: Order user@.service after systemd-oomd.service
The user manager connects to oomd over varlink. Currently, during
shutdown, if oomd is stopped before any user manager, the user
manager will try to reconnect to the socket, leading to a warning
from pid 1 about a conflicting transaction.

Let's fix this by ordering user@.service after systemd-oomd.service,
so that user sessions are stopped before systemd-oomd is stopped,
which makes sure that the user sessions won't try to start oomd via
its socket after systemd-oomd is stopped.
2023-03-18 15:05:43 +09:00
Daan De Meyer a4180c0fb3 journald-console: Add colors when forwarding to console
Let's color output when we're forwarding to the console. To make this
work, we inherit TERM from pid 1 and use it to decide whether we should
output colors or not.
2023-03-16 11:22:58 +01:00
Jan Janssen dfca5587cf tree-wide: Drop gnu-efi
This drops all mentions of gnu-efi and its manual build machinery. A
future commit will bring bootloader builds back. A new bootloader meson
option is now used to control whether to build sd-boot and its userspace
tooling.
2023-03-10 11:41:03 +01:00
Jan Engelhardt 18fe76eba5 doc: correct wrong use "'s" contractions 2023-03-07 13:39:31 +01:00
Lennart Poettering 9d03637404 units: let systemd --user manage its own memory pressure handling
Let's make things systematic: the per-user and the per-system manager
should manage their own memory pressure, as they are, well, managers of
things.

This is particularly relevant and the per-user service manager should
watch its own "init.scope" subcgroup, instead of the main service unit
cgroup, and hence $MEMORY_PRESSURE_WATCH as set by the per-system
service manager would simply be wrong.
2023-03-01 09:43:24 +01:00
Luca Boccassi 7ef09e2099 units: change assert to condition to skip running in initrd/os
These units are also present in the initrd, so instead of an assert,
just use a condition so they are skipped where they need to be skipped.

Fixes https://github.com/systemd/systemd/issues/26358
2023-02-09 12:04:21 +00:00
Zbigniew Jędrzejewski-Szmek e4c7b5f517 core: split system/user job timeouts and make them configurable
Config options are -Ddefault-timeout-sec= and -Ddefault-user-timeout-sec=.
Existing -Dupdate-helper-user-timeout= is renamed to -Dupdate-helper-user-timeout-sec=
for consistency. All three options take an integer value in seconds. The
renaming and type-change of the option is a small compat break, but it's just
at compile time and result in a clear error message. I also doubt that anyone was
actually using the option.

This commit separates the user manager timeouts, but keeps them unchanged at 90 s.
The timeout for the user manager is set to 4/3*user-timeout, which means that it
is still 120 s.

Fedora wants to experiment with lower timeouts, but doing this via a patch would
be annoying and more work than necessary. Let's make this easy to configure.
2023-02-01 11:52:29 +00:00
Frantisek Sumsal 0eb635ef4b units: don't install pcrphase-related units without gnu-efi
since we don't have systemd-pcrphase built anyway, which breaks the tests:

...
I: Attempting to install /usr/lib/systemd/systemd-networkd-wait-online (based on unit file reference)
I: Attempting to install /usr/lib/systemd/systemd-network-generator (based on unit file reference)
I: Attempting to install /usr/lib/systemd/systemd-oomd (based on unit file reference)
I: Attempting to install /usr/lib/systemd/systemd-pcrphase (based on unit file reference)
W: Failed to install '/usr/lib/systemd/systemd-pcrphase'
make: *** [Makefile:4: setup] Error 1
make: Leaving directory '/root/systemd/test/TEST-01-BASIC'

Follow-up to 04959faa63.
2023-01-17 14:30:02 +01:00
Lennart Poettering 04959faa63 generators: optionally, measure file systems at boot
If we use gpt-auto-generator, automatically measure root fs and /var.

Otherwise, add x-systemd.measure option to request this.
2023-01-17 09:42:16 +01:00
Lennart Poettering 50072ccf1b units: rework growfs units to be just a regular unit that is instantiated
The systemd-growfs@.service units are currently written in full for each
file system to grow. Which is kinda pointless given that (besides an
optional ordering dep) they contain always the same definition. Let's
fix that and add a static template for this logic, that the generator
simply instantiates (and adds an ordering dep for).

This mimics how systemd-fsck@.service is handled. Similar to the wait
that for root fs there's a special instance systemd-fsck-root.service
we also add a special instance systemd-growfs-root.service for the root
fs, since it has slightly different deps.

Fixes: #20788
See: #10014
2023-01-17 09:42:16 +01:00
Lennart Poettering 072c8f6505 units: measure /etc/machine-id into PCR 15 during early boot
We want PCR 15 to be useful for binding per-system policy to. Let's
measure the machine ID into it, to ensure that every OS we can
distinguish will get a different PCR (even if the root disk encryption
key is already measured into it).
2023-01-17 09:42:16 +01:00
Franck Bui 2aba77057e journal: give the ability to enable/disable systemd-journald-audit.socket
Before this patch the only way to prevent journald from reading the audit
messages was to mask systemd-journald-audit.socket. However this had main
drawback that downstream couldn't ship the socket disabled by default (beside
the fact that masking units is not supposed to be the usual way to disable
them).

Fixes #15777
2023-01-11 17:18:57 +01:00
Lennart Poettering 5d71e463f4 logind: implement Type=notify-reload protocol properly
So close already. Let's add the two missing notifications too.

Fixes: #18484
2023-01-10 18:28:38 +01:00
Lennart Poettering f84331539d udevd: implement the full Type=notify-reload protocol
We are basically already there, just need to add MONOTONIC_USEC= to the
RELOADING=1 message, and make sure the message is generated in really
all cases.
2023-01-10 18:28:38 +01:00
Lennart Poettering 0e07cdb0e7 networkd: implement Type=notify-reload protocol 2023-01-10 18:28:38 +01:00