Commit graph

24 commits

Author SHA1 Message Date
Daan De Meyer 7a321b5a21 test: Rename testsuite-XX units to match test name
Having these named differently than the test itself mostly creates
unecessary confusion and makes writing logic against the tests harder
so let's rename the testsuite-xx units and scripts to just use the
test name itself.
2024-05-14 12:43:28 +02:00
Daan De Meyer 3c0a1b1e70 core: Imply DefaultDependencies=no for credential mounts
Currently, on soft-reboot, /run/credentials/@system is unmounted
because it has DefaultDependencies=yes and as such will have
Conflicts=umount.target and Before=umount.target. Let's make sure
credential mounts survive soft-reboot by implying DefaultDependencies=no
for credential mounts.
2024-05-14 12:42:45 +02:00
Daan De Meyer e290b45dfa TEST-82-SOFTREBOOT: Exit with exit status 123
Required to make mkosi consider the test successful.
2024-04-30 22:10:05 +02:00
Luca Boccassi ed35851693 run: fix generated unit name clash after soft-reboot
When sd-run connects to D-Bus rather than the private socket, it will
generate the transient unit name using the bus ID assigned by the D-Bus
broker/daemon. The issue is that this ID is only unique per D-Bus run,
if the broker/daemon restarts it starts again from 1, and it's a simple
incremental counter for each client.
So if a transient unit run-u6.service starts and fails, and it is not
collected (default on failure), and the system soft-reboots, any new
transient unit might conflict as the counter will restart:

Failed to start transient service unit: Unit run-u6.service was already loaded or has a fragment file.

Get the soft-reboot counter, and if it's greater than zero, append it
to the autogenerated unit name to avoid clashes.
2024-03-28 11:19:46 +09:00
Luca Boccassi 66f35161f6 core: add counter for soft-reboot iterations
Allow to query via D-Bus how many times the current booted system has
been soft rebooted
2024-03-27 01:27:35 +00:00
Luca Boccassi ebc7510380 core: relax dependency on RootImage= storage from Requires= to Wants=
If a unit is running in an image and wants to survive a soft-reboot,
then it can't be deactivated by the storage of the image going away.
Relax the dependency to a Wants=. Access to the image is not needed
when the unit is running anyway, so downgrade to Wants=.
2023-12-08 11:16:31 +09:00
Yu Watanabe 5edb35ef7a test: check journal files are not corrupted after soft-reboot 2023-11-28 18:28:18 +09:00
Luca Boccassi 665a3d6d15 systemctl: automatically softreboot/kexec if set up on reboot
Automatically softreboot if the nextroot has been set up with an OS
tree, or automatically kexec if a kernel has been loaded with kexec
--load.

Add SYSTEMCTL_SKIP_AUTO_KEXEC and SYSTEMCTL_SKIP_AUTO_SOFT_REBOOT to
skip the automated switchover.
2023-10-20 11:45:37 +01:00
Frantisek Sumsal 2f397514ad test: spawn the to-be-killed-on-soft-reboot units with --collect
Otherwise they might leave stuff behind if they don't respond fast
enough to the first SIGTERM and get SIGKILLEd, which then breaks reusing
the unit name further in the test:

[ 2993.620849] H testsuite-82.sh[43]: + systemd-run -p Type=exec -p DefaultDependencies=no -p IgnoreOnIsolate=yes --unit=testsuite-82-nosurvive.service sleep infinity
[ 2993.628686] H systemd[1]: testsuite-82-nosurvive.service: About to execute: /usr/bin/sleep infinity
[ 2993.628886] H systemd[1]: testsuite-82-nosurvive.service: Forked /usr/bin/sleep as 65
[ 2993.629328] H systemd[1]: testsuite-82-nosurvive.service: Changed dead -> start
...
[ 2993.699892] H testsuite-82.sh[43]: + systemctl --no-block --check-inhibitors=yes soft-reboot
[ 2993.704326] H systemd-logind[41]: The system will soft-reboot now!
...
[ 3001.249302] H systemd[1]: Sending SIGKILL to PID 65 (sleep).
...
[ 3001.303158] H testsuite-82.sh[136]: + systemd-notify '--status=Second Boot'
...
[ 3001.409504] H testsuite-82.sh[136]: + systemd-run -p Type=exec --unit=testsuite-82-nosurvive.service sleep infinity
[ 3001.414061] H testsuite-82.sh[165]: Failed to start transient service unit: Unit testsuite-82-nosurvive.service was already loaded or has a fragment file.

Spotted in Ubuntu CI.
2023-10-03 16:40:49 +02:00
Frantisek Sumsal 399a8a5eb1 test: use --service-type= instead of -p Type= 2023-10-03 16:38:35 +02:00
Frantisek Sumsal 82abce7a89 test: use Type=exec for the auxiliary services
To make sure the respective binaries are exec()ed before moving further
with the test.
2023-09-29 22:10:42 +02:00
Frantisek Sumsal 47f6baccfe test: shutdown the machine on fail after soft-reboot
Since the soft-reboot drops the enqueued end.service, we won't shutdown
the test VM if the test fails and have to wait for the watchdog to kill
us (which may take quite a long time). Let's just forcibly kill the
machine instead to save CI resources.
2023-09-29 22:07:12 +02:00
Frantisek Sumsal d37b9154a7 test: check soft-reboot behavior wrt argv[0][0] == '@' 2023-09-28 13:48:14 +01:00
Luca Boccassi 559214cbbd pid1: add SurviveFinalKillSignal= to skip units on final sigterm/sigkill spree
Add a new boolean for units, SurviveFinalKillSignal=yes/no. Units that
set it will not have their process receive the final sigterm/sigkill in
the shutdown phase.

This is implemented by checking if a process is part of a cgroup marked
with a user.survive_final_kill_signal xattr (or a trusted xattr if we
can't set a user one, which were added only in kernel v5.7 and are not
supported in CentOS 8).
2023-09-28 13:48:14 +01:00
Luca Boccassi e4aab5cf1a logind: add PrepareForShutdownWithMetadata signal
The existing signal doesn't say which type of shutdown is going to happen.
With the introduction of soft-reboot, it is useful to have this information
broadcasted, so that clients can choose to do different things based on the
reboot type.
Add a{sv} as the payload so that more metadata can be added later if
needed, without needing to add yet another signal.
Send both old and new signal for backward compatibility, and send the new
one first so that clients can just wait for the first one on both old and
new systems.
2023-09-11 12:56:00 +01:00
Luca Boccassi 663e27564f core: stage /run/host/os-release with a symlink to avoid possible race condition
If someone reads /run/host/os-release at the exact same time it is being updated, and it
is large enough, they might read a half-written file. This is very unlikely as
os-release is typically small and very rarely changes, but it is not
impossible.

Bind mount a staging directory instead of the file, and symlink the file
into into, so that we can do atomic file updates and close this gap.
Atomic replacement creates a new inode, so existing bind mounts would
continue to see the old file, and only new services would see the new file.
The indirection via the directory allows to work around this, as the
directory is fixed and never changes so the bind mount is always valid,
and its content is shared with all existing services.

Fixes https://github.com/systemd/systemd/issues/28794

Follow-up for 3f37a82545
2023-08-16 16:17:41 +01:00
Luca Boccassi bf85c2395e core: copy os-release with COPY_TRUNCATE
Otherwise if the os-release file shrinks between updates, there
will be a merge of the two.
Also remove redundant ENOENT check.

Follow-up for 3f37a82545
2023-08-11 17:14:09 +01:00
Luca Boccassi b41ab9b3f4 softreboot: ensure all processes are killed
Having surviving processes is not ready yet as a feature, so ensure
everything is killed on the transition for now
2023-07-24 10:45:28 +01:00
Luca Boccassi 3835b9aa4b Revert "core: add IgnoreOnSoftReboot= unit option"
The feature is not ready, postpone it

This reverts commit b80fc61e89.
2023-07-22 23:27:27 +01:00
Luca Boccassi b80fc61e89 core: add IgnoreOnSoftReboot= unit option
As it says on the tin, configures the unit to survive a soft reboot.
Currently all the following options have to be set by hand:

Conflicts=reboot.target kexec.target poweroff.target halt.target
Before=reboot.target kexec.target poweroff.target halt.target
After=sysinit.target basic.target
DefaultDependencies=no
IgnoreOnIsolate=yes

This is not very user friendly. If new default dependencies are added,
or new shutdown/reboot types, they also have to be added manually.

The new option is much simpler, easy to find, and does the right thing
by default.
2023-07-21 18:05:41 +02:00
Luca Boccassi 3f37a82545 core: copy the host's os-release for /run/host/os-release
Currently for portable services we automatically add a bind mount
os-release -> /run/host/os-release. This becomes problematic for the
soft-reboot case, as it's likely that portable services will be configured
to survive it, and thus would forever keep a reference to the old host's
os-release, which would be a problem because it becomes outdated, and also
it stops the old rootfs from being garbage collected.

Create a copy when the manager starts under /run/systemd/propagate instead,
and bind mount that for all services using RootDirectory=/RootImage=, so
that on soft-reboot the content gets updated (without creating a new file,
so the existing bind mounts will see the new content too).

This expands the /run/host/os-release protocol to more services, but I
think that's a nice thing to have too.

Closes https://github.com/systemd/systemd/issues/28023
2023-07-18 17:26:02 +01:00
Frantisek Sumsal 07268394d6 test: unify /testok & /failed handling
And drop it where not necessary.
2023-07-12 16:03:40 +02:00
Frantisek Sumsal 9a27ef092e tree-wide: fix a couple of typos
As reported by Fossies.org.
2023-06-15 20:52:45 +02:00
Lennart Poettering 093d545658 test: add integration test for soft reboots incl. fdstore passing 2023-06-02 18:43:11 +02:00