To make things symmetric to the $SYSTEMD_SSH logic that the varlink
transport supports, let's also honour such a variable in sd-bus when
picking ssh transport.
This uses openssh 9.4's -W support for AF_UNIX. Unfortunately older versions
don't work with this, and I couldn#t figure a way that would work for
older versions too, would not be racy and where we'd still could keep
track of the forked off ssh process.
Unfortunately, on older versions -W will just hang (because it tries to
resolve the AF_UNIX path as regular host name), which sucks, but hopefully this
issue will go away sooner or later on its own, as distributions update.
Fedora is still stuck at 9.3 at the time of posting this (even on
Fedora), even though 9.4, 9.5, 9.6 have all already been released by
now.
Example:
varlinkctl call -j ssh:root@somehost:/run/systemd/io.systemd.Credentials io.systemd.Credentials.Encrypt '{"text":"foobar"}'
Otherwise, udev workers cannot detect slow programs invoked by
IMPORT{program}=, PROGRAM=, or RUN=, and whole worker process may be
killed.
Fixes#30436.
Co-authored-by: sushmbha <sushmita.bhattacharya@oracle.com>
Same as $KERNEL_INSTALL_BYPASS, but for hwdb. This will speed up
cross architecture image builds in mkosi as I can disable package
managers from running the costly hwdb update stuff in qemu user
mode and run it myself with a native systemd-hwdb with --root=.
Now that mkosi-kernel is a thing, this logic in systemd is just mostly
bitrotting since I just use mkosi-kernel these days. If I ever need to
hack on systemd and the kernel in tandem, I'll just add support for
building systemd to mkosi-kernel instead, so let's drop the support for
building a custom kernel in systemd's mkosi configuration.
This code doesn't link when gcc+lld is used:
$ LDFLAGS=-fuse-ld=lld meson setup build-lld && ninja -C build-lld udevadm
...
ld.lld: error: src/shared/libsystemd-shared-255.a(libsystemd-shared-255.a.p/cryptsetup-util.c.o):
symbol crypt_token_external_path@@ has undefined version
collect2: error: ld returned 1 exit status
As a work-around, restrict it to developer mode.
Closes https://github.com/systemd/systemd/issues/30218.
- Use mkosi.images/ instead of mkosi.presets/
- Use the .chroot suffix to run scripts in the image
- Use BuildSources= match for the kernel build
- Move 10-systemd.conf to mkosi.conf and rely on mkosi.local.conf
for local configuration
don't let the devices to be announced just as model "Linux". Let's instead
propagate the underlying block device's model. Also do something
reasonably smart for the serial and firmware version fields.
Introduce a new env variable $SYSTEMD_NSPAWN_CHECK_OS_RELEASE, that can
be used to disable the os-release check for bootable OS trees. Useful
when trying to boot a container with empty /etc/ and bind-mounted /usr/.
Resolves: #29185
I tried to get something similar upstream:
https://gitlab.com/cryptsetup/cryptsetup/-/issues/846
But no luck, it was suggested I use ELF interposition instead. Hence,
let's do so (but not via ugly LD_PRELOAD, but simply by overriding the
relevant symbol natively in our own code).
This makes debugging tokens a ton easier.
Automatically softreboot if the nextroot has been set up with an OS
tree, or automatically kexec if a kernel has been loaded with kexec
--load.
Add SYSTEMCTL_SKIP_AUTO_KEXEC and SYSTEMCTL_SKIP_AUTO_SOFT_REBOOT to
skip the automated switchover.
Instead of using ExtraTrees=, let's use the new RuntimeTrees= option
to mount the full repository into the VM/container. Let's also store
the sources under /usr/src/systemd and update the gdbinit file and
vscode HACKING guide section to match the new location.
Currently we have a 100ms delay which allows for people to enter/show
the boot menu even when timeout is set to zero.
In a handful of cases, that may not be needed - both in terms of access
policy, as well as latency.
For example: the option to provide the boot menu may be hidden behind an
"expert only" UX in the OS, to avoid end users from accidentally
entering it.
In addition, the current 100ms input polling may cause unexpected
additional delays in the boot. Some example numbers from my SteamDeck:
- boot counting/rename/flush doubles 300us -> 600us
- seed/hash setup doubles 900us -> 1800us
- kernel/image load gets ~40% slower 107ms -> 167ms
It's not entirely clear why the UEFI calls gets slower, nevertheless the
information in itself proves useful.
This commit introduces a new option "menu-disabled", which omits the
100ms delay. The option is documented throughout the manual pages as
well as the Boot Loader Specification.
v2:
- use STR_IN_SET
v3:
- drop erroneous whitespace
v4:
- add a new LoaderFeature bit,
- don't change ABI keep TIMEOUT_* tokens the same
- move new token in the 64bit range, update API and storage for it
- change inc/dec behaviour to TIMEOUT_MIN : TIMEOUT_MENU_FORCE
- user cannot opt-in from sd-boot itself, add assert_not_reached()
v5:
- s/Menu disablement control/Menu can be disabled/
- rewrap comments to 109
- use SYNTHETIC_ERRNO(EOPNOTSUPP)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
To be on the safe side, explicitly mention that apart from the numerical
entries we can allow string ones.
Implementation-wise, bootctl will use internal numerical values that
match sd-boot's ABI. The latter also accepts the string options.
Going forward we'd like to avoid adding more internal magic and be more
explicit.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Currently we spawn services by forking a child process, doing a bunch
of work, and then exec'ing the service executable.
There are some advantages to this approach:
- quick: we immediately have access to all the enourmous amount of
state simply by virtue of sharing the memory with the parent
- easy to refactor and add features
- part of the same binary, will never be out of sync
There are however significant drawbacks:
- doing work after fork and before exec is against glibc's supported
case for several APIs we call
- copy-on-write trap: anytime any memory is touched in either parent
or child, a copy of that page will be triggered
- memory footprint of the child process will be memory footprint of
PID1, but using the cgroup memory limits of the unit
The last issue is especially problematic on resource constrained
systems where hard memory caps are enforced and swap is not allowed.
As soon as PID1 is under load, with no page out due to no swap, and a
service with a low MemoryMax= tries to start, hilarity ensues.
Add a new systemd-executor binary, that is able to receive all the
required state via memfd, deserialize it, prepare the appropriate
data structures and call exec_child.
Use posix_spawn which uses CLONE_VM + CLONE_VFORK, to ensure there is
no copy-on-write (same address space will be used, and parent process
will be frozen, until exec).
The sd-executor binary is pinned by FD on startup, so that we can
guarantee there will be no incompatibilities during upgrades.
Let's mention that we just need the latest stable release of mkosi,
not the latest git commit. We also split the instructions for building
on the host and the instructions for building with mkosi into two blocks,
as it's not required to build on the host anymore to build with mkosi.
On normal systems, triggering a timeout should be a bug in code or
configuration error, so I do not think we should extend the default
timeout. Also, we should not introduce a 'first class' configuration
option about that. But, making it configurable may be useful for cases
such that "an extremely highly utilized system (lots of OOM kills,
very high CPU utilization, etc)".
Closes#25441.
The tool initially just measured the boot phase, but was subsequently
extended to measure file system and machine IDs, too. At AllSystemsGo
there were request to add more, and make the tool generically
accessible.
Hence, let's rename the binary (but not the pcrphase services), to make
clear the tool is not just measureing the boot phase, but a lot of other
things too.
The tool is located in /usr/lib/ and still relatively new, hence let's
just rename the binary and be done with it, while keeping the unit names
stable.
While we are at it, also move the tool out of src/boot/ and into its own
src/pcrextend/ dir, since it's not really doing boot related stuff
anymore.
ispell made some suggestions which I applied.
Addresses: https://github.com/systemd/systemd/pull/29209#pullrequestreview-1632623460
Also adds a brief paragraph about initrd transitions. (Plymouth really
should start using the fdstore for pinning DRM objects, and stop trying
to survive the initrd→host transition)
There were a couple spelling/grammatical errors in the docs that made
it hard to read and understand parts of this doc. I cleaned up those
errors and reflowed the line breaks to keep to the 80 char limit.
The article "a" goes before consonant sounds and "an" goes before vowel
sounds. This commit changes an to a for UKI, UDP, UTF-8, URL, UUID, U-Label, UI
and USB, since they start with the sound /ˌjuː/.
Since mkosi is now smart enough to drop the caches when the list of
packages changes, let's enable Incremental= mode by default to ensure
a good experience for anyone new to hacking on systemd with mkosi.
ImportCredential= takes a credential name and searches for a matching
credential in all the credential stores we know about it. It supports
globs which are expanded so that all matching credentials are loaded.
The kernel, systemd, and many other things print their version during boot.
sd-boot and sd-stub are also important, so let's print the version if EFI_DEBUG.
(If !EFI_DEBUG, continue to be quiet.)
When updating the docs, I saw that that the text in HACKING.md was out of date.
Instead of trying to update the instructions there, make it shorter and refer
the reader to tools/debug-sd-boot.sh for details.
Before 7cd43e34c5, it was possible to use
SYSTEMD_PROC_CMDLINE=systemd.condition-first-boot to override autodetection.
But now this doesn't work anymore, and it's useful to be able to do that for
testing.
We provide the same stability for all the headers that are public.
Also, mark id128 as portable to other systems. There is really nothing in the
code that would make it hard. It would probably work out-of-the-box.
Let's start moving towards a more involved partitioning setup to
test our stuff more when using mkosi.
The root partition is generated on boot with systemd-repart.
CentOS supports neither erofs nor btrfs so we use squashfs and xfs
instead.
We also enable SecureBoot= locally for additional coverage. This
and the use of verity means users need to run `mkosi genkey` once
to generate the keys necessary to do secure boot and verity.
This implements a minimal subset of #24961, but in a lot more
restrictive way: we only allow one level of subcgroup (as that's enough
to address the no-processes in inner cgroups rule), and does not change
anything about threaded cgroup logic or similar, or make any of this new
behaviour mandatory.
All this does is this: all non-control processes we invoke for a unit
we'll invoke in a subgroup by the specified name.
We'll later port all our current services that use cgroup delegation
over to this, i.e. user@.service, systemd-nspawn@.service and
systemd-udevd.service.
To make it consistent with other env vars, e.g. $SYSTEMD_ESP_PATH or
$SYSTEMD_XBOOTLDR_PATH.
This is useful when the root is specified by a file descriptor, instead
of a path.
Fixes RHBZ#2183546 (https://bugzilla.redhat.com/show_bug.cgi?id=2183546).
Previously, journal file is always compressed with the default algorithm
set at compile time. So, if a newer algorithm is used, journal files
cannot be read by older version of journalctl that does not support the
algorithm.
Co-authored-by: Colin Walters <walters@verbum.org>
- Drop Netdev= as it was removed in mkosi
- Always install python-psutil in the final image (required for networkd tests)
- Always Install python-pytest in the final image (required for ukify tests)
- Use the narrow glob for all centos python packages
- Drop the networkd mkosi config files (the default image can be used instead)
- Use ".conf" as the mkosi config file suffix everywhere
- Copy src/ to /root/src in the final image and set gdb substitute path in
.gdbinit to make gdb work properly
This is useful to identify log messages with metadata from the images
they run on. Look for ID/VERSION_ID/IMAGE_ID/IMAGE_VERSION/BUILD_ID,
with a SYSEXT_ prefix if we are looking at an extension, and append via
LogExtraFields= as respectively PORTABLE_NAME_AND_VERSION= in case of a
single image. In case of extensions, append as PORTABLE_ROOT_NAME_AND_VERSION=
for the base and one PORTABLE_EXTENSION_AND_VERSION= for each extension.
Example with a base and two extensions, with the unit coming from the
first extension:
[Service]
RootImage=/home/bluca/git/systemd/base.raw
Environment=PORTABLE=app0.raw
BindReadOnlyPaths=/etc/os-release:/run/host/os-release
LogExtraFields=PORTABLE=app0.raw
Environment=PORTABLE_ROOT=base.raw
LogExtraFields=PORTABLE_ROOT=base.raw
LogExtraFields=PORTABLE_ROOT_NAME_AND_VERSION=debian_10
ExtensionImages=/home/bluca/git/systemd/app0.raw
LogExtraFields=PORTABLE_EXTENSION=app0.raw
LogExtraFields=PORTABLE_EXTENSION_NAME_AND_VERSION=app_0
ExtensionImages=/home/bluca/git/systemd/app1.raw
LogExtraFields=PORTABLE_EXTENSION=app1.raw
LogExtraFields=PORTABLE_EXTENSION_NAME_AND_VERSION=app_1
When a portable service uses extensions, we use the 'main' image name
(the one where the unit was found in) as PORTABLE=. It is useful to
also list all the images actually used at runtime, as they might
contain libraries and so on.
Use PORTABLE_ROOT= for the image/directory that is used as RootImage=
or RootDirectory=, and PORTABLE_EXTENSION= for the image/directory that
is used as ExtensionImages= or ExtensionDirectories=.
Note that these new fields are only added if extensions are used,
there's no change for single-DDI portables.
Example with a base and two extensions, with the unit coming from the
first extension:
[Service]
RootImage=/home/bluca/git/systemd/base.raw
Environment=PORTABLE=app0.raw
BindReadOnlyPaths=/etc/os-release:/run/host/os-release
LogExtraFields=PORTABLE=app0.raw
LogExtraFields=PORTABLE_ROOT=base.raw
ExtensionImages=/home/bluca/git/systemd/app0.raw
LogExtraFields=PORTABLE_EXTENSION=app0.raw
ExtensionImages=/home/bluca/git/systemd/app1.raw
LogExtraFields=PORTABLE_EXTENSION=app1.raw