Commit graph

43140 commits

Author SHA1 Message Date
Yu Watanabe 2b43ab00b0 udev-rules: fix negative match rule for SYMLINK and TAG
Fixes #27396.
2023-04-26 09:51:08 +09:00
Frantisek Sumsal c74e13a5c3 creds: make --pretty behave in a slightly more expected manner 2023-04-25 18:34:49 +02:00
Frantisek Sumsal ee46e4d982 test: cover missed stuff from securebits-util.h 2023-04-25 18:34:49 +02:00
Zbigniew Jędrzejewski-Szmek 1c7ed99027 resolved: adjust message about credentials
"credential provided widget" would be better spelled as "credential-provided widget".
But let's adjust the message to name the bad credential explicitly: this
makes it easier to fix for the user.
2023-04-25 18:08:15 +02:00
Zbigniew Jędrzejewski-Szmek 55ace8e5c5 shared/creds-util: return 0 for missing creds in read_credential_strings_many
Realistically, the only thing that the caller can do is ignore failures related
to missing credentials. If the caller requires some credentials to be present,
they should just check which output variables are not NULL. One of the callers
was already doing that, and the other wanted to, but missed -ENOENT. By
suppressing -ENOENT and -ENXIO, both callers are simplified.

Fixes a warning at boot:
systemd-vconsole-setup[221]: Failed to import credentials, ignoring: No such file or directory
2023-04-25 18:08:15 +02:00
Eric Curtin b9dac41837 Support /etc/system-update for OSTree systems
This is required when / is immutable and cannot be written at runtime.

Co-authored-by: Richard Hughes <richard@hughsie.com>
2023-04-25 17:40:41 +02:00
Lennart Poettering d30d5a0374
Merge pull request #27347 from bluca/sd_bus_nonce
sd: avoid closing sd-bus in a fork, store module-global id for sd-bus/sd-session/sd-journal
2023-04-25 17:40:15 +02:00
Lennart Poettering 17b798d915 mount-util: split remount_idmap() in two
This will make things a bit longer for now, but more powerful as we can
reuse the userns fd between calls to remount_idmap() if we need to
adjust multiple mounts.

No change in behaviour, just some minor refactoring.
2023-04-25 17:39:16 +02:00
Lennart Poettering 4054d76151 sd-daemon: add sd_pid_notifyf_with_fds()
I guess it was only a question of time until we need to add the final
frontier of notification functions: one that combines the features of
all the others:

1. specifiying a source PID
2. taking a list of fds to send along
3. accepting a format string for the status string

Hence, let's add it.
2023-04-25 17:38:57 +02:00
Luca Boccassi 4a75704b16 pam: do not attempt to close sd-bus after fork in pam_end()
When pam_end() is called after a fork, and it cleans up caches, it sets
PAM_DATA_SILENT in error_status. FDs will be shared with the parent, so
we do not want to attempt to close them from a child process, or we'll
hit assertions. Complain loudly and skip.
2023-04-25 17:19:57 +02:00
Lennart Poettering 973527648b logind: always use 64bit session IDs
it's a bit confusing that on 32bit systems we'd risk session IDs
overruns like this. Let's expose the same behaviour everywhere and stick
to 64bit ids.

Since we format the ids as strings anyway this doesn't really change
anything performance-wise, it just pushes out collisions by overrun to
basically never happen.
2023-04-25 15:52:19 +01:00
Thorsten Kukuk 092e6cd19a sd-login: add SetTTY session object #26611 2023-04-25 14:33:09 +02:00
Lennart Poettering eb3641fc3c user-record-nss: make return values optional
If we only want to know if some user ID/user name is already allocated,
we don't care for the returned data.
2023-04-25 14:00:57 +02:00
Lennart Poettering c8ab89e569 mountpoint-util: make path_get_mnt_id_at() work with a NULL path 2023-04-25 14:00:38 +02:00
Luca Boccassi 2eeff0f4f1 sd-event: store and compare per-module static origin id
sd-event objects use hashmaps, which use module-global state, so it is not safe
to pass a sd-event object created by a module instance to another module instance
(e.g.: when two libraries static linking sd-event are pulled in a single process).
Initialize a random per-module origin id and store it in the object, and compare
it when entering a public API, and error out if they don't match, together with
the PID.
2023-04-25 12:24:25 +01:00
Luca Boccassi e046719b74 sd-journal: store and compare per-module static origin id
sd-journal objects use hashmaps, which use module-global state, so it is not safe
to pass a sd-journal object created by a module instance to another module instance
(e.g.: when two libraries static linking sd-journal are pulled in a single process).
Initialize a random per-module origin id and store it in the object, and compare
it when entering a public API, and error out if they don't match, together with
the PID.
2023-04-25 12:24:25 +01:00
Luca Boccassi bf876e3f3e sd-bus: store and compare per-module static origin id
sd-bus objects use hashmaps, which use module-global state, so it is not safe
to pass a sd-bus object created by a module instance to another module instance
(e.g.: when two libraries static linking sd-bus are pulled in a single process).
Initialize a random per-module origin id and store it in the object, and compare
it when entering a public API, and error out if they don't match, together with
the PID.
2023-04-25 12:24:25 +01:00
Luca Boccassi bf2d930fa1 macro: add helper for module origin id
These need to be redefined in every module that we need to guard, so add
a macro
2023-04-25 11:34:39 +01:00
Lennart Poettering 797f6cc514 fs-util: make sure open_mkdir_at() does something roughly sensible when invoked with '/' 2023-04-25 18:38:00 +09:00
Yu Watanabe 3d008416d6
Merge pull request #27380 from poettering/bpf-meson-tweaks
two bpf build system changes
2023-04-25 18:37:36 +09:00
Lennart Poettering 3cd60148b4
Merge pull request #27388 from poettering/assert-fd
add ASSERT_FD() similar to ASSERT_PTR(), but for fds
2023-04-25 09:54:20 +02:00
Daan De Meyer afc47ee2af Drop log level of header limits log message
Especially when using in-memory logging, these are too noisy so
let's drop them back to debug level.
2023-04-25 07:31:40 +02:00
Luca Boccassi 7556f29694
Merge pull request #27386 from dtardon/test-cleanup
More automatic cleanup in tests
2023-04-25 02:00:56 +01:00
Wolfgang Müller 38fc5e0314 cryptsetup-fido2: Depend on libcryptsetup
crypsetup-fido2 always depended on both libfido2 and libcryptsetup, but
0a8e026e82 forgot to make the then
implicit dependency on libcryptsetup explicit when moving it from
cryptsetup/ to shared/. This breaks builds when libfido2 is autodetected
but the system is missing libcryptsetup.

Introduce an explicit check for HAVE_LIBCRYPTSETUP such that
cryptsetup-fido2 is only built when both libraries are available.

Fixes #27374.
2023-04-25 02:00:16 +01:00
Luca Boccassi 7d9f6034a9 sd-bus: check for pid change before closing
If we try to close after a fork, the FDs will have been cloned
too and we'll assert. This can happen for example in PAM modules.

Avoid the macro and define ref/unref by hand to do the same check.
2023-04-25 00:54:07 +01:00
Lennart Poettering 0593b34adc homed: rename make_userns() to avoid name conflict with mount-util.[ch]
This doesn't really matter too much as both are static functions. But
it's confusing as hell both when debugging and reading code, given that
homed actually uses mount-util.c

Hence, let's just rename one of the two, to minimize confusion.

No actual change in behaviour.

(and sooner or later we might want to export mount-util.c's version of
the function, since it's generically useful)
2023-04-24 22:29:47 +02:00
Zbigniew Jędrzejewski-Szmek 208a59c15f
Merge pull request #27113 from keszybz/variable-expansion-rework
Rework serialization of command lines in pid1 and make run not expand variables
2023-04-24 22:03:06 +02:00
Lennart Poettering 91ce42f008 parse-util: allow parse_pid() to work with NULL return parameter
That way the function becomes useful for validating pids formatted as
strings.
2023-04-25 03:16:33 +08:00
David Tardon 5b87bccc00 test-hashmap-plain: use _cleanup_ 2023-04-24 21:15:50 +02:00
David Tardon 8f25d740f1 test-set: use _cleanup_ 2023-04-24 21:15:50 +02:00
Lennart Poettering 8e398254ba loop-util: port some code over to ASSERT_FD() 2023-04-24 20:52:52 +02:00
Lennart Poettering 6f81bcef25 fd-util: add ASSERT_FD() that is similar to ASSERT_PTR() but for fds 2023-04-24 20:51:51 +02:00
David Tardon 7a9f8b9053 test-calendarspec: use _cleanup_ 2023-04-24 20:44:29 +02:00
Franck Bui c821ad7d60 locale: convert generated vconsole keymap to x11 layout automatically
When doing x11->console conversions, find_converted_keymap() searches
automatically for a candidate in the converted keymap directory for a given x11
layout.

However doing console->x11 conversions, this automatic search is not done hence
simple conversion in this direction can't be achieved without populating
kbd-model-map with entries for converted keymaps.

For example, let's consider "at" layout which is not part of kbd-model-map. The
"at" x11 layout has a generated keymap
"/usr/share/kbd/keymaps/xkb/at.map.gz". If we configure "at" for the x11
layout, localed is able to automatically find the "at" converted vc layout and
the conversion just works :

  $ localectl set-x11-keymap at
  $ localectl
  System Locale: LANG=en_US.UTF-8
      VC Keymap: at
     X11 Layout: at

However in the opposite direction, ie when setting the vc keymap to "at", no
conversion is done and the x11 layout is not defined:

  $ localectl set-keymap at
  $ localectl
  System Locale: LANG=en_US.UTF-8
      VC Keymap: at
     X11 Layout: (unset)

This patch fixes this limitation as the implemenation is relatively simple and
it removes the need to populate kbd-model-map with (many) entries for converted
keymaps. However the patch doesn't remove the existing entries in kbd-model-map
which became unneeded after this change to be on the safe side.

Note: by default the automatically generated x11 keyboard configs use keyboard
model "microsoftpro" which should be equivalent to "pc105" model but with the
internet/media key mapping added.
2023-04-24 18:44:57 +02:00
Daan De Meyer d404c8d887 nspawn: Don't follow /etc/resolv.conf symlinks
When we're checking if /etc/resolv.conf exists so we can bind mount
on top of it, we care about whether the symlink itself exists if
/etc/resolv.conf exists and not the file it points to, so add
CHASE_NOFOLLOW to make sure we check existence of the symlink and
not the file it points to.
2023-04-24 18:14:12 +02:00
Lennart Poettering 906dff812e pid1: simplify bpf meson import 2023-04-24 17:10:08 +02:00
Lennart Poettering 4d3ef2d1a2 meson: move bpf hookup into main meson build file
This way we can use it in systemd-userdbd later on, too.
2023-04-24 17:10:08 +02:00
Luca Boccassi a2dd39b4cb pam: cache sd-bus separately per module
sd-bus connection is cached by the two pam modules globally, but this
can lead to issues due to hashmaps (used by sd-bus) using a global
static variable for the shared hash key, which is different per module
as both modules are loaded in the same process.

This happens because the sd-bus object is create in one module, but
used in the other, so global state does not match.

Use a different pam cache identifier for the sd-bus pointer, so that
each module uses a different sd-bus connection as a workaround.

Fixes https://github.com/systemd/systemd/issues/27216
Fixes https://github.com/systemd/systemd/issues/17266
2023-04-24 14:18:50 +02:00
Luca Boccassi db0c0f5e00 pam_systemd_home: clean up sd-bus when called about something else's user
acquire_home() takes a reference to a sd-bus object, which the open_session
hook cleans on success. But only when handling a user actually owned by homed,
it did not clean it up when skipping because it is being invoked on a system
user.
We need to be careful with sd-bus here as pam_sm_open_session is the last
hook before forking, and we want to clean up sd-bus before that happens, or
we'll have a broken reference (FDs are cloexec) in the child process, which
will then assert when attempting to close them, or leak the bus connection
which causes dbus to complain loudly:

 dbus-daemon[62]: [system] Connection has not authenticated soon enough, closing it (auth_timeout=30000ms, elapsed: 30020ms)
2023-04-24 14:18:22 +02:00
Franck Bui 3c7012cdda localed-util: make use of strdupcspn() 2023-04-24 14:12:58 +02:00
Luca Boccassi 54e4b42fde stub: add comment on measurement of io.systemd.stub.kernel-cmdline-extra 2023-04-24 11:04:50 +01:00
Zbigniew Jędrzejewski-Szmek 2ed7a221fa run: expand variables also with --scope
This makes syntax be the same for commands which are started by the manager and
those which are spawned directly (when --scope is used).

Before:
$ systemd-run -q -t echo '$TERM'
xterm-256color

$ systemd-run -q --scope echo '$TERM'
$TERM

Now:
$ systemd-run -q --scope echo '$TERM'
xterm-256color

Previous behaviour can be restored via --expand-environment=no:
$ systemd-run -q --scope --expand-environment=no echo '$TERM'
$TERM

Fixes #22948.

At some level, this is a compat break. Fortunately --scope is not very widely
used, so I think we can get away with this. Having different syntax depending
on whether --scope was used or not was bad UX.

A NEWS entry will be required.
2023-04-24 10:02:30 +02:00
Zbigniew Jędrzejewski-Szmek f872ddd182 run: add --expand-environment=no to disable server-side envvar expansion
This uses StartExecEx to get the equivalent of ExecStart=:. StartExecEx was
added in b3d593673c, so this will not work with
older systemds.

A hint is emitted if we get an error indicating lack of support. PID1 returns
SD_BUS_ERROR_PROPERTY_READ_ONLY, but I'm checking for
SD_BUS_ERROR_UNKNOWN_PROPERTY too for safety.
2023-04-24 10:02:30 +02:00
Zbigniew Jędrzejewski-Szmek b58026bddc run: split out creation of unit creation messages
Just refactoring, in preparation for future changes.
(Though I think it'd be reasonable to do anyway, those functions were
awfully long.)

'git diff' displays this badly. The middle part of start_transient_service()
is moved to make_transient_service_unit(), and the middle part of
start_transient_trigger() is moved to make_transient_trigger_unit().
2023-04-24 10:02:30 +02:00
Zbigniew Jędrzejewski-Szmek ac9a75d05e run: simplify returning of status
start_transient_service() would return two ints: one normally and one via
*retval. We can just return one int and propagate it directly, because we
use DEFINE_MAIN_FUNCTION_WITH_POSITIVE_FAILURE().
2023-04-24 10:02:30 +02:00
Zbigniew Jędrzejewski-Szmek 0a27d86a3f core: fix writing of ExecStartEx and friends
The property name is called ExecStartEx, but we have to write it as ExecStart=
in the unit file. :(
Bug introduced in b3d593673c when ex-properties
were initially added.

In addition, we cannot escape $ as $$, because when ":" is used, we wouldn't
unescape $$ back to $.
2023-04-24 10:02:30 +02:00
Zbigniew Jędrzejewski-Szmek 8c41640a71 core/unit: add UNIT_ESCAPE_EXEC_SYNTAX
Unfortunately we can't escape $ when ':' is used to prohibit variable expansion:
  ExecStart=:echo $$
is not the same as
  ExecStart=:echo $

This just adds the functionality and the unittests, without using it anywhere
for real yet.
2023-04-24 10:02:30 +02:00
Zbigniew Jędrzejewski-Szmek f3af629050 core/unit: rename UNIT_ESCAPE_EXEC_SYNTAX → *_ENV
In preparation for future changes.
2023-04-24 10:02:30 +02:00
Zbigniew Jędrzejewski-Szmek e7416db183 core/unit: fix shell-escaping of strings
Our escaping of '$' is '$$', not '\$'. We would write unit files that
were not valid:
  $ systemd-run --user bash -c 'echo $$; sleep 1000'
  Running as unit: run-r1c7c45b5b69f487c86ae205e12100808.service
  $ systemctl cat --user run-r1c7c45b5b69f487c86ae205e12100808
  # /run/user/1000/systemd/transient/run-r1c7c45b5b69f487c86ae205e12100808.service
  ...
  ExecStart="/usr/bin/bash" "-c" "echo \$\$\; sleep 1000"

  $ systemd-analyze verify /run/user/1000/systemd/transient/run-r1c7c45b5b69f487c86ae205e12100808.service
  /run/user/1000/systemd/transient/run-r1c7c45b5b69f487c86ae205e12100808.service:7:
    Ignoring unknown escape sequences: "echo \$\$\; sleep 1000"

Similarly, ';' cannot be escaped as '\;'. Only a handful of characters
listed in "Supported escapes" is allowed.

Escaping of "'" can be done, but it's not useful because we use double quotes
around the string anyway whenever we do escaping.

unit_write_setting() is called all over the place. In a great majority of
places we write either fixed strings or something that we generate ourselves,
so no escaping or quoting is needed. (And it's not allowed, e.g.
'Type="oneshot"' would not work.)  But if we forgot to add escaping or quoting
for a free-style string, it would probably allow writing a unit file that would
be read completely wrong. I looked over various places where
unit_write_setting() is called, and I couldn't find any place where
quoting/escaping was forgotten. But trying to figure out the full
ramifications of this change is not easy.
2023-04-24 10:02:30 +02:00
Zbigniew Jędrzejewski-Szmek a12bc99ef0 basic/logarithm: add popcount() wrapper
__builtin_popcount() is a bit of a mouthful, so let's provide a helper.
Using _Generic has the advantage that if a type other then the ones on
the list is given, compilation will fail. This is nice, because if by any
change we pass a wider type, it is rejected immediately instead of being
truncated.

log.h is also needed. It is included transitively, but let's include it
directly.

macro.h is *not* needed.
2023-04-24 10:02:30 +02:00
Daan De Meyer 750d9859c1 sulogin-shell: Start initrd.target on exit in the initrd
sulogin is documented to continue booting up on exit. To do that
in the initrd, we need to start initrd.target and not default.target.
2023-04-21 16:46:06 +02:00
David Tardon 596b44b178 test: use _cleanup_ for temp. files 2023-04-21 16:44:05 +02:00
David Tardon 925c51d95c test-fdset: use _cleanup_ 2023-04-21 16:29:15 +02:00
David Tardon 39dcab9062 test: shorten a bit 2023-04-21 16:29:15 +02:00
Lennart Poettering 4560d99e5e tre-wide: use FORMAT_DEVNUM() a bit more 2023-04-21 12:45:49 +02:00
Lennart Poettering 67458536af tree-wide: convert more cases do DEVNUM_FORMAT_STR()/DEVNUM_FORMAT_VAL()
Let's use our nice macros a bit more.

(Not comprehensive)
2023-04-21 12:41:15 +02:00
Luca Boccassi 21453b8b4b
Merge pull request #27349 from mrc0mmand/codespell
tree-wide: code spelling fixes
2023-04-20 22:02:17 +01:00
Frantisek Sumsal 94d82b5980 tree-wide: code spelling fixes
As reported by Fossies.
2023-04-20 21:54:59 +02:00
Zbigniew Jędrzejewski-Szmek 08c2f9c626 detect-virt: add message at debug level
Normal users do not have permissions to access /proc/1/root, so
'systemd-detect-virt -r' fails, but the output, even at debug level
is cryptic:

$ SYSTEMD_LOG_LEVEL=debug build/systemd-detect-virt -r
Failed to check for chroot() environment: Permission denied

Let's make this a bit easier to figure out:

$ SYSTEMD_LOG_LEVEL=debug build/systemd-detect-virt -r
Cannot stat /proc/1/root: Permission denied
Failed to check for chroot() environment: Permission denied

I looked over other users of files_same(), and I think in general the message
at debug level is OK for them too.
2023-04-21 03:20:24 +08:00
Gustavo Noronha Silva 6b8e90545e Apply known iocost solutions to block devices
Meta's resource control demo project[0] includes a benchmark tool that can
be used to calculate the best iocost solutions for a given SSD.

  [0]: https://github.com/facebookexperimental/resctl-demo

A project[1] has now been started to create a publicly available database
of results that can be used to apply them automatically.

  [1]: https://github.com/iocost-benchmark/iocost-benchmarks

This change adds a new tool that gets triggered by a udev rule for any
block device and queries the hwdb for known solutions. The format for
the hwdb file that is currently generated by the github action looks like
this:

  # This file was auto-generated on Tue, 23 Aug 2022 13:03:57 +0000.
  # From the following commit:
  # ca82acfe93
  #
  # Match key format:
  # block:<devpath>:name:<model name>:

  # 12 points, MOF=[1.346,1.346], aMOF=[1.249,1.249]
  block:*:name:HFS256GD9TNG-62A0A:fwver:*:
    IOCOST_SOLUTIONS=isolation isolated-bandwidth bandwidth naive
    IOCOST_MODEL_ISOLATION=rbps=1091439492 rseqiops=52286 rrandiops=63784 wbps=192329466 wseqiops=12309 wrandiops=16119
    IOCOST_QOS_ISOLATION=rpct=0.00 rlat=8807 wpct=0.00 wlat=59023 min=100.00 max=100.00
    IOCOST_MODEL_ISOLATED_BANDWIDTH=rbps=1091439492 rseqiops=52286 rrandiops=63784 wbps=192329466 wseqiops=12309 wrandiops=16119
    IOCOST_QOS_ISOLATED_BANDWIDTH=rpct=0.00 rlat=8807 wpct=0.00 wlat=59023 min=100.00 max=100.00
    IOCOST_MODEL_BANDWIDTH=rbps=1091439492 rseqiops=52286 rrandiops=63784 wbps=192329466 wseqiops=12309 wrandiops=16119
    IOCOST_QOS_BANDWIDTH=rpct=0.00 rlat=8807 wpct=0.00 wlat=59023 min=100.00 max=100.00
    IOCOST_MODEL_NAIVE=rbps=1091439492 rseqiops=52286 rrandiops=63784 wbps=192329466 wseqiops=12309 wrandiops=16119
    IOCOST_QOS_NAIVE=rpct=99.00 rlat=8807 wpct=99.00 wlat=59023 min=75.00 max=100.00

The IOCOST_SOLUTIONS key lists the solutions available for that device
in the preferred order for higher isolation, which is a reasonable
default for most client systems. This can be overriden to choose better
defaults for custom use cases, like the various data center workloads.

The tool can also be used to query the known solutions for a specific
device or to apply a non-default solution (say, isolation or bandwidth).

Co-authored-by: Santosh Mahto <santosh.mahto@collabora.com>
2023-04-20 16:45:57 +02:00
Lennart Poettering 18010d394b
Merge pull request #27327 from DaanDeMeyer/hotplug
kmod-setup: Add early loading for virtio_console
2023-04-20 16:34:12 +02:00
Daan De Meyer a93aaede29 kmod-setup: Add early loading for virtio_console
getty-generator enables serial-getty@.service for virtualizer consoles
that it can find in /sys/class/tty. To make sure this works for
virtio consoles, let's make sure we load the module is loaded early
so that the /sys/class/tty/hvc0 exists before we run getty-generator.
2023-04-20 13:43:37 +02:00
Daan De Meyer d2f57745d5 core: Parse logging environment earlier
Let's make sure we parse the logging environment ASAP so that the
options apply to more code. e.g. to allow debugging kmod-setup.c
for example.
2023-04-20 13:43:37 +02:00
Daan De Meyer e1d8f702a2 kmod-setup: Introduce match_modalias_recurse_dir_cb()
Let's make the logic around matching a modalias a bit more generic.
2023-04-20 13:43:37 +02:00
Daan De Meyer 70cc7ed97e string-util: Add startswith_strv()
This is the function version of STARTSWITH_SET(). We also move
STARTSWITH_SET() to string-util.h as it fits more there than in
strv.h and reimplement it using startswith_strv().
2023-04-20 13:43:37 +02:00
Daan De Meyer 3fe07e9525 log: Log when kmsg is being ratelimited
Let's avoid confusing developers and users when log messages suddenly
stop getting logged to kmsg because of ratelimiting by logging an
additional message if we start ratelimiting log messages to kmsg.
2023-04-20 13:43:36 +02:00
Daan De Meyer 8750a06b6c log: Add knob to disable kmsg ratelimiting
This allows us to disable kmsg ratelimiting in the integration tests
and mkosi for easier debugging.
2023-04-20 13:43:34 +02:00
Lennart Poettering 14ce246771 dissect: let's check for crypto_LUKS before fstype allowlist check
When trying to mount a partition that is encrypted without the
encryption first having been set up we want to return a
recognizable error (EUNATCH). This was broken by
80ce8580f5 which added an allowlist check
for permissible file systems first. Let's reverse the check order, so
that we get EUNATCH again, as before. (And leave EIDRM as error for the
failed allowlist check).
2023-04-20 13:39:28 +02:00
Lennart Poettering ed6a6bac45 ratelimit: handle counter overflows somewhat sanely
An overflow here (i.e. the counter reaching 2^32 within a ratelimit time
window) is not so unlikely. Let's handle this somewhat sanely
and simply stop counting, while remaining in the "limit is hit" state until
the time window has passed.
2023-04-20 13:39:06 +02:00
Lennart Poettering 4d49f44f0f dissect-image: issue BLKFLSBUF before probing an fs at block device offset != 0
See added code comment for a longer explanation. TLDR: Linux maintains
distinct block device caches for partition and "whole" block devices,
and a simply BLKFLSBUF should make the worst confusions this causes go
away.
2023-04-20 13:38:32 +02:00
Robert Meijers 4646cdaa37 networkd: fallback to chaddr for static lease lookup when not found
DHCP static leases are looked up by the client identifier as send by
the client, while configured based on MAC. As RFC 2131 states the client
identifier is an opaque key and must not be interpreted by the server
this means that DHCP clients can (/will) also use a client identifier
which is not a MAC address. One of these clients actually is
systemd-networkd which uses an RFC 4361 by default to generate the
client identifier. For these kind of DHCP clients static leases thus
don't work because of this mismatch between configuring a MAC address
but the server matching based on client identifier. This adds a fallback
to try to look up a configured static lease based on the "chaddr" of the
DHCP message as this will always contain the MAC address of the client.

Fixes #21368
2023-04-20 19:18:50 +09:00
Yu Watanabe 114e85d28e core/device: rewrite how device unit is removed from Manager.devices_by_sysfs
If the device unit is not the head of the list saved in
Manager.devices_by_sysfs, then it is not necessary to replace the
existing hashmap entry. This should not change any behavior, just
refactoring.
2023-04-20 09:22:25 +02:00
Yu Watanabe 24a5370bbc list: fix double evaluation 2023-04-20 09:20:08 +02:00
Daan De Meyer 59e4eeed78
Merge pull request #27299 from yuwata/chase-absolute
chase: return absolute path when dir_fd points to the root directory
2023-04-20 09:19:22 +02:00
Yu Watanabe cb3c6aec3a core: add one missing assertion for release_resource_queue
Follow-up for 6ac62d61db.
2023-04-19 21:12:08 +01:00
Quintin Hill 0214ead6ee dissect-image: fix log level in dissect_log_error
Actually use the log_level argument in this function!

Fixes 4953e39
2023-04-20 02:04:15 +08:00
Yu Watanabe 60e761d8f3 chase: replace path_prefix_root_cwd() with chaseat_prefix_root()
The function path_prefix_root_cwd() was introduced for prefixing the
result from chaseat() with root, but
- it is named slightly generic,
- the logic is different from what chase() does.

This makes the name more explanative and specific for the result of the
chaseat(), and make the logic consistent with chase().

Fixes https://github.com/systemd/systemd/pull/27199#issuecomment-1511387731.

Follow-up for #27199.
2023-04-19 03:38:59 +09:00
Yu Watanabe 8d3c49b168 fd-util: skip to check mount ID if kernel is too old and /proc is not mounted
Now, dir_fd_is_root() is heavily used in chaseat(), which is used at
various places. If the kernel is too old and /proc is not mounted, then
there is no way to get the mount ID of a directory. In that case, let's
silently skip the mount ID check.

Fixes https://github.com/systemd/systemd/pull/27299#issuecomment-1511403680.
2023-04-19 03:38:47 +09:00
Yu Watanabe 4b1e461c49 mountpoint-util: check /proc is mounted on failure 2023-04-19 03:28:34 +09:00
Yu Watanabe 9a0dcf03fa chase: prefix with the root directory only when it is not "/" 2023-04-19 03:28:34 +09:00
Yu Watanabe 237bf933de chase: drop repeated call of empty_to_root() 2023-04-19 03:28:34 +09:00
Yu Watanabe b3ef56bc8e chase: update outdated comment about result path 2023-04-19 03:28:34 +09:00
Yu Watanabe 24be89ebd8 chase: make the result absolute when a symlink is absolute
As the path may be outside of the specified dir_fd.
2023-04-19 03:28:34 +09:00
Yu Watanabe c0552b359c chase: make chaseat() provides absolute path also when dir_fd points to the root directory
Usually, we pass the file descriptor of the root directory to chaseat()
when `--root=` is not specified. Previously, even in such case, the
result was relative, and we need to prefix the path with "/" when we
want to pass the path to other functions that do not support dir_fd, or
log or show the path. That's inconvenient.
2023-04-19 03:28:34 +09:00
Mike Yuan d81fc15254
Merge pull request #27323 from keszybz/gpt-auto-generator-warning-cleanup
gpt-auto-generator: do not error out when no partitions are found
2023-04-19 02:06:06 +08:00
Zbigniew Jędrzejewski-Szmek 4953e39c70 gpt-auto-generator: "translate" errno codes into proper messages
E.g. in logs on jammy-ppc64el in https://github.com/systemd/systemd/pull/27294:
Apr 16 17:42:50 H systemd-gpt-auto-generator[300]: Failed to dissect partition table of block device /dev/sda: No message of desired type
Apr 16 17:42:50 H (sd-execu[295]: /usr/lib/systemd/system-generators/systemd-gpt-auto-generator failed with exit status 1.

ee0e6e476e made this particular condition not an
error. But for other errnos we want to print a better message too.
dissect_loop_device_and_warn() already does this, but it always prints the
error at error level. We want to suppress some of the errors, so let's make the
print helper public and do the error suppression in the caller.
2023-04-18 11:58:33 +02:00
Zbigniew Jędrzejewski-Szmek de47cd0610 fstab-generator: add missing phrase in comment 2023-04-18 11:55:03 +02:00
Lennart Poettering 0a5d3c0b5b kmod-setup: bypass heavy virtio-rng check if we are not running in a VM anyway
detect_vm() is cheap, because cached, let's hence do that early before
we get out the big guns and sweep through sysfs.
2023-04-18 10:52:04 +02:00
Lennart Poettering fa505db314 kmod-setup: use STARTSWITH_SET() where appropriate 2023-04-18 10:51:00 +02:00
Lennart Poettering ff707dd1b1 Revert "getty-generator: Use device hotplug to instantiate virtualizer consoles"
This reverts commit e7e6ce5f8d.
2023-04-18 10:38:38 +02:00
Lennart Poettering 766c30a3b5
Merge pull request #27256 from medhefgo/boot-rdtsc
boot: Improve timer frequency detection
2023-04-18 10:38:15 +02:00
Yu Watanabe ee0e6e476e gpt-auto: do not fail when no suitable partitions found
Follow-up for 598fd4da1c.
2023-04-18 17:37:56 +09:00
Daan De Meyer e7e6ce5f8d getty-generator: Use device hotplug to instantiate virtualizer consoles
If getty-generator runs in the initrd, the corresponding tty might not
have been instantiated yet in /dev, which means a serial getty is not
spawned on it. Instead, let's instantiate the serial-getty when the
device appears so that it always gets instantiated.
2023-04-18 09:35:14 +02:00
Lennart Poettering b3a062cb80 lsm-util: move detection of support of LSMs into a new lsm-util.[ch] helper
This makes the bpf LSM check generic, so that we can use it elsewhere.
it also drops the caching inside it, given that bpf-lsm code in PID1
will cache it a second time a stack frame further up when it checks for
various other bpf functionality.
2023-04-18 08:22:21 +02:00
Dominique Martinet 25d9c6cdaf bpf-firewall: give a name to maps used
Running systemd with IP accounting enabled generates many bpf maps (two
per unit for accounting, another two if IPAddressAllow/Deny are used).

Systemd itself knows which maps belong to what unit and commands like
`systemctl status <unit>` can be used to query what service has which
map, but monitoring these values all the time costs 4 dbus requests
(calling the .IP{E,I}gress{Bytes,Packets} method for each unit) and
makes services like the prometheus systemd_exporter[1] somewhat slow
when doing that for every units, while less precise information could
quickly be obtained by looking directly at the maps.

Unfortunately, bpf map names are rather limited:
- only 15 characters in length (16, but last byte must be 0)
- only allows isalnum(), _ and . characters

If it wasn't for the length limit we could use the normal unit escape
functions but I've opted to just make any forbidden character into
underscores for maximum brievty -- the map prefix is also rather short:
This isn't meant as a precise mapping, but as a hint for admins who want
to look at these.

(Note there is no problem if multiple maps have the same name)

Link: https://github.com/povilasv/systemd_exporter [1]
2023-04-18 08:23:55 +09:00
Lennart Poettering 38cdd08b22 process-util: be more careful with pidfd_get_pid() special cases
Let's be more careful with generating error codes for (expected) error
causes.

This does not introduce new error conditions, it just changes what we
return under specific cases, to make things nicely recognizable in each
case. Most importantly this detects if fdinfo reports a pid of "-1" for
pidfds with processes that are already reaped (and thus have no PID
anymore)

None of our current users care about these error codes, but let's get
this right for the future.
2023-04-17 21:38:41 +01:00
Florian Klink 360c9cdc65 fsck: use execv_p_ and execl_p_
Instead of invoking find_executable on our own, use the variants of exec
provided by glibc which does this for us.
2023-04-17 19:56:06 +01:00
Luca Boccassi c9210b7470 creds: make available to all ExecStartPre= and ExecStart= processes
Fixes https://github.com/systemd/systemd/issues/27275
2023-04-17 17:47:28 +01:00
jcg 1034dfd0d8 user-util:remove duplicate includes 2023-04-17 23:58:04 +08:00
Benjamin Herrenschmidt aab896e213 virt: Further improve detection of EC2 metal instances
Commit f90eea7d18
virt: Improve detection of EC2 metal instances

Added support for detecting EC2 metal instances via the product
name in DMI by testing for the ".metal" suffix.

Unfortunately this doesn't cover all cases, as there are going to be
instance types where ".metal" is not a suffix (ie, .metal-16xl,
.metal-32xl, ...)

This modifies the logic to also allow those new forms.

Signed-off-by: Benjamin Herrenschmidt <benh@amazon.com>
2023-04-17 13:21:11 +01:00
Luca Boccassi ad7793b59c
Merge pull request #27298 from mrc0mmand/test-async-tweaks
test: modernize test-async a bit
2023-04-16 23:32:33 +01:00
Yu Watanabe 2cd04086ee process-util: make safe_fork() unset $NOTIFY_SOCKET
Propagating $NOTIFY_SOCKET is typically dangerous. Let's unset it unless
explicitly requested to keep it.

Fixes #27288.
Replaces #27291.
2023-04-17 05:46:32 +08:00
Frantisek Sumsal 3d9c3b7e89 test: modernize test-async a bit
Mainly to give it some debug output to, hopefully, see why it sometimes
gets stuck in CI when run with sanitizers.
2023-04-16 20:30:58 +02:00
Yu Watanabe 8521338f95 exec-util: make execute_strv() optionally take root directory
Preparation for rewriting kernel-install in C.
2023-04-16 19:40:12 +09:00
Yu Watanabe f384ce1187
Merge pull request #27283 from mrc0mmand/assorted-test-tweaks
test: a bunch of assorted tweaks, Saturday edition
2023-04-16 19:39:58 +09:00
Yu Watanabe d8e75260e9
Merge pull request #27253 from yuwata/cmsg-find-and-copy-data
socket-util: introduce CMSG_FIND_AND_COPY_DATA()
2023-04-16 16:28:26 +09:00
Frantisek Sumsal 841834d9c3 test: add a couple of tests with invalid UTF-8 characters 2023-04-16 09:21:13 +02:00
Frantisek Sumsal 192242c986 test: add a simple test for getenv_path_list() 2023-04-16 09:21:13 +02:00
Frantisek Sumsal 10a9466135 test: add a simple test for secure-bits stuff 2023-04-16 09:21:13 +02:00
Frantisek Sumsal 1b2719c2c5 shared: add a missing include 2023-04-16 09:21:13 +02:00
Frantisek Sumsal 9f7fcf80ad test: add tests for uuid/uint64 specifiers
They're used in repart, but are not part of the "common" specifier
lists, so cover them explicitly.
2023-04-16 09:21:13 +02:00
Yu Watanabe b5d39bb3ca tree-wide: also use CMSG_TYPED_DATA() on writing message header 2023-04-16 13:26:58 +09:00
Yu Watanabe 1ebb0953f0 sd-dhcp-server: use CMSG_FIND_DATA() at one more place 2023-04-16 13:26:58 +09:00
Yu Watanabe 789f5c6f70 tree-wide: copy timestamp data from cmsg
On RISCV32, time_t is 64bit and size_t is 32bit, hence the timestamp
data in message header may not be aligned.

Fixes #27241.
2023-04-16 13:26:58 +09:00
Yu Watanabe 4836f4c67d socket-util: introduce CMSG_FIND_AND_COPY_DATA()
The cmd(3) man page says about CMSG_DATA():
> The pointer returned cannot be assumed to be suitably aligned for
> accessing arbitrary payload data types. Applications should not cast
> it to a pointer type matching the payload, but should instead use
> memcpy(3) to copy data to or from a suitably declared object.

Hence, if we want to use unaligned data in cmsg, we need to copy it
before use. That's typically important for reading timestamps in
RISCV32, as the time_t is 64bit and size_t is 32bit on the system.
2023-04-16 13:26:55 +09:00
Frantisek Sumsal cb68860ece test: add a test case for table_dup_cell()
Also, sneak in coverage for "less popular" cell types.
2023-04-15 23:36:40 +02:00
Florian Klink a108fcbace fsck: look for fsck binary not just in /sbin
This removes remaining hardcoded occurences of `/sbin/fsck`, and instead
uses `find_executable` to find `fsck`.

We also use `fsck_exists_for_fstype` to check for the `fsck.*`
executable, which also checks in `$PATH`, so it's fair to assume fsck
itself is also available.
2023-04-15 10:29:50 +01:00
Daan De Meyer e77e07f601 preset: Add ignore directive
The ignore directive specifies to not do anything with the given
unit and leave existing configuration intact. This allows distributions
to gradually adopt preset files by shipping a ignore * preset file.
2023-04-14 20:27:59 +01:00
Luca Boccassi 3e5b771755
Merge pull request #27269 from poettering/statx-dont-sync
mountpoint-util: don't go to the network when doing statx() to detect mountpoints/mnt_id
2023-04-14 16:23:51 +01:00
Lennart Poettering d791013ff5 string-util: add strstrafter()
strstrafter() is like strstr() but returns a pointer to the first
character *after* the found substring, not on the substring itself.
Quite often this is what we actually want.

Inspired by #27267 I think it makes sense to add a helper for this,
to avoid the potentially fragile manual pointer increment afterwards.
2023-04-14 16:56:15 +02:00
Daan De Meyer bb7b1da8fe
Merge pull request #27252 from yuwata/chase-mkdir
chase: refuse CHASE_MKDIR_0755 without CHASE_NONEXISTENT or CHASE_PARENT
2023-04-14 15:19:57 +02:00
Luca Boccassi 4d67245472
Merge pull request #27266 from dtardon/take-struct
Use TAKE_STRUCT() to copy and reset structs
2023-04-14 14:15:35 +01:00
Lennart Poettering d230d4770d mountpoint-util: use memcmp_nn() where appropriate 2023-04-14 13:15:39 +02:00
Lennart Poettering 524ea5852a mountpoint-util: fix hosed overflow check
The overflow check was hosed in two ways: overflows in C are undefined,
hence gcc was free to just optimize the whole thing away. We need to
catch overflows before we run into them, not after.

It checked for an overflow against size_t, but the field we need to
write this in is unsigned. i.e. typically 32bit rather than 64bit. Hence
check for the right maximum.

(The whole check is paranoia anyway, the kernel really shouldn't return
values that would induce an overflow, but you never know, the syscall
turned out to be problematic in so many other ways, hence let's stick to
this.)
2023-04-14 13:15:39 +02:00
Lennart Poettering 92851defbd mountpoint-util: pass AT_STATX_DONT_SYNC to statx() when looking for mnt_id/mountpoints
The concept of a "mount" is a local one, hence there's no point in going
to the network to retrieve mnt_id or STATX_ATTR_MOUNT_ROOT. Hence set
AT_STATX_DONT_SYNC so that the call will not go to the network ever, and
risk deadlocking on that.

Just some extra safety.
2023-04-14 13:15:35 +02:00
David Tardon f52477d611 install: use FOREACH_ARRAY 2023-04-14 10:24:07 +02:00
David Tardon 05cdf6a701 tree-wide: rename cleanup function
... with accordance to the current coding style.
2023-04-14 10:24:07 +02:00
David Tardon 52c788e6e0 install: fix memory leak if GREEDY_REALLOC() fails 2023-04-14 10:23:15 +02:00
David Tardon cfc28ee232 tree-wide: add some asserts 2023-04-14 10:16:01 +02:00
David Tardon 088d71f8ed tree-wide: use TAKE_STRUCT 2023-04-14 10:15:44 +02:00
Yu Watanabe 4ea0bcb922 chase: CHASE_MKDIR_0755 requires CHASE_NONEXISTENT and/or CHASE_PARENT
When CHASE_MKDIR_0755 is specified without CHASE_NONEXISTENT and
CHASE_PARENT, then chase() succeeds only when the file specified by
the path already exists, and in that case, chase() does not create
any parent directories, and CHASE_MKDIR_0755 is meaningless.

Let's mention that CHASE_MKDIR_0755 needs to be specified with
CHASE_NONEXISTENT or CHASE_PARENT, and adds a assertion about that.
2023-04-14 16:36:13 +09:00
Yu Watanabe 5a2f674a00 chase: use FLAGS_SET() macro 2023-04-14 16:28:54 +09:00
Yu Watanabe 1113e50796 tree-wide: replace __alignof__() with alignof()
Addresses https://github.com/systemd/systemd/pull/27254#discussion_r1165267046.
2023-04-14 14:39:06 +09:00
Yu Watanabe 4db752e4aa socket-util: add one missing paren
Follow-up for b6256af75e.
2023-04-14 13:49:35 +09:00
Yu Watanabe 924937cbc0 timesync: drop unnecessary initialization 2023-04-14 13:49:35 +09:00
Yu Watanabe 13524b29a2
Merge pull request #27254 from poettering/cmsg-align-check
socket-util: tighten CMSG_TYPED_DATA() alignment checks
2023-04-14 13:49:04 +09:00
Luca Boccassi 2cba2fcd25
Merge pull request #27144 from enr0n/fix-scope-timer-on-coldplug
scope: do not disable timer event source when state is SCOPE_RUNNING
2023-04-14 00:25:06 +01:00
Luca Boccassi 6ef721cbc7 user units: implicitly enable PrivateUsers= when sandboxing options are set
Enabling these options when not running as root requires a user
namespace, so implicitly enable PrivateUsers=.
This has a side effect as it changes which users are visible to the unit.
However until now these options did not work at all for user units, and
in practice just a handful of user units in Fedora, Debian and Ubuntu
mistakenly used them (and they have been all fixed since).

This fixes the long-standing confusing issue that the user and system
units take the same options but the behaviour is wildly (and sometimes
silently) different depending on which is which, with user units
requiring manually specifiying PrivateUsers= in order for sandboxing
options to actually work and not be silently ignored.
2023-04-13 21:33:48 +01:00
Luca Boccassi ce963a747f
Merge pull request #27244 from bluca/uphold_retry
Uphold/StopWhenUnneeded/BindsTo: add retry timer on rate limit
2023-04-13 21:33:06 +01:00
Mike Yuan 6b7f150bbf core/main: fix a typo for --log-target
Follow-up for d2ebd50d7f

Fixes #27105
2023-04-13 21:29:35 +01:00
Nick Rosbrook e1f85b49b0 scope: do not disable timer event source when state is SCOPE_RUNNING
In scope_set_state(), the timer event source may be disabled depending
on the state. Currently, it will be disabled when the state is
SCOPE_RUNNING. This has the effect of new RuntimeMaxSec values being
ignored on coldplug.

Note that this issue is not currently present when scopes are started
because when scope_start() is called, scope_arm_timer() is called after
scope_set_state().
2023-04-13 14:34:41 -04:00
Luca Boccassi 0607a9f9da systemd-confext: mount confexts as noexec and nosuid
Confexts should not contain code, so mount confexts with noexec.
We cannot mount invidial extensions as noexec, as the overlay ignores
it and bypasses it, we need to use the flag on the whole overlay for
it to be effective.
But given there are legacy scripts still shipped in /etc, allow to
override it with --noexec=false.
2023-04-14 01:21:48 +08:00
Jan Janssen 2a3ae5fae0 boot: Use CPUID to detect TSC frequency
Aside from being more accurate on CPUs that report the information this
is also orders of magnitude faster than sleeping for 1ms.
2023-04-13 15:39:32 +02:00
Jan Janssen 706fd67e4a boot: Rework timer frquency reading
This is in preparation for the next commit.
2023-04-13 15:39:14 +02:00
Jan Janssen 09614b35c0 boot: Use compiler intrinsic for TSC 2023-04-13 15:36:27 +02:00
Luca Boccassi 4c7a0fc8d0 Uphold/StopWhenUnneeded/BindsTo: requeue when job finishes
When a unit is upheld and fails, and there are no state changes in
the upholder, it will not be retried, which is against what the
documentation suggests.

Requeue when the job finishes. Same for the other two queues.
2023-04-13 13:28:25 +01:00
OMOJOLA JOSHUA DAMILOLA 96ead603b8 systemd-cryptenroll: add string aliases for tpm2 PCRs
Fixes #26697. RFE.
2023-04-13 12:08:32 +01:00
Yu Watanabe 85ba4ca8f6 test: add several assertions
Follow-up for 7947dbe322.

Fixes CID#1508781 and CID#1508783.
2023-04-13 11:57:29 +01:00
Lennart Poettering 796da645a0
Merge pull request #18789 from gportay/veritysetup-add-options-for-parity-with-cryptsetup-verity-utility
veritysetup: Add options for parity support with the cryptsetup's verity utility
2023-04-13 11:32:57 +02:00
Yu Watanabe 06e78680e3 image-policy: introduce parse_image_policy_argument() helper
Addresses
84be0c710d (r1060130312),
84be0c710d (r1067927293), and
84be0c710d (r1067926416).

Follow-up for 84be0c710d.
2023-04-13 11:17:28 +02:00
Sjoerd Simons 771805eb44 repart: Discard from/to first/last usable lba
Repart considers the start and end of the usable space to the first multiple
of grainsz (at least 4096 bytes). However the first usable LBA of a GPT
partition is at sector 34 (512 bytes sectors) which is not a multiple of 4096.
The backup GPT label at the end also takes up 33 sectors, meaning the last
usable LBA is at 34 sectors from the end, unlikely to be a 4096 multiple as
well.

This meant that the very first and last sectors were never discarded. However
more problematically if an existing partition started before the first
usable grainsz multiple its start didn't get taken into account as a valid
starting point and got its data discarded.

Signed-off-by: Sjoerd Simons <sjoerd@collabora.com>
2023-04-13 11:12:52 +02:00
Lennart Poettering ca918f63b7 udev,sd-device: use CMSG_FIND_DATA() more 2023-04-13 10:49:23 +02:00
Lennart Poettering b1d0219136 tree-wide: port more code over to CMSG_TYPED_DATA() 2023-04-13 10:49:23 +02:00
Lennart Poettering 79dec6f5cc socket-util: tighten aignment check for CMSG_TYPED_DATA()
Apparently CMSG_DATA() alignment is very much undefined. Which is quite
an ABI fuck-up, but we need to deal with this. CMSG_TYPED_DATA() already
checks alignment of the specified pointer. Let's also check matching
alignment of the underlying structures, which we already can do at
compile-time.

See: #27241

(This does not fix #27241, but should catch such errors already at
compile-time instead of runtime)
2023-04-13 10:21:31 +02:00
Lennart Poettering 39857544ee
Merge pull request #27027 from dtardon/unit-file-list-cleanup
Use _cleanup_ for UnitFileList hash
2023-04-13 09:10:17 +02:00
Yu Watanabe 37734dc677 repart: always take BSD lock when whole block device is opened
Fixes #27236.
2023-04-13 09:07:00 +02:00
Lennart Poettering e8783d7620 pid1: add some debug logging when stashing ds into the fdstore 2023-04-13 06:44:27 +02:00
Lennart Poettering 81a1d6d679 service: rename service_close_socket_fd() → service_release_socket_fd()
Just to match service_release_stdio_fd() and service_release_fd_store()
in the name, since they do similar things.

This follows the concept that we "release" resources, and this is all
generically wrapped in "service_release_resources()".
2023-04-13 06:44:27 +02:00
Lennart Poettering 1ba84fef3c core: move runtime directory removal into release_resource handler
We already clear the various fds we keep from the release_resources()
handler, let's also destroy the runtime dir from there if this
preservation mode is selected.

This makes a minor semantic change: previously we'd keep a runtime
directory around if RuntimeDirectoryPreserve=restart is selected and at
least one JOB_START job was around. With this logic we'll keep it around
a tiny bit longer: as long as any job for the unit is around.
2023-04-13 06:44:27 +02:00
Lennart Poettering 99620f457e service: close fdstore asynchronously
The file descriptors we keep in the fdstore might be basically anything,
let's clean it up with our asynchronous closing feature, to not
deadlock on close().

(Let's also do the same for stdin/stdout/stderr fds, since they might
point to network services these days.)
2023-04-13 06:44:27 +02:00
Lennart Poettering 4fb8f1e883 service: allow freeing the fdstore via cleaning
Now that we have a potentially pinned fdstore let's add a concept for
cleaning it explicitly on user requested. Let's expose this via
"systemctl clean", i.e. the same way as user directories are cleaned.
2023-04-13 06:44:27 +02:00
Lennart Poettering b9c1883a9c service: add ability to pin fd store
Oftentimes it is useful to allow the per-service fd store to survive
longer than for a restart. This is useful in various scenarios:

1. An fd to some security relevant object needs to be stashed somewhere,
   that should not be cleaned automatically, because the security
   enforcement would be dropped then.

2. A user namespace fd should be allocated on first invocation and be
   kept around until the user logs out (i.e. systemd --user ends), á la
   #16328 (This does not implement what #16318 asks for, but should
   solve the use-case discussed there.)

3. There's interest in allow a concept of "userspace reboots" where the
   kernel stays running, and userspace is swapped out (i.e. all services
   exit, and the rootfs transitioned into a new version of it) while
   keeping some select resources pinned, very similar to how we
   implement a switch root. Thus it is useful to allow services to exit,
   while leaving their fds around till the very end.

This is exposed through a new FileDescriptorStorePreserve= setting that
is closely modelled after RuntimeDirectoryPreserve= (in fact it reused
the same internal type), since we want similar behaviour in the end, and
quite often they probably want to be used together.
2023-04-13 06:44:27 +02:00
Lennart Poettering c25fac9a17 service: rework how we release resources
Let's normalize how we release service resources, i.e. the three types
of fds we maintain for each service:

1. the fdstore
2. the socket fd for per-connection socket activated services
3. stdin/stdout/stderr

The generic service_release_resources() hook now calls into
service_release_fd_store() + service_close_socket_fd()
service_release_stdio_fd() one after the other, releasing them all for
the generic "release_resources" infra of the unit lifecycle.

We do no longer close the socket fd from service_set_state(), moving
this exclusively into service_release_resources(), so that all fds are
closed the same way.
2023-04-13 06:44:27 +02:00
Lennart Poettering 6ac62d61db service: release resources from a seperate queue, not unit_check_gc()
The per-unit-type release_resources() hook (most prominent use: to
release a service unit's fdstore once a unit is entirely dead and has no
jobs more) was currently invoked as part of unit_check_gc(), whose
primary purpose is to determine if a unit should be GC'ed. This was
always a bit ugly, as release_resources() changes state of the unit,
while unit_check_gc() is otherwise (and was before release_resources()
was added) a "passive" function that just checks for a couple of
conditions.

unit_check_gc() is called at various places, including when we wonder if
we should add a unit to the gc queue, and then again when we take it out
of the gc queue to dtermine whether to really gc it now. The fact that
these checks have side effects so far wasn't too problematic, as the
state changes (primarily: that services would empty their fdstores) were
relatively limited and scope.

A later patch in this series is supposed to extend the service state
engine with a separate state distinct from SERVICE_DEAD that is very
much like it but indicates that the service still has active resources
(specifically the fdstore). For cases like that the releasing of the
fdstore would result in state changes (as we'd then return to a classic
SERVICE_DEAD state).  And this is where the fact that the
release_resources() is called as side-effect becomes problematic: it
would mean that unit state changes would instantly propagate to state
changes elsewhere, though we usually want this to be done through the
run queue for coalescing and avoidance of recursion.

Hence, let's clean this up: let's move the release_resources() logic
into a queue of its own, and then enqueue items into it from the general
state change notification handle in unit_notify().
2023-04-13 06:44:27 +02:00
Lennart Poettering 47226e893b core: fix property getter method for NFileDescriptorStore bus property
Since da6053d0a7 this is a size_t, not an
unsigned. The difference doesn't matter on LE archs, but it matters on
BE (i.e. s390x), since we'll return entirely nonsensical data.

Let's fix that.

Follow-up-for: da6053d0a7

An embarassing bug introduced in 2018... That made me scratch my head
for way too long, as it made #27135 fail on s390x while it passed
everywhere else.
2023-04-13 06:41:27 +02:00
Gaël PORTAY 21c60c76e1 veritysetup: add support for fec options
The verity fec_* parameters allows to use Forward Error Correction to
recover from corruption if hash verification fails.

This adds the options fec_device, fec_offset and fec_roots (sixth
argument) which are the equivalent of the options --fec-device,
--fec-offset and --fec-roots in the veritysetup world.
 - fec-device=FILE
 - fec-offset=BYTES
 - fec-roots=UINT64

See `veritysetup(8)` for more details.
2023-04-13 05:39:49 +02:00
Gaël PORTAY 0bbf7a842a veritysetup: add support for superblock and underlying options
The verity parameter no_superblock allows to format/open an hash device
without the superblock. However, the superblock data must be set to open
the data-device.

This adds the option superblocks (sixth argument) and all the underlying
options which are implied to set the superblock manually if hash device
has no superblock:

 - superblock=BOOL
 - format=NUMBER (hash version type, 0 for original ChromeOS, 1 for
   modern)
 - data-block-size=BYTES (max page-size, multiple of 512)
 - hash-block-size=BYTES (max page-size, multiple of 512)
 - data-blocks=BLOCKS (size of data-device in blocks)
 - salt=HEXSTR (salt used at format, max 256 bytes)
 - uuid=UUID
 - hash=STR (algorithm name for dm-verity used at format, default is
   sha256)

See `veritysetup(8)` for more details.
2023-04-13 05:15:20 +02:00
Gaël PORTAY 14de7ef914 veritysetup: add support for hash-offset option
The verity parameter hash_area_offset allows to locate the superblock in
the hash device. It can be used to have a single device which contains
both data and hashes.

This adds the option hash-offset=BYTES (sixth argument) which is the
equivalent of the option --hash-offset in the veritysetup world.

See `veritysetup(8)` for more details.
2023-04-13 05:15:17 +02:00
David Schroeder 9c669abb71
pid1: fix coredump_filter setting
Correct what appears to be a copy/paste error in config_parse_exec_coredump_filter that is preventing the coredump_filter setting from working correctly.
2023-04-13 07:48:21 +08:00
Luca Boccassi 7223d500ac Uphold/StopWhenUnneeded/BindsTo: add retry timer on rate limit
The Upholds= promise is that as long as unit A is up and Upholds=B,
B will be activated if failed or inactive. But there is a hard-coded,
non-configurable rate limit for this, so add a timed retry after the
ratelimit has expired.

Apply to BindsTo= and StopWhenUnneeded= as well.
2023-04-12 21:49:48 +01:00
Lennart Poettering 112f27fdbf
Merge pull request #27153 from poettering/varlin-fd-pass
varlink: implement file descriptor passing
2023-04-12 20:34:01 +02:00
Mike Yuan 93ba4c1bc0
Merge pull request #27212 from DaanDeMeyer/notify-exit
core: Propagate exit status via notify socket when running in VM
2023-04-13 01:12:03 +08:00
Mike Yuan 7581da99a1
Merge pull request #27229 from poettering/dissect-policy-confext
dissect: follow-up for image policy merge
2023-04-13 00:14:30 +08:00
David Tardon 90570f6107 systemctl: fix a memory leak
valgrind systemctl is-enabled --root=/ -l default.target >/dev/null
==746041== Memcheck, a memory error detector
==746041== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==746041== Using Valgrind-3.20.0 and LibVEX; rerun with -h for copyright info
==746041== Command: systemctl is-enabled --root=/ -l default.target
==746041==
==746041==
==746041== HEAP SUMMARY:
==746041==     in use at exit: 8,251 bytes in 4 blocks
==746041==   total heap usage: 3,440 allocs, 3,436 frees, 1,163,346 bytes allocated
==746041==
==746041== LEAK SUMMARY:
==746041==    definitely lost: 24 bytes in 1 blocks
==746041==    indirectly lost: 35 bytes in 1 blocks
==746041==      possibly lost: 0 bytes in 0 blocks
==746041==    still reachable: 8,192 bytes in 2 blocks
==746041==         suppressed: 0 bytes in 0 blocks
==746041== Rerun with --leak-check=full to see details of leaked memory
==746041==
==746041== For lists of detected and suppressed errors, rerun with: -s
==746041== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
2023-04-12 17:13:52 +02:00
David Tardon 2768156357 install: rename function 2023-04-12 17:11:52 +02:00
David Tardon 27beea26ef install: drop unused function 2023-04-12 17:09:41 +02:00
David Tardon 1abcc826ca test: use _cleanup_ for UnitFileList hash 2023-04-12 17:09:41 +02:00
David Tardon 6ff02eac41 systemctl-list-unit-files: drop workaround for Coverity
This partially reverts commit 0da999fada .
2023-04-12 17:09:38 +02:00
David Tardon 0bd5a57a57 systemctl: drop stray assignment 2023-04-12 17:04:38 +02:00
David Tardon 6ecf4b7819 systemctl: use _cleanup_ for UnitFileList hash
This also fixes a memory leak in the old code.

valgrind systemctl -t socket --root=/ list-unit-files >/dev/null
==2601899== Memcheck, a memory error detector
==2601899== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==2601899== Using Valgrind-3.20.0 and LibVEX; rerun with -h for copyright info
==2601899== Command: systemctl -t socket --root=/ list-unit-files
==2601899==
==2601899==
==2601899== HEAP SUMMARY:
==2601899==     in use at exit: 39,984 bytes in 994 blocks
==2601899==   total heap usage: 344,414 allocs, 343,420 frees, 2,001,612,404 bytes allocated
==2601899==
==2601899== LEAK SUMMARY:
==2601899==    definitely lost: 7,952 bytes in 497 blocks
==2601899==    indirectly lost: 32,032 bytes in 497 blocks
==2601899==      possibly lost: 0 bytes in 0 blocks
==2601899==    still reachable: 0 bytes in 0 blocks
==2601899==         suppressed: 0 bytes in 0 blocks
==2601899== Rerun with --leak-check=full to see details of leaked memory
==2601899==
==2601899== For lists of detected and suppressed errors, rerun with: -s
==2601899== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
2023-04-12 17:03:55 +02:00
Yu Watanabe 8e1d6003fb
Merge pull request #27217 from yuwata/boot-entry-at
boot-entry: introduce _at() variant
2023-04-12 22:59:54 +09:00
Lennart Poettering db1f7c84ea varlink: honour "sensitive" flag of json variant objects all the way into the socket
Let's honour the flag if it is set, just to be safe.

(This only handles the case for the writing side: whenever the client
code hands us a json object with the flag set we'll honour it till the
it's out of reach for us. This does *not* handle the reading side, which
is left for a later patch once needed. We probably should add a
per-connection flag that simply globally enables the sensitive logic for
all messages coming in on a specific varlink conneciton.)
2023-04-12 15:14:21 +02:00
Lennart Poettering 7947dbe322 test: add varlink fd passing test 2023-04-12 15:14:21 +02:00
Lennart Poettering d37cdac6ce varlink: implement file descriptor passing
Let's add infrastructure to implement fd passing in varlink, when used
over AF_UNIX.

This will optionally associate one or more fds with a message sent via
varlink and deliver it to the server.
2023-04-12 15:14:21 +02:00
Lennart Poettering 790446bd6c varlink: add helper that clears the currently processed incoming message JSON object
Some minor refactoring. This adds a helper call whose only job is to
unref the JSON object of the currently processed incoming message.

This doesn't make too much sense on its own, given this just replaces
one line by another. However, in a later patch when we'll add fd passing
we'll extend the function to also destroy associated fds, and then it
will start to make more sense.
2023-04-12 15:14:21 +02:00
Lennart Poettering 8531631763 varlink: get rid of "reply" field
So far, if we do a synchronous varlink call from the client side via
varlink_call(), we'll
move the returned json object from "v->current" into "v->reply", and
keep it referenced there until the next call. We then return a pointer
to it. This ensures that the json object remains valid between two
varlink_call() invocations.

But the thing is, we don't need a separate field for that, we can just
leave the data in "v->current". This means VARLINK_IDLE_CLIENT state
will be permitted with and without v->current initialized. Initially,
after connection setup it will be set to NULL, but after the first
varlink_call() it will be set to the most recent response, pinning it
into memory.
2023-04-12 15:14:21 +02:00
Lennart Poettering a3861b4726 varlink: add some comments explaining what by various errors are defined 2023-04-12 15:14:21 +02:00
Daan De Meyer 88eec29d18 core: Send ERRNO= via notify socket on exit 2023-04-12 15:03:45 +02:00
Daan De Meyer 3a89cb84a6 core: Propagate exit status via notify socket when running in VM
When running in a container, we can propagate the exit status of
pid1 as usual via the process exit status. This is not possible
when running in a VM. Instead, let's send EXIT_STATUS=%i via the
notify socket if one is configured. The user running the VM can then
pick up the exit status from the notify socket after the VM has shut
down.
2023-04-12 15:03:43 +02:00
Daan De Meyer 623a00020f notify: Add EXIT_STATUS field
Whenever one of our tools or daemons exits, let's send the exit status
via sd-notify in the EXIT_STATUS field.
2023-04-12 15:02:34 +02:00
Lennart Poettering 4f25844a4b sysext: define a default image dissection policy for confext images 2023-04-12 14:54:44 +02:00
Lennart Poettering b151e69671 discover-image: bring discover path list up-to-date.
While merge 3af48a86d9 was for a working
PR it was based on an older version of git main. Let's catch up with the
search path changes from de862276ed.
2023-04-12 14:41:32 +02:00
Daan De Meyer 14cb10b737 Fix compilation error 2023-04-12 14:36:14 +02:00
Thierry Martin 2f091b1b49 nspawn: container network interface naming
systemd-nspawn now optionally supports colon-separated pair of
host interface name and container interface name for --network-macvlan, --network-ipvlan and --network-interface options.
Also supported in .nspawn configuration files (i.e Interface=, MACVLAN=, IPVLAN= parameters).

man page changed for ntwk interface naming
2023-04-12 14:28:43 +02:00
Lennart Poettering 3af48a86d9
Merge pull request #25608 from poettering/dissect-moar
dissect: add dissection policies
2023-04-12 13:46:08 +02:00
Luca Boccassi 068943453f
Merge pull request #27165 from poettering/fdstore-envvar
service: tell service processes that the fdstore is available via an e…
2023-04-12 12:13:43 +01:00
Yu Watanabe d2d969bb45 boot-entry: introduce boot_entry_token_ensure_at() 2023-04-12 19:47:34 +09:00
Yu Watanabe e61ab091b7
Merge pull request #27223 from dtardon/install-changes
Simplify use of bus_deserialize_and_dump_unit_file_changes()
2023-04-12 19:30:51 +09:00
Daan De Meyer ea24ed79f6
Merge pull request #27220 from yuwata/sd-device-follow-ups-for-devlink
sd-device: several follow-ups about devlink creation
2023-04-12 11:49:08 +02:00
Lennart Poettering 75b29fda71 service: tell service processes that the fdstore is available via an env var 2023-04-12 10:34:31 +02:00
David Tardon 234d964c2e systemctl: reduce variable scope 2023-04-12 09:53:55 +02:00
David Tardon 5e891cbb5c tree-wide: drop unneeded output params
Neither of the callers of bus_deserialize_and_dump_unit_file_changes()
touches the changes array, so let's simplify things and keep it internal
to the function.
2023-04-12 09:53:55 +02:00
Yu Watanabe f643ca1767
Merge pull request #27033 from dtardon/array-cleanup
Use CLEANUP_ARRAY more
2023-04-12 16:43:39 +09:00
Yu Watanabe fda18ce2b6 boot-entry: use chase_and_fopen_unlocked() to open /etc/kernel/entry-token
Otherwise, when 'root' is specified, the file may be a symlink to a host
file, and we may read wrong entry.
2023-04-12 16:23:03 +09:00
Yu Watanabe 70e4510805 sd-device: absolute devlink must start with /dev/
This also makes device node path is handled with the same logic.

Addresses https://github.com/systemd/systemd/pull/27169#discussion_r1162739511.

Follow-up for 2c5f119c3c.
2023-04-12 09:20:11 +09:00
Yu Watanabe 3b5fc5fb1b boot-entry: prioritize machine ID only when it is not randomly generated
Preparation for later commits. The parameter will be used in
kernel-install later.
2023-04-12 08:31:50 +09:00
Daan De Meyer 965b481d9b
Merge pull request #27214 from DaanDeMeyer/firstboot
firstboot: Use root directory file descriptor for everything
2023-04-11 22:30:09 +02:00
Tanishka fd7623193d Modified to use STRV_MAKE() in strv_env_name_is_valid() function listed in env-util.c 2023-04-11 21:05:22 +02:00
Mike Yuan 8a826a979a systemctl: suppress error for try-* if unit is masked
Closes #16521
2023-04-11 17:54:02 +01:00
Jan Janssen b87d6da447 boot: Fix alignment of long long inside structs on x86
On x86 EFI follows the windows ABI, which expects 8-byte aligned long
long. The x86 sysv ELF ABI expects them to be 8-byte aligned when used
alone, but 4-byte aligned when they appear inside of structs:

    struct S {
        int i;
        long long ll;
    };

    // _Static_assert(sizeof(struct S) == 12, "x86 sysv ABI");
    _Static_assert(sizeof(struct S) == 16, "EFI/MS ABI");

To get the behavior we need when building with sysv ELF ABI we need to
pass '-malign-double' to the compiler as done by EDK2.

This in turn will make ubsan unhappy as the stack may not be properly
aligned on entry, so we have to tell the compiler explicitly to re-align
the stack on entry to efi_main.

This fixes loading EFI drivers on x86 that were previously always
rejected as the EFI_LOADED_IMAGE_PROTOCOL had a wrong memory layout.

See also: https://github.com/rhboot/shim/pull/516
2023-04-11 17:09:18 +01:00
David Tardon f86a41291b portabled-image-bus: use CLEANUP_ARRAY 2023-04-11 16:32:48 +02:00
David Tardon a5290effe8 portabled-image-bus: use CLEANUP_ARRAY 2023-04-11 16:32:47 +02:00
David Tardon 2b4b01b00a portabled-image-bus: use CLEANUP_ARRAY 2023-04-11 16:32:46 +02:00
David Tardon 0dab8d5dc7 portabled-bus: use CLEANUP_ARRAY 2023-04-11 16:32:45 +02:00
David Tardon bd92527752 sd-bus: use _cleanup_ 2023-04-11 16:31:52 +02:00
David Tardon 04375b6213 sd-bus: use CLEANUP_ARRAY 2023-04-11 16:30:07 +02:00
David Tardon 29933daf9e execute: use CLEANUP_ARRAY 2023-04-11 16:25:07 +02:00
David Tardon 93404d340e execute: use more automatic cleanup 2023-04-11 16:16:33 +02:00
David Tardon ed8267c727 execute: use CLEANUP_ARRAY 2023-04-11 16:11:14 +02:00
David Tardon 608022a935 systemctl-set-default: use CLEANUP_ARRAY 2023-04-11 16:11:13 +02:00
David Tardon cc8fc3d3db systemctl-preset-all: shorten code a tiny bit 2023-04-11 16:11:11 +02:00
David Tardon ae9ff778cd systemctl-preset-all: use CLEANUP_ARRAY 2023-04-11 16:11:09 +02:00
David Tardon 9a57f69844 systemctl-enable: use CLEANUP_ARRAY 2023-04-11 16:11:06 +02:00
David Tardon aa1c1ba1d6 systemctl-add-dependency: shorten code a tiny bit 2023-04-11 16:11:02 +02:00
David Tardon a372f9f16b systemctl-add-dependency: use CLEANUP_ARRAY 2023-04-11 16:11:01 +02:00
David Tardon 1b544e323e portablectl: use CLEANUP_ARRAY 2023-04-11 16:08:00 +02:00
David Tardon 48a50accfe machinectl: do not repeat the same comparison 2023-04-11 15:34:13 +02:00
David Tardon 8df3e0eec5 machinectl: drop unneeded else 2023-04-11 15:33:33 +02:00
David Tardon ffddb3c945 machinectl: use CLEANUP_ARRAY 2023-04-11 15:31:50 +02:00
David Tardon 2a711edd87 dbus-manager: use CLEANUP_ARRAY 2023-04-11 15:28:36 +02:00
Daan De Meyer a0657479f5 firstboot: Use root directory file descriptor for everything
There were a few remaining cases where we used arg_root instead of
the root directory file descriptor. Let's port those over to use the
root directory file descriptor as well.
2023-04-11 15:22:08 +02:00
Daan De Meyer bd595c10e7 user-util: Add default_root_shell_at() 2023-04-11 15:21:51 +02:00
David Tardon f8888b9a3c dbus-manager: use CLEANUP_ARRAY 2023-04-11 15:08:03 +02:00
Daan De Meyer 73c43e96e7
Merge pull request #27186 from yuwata/os-release
os-util: several cleanups and introduce _at() variants of os-release parsers
2023-04-11 14:54:56 +02:00
Zbigniew Jędrzejewski-Szmek ba5a469648
Merge pull request #27169 from yuwata/udev-rule-refuse-unsafe-path
sd-device,udev: refuse unsafe path in SYMLINK= and TAG=
2023-04-11 14:43:50 +02:00
Yu Watanabe 538d878dbd os-util: introduce several _at() variants of os-release parsers 2023-04-11 18:49:45 +09:00
Yu Watanabe 5cf69e709e os-util: make $SYSTEMD_OS_RELEASE prefixed with the root directory
To make it consistent with other env vars, e.g. $SYSTEMD_ESP_PATH or
$SYSTEMD_XBOOTLDR_PATH.

This is useful when the root is specified by a file descriptor, instead
of a path.
2023-04-11 18:49:23 +09:00
Yu Watanabe f4a1d32c82 os-util: merge parse_{extension,os}_release() 2023-04-11 18:49:23 +09:00
Yu Watanabe 7ef43c78df os-util: invert order of arguments in extension release parser
For consistency with other functions.
Unfortunately, va_start() requires that the previous argument is a
pointer, hence the order of the arguments in the internal function
cannot be changed.
2023-04-11 18:49:23 +09:00
Yu Watanabe 61acfd8311 os-util: shorten temporal variable names
No functional change, just refactoring.
2023-04-11 18:49:20 +09:00
Yu Watanabe 59c4707594 os-util: log one more error cause 2023-04-11 18:48:58 +09:00
Yu Watanabe c9d64f8a2c os-util: do not use 'r' for storing loop status
The variable 'r' is usually used for storing return value of functional
call. Let's introduce another boolean to store the current loop status.

No functional change, just refactoring.
2023-04-11 18:48:58 +09:00
Yu Watanabe 7421f20c7e os-util: return earlier when unsupported image class is specified 2023-04-11 18:47:15 +09:00
Yu Watanabe 7213c75045 os-util: return earlier when extension release file is found
No functional change, just refactoring.
2023-04-11 18:44:50 +09:00
Yu Watanabe a84677e0f4 os-util: split-out open_os_release() from open_extension_release()
The logics of opening os-release and extension-release are completely
different.
No functional change, just refactoring.
2023-04-11 18:44:50 +09:00
Yu Watanabe 6f0f4d1488 os-util: fix fd leak on failure 2023-04-11 18:44:50 +09:00
Yu Watanabe 396ec9587c os-util: make open_extension_release() return O_PATH fd 2023-04-11 18:44:50 +09:00
Yu Watanabe 53cbf5f9a6 os-util: drop fopen_extension_release() 2023-04-11 18:44:50 +09:00
Yu Watanabe bfeaa62dbc compress: replace compress_blob() with compress_blob_explicit()
And make compress_xyz() return 0 on success, as we know which compression
algorithm is used when calling compress_blob().

Follow-up for 2360352ef0.
2023-04-11 09:14:34 +02:00
Daan De Meyer 7cb9ed5d38
Merge pull request #27206 from yuwata/udev-rename
udev: rename arguments and options, update comments
2023-04-11 09:12:21 +02:00