Commit graph

72364 commits

Author SHA1 Message Date
Lennart Poettering 593428680c build-sys: pick up vmlinux.h from running kernel BTF or user 2024-04-06 16:08:24 +02:00
Lennart Poettering 5300fe74a1 dissect-image: document one more dissected_image_decrypt() error code 2024-04-06 16:08:23 +02:00
Lennart Poettering 44e3097dff dissect-image: make dissected_image_acquire_metadata() operate within a userns if possible
This opens the door for making the call work without privileges: if we
pass in a userns fd and DissectedImage that has mount fds then we can
acquire all information without privs.
2024-04-06 16:08:23 +02:00
Lennart Poettering 77740bddbe dissect-image: add a new helper that checks if VeritySettings has anything set at all 2024-04-06 16:08:23 +02:00
Lennart Poettering 9444e54e56 dissect-image: add dissected_image_close() that closes all references to resources 2024-04-06 16:08:23 +02:00
Lennart Poettering f7178a04db discover-image: export search paths array
This way we can use it to validate image paths later.
2024-04-06 16:08:23 +02:00
Lennart Poettering b2dcfd8e11 cgroup-setup: add fd-based version of cg_attach() 2024-04-06 16:08:23 +02:00
Lennart Poettering 3b2874952f cgroup-util: add helpers for opening cgroup by id 2024-04-06 16:08:23 +02:00
Lennart Poettering cb1b813f0d lock-util: make global lock return parameter to image_path_lock() optional
When adding unprivileged nspawn support we don't really want a global
lock file, since we cannot even access the dir they are stored in, hence
make the concept optional.

Some minor other modernizations.
2024-04-06 16:08:23 +02:00
Lennart Poettering 0f716ace41 bpf-dlopen: pick up more symbols from libbpf 2024-04-06 16:08:23 +02:00
Lennart Poettering e4f62e7a12 namespace-util: add new helper is_our_namespace() 2024-04-06 16:08:23 +02:00
Lennart Poettering 574a07c79d namespace-util: add namespace_open_by_type() helper 2024-04-06 16:08:23 +02:00
Lennart Poettering 2ad2f0c89e namespace-util: add detach_mount_namespace_userns() 2024-04-06 16:08:23 +02:00
Lennart Poettering e02fb2099c namespace-util: add helper for allocating an empty userns fd 2024-04-06 16:08:23 +02:00
Lennart Poettering 5783b4a954 namespace-util: add detach_mount_namespace_harder()
This is just like detach_mount_namespace() but if need be uses unpriv
user namespaces to be able to execute CLONE_NEWNS.
2024-04-06 16:08:23 +02:00
Lennart Poettering afdd0efa63 uid-range: add some basic operations on UidRange objects
Helpers to compare and get size, and whether the object is empty.
2024-04-06 16:08:23 +02:00
Lennart Poettering 20ba086e77 uid-range: add new uid_range_load_userns_by_fd() helper
This is similar to uid_range_load_userns() but instead of reading the
uid_map off a process it reads it off a userns fd.

(Of course the kernel has no API for this right now, hence we fork off a
throw-away process which joins the user namespace, and then read off the
data from there.)
2024-04-06 16:08:23 +02:00
Lennart Poettering 6ebb53d945 uid-range: optionally load outside view of UID range from uid_map procfs file 2024-04-06 16:08:23 +02:00
Lennart Poettering 5bff40e719 uid-range: add uid_range_overlaps() helper 2024-04-06 16:08:23 +02:00
Lennart Poettering 2251e4ef90 image-policy: add a new image_policy_intersect() call
This new call takes two image policy objects and generates an
"intersection" policy, i.e. only allows what is allowed by both. Or in
other words it conceptually implements a binary AND of the policy flags.
(Except that it's a bit harder, due to normalization, and underspecified
flags).

We can use this later for mountfsd: a client can specify a policy, and
mountfsd can specify another policy, and we'll then apply only what both
allow.

Note that a policy generated like this might be invalid. For example, if
one policy says root must exist and be verity or luks protected, and the
other policy says root must be absent, then the intersection is invalid,
since one policy only allows what the other prohibits and vice versa.
We'll return a clear error code in that case (ENAVAIL). (This is because
we simply don't allow encoding such impossible policies in an
ImagePolicy structure, for good reasons.)
2024-04-06 16:08:23 +02:00
Lennart Poettering b219dcd45a varlink: add varlink_peek_dup_fd() helper
This new call is like varlink_peek_fd() (i.e. gets an fd out of the
connection but leaving it also in there), and combines ith with
F_DUPFD_CLOEXEC to make a copy of it.

We previously already had varlink_dup_fd() which was a duplicating
version for pushing an fd *into* the connection. To reduce confusion,
let's rename that one varlink_push_dup_fd() to make the symmetry to
valrink_push_fd() clear so that we have no:

varlink_peer_push_fd()        → put fd in without dup'ing
varlink_peer_push_dup_fd()    → same with F_DUPFD_CLOEXEC
varlink_peer_peek_fd()        → get fd out without dup'ing
varlink_peer_peek_dup_fd()    → same with F_DUPFD_CLOEXEC
2024-04-06 16:08:23 +02:00
Lennart Poettering 52bd61373b varlink: add varlink_get_peer_gid() helper 2024-04-06 16:08:23 +02:00
Frantisek Sumsal b3a8264831 test: improve debug-ability of test-execute
Since e56a8790a0 debugging test-execute fails has been a royal PITA, since
we ditch all potentially useful output from the test units (that, for
the most part, run `sh -x ...`). Let's improve the situation a bit by
setting EXEC_OUTPUT_NULL only when running the single test case that
needs it, and inheriting stdout otherwise.

For example, with a purposefully introduced error we get this output
with this patch:
exec-personality-x86-64.service: About to execute: sh -x -c "c=\$\$(uname -m); test \"\$\$c\" = \"foo_bar\""
Serializing sd-executor-state to memfd.
...
        Personality: x86-64
        LockPersonality: no
        SystemCallErrorNumber: kill
++ uname -m
+ c=x86_64
+ test x86_64 = foo_bar
Received SIGCHLD from PID 1520588 (sh).
Child 1520588 (sh) died (code=exited, status=1/FAILURE)
exec-personality-x86-64.service: Child 1520588 belongs to exec-personality-x86-64.service.
exec-personality-x86-64.service: Main process exited, code=exited, status=1/FAILURE
exec-personality-x86-64.service: Failed with result 'exit-code'.
...
        Exit Status: 1
src/test/test-execute.c:456:test_exec_personality: exec-personality-x86-64.service: can_unshare=yes: exit status 1, expected 0
(test-execute-root) terminated by signal ABRT.
Assertion 'r >= 0' failed at src/test/test-execute.c:1433, function prepare_ns(). Aborting.
Aborted

But without it, we'd miss the most important part:
exec-personality-x86-64.service: About to execute: sh -x -c "c=\$\$(uname -m); test \"\$\$c\" = \"foo_bar\""
Serializing sd-executor-state to memfd.
...
        Personality: x86-64
        LockPersonality: no
        SystemCallErrorNumber: kill
Received SIGCHLD from PID 1521365 (sh).
Child 1521365 (sh) died (code=exited, status=1/FAILURE)
exec-personality-x86-64.service: Child 1521365 belongs to exec-personality-x86-64.service.
exec-personality-x86-64.service: Main process exited, code=exited, status=1/FAILURE
exec-personality-x86-64.service: Failed with result 'exit-code'.
...
        Exit Status: 1
src/test/test-execute.c:456:test_exec_personality: exec-personality-x86-64.service: can_unshare=yes: exit status 1, expected 0
(test-execute-root) terminated by signal ABRT.
Assertion 'r >= 0' failed at src/test/test-execute.c:1433, function prepare_ns(). Aborting.
Aborted
2024-04-06 13:24:36 +01:00
Luca Boccassi 3abc3671f5
Merge pull request #31131 from poettering/dlopen-kmod
turn libkmod into a dlopen() dependency, too
2024-04-06 13:19:27 +01:00
Vito Caputo a7d8cacce0 man: fix typo s/veno/reno/ 2024-04-06 07:12:33 +02:00
Luca Boccassi b9c5a0d2cc
Merge pull request #32115 from YHNdnzj/service-main-pid-take
core/service: a few improvements for main pid handling
2024-04-05 23:53:13 +01:00
Luca Boccassi e92042269e
Merge pull request #32123 from mrc0mmand/assorted-tweaks
A couple of assorted tweaks
2024-04-05 22:22:06 +01:00
Luca Boccassi 1281115957
Merge pull request #32125 from YHNdnzj/post-merge-stuff
Trivial post merge stuff
2024-04-05 22:18:31 +01:00
Mike Yuan 120be68b8d
core/service: add a FIXME to use pidfd to monitor foreign processes 2024-04-06 02:22:19 +08:00
Mike Yuan a3980843ef
core/service: complain louder if new MAINPID= is refused 2024-04-06 02:22:19 +08:00
Mike Yuan c603f523d0
core/service: make service_set_main_pidref consume pidref
Currently, the memory management of service_set_main_pidref
is a bit odd. Normally we either invalidate the original
resource on caller's side after the call succeeds, or
just pass the ownership wholly. But service_set_main_pidref
take a pointer, and calls pidref_done() internally.

Let's just make it consume the passed pidref. This is more
straightforward.
2024-04-06 02:22:19 +08:00
Mike Yuan 36b21fac8f
sleep: rename SleepMemMode= to MemorySleepMode=
Addresses https://github.com/systemd/systemd/pull/31986#discussion_r1554053623
2024-04-06 02:16:54 +08:00
Mike Yuan 99f3b67f3f
os-util: use ENDSWITH_SET where appropriate
Addresses https://github.com/systemd/systemd/pull/31435#discussion_r1553969156

Co-authored-by: Lennart Poettering <lennart@poettering.net>
2024-04-06 02:16:53 +08:00
Frantisek Sumsal 1d07188b15 base-filesystem: check for __s390x__ first
On s390x both __s390__ and __s390x__ are defined, and with the original
order we'd go through the __s390__ branch and emit a warning:

[169/2118] Compiling C object src/shared/libsystemd-shared-256.a.p/base-filesystem.c.o
../src/shared/base-filesystem.c:136:11: note: ‘#pragma message: Please add an entry above specifying whether your architecture uses /lib64/, /lib32/, or no such links.’
  136 | #  pragma message "Please add an entry above specifying whether your architecture uses /lib64/, /lib32/, or no such links."
      |           ^~~~~~~
2024-04-05 19:44:44 +02:00
Frantisek Sumsal 3ebd598624 test: account for build dir being under one of the tmpfs-ed directories
If we're running test-execute from the build directory which is under
one of the tmpfs-ed directories (i.e. /root or /tmp), test-execute might
behave strangely, since in that case manager_new() pins the system
systemd-executor binary instead of the build dir one, which may lead to
a very confusing test fails (if there's enough difference between the
system and built sd-executor binary). Let's account for that and
bind-mount the build dir under the tmpfs-ed directory if necessary.
2024-04-05 19:44:41 +02:00
Frantisek Sumsal a9805f8ca9 test: make test-fd-util more lenient when using fd_move_above_stdio()
On s390x this test fails when the SUT uses the z90crypt kernel module,
as it's an another FD the test doesn't account for:

/* test_rearrange_stdio */
Successfully forked off 'rearrange' as PID 57293.
test_rearrange_stdio: r=0
/proc/57293/fd:
total 0
lrwx------. 1 root root 64 Apr  5 06:18 0 -> /dev/pts/0
lrwx------. 1 root root 64 Apr  5 06:18 1 -> /dev/pts/0
lrwx------. 1 root root 64 Apr  5 06:18 2 -> /dev/pts/0
lrwx------. 1 root root 64 Apr  5 06:18 3 -> /dev/z90crypt
rearrange terminated by signal ABRT.

Debugging this was pain, since the child process didn't log anything
once we closed stdout/stderr (for obvious reasons). Let's fix both
issues by switching logging to kmsg once we close stdin/stdout/stderr,
and also by making the test work fine when there are some extra FDs in
the child's environment.
2024-04-05 19:40:23 +02:00
Zbigniew Jędrzejewski-Szmek c1e7f938ca
Merge pull request #31435 from bluca/portable_fix_versioned
portable: assorted bug fixes
2024-04-05 17:04:17 +02:00
Antonio Alvarez Feijoo 1eeae735ad sd-journal: fix check in journal_file_verify_header()
Fixes 6ea51363c8
2024-04-05 13:03:19 +02:00
Frantisek Sumsal e55db9e792 log: fix comment 2024-04-05 12:14:18 +02:00
Daan De Meyer aaa872a713 core: Serialize both pid and pidfd to keep downgrades working
Currently, when downgrading from a version with pidfd support to a
version without pidfd support, all information about running processes
is lost as the newer systemd will serialized pidfds which are not recognized
by the older systemd when deserializing.

To improve the situation, let's serialize both the pid and the pidfd.
This is safe because existing versions will either replace the first
deserialized pidref with the second one or discard the second one in
favor of the first one depending on the unit and field. Older versions
that don't support pidfd's will silently discard any fields that contain
a pidfd as those will try to parse the field as a pid and since a pidfd
field will start with '@', those versions will debug error log and ignore
the value.

To make sure we reuse the existing pidfd as much as possible, the pidfd
is serialized first. Both for scopes and service main pids, if the same
pid is seen multiple times, the first pidref is kept. So by serializing
the pidfd first we make sure the original pidfd is used instead of the
new one which is opened when deserializing the first pid field.

For other control units, older versions with pidfd support will discard
the first pidfd and replace it with a new pidfd from the second pid field.
This is a slight regression on downgrades, but we make sure it doesn't
happen for future versions (and older versions when this commit is
backported) by modifying the logic to only use the first successfully
deserialized pidref so that the raw pid without pidfd is discarded instead
of it replacing the existing pidfd.
2024-04-05 12:56:26 +09:00
Luca Boccassi 1ce28e5a24 meson: set -fno-ssa-phiopt when building bpf with gcc
There are bugs in the kernel verifier that cause legitimate code
to be rejected, disabling this optimization makes bpf programs
built with a new enough gcc work again.

Fixes https://github.com/systemd/systemd/issues/31888
2024-04-05 12:55:53 +09:00
Kirk 57cd604fde
hwdb: fix missing colon (#32108)
Missing colon prevents this from working correctly on the Chuwi UBook X and UBook X Pro.
2024-04-05 10:18:59 +09:00
Luca Boccassi 360f486cc6
Merge pull request #32085 from yuwata/udev-check-processing
udev: check ID_PROCESSING udev property more
2024-04-04 23:46:26 +01:00
Yu Watanabe 36ca167220
Merge pull request #31373 from yuwata/network-neighbor-advertisement
network: add basic support of neighbor advertisement
2024-04-05 05:54:12 +09:00
Yu Watanabe e17e438ede udevadm-test: also show security labels if specified
Follow-up for 03b6879f4d.
2024-04-04 21:30:26 +01:00
Yu Watanabe 9990552f9a backlight: fix detection of multiple graphic cards
Follow-up for e0504dd011.

Hopefully, devices in PCI subsystem have some properties, thus have
their udev database file. But, that may not be true. Here, we only read
sysattrs of enumerated devices, hence it is not necessary to check if
the device is initialized or not.
2024-04-04 21:29:57 +01:00
Yu Watanabe f7cb6801a2 udev: do not update sysattr and sysctl value on testing
Follow-up for 089bef6631.
2024-04-04 21:29:30 +01:00
Luca Boccassi 9a937ea2a6
Merge pull request #32102 from YHNdnzj/efi-var-consistent
Trivial follow-up for hibernate-resume
2024-04-04 21:21:10 +01:00
Mike Yuan 05d2a63139
man/kernel-command-line: document resume_offset= too 2024-04-05 03:03:09 +08:00
Mike Yuan 166ad35fa8
hibernate-util: say "HibernateLocation EFI variable" consistently 2024-04-05 02:59:59 +08:00