This reverts PR #23269 and its follow-up commit. Especially,
2299b1cae3 (partially), and
3cf63830ac.
The PR was merged without final approval, and has several issues:
- The NetLabel for static addresses are not assigned, as labels are
stored in the Address objects managed by Network, instead of Link.
- If NetLabel is specified for a static address, then the address
section will be invalid and the address will not be configured,
- It should be implemented with Request object,
- There is no test about the feature.
This reverts PR #22587 and its follow-up commit. More specifically,
2299b1cae3 (partially),
e176f85527,
ceb46a31a0, and
51bb9076ab.
The PR was merged without final approval, and has several issues:
- OSS fuzz reported issues in the conf parser,
- It calls synchrnous netlink call, it should not be especially in PID1,
- The importance of NFTSet for CGroup and DynamicUser may be
questionable, at least, there was no justification PID1 should support
it.
- For networkd, it should be implemented with Request object,
- There is no test for the feature.
Fixes#23711.
Fixes#23717.
Fixes#23719.
Fixes#23720.
Fixes#23721.
Fixes#23759.
Maximum attempts to send mDNS requests is one except for probe requests, which should be attempted thrice.
Implemented fix to account for the difference between regular queries and probe requests, and prevent
even regular queries from being attempted thrice.
See RFC 6762 Section 8.1
arg_early_core_pattern and arg_watchdog_device hold pointers to memory
allocated with strdup() (inside path_make_absolute_cwd). The memory needs
to be freed in reset_arguments() during reload rather than forgotten.
Fixes build with musl:
| ../git/src/shared/dissect-image.c: In function 'mount_image_privately_interactively':
| ../git/src/shared/dissect-image.c:2986:34: error: 'LOCK_SH' undeclared (first use in this function)
| 2986 | r = loop_device_flock(d, LOCK_SH);
| | ^~~~~~~
Commit 0307afc681 ("networkctl: add support to display Transmit/Recieve queue
length (#12633)") added the display of the number of RX and TX Queues to the
output of `networkctl status $DEV`. However the row description says "Queue
Length".
This patch fixes the output by replacing "Queue Length" by "Number of Queues".
Fixes: 0307afc681 ("networkctl: add support to display Transmit/Recieve queue length (#12633)")
Currently, -Wall and -Wextra override previously passed flags like
-Wno-unused-parameter. This reorders them to be passed before any
optional flags. -Wsign-compare is part of -Wextra and therefore dropped.
-nostdlib is a link-stage flag and dropped as it is already part of
efi_ldflags.
This reverts dd51e725df and fixes bugs
introduced by 1624114d74.
Previously,
- On online scan, the syscall filter was a string Hashmap, but it
might contain syscall name with errno or error action. Hence, we need
to drop the errno or error action in the string.
- On offline scan, the syscall filter was a Hashmap of syscall ID, so
hashmap_contains() with syscall name did not work. We need to convert
syscall IDs to syscall names.
- If hashmap_contains() in syscall_names_in_filter() is true, then
the syscall is allowed when the list is an allow-list, and vice versa.
Hence, the condition in syscall_names_in_filter() was errnously
inverted by dd51e725df.
This makes syscalls are always stored with its name, instead of ID,
and also correct the condition.
Fixes#23663.
Note, if `n != SIZE_MAX`, we cannot check the existence of the specified
string in the set without duplicating the string. And, set_consume() also
checks the existence of the string. Hence, it is not necessary to call
set_contains() if `n != SIZE_MAX`.
Follow-up for 68acc1afbe.
Before the commit, SystemCallFilter bus property provides only allowed
syscalls if ExecContext.syscall_filter is an allow-list, and vice versa.
After the commit, if the list is allow-list, it contains allowed
syscalls with value `-1`, and denied syscalls with non-negative values.
To keep the backward compatibility, denied syscalls must be dropped in
SystemCallFilter bus property.
Fixes: CID#1469712
CID 1469712 (#1 of 1): Unused value (UNUSED_VALUE)
returned_value: Assigning value from sd_id128_from_string(word + 2, &boot_id) to r here,
but that stored value is overwritten before it can be used.
Increase the log severity in case of writing to a non existent sysctl
parameter as this can either be caused by a misspelling or a kernel mis-
configuration, e.g. in case YAMA does not get loaded due to a incomplete
lsm= override:
systemd-sysctl[354]: Couldn't write '1' to 'kernel/yama/ptrace_scope', ignoring: No such file or directory
dns_label_unescape() does not nul-terminate the buffer if it does not
have enough space. Hence, if a lable is enough long, then strjoin()
triggers buffer-overflow.
Fixes#23705.
Not only block subsystem, but also misc has device named "loop*", and
the test always said that the following device is newly found:
---
/* test_sd_device_enumerator_filter_subsystem */
New device found: subsystem:misc syspath:/sys/devices/virtual/misc/loop-control
1 new devices are found in re-scan
---
On ppc64le sanitizers disable ASLR (i.e. by setting ADDR_NO_RANDOMIZE),
which opinionated_personality() doesn't return. Let's tweak the current
personality ourselves in such cases.
See: 78f7a6eaa6Resolves: #23666
Device paths are a packed data structure and the UEFI spec is clear that
members may be misaligned.
In this case all accesses are aligned except for the signature. We can
simply memcpy it instead of making a whole (aligned) copy of the device
path.
This drops the unused xnew0 and xallocate_zero_pool as there is only two
users of it. _cleanup_freepool_ will be phased out once the types in the
declarations are changed/renamed.
A future commit will add support for unicode collation protocol that
allows case folding and comparing strings with locale awareness. But it
only operates on whole strings, so fnmatch cannot use those without a
heavy cost. Instead we just case fold the patterns instead (the IDs we
try to match are already lower case).
https://github.com/systemd/systemd/pull/23192 caused breakage in
Arch Linux's build tooling. Let's give users an opt-out aside from
reverting the patch. It's hardly any maintenance work on our side
and gives users an easy way to revert the locale change if needed.
Of course, by default we still pick C.UTF-8 if the option is not
specified.
New directive `DynamicUserNFTSet=` provides a method for integrating
configuration of dynamic users into firewall rules with NFT sets.
Example:
```
table inet filter {
set u {
typeof meta skuid
}
chain service_output {
meta skuid != @u drop
accept
}
}
```
```
/etc/systemd/system/dunft.service
[Service]
DynamicUser=yes
DynamicUserNFTSet=inet:filter:u
ExecStart=/bin/sleep 1000
[Install]
WantedBy=multi-user.target
```
```
$ sudo nft list set inet filter u
table inet filter {
set u {
typeof meta skuid
elements = { 64864 }
}
}
$ ps -n --format user,group,pid,command -p `pgrep sleep`
USER GROUP PID COMMAND
64864 64864 55158 /bin/sleep 1000
```
New directives `NFTSet=`, `IPv4NFTSet=` and `IPv6NFTSet=` provide a method for
integrating configuration of dynamic networks into firewall rules with NFT
sets.
/etc/systemd/network/eth.network
```
[DHCPv4]
...
NFTSet=netdev:filter:eth_ipv4_address
```
```
table netdev filter {
set eth_ipv4_address {
type ipv4_addr
flags interval
}
chain eth_ingress {
type filter hook ingress device "eth0" priority filter; policy drop;
ip saddr != @eth_ipv4_address drop
accept
}
}
```
```
sudo nft list set netdev filter eth_ipv4_address
table netdev filter {
set eth_ipv4_address {
type ipv4_addr
flags interval
elements = { 10.0.0.0/24 }
}
}
```
raise() won't propagate the siginfo information of the signal that's
re-raised. rt_sigqueueinfo() allows us to provide the original siginfo
struct which makes sure it is propagated to the next signal handler
(or to the coredump).
The lookup "works", but is not useful. It was introduced in
9c66f52813.
And printf will NULL args is invalid was introduced in
5d1ce25728 when support for fds was initally
added :(
Introduce rootpkglibdir for installing libsystemd-{shared,core}.so.
The benefit over using rootlibexecdir is that this path can be
multiarch aware, i.e. this path can be architecture qualified.
This is something we'd like to make use of in Debian/Ubuntu to make
libsystemd-shared co-installable, e.g. for i386 the path would be
/usr/lib/i386-linux-gnu/systemd/libsystemd-shared-*.so and for amd64
/usr/lib/x86_64-linux-gnu/systemd/libsystemd-shared-*.so.
This will allow for example to install and run systemd-boot/i386 on an
amd64 host. It also simplifies/enables cross-building/bootstrapping.
For more infos about Multi-Arch see https://wiki.debian.org/Multiarch.
See also https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=990547
New directive `NetLabel=` provides a method for integrating dynamic network
configuration into Linux NetLabel subsystem rules, used by Linux security
modules (LSMs) for network access control. The option expects a whitespace
separated list of NetLabel labels. The labels must conform to lexical
restrictions of LSM labels. When an interface is configured with IP addresses,
the addresses and subnetwork masks will be appended to the NetLabel Fallback
Peer Labeling rules. They will be removed when the interface is
deconfigured. Failures to manage the labels will be ignored.
Example:
```
[DHCP]
NetLabel=system_u:object_r:localnet_peer_t:s0
```
With the above rules for interface `eth0`, when the interface is configured with
an IPv4 address of 10.0.0.0/8, `systemd-networkd` performs the equivalent of
`netlabelctl` operation
```
$ sudo netlabelctl unlbl add interface eth0 address:10.0.0.0/8 label:system_u:object_r:localnet_peer_t:s0
```
Result:
```
$ sudo netlabelctl -p unlbl list
...
interface: eth0
address: 10.0.0.0/8
label: "system_u:object_r:localnet_peer_t:s0"
...
```
We already store the dlopen() stuff for other libraries in util headers
as well so let's do the same for pcre2. We also move the definition of
some trivial cleanup functions from journalctl.c to pcre2-util.h
IIUC, with MAX() we get a VLA and the size is "decided" at runtime,
even though the result is always the same, but with CONST_MAX() we
get a normal stack variable.
The general rule should be to be strict when parsing data, but lenient
when printing it. Or in other words, we should verify data in verification
functions, but not when printing things. It doesn't make sense to refuse
to print a value that we are using internally.
We were tripping ourselves in some of the print functions:
we want to report than an address was configured with too-long prefix, but
the log line would use "n/a" if the prefix was too long. This is not useful.
Most of the time, the removal of the check doesn't make any difference,
because we verified the prefix length on input.
IN6_ADDR_TO_STRING(…) always returns something, so we can simplify the code a
lot. Also, let's not do step-wise concatenation, but instead handle everything
with one str_extendf() call.
Since we don't need the error value, and the buffer is allocated with a fixed
size, the whole logic provided by in_addr_to_string() becomes unnecessary, so
it's enough to wrap inet_ntop() directly.
inet_ntop() can only fail with ENOSPC. But we specify a buffer that is supposed
to be large enough, so this should never fail. A bunch of tests of this are added.
This allows all the wrappers like strna(), strnull(), strempty() to be dropped.
The guard of 'if (DEBUG_LOGGING)' can be dropped from around log_debug(),
because log_debug() implements the check outside of the function call. But
log_link_debug() does not, so it we need it to avoid unnecessary evaluation of
the formatting.
This reverts commit 0bd292567a.
It isn't guaranteed anywhere that __builtin_dynamic_object_size can
always deduce the size of every object passed to it so systemd
can end up using either malloc_usable_size or
__builtin_dynamic_object_size when pointers are passed around,
which in turn can lead to actual segfaults like the one mentioned in
https://github.com/systemd/systemd/issues/23619.
Apparently __builtin_object_size can return different results for
pointers referring to the same memory as well but somehow it hasn't
caused any issues yet. Looks like this whole
malloc_usable_size/FORTIFY_SOURCE stuff should be revisited.
Closes https://github.com/systemd/systemd/issues/23619 and
https://github.com/systemd/systemd/issues/23150.
Reopens https://github.com/systemd/systemd/issues/22801
../src/journal-remote/microhttpd-util.c: In function ‘check_permissions’:
../src/journal-remote/microhttpd-util.c:301:5: error: function might be candidate for attribute ‘noreturn’ [-Werror=suggest-attribute=noreturn]
301 | int check_permissions(struct MHD_Connection *connection, int *code, char **hostname) {
| ^~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
Fixes#23630.
In file included from ../src/basic/siphash24.h:11,
from ../src/basic/hash-funcs.h:6,
from ../src/basic/hashmap.h:8,
from ../src/shared/fdset.h:6,
from ../src/shared/bpf-program.h:9,
from ../src/core/unit.h:11,
from ../src/core/all-units.h:4,
from ../src/core/manager.c:23:
../src/basic/time-util.h: In function 'manager_dispatch_jobs_in_progress':
../src/basic/time-util.h:140:38: error: 'x' may be used uninitialized [-Werror=maybe-uninitialized]
140 | #define FORMAT_TIMESPAN(t, accuracy) format_timespan((char[FORMAT_TIMESPAN_MAX]){}, FORMAT_TIMESPAN_MAX, t, accuracy)
| ^~~~~~~~~~~~~~~
In function 'manager_print_jobs_in_progress',
inlined from 'manager_dispatch_jobs_in_progress' at ../src/core/manager.c:3007:9:
../src/core/manager.c:219:18: note: 'x' was declared here
219 | uint64_t x;
| ^
cc1: all warnings being treated as errors
For some reason this (false positive) warning starts appearing after
-ftrivial-auto-var-init is used.
When something goes awry, we would get identical log messages from all the
bpf subsystems. E.g. "Failed to load BPF object: %m" appeared 5 times in the
sources. But it is very important to know *which* object we failed to load.
This could be guessed, e.g. from surroudning messages or from filename/line
metadata, but when we get log messages in bug reports, this might not be
available. Let's make the messages distinguishable.
While at it, some messages were adjusted a bit. In particular, we shouldn't use
internal names like BPFProgram which have no meaning outside of the codebase.
Testing the error paths is very important. If we are not root, we should
try and get a failure, which we should report nicely and mark the test
as skipped. After those checks are removed, this is what seems to happen.
This way we can see what will happen e.g. in the user manager when we try
to perform some bpf ops.
$ build/test-socket-bind
...
libbpf: load bpf program failed: Operation not permitted
libbpf: failed to load program 'sd_bind4'
libbpf: failed to load object 'socket_bind_bpf'
libbpf: failed to load BPF skeleton 'socket_bind_bpf': -1
Failed to load BPF object: Operation not permitted
Now all lines with "libbpf:" are at debug level and will be hidden by
default.
Partially fixes https://bugzilla.redhat.com/show_bug.cgi?id=2084955#c14
(i.e. the error that was exposed when the initial error was fixed.)
find_socket_fd() does not expect the sender address, but the
listen-address. This is in fact the destination of the DNS packet.
Matching via sender address caused a fallback to the default stub
listener in manager_dns_stub_fd() as the sender address can never
match the proxy stub listen address.
Note that manager_dns_stub_fd() is only used for the default
listener stub and the proxy stub, that means *extra* listeners
stubs (DNSStubListenerExtra=…) have not been affected as
`struct DnsStubListenerExtra` provides a direct link to the event
source.
By using the correct fd we ensure the correct socket options
(like TTL) are used and prevent issues like #23495 in case ifindex
could not be determined.
Fixes: #23520
[zjs: I added the comment and tweaked the patch a bit.
The call to reset_scheduled_shutdown() is moved down a bit to allow the
callback to have access to information about the operation being cancelled.
This all happens within the same function, so there should be no observable
change in behaviour.]
I had a strange failure where the pager was hanging on invocation (gdm crashed
and the kernel got into a strange state where it was hanging on some tasks).
Based on the logs from 'SYSTEMCTL_LOG_LEVEL=debug journalctl', I couldn't even
tell which pager binary we're executing. So let's shorten the function a bit and
provide a bit more detail.
We had yet-another table of descriptive strings to use in error messages.
I started thinking how to synchronize them with the strings in logind, but
ultimately I think it's better to remove those altogether. Those strings
should almost never be used: normally if the call fails, logind will provide
an error message itself, which is probably more detailed than what we can
figure out on the client side. And the most important part that we want to
show here is what exactly we called, in particular RebootWithFlags vs. Reboot,
etc. By using the "descriptive strings" we were obfuscating this. So let's just
simplify our code and print the actual method name, since this is more useful
as an error statement that is googlable and unique.
While at it, let's print the correct method name ;)
Those messages simply *feel* dated: "The system is going for suspend NOW!".
Let's say "The system will suspend|power off|hibernate|… now!" instead.
The exclamation mark is enough to show the urgency.
Also, the "the" seemed out of place. We're not talking about a specific reboot.
DnsPacket.ifindex=1 (loopback) is normalized to 0 whenever a message is
received on the loopback iface, so for both listeners, 127.0.0.53 and
127.0.0.54, the ifindex will be set to 0 by manager_recv() for queries
that have a local origin.
Replies to such local messages need to set a proper ifindex in any
case, as the supplied source-address would otherwise be ignored in
manager_ipv4_send() (CMSG generation is skipped due to ifindex > 0 check).
Note that this change only forces `ifindex` to loopback if it was actually
normalized to `0` before (due to a loopback detection) in order to keep the
nat-to-127.0.0.54-from-another-interface usecase that was described in
a8d0906344 intact.
Also note that nat is not supported for the main stub 127.0.0.53 which is
why forcing LOOPBACK_IFINDEX was/is fine for that case.
Fixes#23495
Fixes#23520. Replaces #23555.
The problem started with cdf370626f and
90b1ec03b2 which together started printing the
wall message in more cases. The motivation for those change was reasonable, but
this clearly causes problems described in #23520: users are getting unexpected
wall messages. Xterm, urxvt, (anything using libutempter?), and tmux (in some
configurations), register local pty sessions in utmp.
So let's try to suppress the message for local pseudo-terminal logins. This
patch based on #23538, but instead of filtering just on /dev/pts, it uses the
.ut_addr_v6 to only filter out local entries.
utmpdump doesn't print all the details. Looking at the list if useful
when trying to tweak the wall filtering logic.
This doesn't do much, but at least it serves as a smoke test for the cleanup
functions.