The idea of nm_free_secret() is to clear the secrets from memory. That
surely is some layer of extra snake oil, because we tend to pass secrets
via D-Bus, where the memory gets passed down to (D-Bus) libraries which
have no idea to keep it private. Still...
But turns out, malloc_usable_size() might not actually be usable for
this. Read the discussion at [1].
Stop using malloc_usable_size(), which seems unfortunate.
There is probably no secret relevant data after the NUL byte anyway,
because we tend to create such strings once, and don't rewrite/truncate
them afterwards (which would leave secrets behind as garbage).
Note that systemd's erase_and_free() still uses malloc_usable_size()
([2]) but the macro foo to get that right is terrifying ([3]).
[1] https://github.com/systemd/systemd/issues/22801#issuecomment-1343041481
[2] 11c0f0659e/src/basic/memory-util.h (L101)
[3] 7929e180aa
Fixes: d63cd26e60 ('shared: improve nm_free_secret() to clear entire memory buffer')
During srv_shutdown() we do
p.stdin.close()
p.kill()
Usually, the kill wins and the service just drops off the bus:
libnm-dbus[3201919]: <debug> [438617.45324] nmclient[40f7938626f3f5f0]: name owner changed: ":1.1" -> (null)
libnm-dbus[3201919]: <debug> [438617.45332] nmclient[40f7938626f3f5f0]: release all
at which point all objects in NMClient get destroyed and the signals get
emitted in the order:
libnm-dbus[3201919]: <trace> [438617.45574] nmclient[40f7938626f3f5f0]: [nmclient] emit "device-removed" signal for /org/freedesktop/NetworkManager/Devices/1
nmcli[out]: eth0: device removed
libnm-dbus[3201919]: <trace> [438617.45590] nmclient[40f7938626f3f5f0]: [nmclient] emit "any-device-removed" signal for /org/freedesktop/NetworkManager/Devices/1
libnm-dbus[3201919]: <trace> [438617.45593] nmclient[40f7938626f3f5f0]: [nmclient] emit "connection-removed" signal for /org/freedesktop/NetworkManager/Settings/Connectio>
nmcli[out]: con-1: connection profile removed
However, sometimes the stub service notices that stdin was closed and it
sends signals about shutting down:
libnm-dbus[3201061]: <trace> [438226.44965] nmclient[401639659459c316]: interfaces-removed: [/org/freedesktop/NetworkManager/Settings] receive interface remove event for >
libnm-dbus[3201061]: <trace> [438226.44966] nmclient[401639659459c316]: [/org/freedesktop/NetworkManager/Settings]: changed-type 0x01 linked
libnm-dbus[3201061]: <trace> [438226.44967] nmclient[401639659459c316]: [/org/freedesktop/NetworkManager/Settings]: changed-type 0x01 consumed
libnm-dbus[3201061]: <trace> [438226.44968] nmclient[401639659459c316]: [/org/freedesktop/NetworkManager/Settings]: changed-type 0x02 linked
libnm-dbus[3201061]: <trace> [438226.44969] nmclient[401639659459c316]: [/org/freedesktop/NetworkManager/Settings]: unregister NMClient from D-Bus object
libnm-dbus[3201061]: <trace> [438226.44971] nmclient[401639659459c316]: [/org/freedesktop/NetworkManager/Settings]: drop D-Bus instance
libnm-dbus[3201061]: <trace> [438226.44971] nmclient[401639659459c316]: [/org/freedesktop/NetworkManager/Settings]: set D-Bus object state unlinked
libnm-dbus[3201061]: <trace> [438226.44972] nmclient[401639659459c316]: [nmclient] emit "connection-removed" signal for /org/freedesktop/NetworkManager/Settings/Connectio>
nmcli[out]: con-1: connection profile removed
libnm-dbus[3201061]: <trace> [438226.44992] nmclient[401639659459c316]: interfaces-removed: [/org/freedesktop/NetworkManager] receive interface remove event for interface>
libnm-dbus[3201061]: <trace> [438226.44994] nmclient[401639659459c316]: [/org/freedesktop/NetworkManager]: changed-type 0x01 linked
libnm-dbus[3201061]: <trace> [438226.44995] nmclient[401639659459c316]: [/org/freedesktop/NetworkManager]: changed-type 0x01 consumed
libnm-dbus[3201061]: <trace> [438226.44996] nmclient[401639659459c316]: [/org/freedesktop/NetworkManager]: changed-type 0x02 linked
libnm-dbus[3201061]: <trace> [438226.44998] nmclient[401639659459c316]: [/org/freedesktop/NetworkManager]: unregister NMClient from D-Bus object
libnm-dbus[3201061]: <trace> [438226.44999] nmclient[401639659459c316]: [/org/freedesktop/NetworkManager]: drop D-Bus instance
libnm-dbus[3201061]: <trace> [438226.45000] nmclient[401639659459c316]: [/org/freedesktop/NetworkManager]: set D-Bus object state unlinked
libnm-dbus[3201061]: <trace> [438226.45001] nmclient[401639659459c316]: [nmclient] emit "device-removed" signal for /org/freedesktop/NetworkManager/Devices/1
nmcli[out]: eth0: device removed
libnm-dbus[3201061]: <trace> [438226.45005] nmclient[401639659459c316]: [nmclient] emit "any-device-removed" signal for /org/freedesktop/NetworkManager/Devices/1
nmcli[out]: NetworkManager is stopped
libnm-dbus[3201061]: <debug> [438226.45545] nmclient[401639659459c316]: name owner changed: ":1.1" -> (null)
libnm-dbus[3201061]: <debug> [438226.45550] nmclient[401639659459c316]: release all
The fix is to accept the events in any order.
(cherry picked from commit 8548ab29ee)
The test stub service watches stdin, and if it gets closed the service
will shut down. Note that the service does not catch any signals, so
sending a signal will kill the service right away.
The previous code first closed stdin, and then killed the process.
That can result in different outcomes on D-Bus. Usually the signal
gets received first, and the test service just drops off the bus.
Sometimes it notices the closing of stdin and shuts actively down.
That can make a difference, especially for the test_monitor() test which
runs the monitor while stopping the service.
We could just always kill the stub service to get consistent behavior.
However, that doesn't seem very useful. Instead, randomize the behavior
to easier see how the behavior differs.
(cherry picked from commit fc282d5e05)
The main purpose is to simplify printf debugging and manual testing. We
can now trivially patch the code so that all output from nmcli gets
(additionally) written to a file. That is useful when debugging a unit
test in "test-client.py". Thereby we can duplicate all messages via
nm_utils_print(), which is in sync with the debug messages from libnm
and which honors LIBNM_CLIENT_DEBUG_FILE.
(cherry picked from commit b32e4c941a)
These will replace the direct calls to g_print()/g_printerr() in nmcli.
There are two purposes.
1) the new macros embody the concept of "printing something from nmcli".
It means, we can `git grep` for those functions, and find all the
relevant places, without hitting the irrelevant ones (e.g. tests that
also use g_print()).
2) by having one place, we can trivially change it. That is useful for
printf debugging. For example, "test-client.py" runs nmcli and
captures and compares the output. With libnm we can set
LIBNM_CLIENT_DEBUG and LIBNM_CLIENT_DEBUG_FILE to print libnm debug
messages to a file. But we cannot trivially synchronize the messages
from nmcli with that output (also because they are consumed by the test
and not immediately accessible). This would be easy, if we temporarily
could patch nmc_print*() to also log to nm_utils_print(). The new macros
will allow doing that at one place.
For example, patch the "#if 0" and run:
$ LIBNM_CLIENT_DEBUG=trace \
LIBNM_CLIENT_DEBUG_FILE='xxx.%p' \
NMTST_USE_VALGRIND=1 \
LIBTOOL="/bin/sh ./libtool"
./src/tests/client/test-client.sh -- -k monitor
(cherry picked from commit 4cf94f30c7)
nmc_print() will be used for something else. Rename. Also,
nmc_print_table() is the better name anyway because the function does a
lot of formatting and not simple printf().
(cherry picked from commit 4b2ded7a4a)
For debugging libnm, LIBNM_CLIENT_DEBUG can be very useful.
As the tests compare stdout/stderr from nmcli with expected output, just
enabling it will break the tests. However, in combination with
LIBNM_CLIENT_DEBUG_FILE this can be very useful.
Preserve and pass on the environment variables, if set.
(cherry picked from commit 1630009234)
With LIBNM_CLIENT_DEBUG we get debug logging for libnm, either to stdout
or to stderr.
"test-client.py" runs nmcli as a unit test. It thereby catches stdout
and stderr. That means, LIBNM_CLIENT_DEBUG would break the tests.
Now honor the LIBNM_CLIENT_DEBUG_FILE environment variable to specify a
file to which the debug logs get written. The pattern "%p" is replaced
by the process id.
As before, nm_utils_print(0, ...) also honors this environment variable
and uses the same logging destination.
(cherry picked from commit 6dbb215793)
Sort imports by name. Also avoid "from signal import SIGINT". I find
it ugly to import names in the current namespace. "SIGINT" should be
refered to by its full name, including the package/namespace.
(cherry picked from commit ee17346cee)
This will allow to find some memory leaks and memory corruptions.
The bulk of the nmcli calls are still not hooked up with valgrind.
Since we call nmcli a thousand time, we could not just run valgrind with
all of them. We would have instead to enable it randomly. This is
more work.
(cherry picked from commit debf78dbed)
The base class is not used, and it's not clear that it would be useful.
Sure, we could extend "test-client.py" will various non-nmcli tests. That
might make sense. And then it might make sense to have more unit test classes.
So far, we don't need that. Drop the unused base class NmTestBase.
(cherry picked from commit d1e6d53013)
Since glib 2.45, we are guaranteed that g_free() just calls free(), so
both can be used interchangeably. However, we still only depend on glib
2.40.
In any case, it's ugly to mix the two. Memory allocated by plain
malloc(), should be only freed with free(). The buffer in question comes
from readline, which allocates it using the system allocator.
Fixes: 995229181c ('cli: remove editor thread')
(cherry picked from commit 5dc07174d3)
Such leaks show up in valgrind, and are simply bugs.
Also, various callers (not all of them, which is another bug!) like to
take ownership of the returned string and free it. That means, we leave
a dangling pointer in the global variable, which is very ugly and error
prone.
Also, the callers like to free the string with g_free(), which is not
appropriate for the "rl_string" memory which was allocated by readline.
It must be freed with free(). Avoid that, by cloning the string using
the glib allocator.
Fixes: 995229181c ('cli: remove editor thread')
(cherry picked from commit 89734c7553)
During srv_shutdown() we do
p.stdin.close()
p.kill()
Usually, the kill wins and the service just drops off the bus:
libnm-dbus[3201919]: <debug> [438617.45324] nmclient[40f7938626f3f5f0]: name owner changed: ":1.1" -> (null)
libnm-dbus[3201919]: <debug> [438617.45332] nmclient[40f7938626f3f5f0]: release all
at which point all objects in NMClient get destroyed and the signals get
emitted in the order:
libnm-dbus[3201919]: <trace> [438617.45574] nmclient[40f7938626f3f5f0]: [nmclient] emit "device-removed" signal for /org/freedesktop/NetworkManager/Devices/1
nmcli[out]: eth0: device removed
libnm-dbus[3201919]: <trace> [438617.45590] nmclient[40f7938626f3f5f0]: [nmclient] emit "any-device-removed" signal for /org/freedesktop/NetworkManager/Devices/1
libnm-dbus[3201919]: <trace> [438617.45593] nmclient[40f7938626f3f5f0]: [nmclient] emit "connection-removed" signal for /org/freedesktop/NetworkManager/Settings/Connectio>
nmcli[out]: con-1: connection profile removed
However, sometimes the stub service notices that stdin was closed and it
sends signals about shutting down:
libnm-dbus[3201061]: <trace> [438226.44965] nmclient[401639659459c316]: interfaces-removed: [/org/freedesktop/NetworkManager/Settings] receive interface remove event for >
libnm-dbus[3201061]: <trace> [438226.44966] nmclient[401639659459c316]: [/org/freedesktop/NetworkManager/Settings]: changed-type 0x01 linked
libnm-dbus[3201061]: <trace> [438226.44967] nmclient[401639659459c316]: [/org/freedesktop/NetworkManager/Settings]: changed-type 0x01 consumed
libnm-dbus[3201061]: <trace> [438226.44968] nmclient[401639659459c316]: [/org/freedesktop/NetworkManager/Settings]: changed-type 0x02 linked
libnm-dbus[3201061]: <trace> [438226.44969] nmclient[401639659459c316]: [/org/freedesktop/NetworkManager/Settings]: unregister NMClient from D-Bus object
libnm-dbus[3201061]: <trace> [438226.44971] nmclient[401639659459c316]: [/org/freedesktop/NetworkManager/Settings]: drop D-Bus instance
libnm-dbus[3201061]: <trace> [438226.44971] nmclient[401639659459c316]: [/org/freedesktop/NetworkManager/Settings]: set D-Bus object state unlinked
libnm-dbus[3201061]: <trace> [438226.44972] nmclient[401639659459c316]: [nmclient] emit "connection-removed" signal for /org/freedesktop/NetworkManager/Settings/Connectio>
nmcli[out]: con-1: connection profile removed
libnm-dbus[3201061]: <trace> [438226.44992] nmclient[401639659459c316]: interfaces-removed: [/org/freedesktop/NetworkManager] receive interface remove event for interface>
libnm-dbus[3201061]: <trace> [438226.44994] nmclient[401639659459c316]: [/org/freedesktop/NetworkManager]: changed-type 0x01 linked
libnm-dbus[3201061]: <trace> [438226.44995] nmclient[401639659459c316]: [/org/freedesktop/NetworkManager]: changed-type 0x01 consumed
libnm-dbus[3201061]: <trace> [438226.44996] nmclient[401639659459c316]: [/org/freedesktop/NetworkManager]: changed-type 0x02 linked
libnm-dbus[3201061]: <trace> [438226.44998] nmclient[401639659459c316]: [/org/freedesktop/NetworkManager]: unregister NMClient from D-Bus object
libnm-dbus[3201061]: <trace> [438226.44999] nmclient[401639659459c316]: [/org/freedesktop/NetworkManager]: drop D-Bus instance
libnm-dbus[3201061]: <trace> [438226.45000] nmclient[401639659459c316]: [/org/freedesktop/NetworkManager]: set D-Bus object state unlinked
libnm-dbus[3201061]: <trace> [438226.45001] nmclient[401639659459c316]: [nmclient] emit "device-removed" signal for /org/freedesktop/NetworkManager/Devices/1
nmcli[out]: eth0: device removed
libnm-dbus[3201061]: <trace> [438226.45005] nmclient[401639659459c316]: [nmclient] emit "any-device-removed" signal for /org/freedesktop/NetworkManager/Devices/1
nmcli[out]: NetworkManager is stopped
libnm-dbus[3201061]: <debug> [438226.45545] nmclient[401639659459c316]: name owner changed: ":1.1" -> (null)
libnm-dbus[3201061]: <debug> [438226.45550] nmclient[401639659459c316]: release all
The fix is to accept the events in any order.
The test stub service watches stdin, and if it gets closed the service
will shut down. Note that the service does not catch any signals, so
sending a signal will kill the service right away.
The previous code first closed stdin, and then killed the process.
That can result in different outcomes on D-Bus. Usually the signal
gets received first, and the test service just drops off the bus.
Sometimes it notices the closing of stdin and shuts actively down.
That can make a difference, especially for the test_monitor() test which
runs the monitor while stopping the service.
We could just always kill the stub service to get consistent behavior.
However, that doesn't seem very useful. Instead, randomize the behavior
to easier see how the behavior differs.
The main purpose is to simplify printf debugging and manual testing. We
can now trivially patch the code so that all output from nmcli gets
(additionally) written to a file. That is useful when debugging a unit
test in "test-client.py". Thereby we can duplicate all messages via
nm_utils_print(), which is in sync with the debug messages from libnm
and which honors LIBNM_CLIENT_DEBUG_FILE.
These will replace the direct calls to g_print()/g_printerr() in nmcli.
There are two purposes.
1) the new macros embody the concept of "printing something from nmcli".
It means, we can `git grep` for those functions, and find all the
relevant places, without hitting the irrelevant ones (e.g. tests that
also use g_print()).
2) by having one place, we can trivially change it. That is useful for
printf debugging. For example, "test-client.py" runs nmcli and
captures and compares the output. With libnm we can set
LIBNM_CLIENT_DEBUG and LIBNM_CLIENT_DEBUG_FILE to print libnm debug
messages to a file. But we cannot trivially synchronize the messages
from nmcli with that output (also because they are consumed by the test
and not immediately accessible). This would be easy, if we temporarily
could patch nmc_print*() to also log to nm_utils_print(). The new macros
will allow doing that at one place.
For example, patch the "#if 0" and run:
$ LIBNM_CLIENT_DEBUG=trace \
LIBNM_CLIENT_DEBUG_FILE='xxx.%p' \
NMTST_USE_VALGRIND=1 \
LIBTOOL="/bin/sh ./libtool"
./src/tests/client/test-client.sh -- -k monitor
nmc_print() will be used for something else. Rename. Also,
nmc_print_table() is the better name anyway because the function does a
lot of formatting and not simple printf().
For debugging libnm, LIBNM_CLIENT_DEBUG can be very useful.
As the tests compare stdout/stderr from nmcli with expected output, just
enabling it will break the tests. However, in combination with
LIBNM_CLIENT_DEBUG_FILE this can be very useful.
Preserve and pass on the environment variables, if set.
With LIBNM_CLIENT_DEBUG we get debug logging for libnm, either to stdout
or to stderr.
"test-client.py" runs nmcli as a unit test. It thereby catches stdout
and stderr. That means, LIBNM_CLIENT_DEBUG would break the tests.
Now honor the LIBNM_CLIENT_DEBUG_FILE environment variable to specify a
file to which the debug logs get written. The pattern "%p" is replaced
by the process id.
As before, nm_utils_print(0, ...) also honors this environment variable
and uses the same logging destination.
Sort imports by name. Also avoid "from signal import SIGINT". I find
it ugly to import names in the current namespace. "SIGINT" should be
refered to by its full name, including the package/namespace.
This will allow to find some memory leaks and memory corruptions.
The bulk of the nmcli calls are still not hooked up with valgrind.
Since we call nmcli a thousand time, we could not just run valgrind with
all of them. We would have instead to enable it randomly. This is
more work.
The base class is not used, and it's not clear that it would be useful.
Sure, we could extend "test-client.py" will various non-nmcli tests. That
might make sense. And then it might make sense to have more unit test classes.
So far, we don't need that. Drop the unused base class NmTestBase.
Since glib 2.45, we are guaranteed that g_free() just calls free(), so
both can be used interchangeably. However, we still only depend on glib
2.40.
In any case, it's ugly to mix the two. Memory allocated by plain
malloc(), should be only freed with free(). The buffer in question comes
from readline, which allocates it using the system allocator.
Fixes: 995229181c ('cli: remove editor thread')
Such leaks show up in valgrind, and are simply bugs.
Also, various callers (not all of them, which is another bug!) like to
take ownership of the returned string and free it. That means, we leave
a dangling pointer in the global variable, which is very ugly and error
prone.
Also, the callers like to free the string with g_free(), which is not
appropriate for the "rl_string" memory which was allocated by readline.
It must be freed with free(). Avoid that, by cloning the string using
the glib allocator.
Fixes: 995229181c ('cli: remove editor thread')
The onlink flag is part of each next hop.
When NetworkManager configures ECMP routes, we won't support that. All
next hops of an ECMP route must share the same onlink flag. That is fine
and fixed by this commit.
What is not fine, is that we don't track the rtnh_flags flags in
NMPlatformIP4RtNextHop, and consequently our nmp_object_id_cmp() is
wrong.
Fixes: 5b5ce42682 ('nm-netns: track ECMP routes')
(cherry picked from commit 6ed966258c)
If the route with a next hop is already onlink, we don't need to add a
direct route to the gateway.
It also wouldn't work previously, because the onlink route to the
gateway that we would add, would have no gateway and the RTNH_F_ONLINK
set. Kernel would reject that with an error. We would have to clear the
RTNH_F_ONLINK flag, if there is no gateway.
(cherry picked from commit 93b46c8906)