It does not make sense to issue an error. This should be a helper function.
"NetworkManager[18341]: nm_clear_g_signal_handler: assertion 'G_IS_OBJECT (self)' failed"
error started since commit e6d7fee5a6 due to that.
The unmanaged-flag NM_UNMANAGED_EXTERNAL_DOWN is initially set during
nm_device_finish_init(). But it was only set if the device was down at
that point.
If due to a race the platform device was not yet initialized, a later
initialization in device_link_changed() would clear NM_UNMANAGED_PLATFORM_INIT.
If the device is not external-down (because it was already up during
nm_device_finish_init()), the device will be managed right away with
reason NM_DEVICE_STATE_REASON_NOW_MANAGED.
Together with commit e29ab54335, this
is a race that causes a failure to assume the external-down device.
https://bugzilla.redhat.com/show_bug.cgi?id=1269199
We get a lot of these debugging message, although the event is entirely
internal to NMLinuxPlatform and only interesting when debugging a problem
in platform itself.
Downgrade to TRACE level.
When setting the logging with omitting the domains, we would
use the previously set logging domains. That was wrong since
the addition of the 'KEEP' level:
(1) $ nmcli g l level INFO domains DNS,CORE
$ nmcli g l
LEVEL DOMAINS
INFO DNS,CORE
(2) $ nmcli g l level KEEP domains PPP:TRACE
$ nmcli g l
LEVEL DOMAINS
INFO PPP:TRACE,DNS,CORE
(3) $ nmcli g l level ERR
$ nmcli g l
LEVEL DOMAINS
ERR PPP:TRACE
with this change, command (3) effectively translates to:
$ nmcli g l level ERR domains PPP,DNS,CORE
$ nmcli g l
LEVEL DOMAINS
ERR PPP,DNS,CORE
"nm-logging.c" uses several global variables. As their name doesn't
indicate that they are global variables, this is quite confusing.
Pack them all into a struct @global, which effectively puts the
variables into a separate namespace.
When debug-logging for platform is enabled, every access to sysctl
is cached (to log the last values).
This cache can grow quite large if the system has a large number of
interfaces (e.g. docker creating veth pairs for each container).
We already used to clear the cache, when we were about to access
sysctl *and* logging was disabled in the meantime.
Now, when logging setup changes, immediately clear the cache.
Having "nm-logging.c" call into platform code is a bit of a hack
and a better design would be to have logging code emit a signal to
which platform would subscribe. But that seems to involve much
more code (especially, as no other users care about such a signal
and because nm-logging is not a GObject).
Also, log a warning when the cache grows large to inform the user
about the cache and what he can do to clear it. The extra effort to
clear the cache when changing logging setup is done so that we do
what we tell the user: changing the logging level, will clear the
cache -- right away, not some time later when the next message is
logged.
Without this, the user cannot configure only certain logging domains
without touching them all.
E.g.
# nmcli general logging level DEBUG domains PLATFORM
will disable all non-PLATFORM domains.
Well, the user can do:
# nmcli general logging level INFO domains PLATFORM:DEBUG
# nmcli general logging level DEBUG domains ALL:INFO,PLATFORM
but in this case all non-PLATFORM domains are reset explicitly.
Now the user can:
# nmcli general logging level KEEP domains PLATFORM:DEBUG
# nmcli general logging level DEBUG domains ALL:KEEP,PLATFORM
which will only change the platform domain.
When a VLAN has a bond as parent device the MAC address of the bond
may change when other devices are enslaved and then the VLAN would
have a MAC which is different from parent's one.
Let the VLAN device listen for changes in hw-address property of
parent and update its MAC address accordingly.
https://bugzilla.redhat.com/show_bug.cgi?id=1264322
Executing:
# brctl addbr lbr0
# ip addr add 10.1.1.1/24 dev lbr0
# ip link set lbr0 up
can result in a race so that NetworkManager would manage the device
(and clear the IP addresses).
It happens, when NetworkManager first receives platform signals that
the device is already up:
signal: link changed: 11: lbr0 <UP,LOWER_UP;broadcast,multicast,up,running,lowerup> mtu 1500 arp 1 bridge* not-init addrgenmode eui64 addr D2:A1:B4:17:18:F2 driver bridge
Note that the device is still unknown via udev (not-init). The
unmanaged-state NM_UNMANAGED_EXTERNAL_DOWN gets cleared, but the
device still stays unmanaged.
Only afterwards the device is known in udev:
signal: link changed: 11: lbr0 <UP,LOWER_UP;broadcast,multicast,up,running,lowerup> mtu 1500 arp 1 bridge* init addrgenmode eui64 addr D2:A1:B4:17:18:F2 driver bridge
At this point, we also clear NM_UNMANAGED_PLATFORM_INIT, making
the device managed with reason NM_DEVICE_STATE_REASON_NOW_MANAGED.
That results in managing the external device.
Fix that by only clearing NM_UNMANAGED_EXTERNAL_DOWN after the device
is no longer NM_UNMANAGED_PLATFORM_INIT.
https://bugzilla.redhat.com/show_bug.cgi?id=1269199
Dracut when faced with an ipv6 only setup during kickstart will generate a ifcfg
file that sets the ipv4 address things to null but sets BOOTPROTO=static. This
makes network manager screw up because it expects an ipv4 address to be set.
Instead deal with this case by checking if we have any ipv4 addrs set, and if
not just disable ipv4. This fixes our inability to kickstart in our ipv6 only
clusters. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
https://mail.gnome.org/archives/networkmanager-list/2015-October/msg00015.html
If the lease file doesn't exist sd_dhcp_lease_load() still indicates
success while not returning any lease, resulting in an assertion fail
when we try to generate an IP4Config:
#0 g_logv (log_domain=0x7f309b45dba0 "NetworkManager", log_level=G_LOG_LEVEL_CRITICAL, format=<optimized out>, args=args@entry=0x7ffc815c38e0) at gmessages.c:1046
#1 0x00007f3097d4fa3f in g_log (log_domain=log_domain@entry=0x7f309b45dba0 "NetworkManager", log_level=log_level@entry=G_LOG_LEVEL_CRITICAL, format=format@entry=0x7f3097dbd73d "%s: assertion '%s' failed")
at gmessages.c:1079
#2 0x00007f3097d4fa79 in g_return_if_fail_warning (log_domain=log_domain@entry=0x7f309b45dba0 "NetworkManager", pretty_function=pretty_function@entry=0x7f309b456b30 <__FUNCTION__.31435> "lease_to_ip4_config",
expression=expression@entry=0x7f309b456417 "lease != NULL") at gmessages.c:1088
#3 0x00007f309b35454a in lease_to_ip4_config (lease=0x0, options=options@entry=0x0, default_priority=default_priority@entry=100, log_lease=log_lease@entry=0, error=0x0) at dhcp-manager/nm-dhcp-systemd.c:230
#4 0x00007f309b3546a0 in nm_dhcp_systemd_get_lease_ip_configs (iface=<optimized out>, uuid=<optimized out>, ipv6=<optimized out>, default_route_metric=100) at dhcp-manager/nm-dhcp-systemd.c:397
#5 0x00007f309b35ed4a in find_ip4_lease_config (ext_ip4_config=0x7f309cbe8640, connection=0x7f309cb98250, self=0x7f309cbfb7b0) at devices/nm-device.c:7215
#6 capture_lease_config (ext_ip6_config=0x0, out_ip6_config=0x0, out_ip4_config=0x7f309cbfb5d0, ext_ip4_config=0x7f309cbe8640, self=0x7f309cbfb7b0) at devices/nm-device.c:7289
#7 update_ip4_config (self=self@entry=0x7f309cbfb7b0, initial=initial@entry=1) at devices/nm-device.c:7323
#8 0x00007f309b3608be in nm_device_capture_initial_config (self=self@entry=0x7f309cbfb7b0) at devices/nm-device.c:7428
#9 0x00007f309b3dbaad in get_existing_connection (out_generated=<synthetic pointer>, device=0x7f309cbfb7b0, manager=0x7f309cb7f150) at nm-manager.c:1550
#10 recheck_assume_connection (device=device@entry=0x7f309cbfb7b0, user_data=user_data@entry=0x7f309cb7f150) at nm-manager.c:1689
#11 0x00007f309b3dc62d in add_device (self=0x7f309cb7f150, device=0x7f309cbfb7b0, try_assume=1) at nm-manager.c:1875
#12 0x00007f309b3dcd10 in platform_link_added (self=self@entry=0x7f309cb7f150, ifindex=<optimized out>, plink=plink@entry=0x7f309cbcff40) at nm-manager.c:1984
#13 0x00007f309b3df7d4 in platform_query_devices (self=0x7f309cb7f150) at nm-manager.c:2056
#14 nm_manager_start (self=0x7f309cb7f150) at nm-manager.c:4220
#15 0x00007f309b341f2c in main (argc=1, argv=0x7ffc815c3f68) at main.c:494
schedule_stage3() used to set the firewall before really
scheduling the stage3. In that case, fw_change_zone_cb() would
then directly call:
activation_source_schedule (self, activate_stage3_ip_config_start, AF_INET);
This was different from all other places. Usually, only the
nm_device_schedule_*() functions would directly call to
activation_source_schedule().
Change this, to behave similar as when we wait for master-ready
while scheduling stage2. As such, it is more idiomatic, and
it would still work correctly even if there were multiple conditions
that would block scheduling-stage3.
Instead of adding explicit logging lines while scheduling and processing
the activation-stages, only log them in activation_source_schedule()
and activation_source_handle_cb().
Effectively, lines like:
"Activation: Stage 1 of 5 (Device Prepare) started..."
are now:
"activation-stage: invoke activate_stage1_device_prepare,2 (id x)"
The function name is logged, so change them to be more suitable:
- drop the "nm_device_" prefix. It is redundant in the logging,
and usually our static functions don't have an nm-prefix.
- have all functions now start with "activate_stagex_". The
stage number corresponds to what we are currently logging
during activation:
"Activation: Stage 1 of 5 (Device Prepare)..."
No other functions start with a "activate_stagex_" prefix,
so it's easy to grep.
Every activation-source handler first cleared the current
activation-source (activation_source_clear()) and later
returned FALSE/G_SOURCE_REMOVE.
Do not directly attach the handlers to g_idle_add().
Instead wrap the actual handler. This wrapper
- always clears the activation source first
- always returns G_SOURCE_REMOVE
- does trace logging about invoking the handler
This also avoids a possibility for a bug where the handler returns
G_SOURCE_REMOVE, without also clearing activation_source_clear().
Some initscripts variables can use "0" or "1" instead of more common
"yes", "no", for example REORDER_HDR.
And we also write REORDER_HDR=0|1 in writer.c, so we did not read REODER_HDR
correctly.
Fixes: ccea442504
The kernel defaults REORDER_HDR to 1 when creating a new VLAN, but
NetworkManager's VLAN flags property defaulted to 0. Thus REORDER_HDR was not
set for NM-created VLANs with default values.
We want to match the kernel default, so we change the default value for the
vlan.flags property. However, we do not want to change the flags for existing
connections if the property is missing in connection files. Thus we have to
update plugins for that. We also make sure that vlan.flags is always written
by 'keyfile' when the value is default. That way new connections have flags
property explicitly written and it will be loaded as expected.
https://bugzilla.redhat.com/show_bug.cgi?id=1250225
Don't handle master-ready at the beginning of stage2, but instead while
scheduling (and then possibly delaying the scheduling of stage2).
This seems more idiomatic:
When inside a stage and your part is done: call schedule-next-stage.
That is, always schedule the next stage, not the current one.
schedule-next-stage then might delay to really scheduling until the
device is ready for the next state.
Fixes: 85ac903bb8
During stage2, if the slave detected that it would need to wait for
the master, it would return FALSE (which removes the g-idle-handler).
However, it would not clear the activation-source, so later, when
the master becomes ready, its attempt to schedule stage2 again would
result in an error-log and the idle-handler would not be scheduled
again.
Fixes: 85ac903bb8https://bugzilla.redhat.com/show_bug.cgi?id=1268797https://bugzilla.redhat.com/show_bug.cgi?id=1183444
This is intentionally IPv4 specific since this is used for a quick fallback to
method=link-local -- something that's not needed for IPv6 since the link local
address is always there.
https://bugzilla.redhat.com/show_bug.cgi?id=1262922
Otherwise the uninitializeded objects could be prematurely signalled if their
paths are seen twice in quick succession. This happens when you have ethernet
hardware and add an ethernet connection -- it's immediatelly added to
AvialableConnections and the property reload signals the object addition
before the NMRemoteSettings's GetSettings() finishes:
# nmcli c add type ethernet autoconnect no ifname '*'
(process:4610): libnm-CRITICAL **: nm_connection_get_id: assertion 's_con != NULL' failed
Connection '(null)' ((null)) successfully added.
#
https://bugzilla.gnome.org/show_bug.cgi?id=754794