core: better handle ignore-carrier=no for bond/bridge/team devices

By default, bond/bridge/team devices ignore carrier, and so do their
ports. However, it can make sense to set '[device*].ignore-carrier' for
the controller device. Meaningfully support that.

This is a follow up to commit 8c91422954 ('device: handle carrier
changes for master device differently'), which didn't fully solve the
problem.

What already works, is that when you set ignore-carrier for the
controller, then after loss of carrier and a carrier wait timeout, the
controller and ports go down. If both the controller and port profiles
have autoconnect disabled, they stay down and that's it. It works as
expected, but is not very useful, because when we want to automatically
react on carrier loss, we also want to automatically reconnect.

For controller profiles, carrier only makes sense when ports are
attached. However, we can (auto) activate controller profiles without
ports. So when the user enables autoconnect for the controller profile,
then the profile will eagerly reconnect. That means, after loss of
carrier, the device goes down and reconnects right away. It means, when
configuring a bond with ignore-carrier=no and autoconnect=yes, then
the sensible thing happens (an immediate reconnect). That is just not
a useful configuration.

The useful way to configure configure ignore-carrier=no for a controller
device, autoconnect on the master must be disabled while being enabled
on the ports. After all, it's the ports that will autoconnect based on
the carrier state and bring up the controller with them.

Note that at the moment when a port decides to autoconnect, the
controller profile is not yet selected. That only happens later during
_internal_activate_device() after searching it with find_master().  At
that point, the port profile checks whether it should autoconnect based
on its own carrier state, and abort if not.

If autoconnect is aborted due to lack of carrier, the profile gets
blocked from autoconnect with reason "failed". Hence, when the carrier
returns, we need to clear any "failed" blocked reasons and schedule
another autoconnect check,

Note that this really only works if the port is itself a simple device,
like an ethernet. If the port is itself a software device (like a bond,
or a VLAN), then the carrier state in _internal_activate_device() is
unknown, and we cannot avoid autoconnect. It's unclear how that could
make sense, if at all.

This setup can be combined with "connection.autoconnect-slaves=yes". In
that case, we have the first port to autoconnect when they get carrier,
bringing up the controller too. Usually the other ports that don't have
carrier would not autoconnect, but with autoconnect-slaves they will.
The effect is, that we autoconnect whenever any of the ports has
carrier, and then we immediately also bring up the ports that don't have
carrier (which we usually would not).

https://bugzilla.redhat.com/show_bug.cgi?id=2156684

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1658
This commit is contained in:
Thomas Haller 2023-06-13 12:52:20 +02:00
parent e57b3c8072
commit b9231a0e18
No known key found for this signature in database
GPG key ID: 29C2366E4DFC5728
2 changed files with 81 additions and 1 deletions

View file

@ -6639,6 +6639,8 @@ carrier_changed(NMDevice *self, gboolean carrier)
}
if (carrier) {
gboolean recheck_auto_activate = FALSE;
if (priv->state == NM_DEVICE_STATE_UNAVAILABLE) {
nm_device_queue_state(self,
NM_DEVICE_STATE_DISCONNECTED,
@ -6649,8 +6651,18 @@ carrier_changed(NMDevice *self, gboolean carrier)
* when the carrier appears, auto connections are rechecked for
* the device.
*/
nm_device_recheck_auto_activate_schedule(self);
recheck_auto_activate = TRUE;
}
if (nm_manager_devcon_autoconnect_blocked_reason_set(
nm_device_get_manager(self),
self,
NULL,
NM_SETTINGS_AUTOCONNECT_BLOCKED_REASON_FAILED,
FALSE))
recheck_auto_activate = TRUE;
if (recheck_auto_activate)
nm_device_recheck_auto_activate_schedule(self);
} else {
if (priv->state == NM_DEVICE_STATE_UNAVAILABLE) {
if (priv->queued_state.id && priv->queued_state.state >= NM_DEVICE_STATE_DISCONNECTED)

View file

@ -5622,6 +5622,62 @@ active_connection_parent_active(NMActiveConnection *active,
unmanaged_to_disconnected(device);
}
static gboolean
_check_autoconnect_port(NMActiveConnection *active,
NMSettingsConnection *master_connection,
NMDevice *master_device,
NMActiveConnection *master_ac)
{
NMSettingConnection *s_con;
NMDevice *device;
if (nm_active_connection_get_activation_reason(active) != NM_ACTIVATION_REASON_AUTOCONNECT) {
/* This is an explicit activation. Proceed. */
return TRUE;
}
if (!master_connection) {
/* This is not a port. Proceed. */
return TRUE;
}
device = nm_active_connection_get_device(active);
if (!nm_device_is_real(device)) {
/* The device is not real. We don't know about the carrier. Proceed. */
return TRUE;
}
if (nm_device_get_ifindex(device) <= 0) {
/* The device has no ifindex. It has no concept of carrier. Proceed. */
return TRUE;
}
if (nm_device_has_carrier(device)) {
/* The device has carrier. Proceed. */
return TRUE;
}
s_con = nm_settings_connection_get_setting(master_connection, NM_META_SETTING_TYPE_CONNECTION);
if (nm_setting_connection_get_autoconnect(s_con)) {
/* The controller profile has autoconnect enabled. Here we want to honor
* "ignore-carrier=no", which -- as configuration -- only makes sense for
* controllers that have autoconnect disable. Proceed. */
return TRUE;
}
if (nm_config_data_get_ignore_carrier_for_port(
NM_CONFIG_GET_DATA,
nm_setting_connection_get_interface_name(s_con),
nm_setting_connection_get_connection_type(s_con))) {
/* We ignore carrier on the master (as we would do by default). Proceed. */
return TRUE;
}
return FALSE;
}
static gboolean
_internal_activate_device(NMManager *self, NMActiveConnection *active, GError **error)
{
@ -5702,6 +5758,18 @@ _internal_activate_device(NMManager *self, NMActiveConnection *active, GError **
return FALSE;
}
if (!_check_autoconnect_port(active, master_connection, master_device, master_ac)) {
/* Usually, port and controller devices can (auto)connect without carrier. However,
* the controller has "ignore-carrier=no" configured. If the port autoconnects,
* has no carrier and the controller has ignore-carrier=no, then autoconnect
* is going to fail. */
g_set_error(error,
NM_MANAGER_ERROR,
NM_MANAGER_ERROR_DEPENDENCY_FAILED,
"port has no carrier and controller does not ignore carrier");
return FALSE;
}
/* Create any backing resources the device needs */
if (!nm_device_is_real(device)) {
NMDevice *parent;