ifcfg: better handle non-full-membership PKEY_ID with new PKEY_ID_NM variable

Infiniband profiles can have a p-key set. Both in kernel API
("create_child" sysctl) and in NetworkManager API, that key can range
from 0x0001 to 0xFFFF (0x8000 excluded). NetworkManager does not support
renaming the interface, so kernel always assigns the interface name
"$PHYSDEV.$PKEY_ID" (with $PKEY_ID as 4 character hex digits).

Note that the highest bit in the p-key (0x8000) is the full-membership
flag. Internally, kernel only supports full-membership so when we create
for example "ib0.00c1" and "ib0.80c1" interfaces, their actually used
p-key is in both cases 0x80c1 and you can see it with `ip -d link`.
Nonetheless, kernel and NetworkManager allow to configure the p-key
without the highest bit set, and the result differs in the interface
name.

Note that initscripts' ifup-ib0 would always internally coerce the
PKEY_ID variable to have the high bit set ([1]). It also would require
that the `DEVICE=` variable is specified and matches the expected
interface name. So both these configurations are identical and valid:

  DEVICE=ib0.80c1
  PHYSDEV=ib0
  PKEY_ID=0x80c1

and

  DEVICE=ib0.80c1
  PHYSDEV=ib0
  PKEY_ID=0x00c1

Historically, NetworkManager would also implement the same restrictions
([2], [3], [4]). That meant, not all valid NetworkManager infiniband
profiles could be expressed as  ifcfg file. For example, NetworkManager
allows to have "connection.interface-name" (`DEVICE=`) unset (which
ifup-ib and ifcfg reader did not allow). Also, NetworkManager would
allow configuring a "infiniband.p-key" without full membership flag, and
the reader would mangle that.

This caused various problems to the point that when you configure an
infiniband.p-key with a non-full-membership key, the ifcfg-rh written by
NetworkManager was invalid. Either, you could leave
"connection.interface-name" unset, but then the reader would complain
about missing `DEVICE=`. Or, we could write `DEVICE=ib0.00c1;
PKEY_ID=0x00c1`, which was invalid as we expected `DEVICE=ib0.80c1`.

This was addressed by rhbz 2122703 ([5]). The fix was to

  - not require a `DEVICE=` ([6]).
  - don't mangle the `PKEY_ID=` in the reader ([7]).

which happened in 1.41.2 and 1.40.2 (rhel-8.8).

With this change, we could persist any valid infiniband profile to ifcfg
format. We also could read back any valid ifcfg file that NetworkManager
would have written in the past (note that it could not write valid ifcfg
files previously, if the p-key didn't have the full-membership key set).

The problem is, that users were used to edit ifcfg files by hand, and
users would have files with:

  DEVICE=ib0.80c1
  PHYSDEV=ib0
  PKEY_ID=0x00c1

This files had worked before, but now failed to verify as we would
expect `DEVICE=ib0.00c1`. Also, there was a change in behavior that
PKEY_ID is now interpreted without the high bit set. This is reported as
rhbz 2209164 ([8]).

We will do several things to fix that:

1) we now normalize the "connection.interface-name" to be valid. It was
  not useful to set it anyway, as it was redundant. Complaining about a
  redundant setting, which makes little sense to configure, is not useful.
  This is done by [9].

2) we now again treat PKEY_ID= as if it had 0x8000 flag set. This was done by
  [10].

With step 1) and 2), we are able to read any existing ifcfg files out
there in the way we did before 1.41.2.

There is however one piece missing. When we now create a profile using
nmcli/libnm/D-Bus, which has a non-full-membership p-key, then the
profile gets mangled in the process.

If the user uses NetworkManager API to configure an interface and
chooses a non-full-membership p-key, then this should work the same as
with keyfile plugin (or on rhel-9, where keyfile is the default). Note
that before 1.41.2 it didn't work at all, when the user used ifcfg-rh
backend. Likely(?) there are no users who rely on creating such a profile
with nmcli/libnm/D-Bus and expect to automatically have the p-key
normalized. That didn't work before 1.41.2 and didn't behave that way
between 1.41.2 and now.

This patch fixes that by introducing a new key PKEY_ID_NM= for holding
the real p-key. Now ifcfg backend is consistent with handling infiniband
profiles, and old, hand-written ifcfg files still work as before.

There is of course change in behavior, that ifcfg files between 1.41.2
and now were interpreted differently. But that is bug 2209164 ([8]) and
what we fix here.

For now strong reasons, we keep writing the PKEY_ID to file too. It's
redundant, but that is what a human might expect there.

[1]  05333c3602/f/rdma.ifup-ib (_75)
[2]  https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/blob/1.40.0/src/core/settings/plugins/ifcfg-rh/nms-ifcfg-rh-reader.c#L5386
[3]  cb5606cf1c (a7a78fccb2c8c945fd09038656ae734c1b0349ab_3493_3532)
[4]  cb5606cf1c (a7a78fccb2c8c945fd09038656ae734c1b0349ab_3493_3506)
[5]  https://bugzilla.redhat.com/show_bug.cgi?id=2122703
[6]  4c32dd9d25
[7]  a4fe16a426
[8]  https://bugzilla.redhat.com/show_bug.cgi?id=2209164
[9]  4610fd67e6
[10] f8e5e07355
This commit is contained in:
Thomas Haller 2023-05-30 17:51:02 +02:00
parent 0d0704eaa0
commit 5e3e38f291
No known key found for this signature in database
GPG key ID: 29C2366E4DFC5728
8 changed files with 42 additions and 68 deletions

View file

@ -5391,6 +5391,7 @@ parse_infiniband_p_key(shvarFile *ifcfg, int *out_p_key, char **out_parent, GErr
gs_free char *physdev = NULL;
gs_free char *pkey_id = NULL;
int id;
int fixup_id = 0;
physdev = svGetValueStr_cp(ifcfg, "PHYSDEV");
if (!physdev) {
@ -5401,7 +5402,14 @@ parse_infiniband_p_key(shvarFile *ifcfg, int *out_p_key, char **out_parent, GErr
return FALSE;
}
pkey_id = svGetValueStr_cp(ifcfg, "PKEY_ID");
pkey_id = svGetValueStr_cp(ifcfg, "PKEY_ID_NM");
if (!pkey_id) {
/* Only check for "$PKEY_ID". That key is interpreted as having the
* full membership flag set ("fixup_id"). */
fixup_id = 0x8000;
pkey_id = svGetValueStr_cp(ifcfg, "PKEY_ID");
}
if (!pkey_id) {
g_set_error(error,
NM_SETTINGS_ERROR,
@ -5420,23 +5428,7 @@ parse_infiniband_p_key(shvarFile *ifcfg, int *out_p_key, char **out_parent, GErr
return FALSE;
}
/* The highest bit 0x8000 indicates full membership, which kernel always
* automatically sets.
*
* NetworkManager supports p-keys without the high bit set. That affects
* the interface name (nm_net_devname_infiniband()) and is what
* we write to "create_child"/"delete_child" sysctl. Kernel will honor
* such p-keys for the interface name, but for other purposes it adds the
* highest bit. That makes using p-keys without the highest bit odd.
*
* Historically, /etc/sysconfig/network-scripts/ifup-ib would always add "|=0x8000".
* The reader does that too.
*
* Note that this means ifcfg cannot handle p-keys without the highest bit set,
* and when trying to store that to ifcfg format, the profile will be mangled/modified
* by the ifcg plugin (unlike keyfile backend, which preserves the original p-key value).
*/
id |= 0x8000;
id |= fixup_id;
*out_p_key = id;
*out_parent = g_steal_pointer(&physdev);

View file

@ -1033,6 +1033,7 @@ const NMSIfcfgKeyTypeInfo nms_ifcfg_well_known_keys[] = {
_KEY_TYPE("PHYSDEV", NMS_IFCFG_KEY_TYPE_IS_PLAIN),
_KEY_TYPE("PKEY", NMS_IFCFG_KEY_TYPE_IS_PLAIN),
_KEY_TYPE("PKEY_ID", NMS_IFCFG_KEY_TYPE_IS_PLAIN),
_KEY_TYPE("PKEY_ID_NM", NMS_IFCFG_KEY_TYPE_IS_PLAIN),
_KEY_TYPE("PMF", NMS_IFCFG_KEY_TYPE_IS_PLAIN),
_KEY_TYPE("PORTNAME", NMS_IFCFG_KEY_TYPE_IS_PLAIN),
_KEY_TYPE("POWERSAVE", NMS_IFCFG_KEY_TYPE_IS_PLAIN),

View file

@ -33,7 +33,7 @@ typedef struct {
NMSIfcfgKeyTypeFlags key_flags;
} NMSIfcfgKeyTypeInfo;
extern const NMSIfcfgKeyTypeInfo nms_ifcfg_well_known_keys[263];
extern const NMSIfcfgKeyTypeInfo nms_ifcfg_well_known_keys[264];
const NMSIfcfgKeyTypeInfo *nms_ifcfg_well_known_key_find_info(const char *key, gssize *out_idx);

View file

@ -1054,30 +1054,24 @@ write_infiniband_setting(NMConnection *connection,
p_key = nm_setting_infiniband_get_p_key(s_infiniband);
if (p_key != -1) {
/* The reader normalizes KKEY_ID with |=0x8000. Also do that when
* writing the profile so that what we write, is consistent with what
* we would read. */
p_key |= 0x8000;
svSetValueStr(ifcfg, "PKEY", "yes");
svSetValueInt64(ifcfg, "PKEY_ID", p_key);
if (!NM_FLAGS_HAS(p_key, 0x8000)) {
/* initscripts' ifup-ib used to always interpret the PKEY_ID with
* the full membership flag (0x8000) set. For compatibility, we do
* interpret PKEY_ID as having that flag set.
*
* However, now we want to persist a p-key which doesn't have the
* flag. Use a NetworkManager specific variable for that. This configuration
* is not supported by initscripts' ifup-ib.
*/
svSetValueInt64(ifcfg, "PKEY_ID_NM", p_key);
}
parent = nm_setting_infiniband_get_parent(s_infiniband);
svSetValueStr(ifcfg, "PHYSDEV", parent);
if (parent && nm_connection_get_interface_name(connection)) {
char name[NM_IFNAMSIZ];
/* The connection.interface-name depends on the p-key. Also,
* nm_connection_normalize() will automatically adjust the
* interface-name to match the p-key.
*
* As we patched the p-key above, also anticipate that change, and
* don't write a DEVICE= to the file, which would we normalize
* differently, when reading it back. */
nm_net_devname_infiniband(name, parent, p_key);
*out_interface_name = g_strdup(name);
}
}
svSetValueStr(ifcfg, "TYPE", TYPE_INFINIBAND);

View file

@ -8446,7 +8446,6 @@ test_write_infiniband(gconstpointer test_data)
const int TEST_IDX = GPOINTER_TO_INT(test_data);
nmtst_auto_unlinkfile char *testfile = NULL;
gs_unref_object NMConnection *connection = NULL;
gs_unref_object NMConnection *expected = NULL;
gs_unref_object NMConnection *reread = NULL;
gboolean reread_same = FALSE;
NMSettingConnection *s_con;
@ -8527,32 +8526,20 @@ test_write_infiniband(gconstpointer test_data)
nmtst_assert_connection_verifies(connection);
if (p_key != -1 && p_key < 0x8000) {
expected = nm_simple_connection_new_clone(connection);
g_object_set(nm_connection_get_setting(expected, NM_TYPE_SETTING_INFINIBAND),
NM_SETTING_INFINIBAND_P_KEY,
(int) (p_key | 0x8000),
NULL);
} else
expected = g_object_ref(connection);
_writer_new_connection_reread(connection,
TEST_SCRATCH_DIR,
&testfile,
NO_EXPECTED,
&reread,
&reread_same);
_assert_reread_same(expected, reread);
if (p_key == -1 || p_key > 0x8000)
g_assert(reread_same);
else
g_assert(!reread_same);
_assert_reread_same(connection, reread);
g_assert(reread_same);
g_assert_cmpstr(interface_name, ==, nm_connection_get_interface_name(reread));
g_assert_cmpint(nm_setting_infiniband_get_p_key(
_nm_connection_get_setting(reread, NM_TYPE_SETTING_INFINIBAND)),
==,
p_key == -1 ? -1 : (p_key | 0x8000));
p_key);
}
static void

View file

@ -423,31 +423,31 @@ nm_setting_infiniband_class_init(NMSettingInfinibandClass *klass)
/**
* NMSettingInfiniband:p-key:
*
* The InfiniBand P_Key to use for this device. A value of -1 means to use
* the default P_Key (aka "the P_Key at index 0"). Otherwise, it is a
* The InfiniBand p-key to use for this device. A value of -1 means to use
* the default p-key (aka "the p-key at index 0"). Otherwise, it is a
* 16-bit unsigned integer, whose high bit 0x8000 is set if it is a "full
* membership" P_Key. The values 0 and 0x8000 are not allowed.
* membership" p-key. The values 0 and 0x8000 are not allowed.
*
* With the p-key set, the interface name is always "$parent.$p_key".
* Setting "connection.interface-name" to another name is not supported.
*
* Note that kernel will internally always set the full membership bit,
* although the interface name does not reflect that. Thus, not setting
* the high bit is probably not useful.
*
* If the profile is stored in ifcfg-rh format, then the full membership
* bit is automatically added. To get consistent behavior, it is
* best to only use p-key values with the full membership bit set.
* although the interface name does not reflect that. Usually the user
* would want to configure a full membership p-key with 0x8000 flag set.
**/
/* ---ifcfg-rh---
* property: p-key
* variable: PKEY_ID (and PKEY=yes)
* variable: PKEY_ID or PKEY_ID_NM(*) (requires PKEY=yes)
* default: PKEY=no
* description: InfiniBand P_Key. The value can be a hex number prefixed with "0x"
* or a decimal number.
* When PKEY_ID is specified, PHYSDEV and DEVICE also must be specified.
* When PKEY_ID is specified, PHYSDEV must be specified.
* Note that ifcfg-rh format will always automatically set the full membership
* bit 0x8000. Other p-key cannot be stored.
* flag 0x8000 for the PKEY_ID variable. To express IDs without the full membership
* flag, use PKEY_ID_NM. Note that kernel internally treats the interface as
* having the full membership flag set, this mainly affects the interface name.
* For the ifcfg file to be supported by initscripts' ifup-ib, the DEVICE=
* must always be set. NetworkManager does not require that.
* example: PKEY=yes PKEY_ID=2 PHYSDEV=mlx4_ib0 DEVICE=mlx4_ib0.8002
* ---end---
*/

View file

@ -155,7 +155,7 @@
#define DESCRIBE_DOC_NM_SETTING_GSM_USERNAME N_("The username used to authenticate with the network, if required. Many providers do not require a username, or accept any username. But if a username is required, it is specified here.")
#define DESCRIBE_DOC_NM_SETTING_INFINIBAND_MAC_ADDRESS N_("If specified, this connection will only apply to the IPoIB device whose permanent MAC address matches. This property does not change the MAC address of the device (i.e. MAC spoofing).")
#define DESCRIBE_DOC_NM_SETTING_INFINIBAND_MTU N_("If non-zero, only transmit packets of the specified size or smaller, breaking larger packets up into multiple frames.")
#define DESCRIBE_DOC_NM_SETTING_INFINIBAND_P_KEY N_("The InfiniBand P_Key to use for this device. A value of -1 means to use the default P_Key (aka \"the P_Key at index 0\"). Otherwise, it is a 16-bit unsigned integer, whose high bit 0x8000 is set if it is a \"full membership\" P_Key. The values 0 and 0x8000 are not allowed. With the p-key set, the interface name is always \"$parent.$p_key\". Setting \"connection.interface-name\" to another name is not supported. Note that kernel will internally always set the full membership bit, although the interface name does not reflect that. Thus, not setting the high bit is probably not useful. If the profile is stored in ifcfg-rh format, then the full membership bit is automatically added. To get consistent behavior, it is best to only use p-key values with the full membership bit set.")
#define DESCRIBE_DOC_NM_SETTING_INFINIBAND_P_KEY N_("The InfiniBand p-key to use for this device. A value of -1 means to use the default p-key (aka \"the p-key at index 0\"). Otherwise, it is a 16-bit unsigned integer, whose high bit 0x8000 is set if it is a \"full membership\" p-key. The values 0 and 0x8000 are not allowed. With the p-key set, the interface name is always \"$parent.$p_key\". Setting \"connection.interface-name\" to another name is not supported. Note that kernel will internally always set the full membership bit, although the interface name does not reflect that. Usually the user would want to configure a full membership p-key with 0x8000 flag set.")
#define DESCRIBE_DOC_NM_SETTING_INFINIBAND_PARENT N_("The interface name of the parent device of this device. Normally NULL, but if the \"p_key\" property is set, then you must specify the base device by setting either this property or \"mac-address\".")
#define DESCRIBE_DOC_NM_SETTING_INFINIBAND_TRANSPORT_MODE N_("The IP-over-InfiniBand transport mode. Either \"datagram\" or \"connected\".")
#define DESCRIBE_DOC_NM_SETTING_IP4_CONFIG_ADDRESSES N_("A list of IPv4 addresses and their prefix length. Multiple addresses can be separated by comma. For example \"192.168.1.5/24, 10.1.0.5/24\". The addresses are listed in decreasing priority, meaning the first address will be the primary address.")

View file

@ -612,7 +612,7 @@
nmcli-description="The IP-over-InfiniBand transport mode. Either &quot;datagram&quot; or &quot;connected&quot;." />
<property name="p-key"
alias="p-key"
nmcli-description="The InfiniBand P_Key to use for this device. A value of -1 means to use the default P_Key (aka &quot;the P_Key at index 0&quot;). Otherwise, it is a 16-bit unsigned integer, whose high bit 0x8000 is set if it is a &quot;full membership&quot; P_Key. The values 0 and 0x8000 are not allowed. With the p-key set, the interface name is always &quot;$parent.$p_key&quot;. Setting &quot;connection.interface-name&quot; to another name is not supported. Note that kernel will internally always set the full membership bit, although the interface name does not reflect that. Thus, not setting the high bit is probably not useful. If the profile is stored in ifcfg-rh format, then the full membership bit is automatically added. To get consistent behavior, it is best to only use p-key values with the full membership bit set." />
nmcli-description="The InfiniBand p-key to use for this device. A value of -1 means to use the default p-key (aka &quot;the p-key at index 0&quot;). Otherwise, it is a 16-bit unsigned integer, whose high bit 0x8000 is set if it is a &quot;full membership&quot; p-key. The values 0 and 0x8000 are not allowed. With the p-key set, the interface name is always &quot;$parent.$p_key&quot;. Setting &quot;connection.interface-name&quot; to another name is not supported. Note that kernel will internally always set the full membership bit, although the interface name does not reflect that. Usually the user would want to configure a full membership p-key with 0x8000 flag set." />
<property name="parent"
alias="parent"
nmcli-description="The interface name of the parent device of this device. Normally NULL, but if the &quot;p_key&quot; property is set, then you must specify the base device by setting either this property or &quot;mac-address&quot;." />