NetworkManager/man
Thomas Haller fe80b2d1ec
cloud-setup: use suppress_prefixlength rule to honor non-default-routes in the main table
Background
==========

Imagine you run a container on your machine. Then the routing table
might look like:

    default via 10.0.10.1 dev eth0 proto dhcp metric 100
    10.0.10.0/28 dev eth0 proto kernel scope link src 10.0.10.5 metric 100
    [...]
    10.42.0.0/24 via 10.42.0.0 dev flannel.1 onlink
    10.42.1.2 dev cali02ad7e68ce1 scope link
    10.42.1.3 dev cali8fcecf5aaff scope link
    10.42.2.0/24 via 10.42.2.0 dev flannel.1 onlink
    10.42.3.0/24 via 10.42.3.0 dev flannel.1 onlink

That is, there are another interfaces with subnets and specific routes.

If nm-cloud-setup now configures rules:

    0:  from all lookup local
    30400:  from 10.0.10.5 lookup 30400
    32766:  from all lookup main
    32767:  from all lookup default

and

    default via 10.0.10.1 dev eth0 table 30400 proto static metric 10
    10.0.10.1 dev eth0 table 30400 proto static scope link metric 10

then these other subnets will also be reached via the default route.

This container example is just one case where this is a problem. In
general, if you have specific routes on another interface, then the
default route in the 30400+ table will interfere badly.

The idea of nm-cloud-setup is to automatically configure the network for
secondary IP addresses. When the user has special requirements, then
they should disable nm-cloud-setup and configure whatever they want.
But the container use case is popular and important. It is not something
where the user actively configures the network. This case needs to work better,
out of the box. In general, nm-cloud-setup should work better with the
existing network configuration.

Change
======

Add new routing tables 30200+ with the individual subnets of the
interface:

    10.0.10.0/24 dev eth0 table 30200 proto static metric 10
    [...]
    default via 10.0.10.1 dev eth0 table 30400 proto static metric 10
    10.0.10.1 dev eth0 table 30400 proto static scope link metric 10

Also add more important routing rules with priority 30200+, which select
these tables based on the source address:

    30200:  from 10.0.10.5 lookup 30200

These will do source based routing for the subnets on these
interfaces.

Then, add a rule with priority 30350

    30350:  lookup main suppress_prefixlength 0

which processes the routes from the main table, but ignores the default
routes. 30350 was chosen, because it's in between the rules 30200+ and
30400+, leaving a range for the user to configure their own rules.

Then, as before, the rules 30400+ again look at the corresponding 30400+
table, to find a default route.

Finally, process the main table again, this time honoring the default
route. That is for packets that have a different source address.

This change means that the source based routing is used for the
subnets that are configured on the interface and for the default route.
Whereas, if there are any more specific routes in the main table, they will
be preferred over the default route.

Apparently Amazon Linux solves this differently, by not configuring a
routing table for addresses on interface "eth0". That might be an
alternative, but it's not clear to me what is special about eth0 to
warrant this treatment. It also would imply that we somehow recognize
this primary interface. In practise that would be doable by selecting
the interface with "iface_idx" zero.

Instead choose this approach. This is remotely similar to what WireGuard does
for configuring the default route ([1]), however WireGuard uses fwmark to match
the packets instead of the source address.

[1] https://www.wireguard.com/netns/#improved-rule-based-routing
2021-09-16 17:30:25 +02:00
..
common.ent.in man: fix "no-auto-default" state dir in NetworkManager.conf manual 2018-10-25 15:24:38 +02:00
meson.build man: split NetworkManager-dispatcher(8) manual page out of NetworkManager(8) 2021-03-16 17:01:53 +01:00
NetworkManager-dispatcher.xml man: update URL for networkmanager.dev home page 2021-08-03 14:57:35 +02:00
NetworkManager.conf.xml core: introduce device 'allowed-connections' property 2021-07-27 17:43:45 +02:00
NetworkManager.xml man: update URL for networkmanager.dev home page 2021-08-03 14:57:35 +02:00
nm-cloud-setup.xml cloud-setup: use suppress_prefixlength rule to honor non-default-routes in the main table 2021-09-16 17:30:25 +02:00
nm-initrd-generator.xml nm-initrd-generator: include man entry for rd.ethtool options 2021-08-17 12:32:54 -03:00
nm-online.xml nm-online: allow configuring timeout via NM_ONLINE_TIMEOUT environment 2020-04-30 21:46:59 +02:00
nm-openvswitch.xml man: update nm-openswitch example 2019-07-09 12:05:32 +02:00
nm-settings-dbus.xsl docs: update documentation for nm-settings-nmcli manual 2020-06-11 10:53:49 +02:00
nm-settings-ifcfg-rh.xsl man: document NOZEROCONF in man nm-settings-ifcfg-rh 2021-03-16 13:45:39 +01:00
nm-settings-keyfile.xsl docs: unify "nm-property-infos-*.xml" and "nm-settings-docs-*.xml" (root element) 2020-06-11 10:53:50 +02:00
nm-settings-nmcli.xsl Support new attribute tag description-docbook 2021-06-23 08:59:45 -04:00
nmcli-examples.xml all: fix typo in man pages 2020-07-03 10:48:04 +02:00
nmcli.xml man/cli: mention nmcli device up|down instead of nmcli device connect|disconnect 2021-07-09 16:41:26 +02:00
nmtui.1.in man: update version number and dates in manual pages 2016-03-09 10:11:27 +01:00
nmtui.xml man: turn the manual page cross-references into links 2016-06-21 18:40:13 +02:00