linux/net
Vladimir Oltean 1461d212ab net/sched: taprio: make qdisc_leaf() see the per-netdev-queue pfifo child qdiscs
taprio can only operate as root qdisc, and to that end, there exists the
following check in taprio_init(), just as in mqprio:

	if (sch->parent != TC_H_ROOT)
		return -EOPNOTSUPP;

And indeed, when we try to attach taprio to an mqprio child, it fails as
expected:

$ tc qdisc add dev swp0 root handle 1: mqprio num_tc 8 \
	map 0 1 2 3 4 5 6 7 \
	queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 hw 0
$ tc qdisc replace dev swp0 parent 1:2 taprio num_tc 8 \
	map 0 1 2 3 4 5 6 7 \
	queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
	base-time 0 sched-entry S 0x7f 990000 sched-entry S 0x80 100000 \
	flags 0x0 clockid CLOCK_TAI
Error: sch_taprio: Can only be attached as root qdisc.

(extack message added by me)

But when we try to attach a taprio child to a taprio root qdisc,
surprisingly it doesn't fail:

$ tc qdisc replace dev swp0 root handle 1: taprio num_tc 8 \
	map 0 1 2 3 4 5 6 7 queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
	base-time 0 sched-entry S 0x7f 990000 sched-entry S 0x80 100000 \
	flags 0x0 clockid CLOCK_TAI
$ tc qdisc replace dev swp0 parent 1:2 taprio num_tc 8 \
	map 0 1 2 3 4 5 6 7 \
	queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
	base-time 0 sched-entry S 0x7f 990000 sched-entry S 0x80 100000 \
	flags 0x0 clockid CLOCK_TAI

This is because tc_modify_qdisc() behaves differently when mqprio is
root, vs when taprio is root.

In the mqprio case, it finds the parent qdisc through
p = qdisc_lookup(dev, TC_H_MAJ(clid)), and then the child qdisc through
q = qdisc_leaf(p, clid). This leaf qdisc q has handle 0, so it is
ignored according to the comment right below ("It may be default qdisc,
ignore it"). As a result, tc_modify_qdisc() goes through the
qdisc_create() code path, and this gives taprio_init() a chance to check
for sch_parent != TC_H_ROOT and error out.

Whereas in the taprio case, the returned q = qdisc_leaf(p, clid) is
different. It is not the default qdisc created for each netdev queue
(both taprio and mqprio call qdisc_create_dflt() and keep them in
a private q->qdiscs[], or priv->qdiscs[], respectively). Instead, taprio
makes qdisc_leaf() return the _root_ qdisc, aka itself.

When taprio does that, tc_modify_qdisc() goes through the qdisc_change()
code path, because the qdisc layer never finds out about the child qdisc
of the root. And through the ->change() ops, taprio has no reason to
check whether its parent is root or not, just through ->init(), which is
not called.

The problem is the taprio_leaf() implementation. Even though code wise,
it does the exact same thing as mqprio_leaf() which it is copied from,
it works with different input data. This is because mqprio does not
attach itself (the root) to each device TX queue, but one of the default
qdiscs from its private array.

In fact, since commit 13511704f8 ("net: taprio offload: enforce qdisc
to netdev queue mapping"), taprio does this too, but just for the full
offload case. So if we tried to attach a taprio child to a fully
offloaded taprio root qdisc, it would properly fail too; just not to a
software root taprio.

To fix the problem, stop looking at the Qdisc that's attached to the TX
queue, and instead, always return the default qdiscs that we've
allocated (and to which we privately enqueue and dequeue, in software
scheduling mode).

Since Qdisc_class_ops :: leaf  is only called from tc_modify_qdisc(),
the risk of unforeseen side effects introduced by this change is
minimal.

Fixes: 5a781ccbd1 ("tc: Add support for configuring the taprio scheduler")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20 11:41:14 -07:00
..
6lowpan net: 6lowpan: constify lowpan_nhc structures 2022-06-09 21:53:28 +02:00
9p iov_iter stuff, part 2, rebased 2022-08-08 20:04:35 -07:00
802
8021q Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-07-14 15:27:35 -07:00
appletalk net: remove noblock parameter from skb_recv_datagram() 2022-04-06 13:45:26 +01:00
atm net: SO_RCVMARK socket option for SO_MARK with recvmsg() 2022-04-28 13:08:15 -07:00
ax25 net: avoid overflow when rose /proc displays timer information. 2022-08-05 19:00:02 -07:00
batman-adv batman-adv: Fix hang up with small MTU hard-interface 2022-08-20 14:17:45 +02:00
bluetooth Bluetooth: hci_sync: Fix hci_read_buffer_size_sync 2022-09-02 14:01:28 -07:00
bpf bpf: Allow calling bpf_prog_test kfuncs in tracing programs 2022-08-09 18:46:11 -07:00
bpfilter uaccess: remove CONFIG_SET_FS 2022-02-25 09:36:06 +01:00
bridge netfilter: br_netfilter: Drop dst references before setting. 2022-08-31 12:12:32 +02:00
caif caif: Fix bitmap data type in "struct caifsock" 2022-07-22 12:51:45 +01:00
can can: j1939: j1939_session_destroy(): fix memory leak of skbs 2022-08-09 09:05:06 +02:00
ceph libceph: clean up ceph_osdc_start_request prototype 2022-08-03 14:05:39 +02:00
core net: core: fix flow symmetric hash 2022-09-09 12:48:00 +01:00
dcb net: dcb: disable softirqs in dcbnl_flush_dev() 2022-03-03 08:01:55 -08:00
dccp dccp: put dccp_qpolicy_full() and dccp_qpolicy_push() in the same lock 2022-08-01 12:11:56 -07:00
decnet dn_route: replace "jiffies-now>0" with "jiffies!=now" 2022-07-29 20:12:49 -07:00
dns_resolver
dsa net: dsa: hellcreek: Print warning only once 2022-08-31 19:54:04 -07:00
ethernet net: ethernet: set default assignment identifier to NET_NAME_ENUM 2022-04-07 21:04:03 -07:00
ethtool net: delete extra space and tab in blank line 2022-07-25 19:38:31 -07:00
hsr treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_30.RULE (part 2) 2022-06-10 14:51:35 +02:00
ieee802154 net/ieee802154: fix uninit value bug in dgram_sendmsg 2022-09-16 10:53:55 +01:00
ife
ipv4 ipmr: Always call ip{,6}_mr_forward() from RCU read-side critical section 2022-09-20 08:22:15 -07:00
ipv6 ipv6: Fix crash when IPv6 is administratively disabled 2022-09-20 11:27:32 -07:00
iucv net: keep sk->sk_forward_alloc as small as possible 2022-06-10 16:21:27 -07:00
kcm kcm: fix strp_init() order and cleanup 2022-08-31 12:16:44 -07:00
key Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec 2022-08-24 12:51:50 +01:00
l2tp l2tp: l2tp_debugfs: fix Clang -Wformat warnings 2022-07-08 12:14:36 +01:00
l3mdev l3mdev: l3mdev_master_upper_ifindex_by_index_rcu should be using netdev_master_upper_dev_get_rcu 2022-04-15 14:27:24 -07:00
lapb
llc net: rename reference+tracking helpers 2022-06-09 21:52:55 -07:00
mac80211 We have a handful of fixes: 2022-09-04 11:23:11 +01:00
mac802154 net: mac802154: Fix a condition in the receive path 2022-08-29 11:10:22 +02:00
mctp Networking changes for 5.19. 2022-05-25 12:22:58 -07:00
mpls net: Use u64_stats_fetch_begin_irq() for stats fetch. 2022-08-29 13:02:27 +01:00
mptcp mptcp: fix fwd memory accounting on coalesce 2022-09-13 10:18:44 +02:00
ncsi net/ncsi: use proper "mellanox" DT vendor prefix 2022-06-23 20:51:06 -07:00
netfilter netfilter: nfnetlink_osf: fix possible bogus match in nf_osf_find() 2022-09-07 15:55:28 +02:00
netlabel netlabel: fix typo in comment 2022-08-10 09:24:41 +01:00
netlink net: genl: fix error path memory leak in policy dumping 2022-08-18 10:20:48 -07:00
netrom net: remove noblock parameter from skb_recv_datagram() 2022-04-06 13:45:26 +01:00
nfc net: nfc: Directly use ida_alloc()/free() 2022-05-28 15:28:47 +01:00
nsh
openvswitch openvswitch: fix memory leak at failed datapath creation 2022-08-26 19:26:30 -07:00
packet net/af_packet: check len when min_header_len equals to 0 2022-07-29 12:09:27 +01:00
phonet net: remove noblock parameter from recvmsg() entities 2022-04-12 15:00:25 +02:00
psample
qrtr net: qrtr: start MHI channel after endpoit creation 2022-08-15 11:21:42 +01:00
rds rds: add missing barrier to release_refill 2022-08-12 10:46:01 +01:00
rfkill rfkill: make new event layout opt-in 2022-03-18 13:09:17 +02:00
rose rose: check NULL rose_loopback_neigh->loopback 2022-08-22 14:24:54 +01:00
rxrpc rxrpc: Remove rxrpc_get_reply_time() which is no longer used 2022-09-01 11:44:13 +01:00
sched net/sched: taprio: make qdisc_leaf() see the per-netdev-queue pfifo child qdiscs 2022-09-20 11:41:14 -07:00
sctp Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-07-28 18:21:16 -07:00
smc net/smc: Fix possible access to freed memory in link clear 2022-09-07 16:00:48 +01:00
strparser strparser: pad sk_skb_cb to avoid straddling cachelines 2022-07-08 18:38:44 -07:00
sunrpc NFS client bugfixes for Linux 6.0 2022-08-22 11:40:01 -07:00
switchdev net: rename reference+tracking helpers 2022-06-09 21:52:55 -07:00
tipc tipc: fix shift wrapping bug in map_get() 2022-09-02 12:26:29 +01:00
tls tls: rx: react to strparser initialization errors 2022-08-17 10:24:00 +01:00
unix Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next 2022-07-09 12:24:16 -07:00
vmw_vsock vsock: Set socket state back to SS_UNCONNECTED in vsock_connect_timeout() 2022-08-10 09:50:18 +01:00
wireless wifi: use struct_group to copy addresses 2022-09-03 16:40:06 +02:00
x25 net/x25: fix call timeouts in blocking connects 2022-08-08 20:48:51 -07:00
xdp xsk: Fix corrupted packets for XDP_SHARED_UMEM 2022-08-15 17:26:07 +02:00
xfrm net: Fix data-races around netdev_max_backlog. 2022-08-24 13:46:57 +01:00
compat.c net: clear msg_get_inq in __get_compat_msghdr() 2022-09-20 08:23:20 -07:00
devres.c
Kconfig page_pool: Add allocation stats 2022-03-03 09:55:28 +00:00
Kconfig.debug net: CONFIG_DEBUG_NET depends on CONFIG_NET 2022-06-02 10:15:05 -07:00
Makefile
socket.c net: Fix a data-race around sysctl_somaxconn. 2022-08-24 13:46:58 +01:00
sysctl_net.c