Commit graph

14289 commits

Author SHA1 Message Date
Pedro Tammela 3d5026fc5a selftests: tc-testing: use netns delete from pyroute2
When pyroute2 is available, use the native netns delete routine instead
of calling iproute2 to do it. As forks are expensive with some kernel
configs, minimize its usage to avoid kselftests timeouts.

Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/20231117171208.2066136-4-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-20 18:06:36 -08:00
Pedro Tammela 50a5988a7a selftests: tc-testing: move back to per test ns setup
Surprisingly in kernel configs with most of the debug knobs turned on,
pre-allocating the test resources makes tdc run much slower overall than
when allocating resources on a per test basis.

As these knobs are used in kselftests in downstream CIs, let's go back
to the old way of doing things to avoid kselftests timeouts.

Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202311161129.3b45ed53-oliver.sang@intel.com
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/20231117171208.2066136-3-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-20 18:06:36 -08:00
Pedro Tammela 025de7b6a6 selftests: tc-testing: cap parallel tdc to 4 cores
We have observed a lot of lock contention and test instability when running with >8 cores.
Enough to actually make the tests run slower than with fewer cores.

Cap the maximum cores of parallel tdc to 4 which showed in testing to
be a reasonable number for efficiency and stability in different kernel
config scenarios.

Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/20231117171208.2066136-2-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-20 18:06:35 -08:00
Willem de Bruijn a0bc96c0cd selftests: net: verify fq per-band packet limit
Commit 29f834aa32 ("net_sched: sch_fq: add 3 bands and WRR
scheduling") introduces multiple traffic bands, and per-band maximum
packet count.

Per-band limits ensures that packets in one class cannot fill the
entire qdisc and so cause DoS to the traffic in the other classes.

Verify this behavior:
  1. set the limit to 10 per band
  2. send 20 pkts on band A: verify that 10 are queued, 10 dropped
  3. send 20 pkts on band A: verify that  0 are queued, 20 dropped
  4. send 20 pkts on band B: verify that 10 are queued, 10 dropped

Packets must remain queued for a period to trigger this behavior.
Use SO_TXTIME to store packets for 100 msec.

The test reuses existing upstream test infra. The script is a fork of
cmsg_time.sh. The scripts call cmsg_sender.

The test extends cmsg_sender with two arguments:

* '-P' SO_PRIORITY
  There is a subtle difference between IPv4 and IPv6 stack behavior:
  PF_INET/IP_TOS        sets IP header bits and sk_priority
  PF_INET6/IPV6_TCLASS  sets IP header bits BUT NOT sk_priority

* '-n' num pkts
  Send multiple packets in quick succession.
  I first attempted a for loop in the script, but this is too slow in
  virtualized environments, causing flakiness as the 100ms timeout is
  reached and packets are dequeued.

Also do not wait for timestamps to be queued unless timestamps are
requested.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20231116203449.2627525-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-20 17:48:36 -08:00
Robin Murphy 9859418194 iommufd/selftest: Fix _test_mock_dirty_bitmaps()
The ASSERT_EQ() macro sneakily expands to two statements, so the loop here
needs braces to ensure it captures both and actually terminates the test
upon failure. Where these tests are currently failing on my arm64 machine,
this reduces the number of logged lines from a rather unreasonable
~197,000 down to 10. While we're at it, we can also clean up the
tautologous "count" calculations whose assertions can never fail unless
mathematics and/or the C language become fundamentally broken.

Fixes: a9af47e382 ("iommufd/selftest: Test IOMMU_HWPT_GET_DIRTY_BITMAP")
Link: https://lore.kernel.org/r/90e083045243ef407dd592bb1deec89cd1f4ddf2.1700153535.git.robin.murphy@arm.com
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Tested-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-11-20 21:29:31 -04:00
Andrii Nakryiko 57b97ecb40 selftests/bpf: reduce verboseness of reg_bounds selftest logs
Reduce verboseness of test_progs' output in reg_bounds set of tests with
two changes.

First, instead of each different operator (<, <=, >, ...) being it's own
subtest, combine all different ops for the same (x, y, init_t, cond_t)
values into single subtest. Instead of getting 6 subtests, we get one
generic one, e.g.:

  #192/53  reg_bounds_crafted/(s64)[0xffffffffffffffff; 0] (s64)<op> 0xffffffff00000000:OK

Second, for random generated test cases, treat all of them as a single
test to eliminate very verbose output with random values in them. So now
we'll just get one line per each combination of (init_t, cond_t),
instead of 6 x 25 = 150 subtests before this change:

  #225     reg_bounds_rand_consts_s32_s32:OK

Given we reduce verboseness so much, it makes sense to do a bit more
random testing, so we also bump default number of random tests to 100,
up from 25. This doesn't increase runtime significantly, especially in
parallelized mode.

With all the above changes we still make sure that we have all the
information necessary for reproducing test case if it happens to fail.
That includes reporting random seed and specific operator that is
failing. Those will only be printed to console if related test/subtest
fails, so it doesn't have any added verboseness implications.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20231120180452.145849-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-20 12:54:04 -08:00
Daniel Borkmann adfeae2d24 selftests/bpf: Add netkit to tc_redirect selftest
Extend the existing tc_redirect selftest to also cover netkit devices
for exercising the bpf_redirect_peer() code paths, so that we have both
veth as well as netkit covered, all tests still pass after this change.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Stanislav Fomichev <sdf@google.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20231114004220.6495-9-daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-11-20 10:15:16 -08:00
Daniel Borkmann eee82da79f selftests/bpf: De-veth-ize the tc_redirect test case
No functional changes to the test case, but just renaming various functions,
variables, etc, to remove veth part of their name for making it more generic
and reusable later on (e.g. for netkit).

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Stanislav Fomichev <sdf@google.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20231114004220.6495-8-daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-11-20 10:15:16 -08:00
Pedro Tammela 54293e4d6a selftests/tc-testing: add hashtable tests for u32
Add tests to specifically check for the refcount interactions of
hashtables created by u32. These tables should not be deleted when
referenced and the flush order should respect a tree like composition.

Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/20231114141856.974326-3-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-18 19:38:23 -08:00
Andrii Nakryiko 0f8dbdbc64 bpf: smarter verifier log number printing logic
Instead of always printing numbers as either decimals (and in some
cases, like for "imm=%llx", in hexadecimals), decide the form based on
actual values. For numbers in a reasonably small range (currently,
[0, U16_MAX] for unsigned values, and [S16_MIN, S16_MAX] for signed ones),
emit them as decimals. In all other cases, even for signed values,
emit them in hexadecimals.

For large values hex form is often times way more useful: it's easier to
see an exact difference between 0xffffffff80000000 and 0xffffffff7fffffff,
than between 18446744071562067966 and 18446744071562067967, as one
particular example.

Small values representing small pointer offsets or application
constants, on the other hand, are way more useful to be represented in
decimal notation.

Adjust reg_bounds register state parsing logic to take into account this
change.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20231118034623.3320920-8-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-18 11:39:59 -08:00
Andrii Nakryiko 1db747d75b bpf: omit default off=0 and imm=0 in register state log
Simplify BPF verifier log further by omitting default (and frequently
irrelevant) off=0 and imm=0 parts for non-SCALAR_VALUE registers. As can
be seen from fixed tests, this is often a visual noise for PTR_TO_CTX
register and even for PTR_TO_PACKET registers.

Omitting default values follows the rest of register state logic: we
omit default values to keep verifier log succinct and to highlight
interesting state that deviates from default one. E.g., we do the same
for var_off, when it's unknown, which gives no additional information.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20231118034623.3320920-7-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-18 11:39:59 -08:00
Andrii Nakryiko 0c95c9fdb6 bpf: emit map name in register state if applicable and available
In complicated real-world applications, whenever debugging some
verification error through verifier log, it often would be very useful
to see map name for PTR_TO_MAP_VALUE register. Usually this needs to be
inferred from key/value sizes and maybe trying to guess C code location,
but it's not always clear.

Given verifier has the name, and it's never too long, let's just emit it
for ptr_to_map_key, ptr_to_map_value, and const_ptr_to_map registers. We
reshuffle the order a bit, so that map name, key size, and value size
appear before offset and immediate values, which seems like a more
logical order.

Current output:

  R1_w=map_ptr(map=array_map,ks=4,vs=8,off=0,imm=0)

But we'll get rid of useless off=0 and imm=0 parts in the next patch.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20231118034623.3320920-6-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-18 11:39:59 -08:00
Ido Schimmel af51d6bd0b selftests: mlxsw: Add PCI reset test
Test that PCI reset works correctly by verifying that only the expected
reset methods are supported and that after issuing the reset the ifindex
of the port changes.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-18 17:38:51 +00:00
Linus Torvalds 12ee72fe01 Thirteen hotfixes. Seven are cc:stable and the remainder pertain to
post-6.6 issues or aren't considered suitable for backporting.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZVfj5gAKCRDdBJ7gKXxA
 juu8AP9JwxIvlL5h8r1BD1w3mSNIOt1lFVPnrdElGLh4KwxIKwEAnosxmewHtSzY
 DsF7MsSgw6xG383LQR4Yp4I0a6g0dQ8=
 =faay
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2023-11-17-14-04' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "Thirteen hotfixes. Seven are cc:stable and the remainder pertain to
  post-6.6 issues or aren't considered suitable for backporting"

* tag 'mm-hotfixes-stable-2023-11-17-14-04' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mm: more ptep_get() conversion
  parisc: fix mmap_base calculation when stack grows upwards
  mm/damon/core.c: avoid unintentional filtering out of schemes
  mm: kmem: drop __GFP_NOFAIL when allocating objcg vectors
  mm/damon/sysfs-schemes: handle tried region directory allocation failure
  mm/damon/sysfs-schemes: handle tried regions sysfs directory allocation failure
  mm/damon/sysfs: check error from damon_sysfs_update_target()
  mm: fix for negative counter: nr_file_hugepages
  selftests/mm: add hugetlb_fault_after_madv to .gitignore
  selftests/mm: restore number of hugepages
  selftests: mm: fix some build warnings
  selftests: mm: skip whole test instead of failure
  mm/damon/sysfs: eliminate potential uninitialized variable warning
2023-11-17 14:19:46 -08:00
Andrii Nakryiko ff8867af01 bpf: rename BPF_F_TEST_SANITY_STRICT to BPF_F_TEST_REG_INVARIANTS
Rename verifier internal flag BPF_F_TEST_SANITY_STRICT to more neutral
BPF_F_TEST_REG_INVARIANTS. This is a follow up to [0].

A few selftests and veristat need to be adjusted in the same patch as
well.

  [0] https://patchwork.kernel.org/project/netdevbpf/patch/20231112010609.848406-5-andrii@kernel.org/

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20231117171404.225508-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-17 10:30:02 -08:00
Paolo Abeni 75a50c4f5b kselftest: rtnetlink: fix ip route command typo
The blamed commit below introduced a typo causing 'gretap' test-case
failures:

./rtnetlink.sh  -t kci_test_gretap -v
COMMAND: ip link add name test-dummy0 type dummy
COMMAND: ip link set test-dummy0 up
COMMAND: ip netns add testns
COMMAND: ip link help gretap 2>&1 | grep -q '^Usage:'
COMMAND: ip -netns testns link add dev gretap00 type gretap seq key 102 local 172.16.1.100 remote 172.16.1.200
COMMAND: ip -netns testns addr add dev gretap00 10.1.1.100/24
COMMAND: ip -netns testns link set dev gretap00 ups
    Error: either "dev" is duplicate, or "ups" is a garbage.
COMMAND: ip -netns testns link del gretap00
COMMAND: ip -netns testns link add dev gretap00 type gretap external
COMMAND: ip -netns testns link del gretap00
FAIL: gretap

Fix it by using the correct keyword.

Fixes: 9c2a19f715 ("kselftest: rtnetlink.sh: add verbose flag")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-17 02:47:19 +00:00
Lucas Karpinski 3bdd9fd29c selftests/net: synchronize udpgro tests' tx and rx connection
The sockets used by udpgso_bench_tx aren't always ready when
udpgso_bench_tx transmits packets. This issue is more prevalent in -rt
kernels, but can occur in both. Replace the hacky sleep calls with a
function that checks whether the ports in the namespace are ready for
use.

Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Lucas Karpinski <lkarpins@redhat.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-16 22:31:56 +00:00
Pedro Tammela 04fd47bf70 selftests: tc-testing: use parallel tdc in kselftests
Leverage parallel tests in kselftests using all the available cpus.
We tested this in tuxsuite and locally extensively and it seems it's ready for prime time.

Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-16 22:30:10 +00:00
Pedro Tammela bb9623c337 selftests: tc-testing: preload all modules in kselftests
While running tdc tests in parallel it can race over the module loading
done by tc and fail the run with random errors.
So avoid this by preloading all modules before running tdc in kselftests.

Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-16 22:30:10 +00:00
Pedro Tammela fa63d353dd selftests: tc-testing: rework namespaces and devices setup
As mentioned in the TC Workshop 0x17, our recent changes to tdc broke
downstream CI systems like tuxsuite. The issue is the classic problem
with rcu/workqueue objects where you can miss them if not enough wall time
has passed. The latter is subjective to the system and kernel config,
in my machine could be nanoseconds while in another could be microseconds
or more.

In order to make the suite deterministic, poll for the existence
of the objects in a reasonable manner. Talking netlink directly is the
the best solution in order to avoid paying the cost of multiple
'fork()' calls, so introduce a netlink based setup routine using
pyroute2. We leave the iproute2 one as a fallback when pyroute2 is not
available.

Also rework the iproute2 side to mimic the netlink routine where it
creates DEV0 as the peer of DEV1 and moves DEV1 into the net namespace.
This way when the namespace is deleted DEV0 is also deleted
automatically, leaving no margin for resource leaks.

Another bonus of this change is that our setup time sped up by a factor
of 2 when using netlink.

Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-16 22:30:10 +00:00
Pedro Tammela 9ffa01cab0 selftests: tc-testing: drop '-N' argument from nsPlugin
This argument would bypass the net namespace creation and run the test in
the root namespace, even if nsPlugin was specified.
Drop it as it's the same as commenting out the nsPlugin from a test and adds
additional complexity to the plugin code.

Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-16 22:30:10 +00:00
Linus Torvalds 7475e51b87 Including fixes from BPF and netfilter.
Current release - regressions:
 
  - core: fix undefined behavior in netdev name allocation
 
  - bpf: do not allocate percpu memory at init stage
 
  - netfilter: nf_tables: split async and sync catchall in two functions
 
  - mptcp: fix possible NULL pointer dereference on close
 
 Current release - new code bugs:
 
  - eth: ice: dpll: fix initial lock status of dpll
 
 Previous releases - regressions:
 
  - bpf: fix precision backtracking instruction iteration
 
  - af_unix: fix use-after-free in unix_stream_read_actor()
 
  - tipc: fix kernel-infoleak due to uninitialized TLV value
 
  - eth: bonding: stop the device in bond_setup_by_slave()
 
  - eth: mlx5:
    - fix double free of encap_header
    - avoid referencing skb after free-ing in drop path
 
  - eth: hns3: fix VF reset
 
  - eth: mvneta: fix calls to page_pool_get_stats
 
 Previous releases - always broken:
 
  - core: set SOCK_RCU_FREE before inserting socket into hashtable
 
  - bpf: fix control-flow graph checking in privileged mode
 
  - eth: ppp: limit MRU to 64K
 
  - eth: stmmac: avoid rx queue overrun
 
  - eth: icssg-prueth: fix error cleanup on failing initialization
 
  - eth: hns3: fix out-of-bounds access may occur when coalesce info is
  	      read via debugfs
 
  - eth: cortina: handle large frames
 
 Misc:
 
  - selftests: gso: support CONFIG_MAX_SKB_FRAGS up to 45
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmVV9akSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkICMP/1+QHUaD4JG1mW9oYc2zINPfQl3dqQt3
 2CGSE2yrtbQvyQl39BDa0WFzV5X6So6/U50twhTNM+UAJsCaOvxCUDvUP9eY9Dcm
 z2H4oITZimyP4CEb3l7JpL2PImvfImL7D/fCPPMUZVzNY6dkEFznaQrnawbJz4gg
 mZXDnjwIXq7OchoJy3dHzyOn4ZQj2Df5VcfBzkVMdMcwV55Sd5JezbhwJ6NOmnKA
 uoXlq4pFYj3ahAhEQfLWUwXmF3e6esHs/WUCMe5FR9YkanJlu4oHUmY3RLzfcdQA
 PPIPDRxOzthcXyymqvqs7gnZ3ruMUll4B7tGTVFpJch8ts+DwGdUyBIIoDd/1BUT
 gmjipP5HPia3Qdtk3Jc4vMkcf5AwoGo0hXku7YYJ1K7+4+t8ep3/hDbQc0PLWX6J
 afiQgqpnNXHSTqBO5zl91vSwhGr/AAtAkDlPnsQL/RDAxY4teIwxHuoMvwPWaHZJ
 sMo5ZcHXvNnBbGhpozFtmrnbf1nduUrQmW5LkJViCLf25Sj6pDYbo8WnhMuOKSnZ
 7an2YqniCgBtrX4MEVn2jsWgavI+SxndVIQR04u0uwqmP+dn8s9LUfjKKDtPWHsK
 +zMFtk+Op03TW5ur9w3+dgrGH0cLogPO3BJkho7xXKBfZ6/tN/pOef3/nV9xY6g8
 JjnBUdpZRTWI
 =VjWw
 -----END PGP SIGNATURE-----

Merge tag 'net-6.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from BPF and netfilter.

  Current release - regressions:

   - core: fix undefined behavior in netdev name allocation

   - bpf: do not allocate percpu memory at init stage

   - netfilter: nf_tables: split async and sync catchall in two
     functions

   - mptcp: fix possible NULL pointer dereference on close

  Current release - new code bugs:

   - eth: ice: dpll: fix initial lock status of dpll

  Previous releases - regressions:

   - bpf: fix precision backtracking instruction iteration

   - af_unix: fix use-after-free in unix_stream_read_actor()

   - tipc: fix kernel-infoleak due to uninitialized TLV value

   - eth: bonding: stop the device in bond_setup_by_slave()

   - eth: mlx5:
      - fix double free of encap_header
      - avoid referencing skb after free-ing in drop path

   - eth: hns3: fix VF reset

   - eth: mvneta: fix calls to page_pool_get_stats

  Previous releases - always broken:

   - core: set SOCK_RCU_FREE before inserting socket into hashtable

   - bpf: fix control-flow graph checking in privileged mode

   - eth: ppp: limit MRU to 64K

   - eth: stmmac: avoid rx queue overrun

   - eth: icssg-prueth: fix error cleanup on failing initialization

   - eth: hns3: fix out-of-bounds access may occur when coalesce info is
     read via debugfs

   - eth: cortina: handle large frames

  Misc:

   - selftests: gso: support CONFIG_MAX_SKB_FRAGS up to 45"

* tag 'net-6.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (78 commits)
  macvlan: Don't propagate promisc change to lower dev in passthru
  net: sched: do not offload flows with a helper in act_ct
  net/mlx5e: Check return value of snprintf writing to fw_version buffer for representors
  net/mlx5e: Check return value of snprintf writing to fw_version buffer
  net/mlx5e: Reduce the size of icosq_str
  net/mlx5: Increase size of irq name buffer
  net/mlx5e: Update doorbell for port timestamping CQ before the software counter
  net/mlx5e: Track xmit submission to PTP WQ after populating metadata map
  net/mlx5e: Avoid referencing skb after free-ing in drop path of mlx5e_sq_xmit_wqe
  net/mlx5e: Don't modify the peer sent-to-vport rules for IPSec offload
  net/mlx5e: Fix pedit endianness
  net/mlx5e: fix double free of encap_header in update funcs
  net/mlx5e: fix double free of encap_header
  net/mlx5: Decouple PHC .adjtime and .adjphase implementations
  net/mlx5: DR, Allow old devices to use multi destination FTE
  net/mlx5: Free used cpus mask when an IRQ is released
  Revert "net/mlx5: DR, Supporting inline WQE when possible"
  bpf: Do not allocate percpu memory at init stage
  net: Fix undefined behavior in netdev name allocation
  dt-bindings: net: ethernet-controller: Fix formatting error
  ...
2023-11-16 07:51:26 -05:00
Jakub Kicinski a6a6a0a9fd Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Alexei Starovoitov says:

====================
pull-request: bpf 2023-11-15

We've added 7 non-merge commits during the last 6 day(s) which contain
a total of 9 files changed, 200 insertions(+), 49 deletions(-).

The main changes are:

1) Do not allocate bpf specific percpu memory unconditionally, from Yonghong.

2) Fix precision backtracking instruction iteration, from Andrii.

3) Fix control flow graph checking, from Andrii.

4) Fix xskxceiver selftest build, from Anders.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf: Do not allocate percpu memory at init stage
  selftests/bpf: add more test cases for check_cfg()
  bpf: fix control-flow graph checking in privileged mode
  selftests/bpf: add edge case backtracking logic test
  bpf: fix precision backtracking instruction iteration
  bpf: handle ldimm64 properly in check_cfg()
  selftests: bpf: xskxceiver: ksft_print_msg: fix format type error
====================

Link: https://lore.kernel.org/r/20231115214949.48854-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-15 22:28:02 -08:00
Breno Leitao edf1454432 selftests/mm: add hugetlb_fault_after_madv to .gitignore
commit 116d57303a ("selftests/mm: add a new test for madv and hugetlb")
added a new test case, but, it didn't add the binary name in
tools/testing/selftests/mm/.gitignore.

Add hugetlb_fault_after_madv to tools/testing/selftests/mm/.gitignore.

Link: https://lkml.kernel.org/r/20231103173400.1608403-2-leitao@debian.org
Fixes: 116d57303a ("selftests/mm: add a new test for madv and hugetlb")
Signed-off-by: Breno Leitao <leitao@debian.org>
Reported-by: Ryan Roberts <ryan.roberts@arm.com>
Closes: https://lore.kernel.org/all/662df57e-47f1-4c15-9b84-f2f2d587fc5c@arm.com/
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-11-15 15:30:09 -08:00
Breno Leitao dd9b35efd7 selftests/mm: restore number of hugepages
The test mm `hugetlb_fault_after_madv` selftest needs one and only one
huge page to run, thus it sets `/proc/sys/vm/nr_hugepages` to 1.

The problem is that further tests require the previous number of hugepages
allocated in order to succeed.

Save the number of huge pages before changing it, and restore it once the
test finishes, so, further tests could run successfully.

Link: https://lkml.kernel.org/r/20231103173400.1608403-1-leitao@debian.org
Fixes: 116d57303a ("selftests/mm: add a new test for madv and hugetlb")
Signed-off-by: Breno Leitao <leitao@debian.org>
Reported-by: Ryan Roberts <ryan.roberts@arm.com>
Closes: https://lore.kernel.org/all/662df57e-47f1-4c15-9b84-f2f2d587fc5c@arm.com/
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-11-15 15:30:08 -08:00
Muhammad Usama Anjum 9297e5360c selftests: mm: fix some build warnings
Fix build warnings:
pagemap_ioctl.c:1154:38: warning: format `%s' expects a matching `char *' argument [-Wformat=]
pagemap_ioctl.c:1162:51: warning: format `%ld' expects argument of type `long int', but argument 2 has type `int' [-Wformat=]
pagemap_ioctl.c:1192:51: warning: format `%ld' expects argument of type `long int', but argument 2 has type `int' [-Wformat=]
pagemap_ioctl.c:1600:51: warning: format `%ld' expects argument of type `long int', but argument 2 has type `int' [-Wformat=]
pagemap_ioctl.c:1628:51: warning: format `%ld' expects argument of type `long int', but argument 2 has type `int' [-Wformat=]

Link: https://lkml.kernel.org/r/20231103182343.2874015-2-usama.anjum@collabora.com
Fixes: 46fd75d4a3 ("selftests: mm: add pagemap ioctl tests")
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-11-15 15:30:08 -08:00
Muhammad Usama Anjum 019b277b68 selftests: mm: skip whole test instead of failure
Some architectures don't support userfaultfd.  Skip running the whole test
on them instead of registering the failure.

Link: https://lkml.kernel.org/r/20231103182343.2874015-1-usama.anjum@collabora.com
Fixes: 46fd75d4a3 ("selftests: mm: add pagemap ioctl tests")

Reported-by: Ryan Roberts <ryan.roberts@arm.com>
Closes: https://lore.kernel.org/all/f8463381-2697-49e9-9460-9dc73452830d@arm.com
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-11-15 15:30:08 -08:00
Andrii Nakryiko 882e3d873c selftests/bpf: add iter test requiring range x range logic
Add a simple verifier test that requires deriving reg bounds for one
register from another register that's not a constant. This is
a realistic example of iterating elements of an array with fixed maximum
number of elements, but smaller actual number of elements.

This small example was an original motivation for doing this whole patch
set in the first place, yes.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20231112010609.848406-14-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-15 12:03:43 -08:00
Andrii Nakryiko a5c57f81eb veristat: add ability to set BPF_F_TEST_SANITY_STRICT flag with -r flag
Add a new flag -r (--test-sanity), similar to -t (--test-states), to add
extra BPF program flags when loading BPF programs.

This allows to use veristat to easily catch sanity violations in
production BPF programs.

reg_bounds tests are also enforcing BPF_F_TEST_SANITY_STRICT flag now.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20231112010609.848406-13-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-15 12:03:43 -08:00
Andrii Nakryiko 8c5677f8b3 selftests/bpf: set BPF_F_TEST_SANITY_SCRIPT by default
Make sure to set BPF_F_TEST_SANITY_STRICT program flag by default across
most verifier tests (and a bunch of others that set custom prog flags).

There are currently two tests that do fail validation, if enforced
strictly: verifier_bounds/crossing_64_bit_signed_boundary_2 and
verifier_bounds/crossing_32_bit_signed_boundary_2. To accommodate them,
we teach test_loader a flag negation:

__flag(!<flagname>) will *clear* specified flag, allowing easy opt-out.

We apply __flag(!BPF_F_TEST_SANITY_STRICT) to these to tests.

Also sprinkle BPF_F_TEST_SANITY_STRICT everywhere where we already set
test-only BPF_F_TEST_RND_HI32 flag, for completeness.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20231112010609.848406-12-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-15 12:03:42 -08:00
Andrii Nakryiko dab16659c5 selftests/bpf: add randomized reg_bounds tests
Add random cases generation to reg_bounds.c and run them without
SLOW_TESTS=1 to increase a chance of BPF CI catching latent issues.

Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20231112010609.848406-11-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-15 12:03:42 -08:00
Andrii Nakryiko 2b0d204e36 selftests/bpf: add range x range test to reg_bounds
Now that verifier supports range vs range bounds adjustments, validate
that by checking each generated range against every other generated
range, across all supported operators (everything by JSET).

We also add few cases that were problematic during development either
for verifier or for selftest's range tracking implementation.

Note that we utilize the same trick with splitting everything into
multiple independent parallelizable tests, but init_t and cond_t. This
brings down verification time in parallel mode from more than 8 hours
down to less that 1.5 hours. 106 million cases were successfully
validate for range vs range logic, in addition to about 7 million range
vs const cases, added in earlier patch.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20231112010609.848406-10-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-15 12:03:42 -08:00
Andrii Nakryiko 774f94c5e7 selftests/bpf: adjust OP_EQ/OP_NE handling to use subranges for branch taken
Similar to kernel-side BPF verifier logic enhancements, use 32-bit
subrange knowledge for is_branch_taken() logic in reg_bounds selftests.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20231112010609.848406-9-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-15 12:03:42 -08:00
Andrii Nakryiko 8863238993 selftests/bpf: BPF register range bounds tester
Add test to validate BPF verifier's register range bounds tracking logic.

The main bulk is a lot of auto-generated tests based on a small set of
seed values for lower and upper 32 bits of full 64-bit values.
Currently we validate only range vs const comparisons, but the idea is
to start validating range over range comparisons in subsequent patch set.

When setting up initial register ranges we treat registers as one of
u64/s64/u32/s32 numeric types, and then independently perform conditional
comparisons based on a potentially different u64/s64/u32/s32 types. This
tests lots of tricky cases of deriving bounds information across
different numeric domains.

Given there are lots of auto-generated cases, we guard them behind
SLOW_TESTS=1 envvar requirement, and skip them altogether otherwise.
With current full set of upper/lower seed value, all supported
comparison operators and all the combinations of u64/s64/u32/s32 number
domains, we get about 7.7 million tests, which run in about 35 minutes
on my local qemu instance without parallelization. But we also split
those tests by init/cond numeric types, which allows to rely on
test_progs's parallelization of tests with `-j` option, getting run time
down to about 5 minutes on 8 cores. It's still something that shouldn't
be run during normal test_progs run.  But we can run it a reasonable
time, and so perhaps a nightly CI test run (once we have it) would be
a good option for this.

We also add a small set of tricky conditions that came up during
development and triggered various bugs or corner cases in either
selftest's reimplementation of range bounds logic or in verifier's logic
itself. These are fast enough to be run as part of normal test_progs
test run and are great for a quick sanity checking.

Let's take a look at test output to understand what's going on:

  $ sudo ./test_progs -t reg_bounds_crafted
  #191/1   reg_bounds_crafted/(u64)[0; 0xffffffff] (u64)< 0:OK
  ...
  #191/115 reg_bounds_crafted/(u64)[0; 0x17fffffff] (s32)< 0:OK
  ...
  #191/137 reg_bounds_crafted/(u64)[0xffffffff; 0x100000000] (u64)== 0:OK

Each test case is uniquely and fully described by this generated string.
E.g.: "(u64)[0; 0x17fffffff] (s32)< 0". This means that we
initialize a register (R6) in such a way that verifier knows that it can
have a value in [(u64)0; (u64)0x17fffffff] range. Another
register (R7) is also set up as u64, but this time a constant (zero in
this case). They then are compared using 32-bit signed < operation.
Resulting TRUE/FALSE branches are evaluated (including cases where it's
known that one of the branches will never be taken, in which case we
validate that verifier also determines this as a dead code). Test
validates that verifier's final register state matches expected state
based on selftest's own reg_state logic, implemented from scratch for
cross-checking purposes.

These test names can be conveniently used for further debugging, and if -vv
verboseness is requested we can get a corresponding verifier log (with
mark_precise logs filtered out as irrelevant and distracting). Example below is
slightly redacted for brevity, omitting irrelevant register output in
some places, marked with [...].

  $ sudo ./test_progs -a 'reg_bounds_crafted/(u32)[0; U32_MAX] (s32)< -1' -vv
  ...
  VERIFIER LOG:
  ========================
  func#0 @0
  0: R1=ctx(off=0,imm=0) R10=fp0
  0: (05) goto pc+2
  3: (85) call bpf_get_current_pid_tgid#14      ; R0_w=scalar()
  4: (bc) w6 = w0                       ; R0_w=scalar() R6_w=scalar(smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
  5: (85) call bpf_get_current_pid_tgid#14      ; R0_w=scalar()
  6: (bc) w7 = w0                       ; R0_w=scalar() R7_w=scalar(smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
  7: (b4) w1 = 0                        ; R1_w=0
  8: (b4) w2 = -1                       ; R2=4294967295
  9: (ae) if w6 < w1 goto pc-9
  9: R1=0 R6=scalar(smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
  10: (2e) if w6 > w2 goto pc-10
  10: R2=4294967295 R6=scalar(smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
  11: (b4) w1 = -1                      ; R1_w=4294967295
  12: (b4) w2 = -1                      ; R2_w=4294967295
  13: (ae) if w7 < w1 goto pc-13        ; R1_w=4294967295 R7=4294967295
  14: (2e) if w7 > w2 goto pc-14
  14: R2_w=4294967295 R7=4294967295
  15: (bc) w0 = w6                      ; [...] R6=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
  16: (bc) w0 = w7                      ; [...] R7=4294967295
  17: (ce) if w6 s< w7 goto pc+3        ; R6=scalar(id=1,smin=0,smax=umax=4294967295,smin32=-1,var_off=(0x0; 0xffffffff)) R7=4294967295
  18: (bc) w0 = w6                      ; [...] R6=scalar(id=1,smin=0,smax=umax=4294967295,smin32=-1,var_off=(0x0; 0xffffffff))
  19: (bc) w0 = w7                      ; [...] R7=4294967295
  20: (95) exit

  from 17 to 21: [...]
  21: (bc) w0 = w6                      ; [...] R6=scalar(id=1,smin=umin=umin32=2147483648,smax=umax=umax32=4294967294,smax32=-2,var_off=(0x80000000; 0x7fffffff))
  22: (bc) w0 = w7                      ; [...] R7=4294967295
  23: (95) exit

  from 13 to 1: [...]
  1: [...]
  1: (b7) r0 = 0                        ; R0_w=0
  2: (95) exit
  processed 24 insns (limit 1000000) max_states_per_insn 0 total_states 2 peak_states 2 mark_read 1
  =====================

Verifier log above is for `(u32)[0; U32_MAX] (s32)< -1` use cases, where u32
range is used for initialization, followed by signed < operator. Note
how we use w6/w7 in this case for register initialization (it would be
R6/R7 for 64-bit types) and then `if w6 s< w7` for comparison at
instruction #17. It will be `if R6 < R7` for 64-bit unsigned comparison.
Above example gives a good impression of the overall structure of a BPF
programs generated for reg_bounds tests.

In the future, this "framework" can be extended to test not just
conditional jumps, but also arithmetic operations. Adding randomized
testing is another possibility.

Some implementation notes. We basically have our own generics-like
operations on numbers, where all the numbers are stored in u64, but how
they are interpreted is passed as runtime argument enum num_t. Further,
`struct range` represents a bounds range, and those are collected
together into a minimal `struct reg_state`, which collects range bounds
across all four numberical domains: u64, s64, u32, s64.

Based on these primitives and `enum op` representing possible
conditional operation (<, <=, >, >=, ==, !=), there is a set of generic
helpers to perform "range arithmetics", which is used to maintain struct
reg_state. We simulate what verifier will do for reg bounds of R6 and R7
registers using these range and reg_state primitives. Simulated
information is used to determine branch taken conclusion and expected
exact register state across all four number domains.

Implementation of "range arithmetics" is more generic than what verifier
is currently performing: it allows range over range comparisons and
adjustments. This is the intended end goal of this patch set overall and verifier
logic is enhanced in subsequent patches in this series to handle range
vs range operations, at which point selftests are extended to validate
these conditions as well. For now it's range vs const cases only.

Note that tests are split into multiple groups by their numeric types
for initialization of ranges and for comparison operation. This allows
to use test_progs's -j parallelization to speed up tests, as we now have
16 groups of parallel running tests. Overall reduction of running time
that allows is pretty good, we go down from more than 30 minutes to
slightly less than 5 minutes running time.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Link: https://lore.kernel.org/r/20231112010609.848406-8-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-15 12:03:42 -08:00
Paolo Abeni 7cefbe5e1d selftests: mptcp: fix fastclose with csum failure
Running the mp_join selftest manually with the following command line:

  ./mptcp_join.sh -z -C

leads to some failures:

  002 fastclose server test
  # ...
  rtx                                 [fail] got 1 MP_RST[s] TX expected 0
  # ...
  rstrx                               [fail] got 1 MP_RST[s] RX expected 0

The problem is really in the wrong expectations for the RST checks
implied by the csum validation. Note that the same check is repeated
explicitly in the same test-case, with the correct expectation and
pass successfully.

Address the issue explicitly setting the correct expectation for
the failing checks.

Reported-by: Xiumei Mu <xmu@redhat.com>
Fixes: 6bf41020b7 ("selftests: mptcp: update and extend fastclose test-cases")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20231114-upstream-net-20231113-mptcp-misc-fixes-6-7-rc2-v1-5-7b9cd6a7b7f4@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-14 20:10:21 -08:00
Yafang Shao 360769233c selftests/bpf: Add selftests for cgroup1 hierarchy
Add selftests for cgroup1 hierarchy.
The result as follows,

  $ tools/testing/selftests/bpf/test_progs --name=cgroup1_hierarchy
  #36/1    cgroup1_hierarchy/test_cgroup1_hierarchy:OK
  #36/2    cgroup1_hierarchy/test_root_cgid:OK
  #36/3    cgroup1_hierarchy/test_invalid_level:OK
  #36/4    cgroup1_hierarchy/test_invalid_cgid:OK
  #36/5    cgroup1_hierarchy/test_invalid_hid:OK
  #36/6    cgroup1_hierarchy/test_invalid_cgrp_name:OK
  #36/7    cgroup1_hierarchy/test_invalid_cgrp_name2:OK
  #36/8    cgroup1_hierarchy/test_sleepable_prog:OK
  #36      cgroup1_hierarchy:OK
  Summary: 1/8 PASSED, 0 SKIPPED, 0 FAILED

Besides, I also did some stress test similar to the patch #2 in this
series, as follows (with CONFIG_PROVE_RCU_LIST enabled):

- Continuously mounting and unmounting named cgroups in some tasks,
  for example:

  cgrp_name=$1
  while true
  do
      mount -t cgroup -o none,name=$cgrp_name none /$cgrp_name
      umount /$cgrp_name
  done

- Continuously run this selftest concurrently,
  while true; do ./test_progs --name=cgroup1_hierarchy; done

They can ran successfully without any RCU warnings in dmesg.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/r/20231111090034.4248-7-laoar.shao@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-14 08:59:23 -08:00
Yafang Shao bf47300b18 selftests/bpf: Add a new cgroup helper get_cgroup_hierarchy_id()
A new cgroup helper function, get_cgroup1_hierarchy_id(), has been
introduced to obtain the ID of a cgroup1 hierarchy based on the provided
cgroup name. This cgroup name can be obtained from the /proc/self/cgroup
file.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/r/20231111090034.4248-6-laoar.shao@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-14 08:56:56 -08:00
Yafang Shao c1dcc050aa selftests/bpf: Add a new cgroup helper get_classid_cgroup_id()
Introduce a new helper function to retrieve the cgroup ID from a net_cls
cgroup directory.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/r/20231111090034.4248-5-laoar.shao@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-14 08:56:56 -08:00
Yafang Shao f744d35ecf selftests/bpf: Add parallel support for classid
Include the current pid in the classid cgroup path. This way, different
testers relying on classid-based configurations will have distinct classid
cgroup directories, enabling them to run concurrently. Additionally, we
leverage the current pid as the classid, ensuring unique identification.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/r/20231111090034.4248-4-laoar.shao@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-14 08:56:56 -08:00
Yafang Shao 4849775587 selftests/bpf: Fix issues in setup_classid_environment()
If the net_cls subsystem is already mounted, attempting to mount it again
in setup_classid_environment() will result in a failure with the error code
EBUSY. Despite this, tmpfs will have been successfully mounted at
/sys/fs/cgroup/net_cls. Consequently, the /sys/fs/cgroup/net_cls directory
will be empty, causing subsequent setup operations to fail.

Here's an error log excerpt illustrating the issue when net_cls has already
been mounted at /sys/fs/cgroup/net_cls prior to running
setup_classid_environment():

- Before that change

  $ tools/testing/selftests/bpf/test_progs --name=cgroup_v1v2
  test_cgroup_v1v2:PASS:server_fd 0 nsec
  test_cgroup_v1v2:PASS:client_fd 0 nsec
  test_cgroup_v1v2:PASS:cgroup_fd 0 nsec
  test_cgroup_v1v2:PASS:server_fd 0 nsec
  run_test:PASS:skel_open 0 nsec
  run_test:PASS:prog_attach 0 nsec
  test_cgroup_v1v2:PASS:cgroup-v2-only 0 nsec
  (cgroup_helpers.c:248: errno: No such file or directory) Opening Cgroup Procs: /sys/fs/cgroup/net_cls/cgroup.procs
  (cgroup_helpers.c:540: errno: No such file or directory) Opening cgroup classid: /sys/fs/cgroup/net_cls/cgroup-test-work-dir/net_cls.classid
  run_test:PASS:skel_open 0 nsec
  run_test:PASS:prog_attach 0 nsec
  (cgroup_helpers.c:248: errno: No such file or directory) Opening Cgroup Procs: /sys/fs/cgroup/net_cls/cgroup-test-work-dir/cgroup.procs
  run_test:FAIL:join_classid unexpected error: 1 (errno 2)
  test_cgroup_v1v2:FAIL:cgroup-v1v2 unexpected error: -1 (errno 2)
  (cgroup_helpers.c:248: errno: No such file or directory) Opening Cgroup Procs: /sys/fs/cgroup/net_cls/cgroup.procs
  #44      cgroup_v1v2:FAIL
  Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED

- After that change
  $ tools/testing/selftests/bpf/test_progs --name=cgroup_v1v2
  #44      cgroup_v1v2:OK
  Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/r/20231111090034.4248-3-laoar.shao@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-14 08:56:56 -08:00
Jordan Rome 727a92d62f selftests/bpf: Add assert for user stacks in test_task_stack
This is a follow up to:
commit b8e3a87a62 ("bpf: Add crosstask check to __bpf_get_stack").

This test ensures that the task iterator only gets a single
user stack (for the current task).

Signed-off-by: Jordan Rome <linux@jordanrome.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/bpf/20231112023010.144675-1-linux@jordanrome.com
2023-11-13 18:39:38 -08:00
Linus Torvalds 4eeee6636a LoongArch changes for v6.7
1, Support PREEMPT_DYNAMIC with static keys;
 2, Relax memory ordering for atomic operations;
 3, Support BPF CPU v4 instructions for LoongArch;
 4, Some build and runtime warning fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCAA0FiEEzOlt8mkP+tbeiYy5AoYrw/LiJnoFAmVQWXgWHGNoZW5odWFj
 YWlAa2VybmVsLm9yZwAKCRAChivD8uImepDTEACS808EsgSNIM1+JwldhdqKOErt
 XDWlLuIddVpenInx8F+9GnZJzKBU+wl+Ow5ejcVarjcecIJDv5UhoVrbhpeOHkfv
 RszRXQR4p/ZNSFvdraYDjjJ9UX6bp5rq7vMUC2d9bLazMauAfwf7T/HJ5qj9OYZi
 RLlcwaKo2UQHYsT7nJicjh0qpH1YpZQBYTaUUCwzilzB6vAIOTf6X12vFmhtM/i+
 5RIPnesMA1IQSm2ywUODpDHCs7Pirvy8aJvx0CsYdi3xl1yg3pUS6u69Ms61uWlw
 29yYhNbWmVnDikTVLTNISDb/jwto5SAVB2KQKBhF1trF4ZBNE6r7sP4m2tfllYo9
 KXK9tm0U8McS5o46Qd5er6eEnxL7mEeAsc12tNKUYOMe3SIkmHJmj/rZQOtpsiBg
 zqQsYkGUfO2VAwMWiGke8dxPZElOYwZ3UCOpbEpXEXy3NW71VJTIuQFGmsYKJhdy
 3xaAtQxdffE5yUTt2j3Y8Mex2b2oSUBSF263imsZjzWOOxd480iaoejtamf1V779
 bElevzZjMDmbiQ7kiVSf96TWc7iYcSv33jhP4DorKIqnPseYPfrXEeD1xY7JV+IU
 kkvSlO0hAJzVMmQgu5n0PPT1wrVpuvwtbsfcRobIkr1vktZyLaKHRq7rh4R5HTRL
 ZUUm6c0kUDywGT+J4A==
 =bmFe
 -----END PGP SIGNATURE-----

Merge tag 'loongarch-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson

Pull LoongArch updates from Huacai Chen:

 - support PREEMPT_DYNAMIC with static keys

 - relax memory ordering for atomic operations

 - support BPF CPU v4 instructions for LoongArch

 - some build and runtime warning fixes

* tag 'loongarch-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
  selftests/bpf: Enable cpu v4 tests for LoongArch
  LoongArch: BPF: Support signed mod instructions
  LoongArch: BPF: Support signed div instructions
  LoongArch: BPF: Support 32-bit offset jmp instructions
  LoongArch: BPF: Support unconditional bswap instructions
  LoongArch: BPF: Support sign-extension mov instructions
  LoongArch: BPF: Support sign-extension load instructions
  LoongArch: Add more instruction opcodes and emit_* helpers
  LoongArch/smp: Call rcutree_report_cpu_starting() earlier
  LoongArch: Relax memory ordering for atomic operations
  LoongArch: Mark __percpu functions as always inline
  LoongArch: Disable module from accessing external data directly
  LoongArch: Support PREEMPT_DYNAMIC with static keys
2023-11-12 10:58:08 -08:00
Yonghong Song 100888fb6d selftests/bpf: Fix pyperf180 compilation failure with clang18
With latest clang18 (main branch of llvm-project repo), when building bpf selftests,
    [~/work/bpf-next (master)]$ make -C tools/testing/selftests/bpf LLVM=1 -j

The following compilation error happens:
    fatal error: error in backend: Branch target out of insn range
    ...
    Stack dump:
    0.      Program arguments: clang -g -Wall -Werror -D__TARGET_ARCH_x86 -mlittle-endian
      -I/home/yhs/work/bpf-next/tools/testing/selftests/bpf/tools/include
      -I/home/yhs/work/bpf-next/tools/testing/selftests/bpf -I/home/yhs/work/bpf-next/tools/include/uapi
      -I/home/yhs/work/bpf-next/tools/testing/selftests/usr/include -idirafter
      /home/yhs/work/llvm-project/llvm/build.18/install/lib/clang/18/include -idirafter /usr/local/include
      -idirafter /usr/include -Wno-compare-distinct-pointer-types -DENABLE_ATOMICS_TESTS -O2 --target=bpf
      -c progs/pyperf180.c -mcpu=v3 -o /home/yhs/work/bpf-next/tools/testing/selftests/bpf/pyperf180.bpf.o
    1.      <eof> parser at end of file
    2.      Code generation
    ...

The compilation failure only happens to cpu=v2 and cpu=v3. cpu=v4 is okay
since cpu=v4 supports 32-bit branch target offset.

The above failure is due to upstream llvm patch [1] where some inlining behavior
are changed in clang18.

To workaround the issue, previously all 180 loop iterations are fully unrolled.
The bpf macro __BPF_CPU_VERSION__ (implemented in clang18 recently) is used to avoid
unrolling changes if cpu=v4. If __BPF_CPU_VERSION__ is not available and the
compiler is clang18, the unrollng amount is unconditionally reduced.

  [1] 1a2e77cf9e

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20231110193644.3130906-1-yonghong.song@linux.dev
2023-11-11 12:18:10 -08:00
Andrii Nakryiko e2e57d637a selftests/bpf: add more test cases for check_cfg()
Add a few more simple cases to validate proper privileged vs unprivileged
loop detection behavior. conditional_loop2 is the one reported by Hao
Sun that triggered this set of fixes.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Suggested-by: Hao Sun <sunhao.th@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20231110061412.2995786-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-09 22:57:25 -08:00
Andrii Nakryiko 10e14e9652 bpf: fix control-flow graph checking in privileged mode
When BPF program is verified in privileged mode, BPF verifier allows
bounded loops. This means that from CFG point of view there are
definitely some back-edges. Original commit adjusted check_cfg() logic
to not detect back-edges in control flow graph if they are resulting
from conditional jumps, which the idea that subsequent full BPF
verification process will determine whether such loops are bounded or
not, and either accept or reject the BPF program. At least that's my
reading of the intent.

Unfortunately, the implementation of this idea doesn't work correctly in
all possible situations. Conditional jump might not result in immediate
back-edge, but just a few unconditional instructions later we can arrive
at back-edge. In such situations check_cfg() would reject BPF program
even in privileged mode, despite it might be bounded loop. Next patch
adds one simple program demonstrating such scenario.

To keep things simple, instead of trying to detect back edges in
privileged mode, just assume every back edge is valid and let subsequent
BPF verification prove or reject bounded loops.

Note a few test changes. For unknown reason, we have a few tests that
are specified to detect a back-edge in a privileged mode, but looking at
their code it seems like the right outcome is passing check_cfg() and
letting subsequent verification to make a decision about bounded or not
bounded looping.

Bounded recursion case is also interesting. The example should pass, as
recursion is limited to just a few levels and so we never reach maximum
number of nested frames and never exhaust maximum stack depth. But the
way that max stack depth logic works today it falsely detects this as
exceeding max nested frame count. This patch series doesn't attempt to
fix this orthogonal problem, so we just adjust expected verifier failure.

Suggested-by: Alexei Starovoitov <ast@kernel.org>
Fixes: 2589726d12 ("bpf: introduce bounded loops")
Reported-by: Hao Sun <sunhao.th@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20231110061412.2995786-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-09 22:57:24 -08:00
Andrii Nakryiko 62ccdb11d3 selftests/bpf: add edge case backtracking logic test
Add a dedicated selftests to try to set up conditions to have a state
with same first and last instruction index, but it actually is a loop
3->4->1->2->3. This confuses mark_chain_precision() if verifier doesn't
take into account jump history.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20231110002638.4168352-4-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-09 20:11:20 -08:00
Andrii Nakryiko 3feb263bb5 bpf: handle ldimm64 properly in check_cfg()
ldimm64 instructions are 16-byte long, and so have to be handled
appropriately in check_cfg(), just like the rest of BPF verifier does.

This has implications in three places:
  - when determining next instruction for non-jump instructions;
  - when determining next instruction for callback address ldimm64
    instructions (in visit_func_call_insn());
  - when checking for unreachable instructions, where second half of
    ldimm64 is expected to be unreachable;

We take this also as an opportunity to report jump into the middle of
ldimm64. And adjust few test_verifier tests accordingly.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Reported-by: Hao Sun <sunhao.th@gmail.com>
Fixes: 475fb78fbf ("bpf: verifier (add branch/goto checks)")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20231110002638.4168352-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-09 20:11:20 -08:00
Anders Roxell fe69a1b1b6 selftests: bpf: xskxceiver: ksft_print_msg: fix format type error
Crossbuilding selftests/bpf for architecture arm64, format specifies
type error show up like.

xskxceiver.c:912:34: error: format specifies type 'int' but the argument
has type '__u64' (aka 'unsigned long long') [-Werror,-Wformat]
 ksft_print_msg("[%s] expected meta_count [%d], got meta_count [%d]\n",
                                                                ~~
                                                                %llu
                __func__, pkt->pkt_nb, meta->count);
                                       ^~~~~~~~~~~
xskxceiver.c:929:55: error: format specifies type 'unsigned long long' but
 the argument has type 'u64' (aka 'unsigned long') [-Werror,-Wformat]
 ksft_print_msg("Frag invalid addr: %llx len: %u\n", addr, len);
                                    ~~~~             ^~~~

Fixing the issues by casting to (unsigned long long) and changing the
specifiers to be %llu from %d and %u, since with u64s it might be %llx
or %lx, depending on architecture.

Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Link: https://lore.kernel.org/r/20231109174328.1774571-1-anders.roxell@linaro.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-09 19:18:12 -08:00
Dave Marchevsky e9ed8df718 selftests/bpf: Test bpf_refcount_acquire of node obtained via direct ld
This patch demonstrates that verifier changes earlier in this series
result in bpf_refcount_acquire(mapval->stashed_kptr) passing
verification. The added test additionally validates that stashing a kptr
in mapval and - in a separate BPF program - refcount_acquiring the kptr
without unstashing works as expected at runtime.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/r/20231107085639.3016113-7-davemarchevsky@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-09 19:07:51 -08:00
Dave Marchevsky f460e7bdb0 selftests/bpf: Add test passing MAYBE_NULL reg to bpf_refcount_acquire
The test added in this patch exercises the logic fixed in the previous
patch in this series. Before the previous patch's changes,
bpf_refcount_acquire accepts MAYBE_NULL local kptrs; after the change
the verifier correctly rejects the such a call.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/r/20231107085639.3016113-3-davemarchevsky@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-09 19:07:51 -08:00
Andrii Nakryiko 27007fae70 veristat: add ability to filter top N results
Add ability to filter top B results, both in replay/verifier mode and
comparison mode. Just adding `-n10` will emit only first 10 rows, or
less, if there is not enough rows.

This is not just a shortcut instead of passing veristat output through
`head`, though. Filtering out all the other rows influences final table
formatting, as table column widths are calculated based on actual
emitted test.

To demonstrate the difference, compare two "equivalent" forms below, one
using head and another using -n argument.

TOP N FEATURE
=============
[vmuser@archvm bpf]$ sudo ./veristat -C ~/baseline-results-selftests.csv ~/sanity2-results-selftests.csv -e file,prog,insns,states -s '|insns_diff|' -n10
File                                      Program                Insns (A)  Insns (B)  Insns (DIFF)  States (A)  States (B)  States (DIFF)
----------------------------------------  ---------------------  ---------  ---------  ------------  ----------  ----------  -------------
test_seg6_loop.bpf.linked3.o              __add_egr_x                12440      12360  -80 (-0.64%)         364         357    -7 (-1.92%)
async_stack_depth.bpf.linked3.o           async_call_root_check        145        145   +0 (+0.00%)           3           3    +0 (+0.00%)
async_stack_depth.bpf.linked3.o           pseudo_call_check            139        139   +0 (+0.00%)           3           3    +0 (+0.00%)
atomic_bounds.bpf.linked3.o               sub                            7          7   +0 (+0.00%)           0           0    +0 (+0.00%)
bench_local_storage_create.bpf.linked3.o  kmalloc                        5          5   +0 (+0.00%)           0           0    +0 (+0.00%)
bench_local_storage_create.bpf.linked3.o  sched_process_fork            22         22   +0 (+0.00%)           2           2    +0 (+0.00%)
bench_local_storage_create.bpf.linked3.o  socket_post_create            23         23   +0 (+0.00%)           2           2    +0 (+0.00%)
bind4_prog.bpf.linked3.o                  bind_v4_prog                 358        358   +0 (+0.00%)          33          33    +0 (+0.00%)
bind6_prog.bpf.linked3.o                  bind_v6_prog                 429        429   +0 (+0.00%)          37          37    +0 (+0.00%)
bind_perm.bpf.linked3.o                   bind_v4_prog                  15         15   +0 (+0.00%)           1           1    +0 (+0.00%)

PIPING TO HEAD
==============
[vmuser@archvm bpf]$ sudo ./veristat -C ~/baseline-results-selftests.csv ~/sanity2-results-selftests.csv -e file,prog,insns,states -s '|insns_diff|' | head -n12
File                                                   Program                                               Insns (A)  Insns (B)  Insns (DIFF)  States (A)  States (B)  States (DIFF)
-----------------------------------------------------  ----------------------------------------------------  ---------  ---------  ------------  ----------  ----------  -------------
test_seg6_loop.bpf.linked3.o                           __add_egr_x                                               12440      12360  -80 (-0.64%)         364         357    -7 (-1.92%)
async_stack_depth.bpf.linked3.o                        async_call_root_check                                       145        145   +0 (+0.00%)           3           3    +0 (+0.00%)
async_stack_depth.bpf.linked3.o                        pseudo_call_check                                           139        139   +0 (+0.00%)           3           3    +0 (+0.00%)
atomic_bounds.bpf.linked3.o                            sub                                                           7          7   +0 (+0.00%)           0           0    +0 (+0.00%)
bench_local_storage_create.bpf.linked3.o               kmalloc                                                       5          5   +0 (+0.00%)           0           0    +0 (+0.00%)
bench_local_storage_create.bpf.linked3.o               sched_process_fork                                           22         22   +0 (+0.00%)           2           2    +0 (+0.00%)
bench_local_storage_create.bpf.linked3.o               socket_post_create                                           23         23   +0 (+0.00%)           2           2    +0 (+0.00%)
bind4_prog.bpf.linked3.o                               bind_v4_prog                                                358        358   +0 (+0.00%)          33          33    +0 (+0.00%)
bind6_prog.bpf.linked3.o                               bind_v6_prog                                                429        429   +0 (+0.00%)          37          37    +0 (+0.00%)
bind_perm.bpf.linked3.o                                bind_v4_prog                                                 15         15   +0 (+0.00%)           1           1    +0 (+0.00%)

Note all the wasted whitespace in the "PIPING TO HEAD" variant.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20231108051430.1830950-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-09 19:07:51 -08:00
Andrii Nakryiko 5d4a7aaca1 veristat: add ability to sort by stat's absolute value
Add ability to sort results by absolute values of specified stats. This
is especially useful to find biggest deviations in comparison mode. When
comparing verifier change effect against a large base of BPF object
files, it's necessary to see big changes both in positive and negative
directions, as both might be a signal for regressions or bugs.

The syntax is natural, e.g., adding `-s '|insns_diff|'^` will instruct
veristat to sort by absolute value of instructions difference in
ascending order.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20231108051430.1830950-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-09 19:07:51 -08:00
Anders Roxell f2d2c7e1b7 selftests/bpf: Disable CONFIG_DEBUG_INFO_REDUCED in config.aarch64
Building an arm64 kernel and seftests/bpf with defconfig +
selftests/bpf/config and selftests/bpf/config.aarch64 the fragment
CONFIG_DEBUG_INFO_REDUCED is enabled in arm64's defconfig, it should be
disabled in file sefltests/bpf/config.aarch64 since if its not disabled
CONFIG_DEBUG_INFO_BTF wont be enabled.

Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20231103220912.333930-1-anders.roxell@linaro.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-09 19:07:38 -08:00
Manu Bretelle b0cf0dcde8 selftests/bpf: Consolidate VIRTIO/9P configs in config.vm file
Those configs are needed to be able to run VM somewhat consistently.
For instance, ATM, s390x is missing the `CONFIG_VIRTIO_CONSOLE` which
prevents s390x kernels built in CI to leverage qemu-guest-agent.

By moving them to `config,vm`, we should have selftest kernels which are
equal in term of VM functionalities when they include this file.

The set of config unabled were picked using

    grep -h -E '(_9P|_VIRTIO)' config.x86_64 config | sort | uniq

added to `config.vm` and then
    grep -vE '(_9P|_VIRTIO)' config.{x86_64,aarch64,s390x}

as a side-effect, some config may have disappeared to the aarch64 and
s390x kernels, but they should not be needed. CI will tell.

Signed-off-by: Manu Bretelle <chantr4@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20231031212717.4037892-1-chantr4@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-09 19:07:38 -08:00
Hou Tao 2f553b032c selftsets/bpf: Retry map update for non-preallocated per-cpu map
BPF CI failed due to map_percpu_stats_percpu_hash from time to time [1].
It seems that the failure reason is per-cpu bpf memory allocator may not
be able to allocate per-cpu pointer successfully and it can not refill
free llist timely, and bpf_map_update_elem() will return -ENOMEM.

So mitigate the problem by retrying the update operation for
non-preallocated per-cpu map.

[1]: https://github.com/kernel-patches/bpf/actions/runs/6713177520/job/18244865326?pr=5909

Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20231101032455.3808547-4-houtao@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-09 18:58:40 -08:00
Hou Tao b9b7955316 selftests/bpf: Export map_update_retriable()
Export map_update_retriable() to make it usable for other map_test
cases. These cases may only need retry for specific errno, so add
a new callback parameter to let map_update_retriable() decide whether or
not the errno is retriable.

Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20231101032455.3808547-3-houtao@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-09 18:58:40 -08:00
Hou Tao d79924ca57 selftests/bpf: Use value with enough-size when updating per-cpu map
When updating per-cpu map in map_percpu_stats test, patch_map_thread()
only passes 4-bytes-sized value to bpf_map_update_elem(). The expected
size of the value is 8 * num_possible_cpus(), so fix it by passing a
value with enough-size for per-cpu map update.

Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20231101032455.3808547-2-houtao@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-09 18:58:40 -08:00
Andrii Nakryiko f4c7e88732 selftests/bpf: satisfy compiler by having explicit return in btf test
Some compilers complain about get_pprint_mapv_size() not returning value
in some code paths. Fix with explicit return.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20231102033759.2541186-3-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-09 18:58:38 -08:00
Andrii Nakryiko 2b62aa59d0 selftests/bpf: fix RELEASE=1 build for tc_opts
Compiler complains about malloc(). We also don't need to dynamically
allocate anything, so make the life easier by using statically sized
buffer.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20231102033759.2541186-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-09 18:58:38 -08:00
Yuran Pereira bf4a64b932 selftests/bpf: Add malloc failure checks in bpf_iter
Since some malloc calls in bpf_iter may at times fail,
this patch adds the appropriate fail checks, and ensures that
any previously allocated resource is appropriately destroyed
before returning the function.

Signed-off-by: Yuran Pereira <yuran.pereira@hotmail.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Acked-by: Kui-Feng Lee <thinker.li@gmail.com>
Link: https://lore.kernel.org/r/DB3PR10MB6835F0ECA792265FA41FC39BE8A3A@DB3PR10MB6835.EURPRD10.PROD.OUTLOOK.COM
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-09 18:58:38 -08:00
Yuran Pereira fac85c291e selftests/bpf: Convert CHECK macros to ASSERT_* macros in bpf_iter
As it was pointed out by Yonghong Song [1], in the bpf selftests the use
of the ASSERT_* series of macros is preferred over the CHECK macro.
This patch replaces all CHECK calls in bpf_iter with the appropriate
ASSERT_* macros.

[1] https://lore.kernel.org/lkml/0a142924-633c-44e6-9a92-2dc019656bf2@linux.dev

Suggested-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Yuran Pereira <yuran.pereira@hotmail.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Acked-by: Kui-Feng Lee <thinker.li@gmail.com>
Link: https://lore.kernel.org/r/DB3PR10MB6835E9C8DFCA226DD6FEF914E8A3A@DB3PR10MB6835.EURPRD10.PROD.OUTLOOK.COM
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-09 18:58:38 -08:00
Linus Torvalds 89cdf9d556 Including fixes from netfilter and bpf.
Current release - regressions:
 
  - sched: fix SKB_NOT_DROPPED_YET splat under debug config
 
 Current release - new code bugs:
 
  - tcp: fix usec timestamps with TCP fastopen
 
  - tcp_sigpool: fix some off by one bugs
 
  - tcp: fix possible out-of-bounds reads in tcp_hash_fail()
 
  - tcp: fix SYN option room calculation for TCP-AO
 
  - bpf: fix compilation error without CGROUPS
 
  - ptp:
    - ptp_read() should not release queue
    - fix tsevqs corruption
 
 Previous releases - regressions:
 
  - llc: verify mac len before reading mac header
 
 Previous releases - always broken:
 
  - bpf:
    - fix check_stack_write_fixed_off() to correctly spill imm
    - fix precision tracking for BPF_ALU | BPF_TO_BE | BPF_END
    - check map->usercnt after timer->timer is assigned
 
  - dsa: lan9303: consequently nested-lock physical MDIO
 
  - dccp/tcp: call security_inet_conn_request() after setting IP addr
 
  - tg3: fix the TX ring stall due to incorrect full ring handling
 
  - phylink: initialize carrier state at creation
 
  - ice: fix direction of VF rules in switchdev mode
 
 Misc:
 
  - fill in a bunch of missing MODULE_DESCRIPTION()s, more to come
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmVNRnAACgkQMUZtbf5S
 IrsYaA/+IUoYi96/oLtvvrET6HIbXeMaLKef0UlytEicQKy8h5EWlhcTZPhQEY0g
 dtaKOemQsO0dQTma4eQBiBDHeCeSkitgD9p7fh0i+//QFYWSFqHrBiF2mlToc/ZQ
 T1p4BlVL7D2Xsr1Lki93zk+EhFGEy2KroYgrWbZc9TWE5ap9PtSVF9eqeHAVCmZ7
 ocre/eo4pqUM9rAHIAyhoL+0xtVQ59dBevbJC0qYcmflhafr82Gtdveo6pBBKuYm
 GhwbRrAXER3Neav9c6NHqat4zsMwGpC27SiN9dYWm6dlkeS9U9t2PUu71OkJGVfw
 VaSE+utkC/WmzGbuiUIjqQLBrRe372ItHCr78BfSRMshS+RBTHtoK7njeH8Iv67E
 RsMeCyVNj9dtGlOQG5JAv8IoCQ1WbMw9B36Yzw3ip/MmDX/ntXz7Dcr4ZMZ6VURS
 CHhHFZPnmMykMXkT6SIlxeAg2r8ELtESzkvLimdTVFPAlk3cPkibKJbh3F/tEqXS
 PDb3y0uoEgRQBAsWXXx9FQEvv9rTL6YrzbMhmJBIIEoNxppQYQ7FZBJX9utAVp5B
 1GdyqhR6IRTaKb9cMRj/K1xPwm2KgCw9xj9pjKdAA7QUMslXbFp8blv1rIkFGshg
 hiNXmPcI8wo0j+0lZYktEcIERL5y6c8BgK2NnPU6RULua96tuQ4=
 =k6Wk
 -----END PGP SIGNATURE-----

Merge tag 'net-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from netfilter and bpf.

  Current release - regressions:

   - sched: fix SKB_NOT_DROPPED_YET splat under debug config

  Current release - new code bugs:

   - tcp:
       - fix usec timestamps with TCP fastopen
       - fix possible out-of-bounds reads in tcp_hash_fail()
       - fix SYN option room calculation for TCP-AO

   - tcp_sigpool: fix some off by one bugs

   - bpf: fix compilation error without CGROUPS

   - ptp:
       - ptp_read() should not release queue
       - fix tsevqs corruption

  Previous releases - regressions:

   - llc: verify mac len before reading mac header

  Previous releases - always broken:

   - bpf:
       - fix check_stack_write_fixed_off() to correctly spill imm
       - fix precision tracking for BPF_ALU | BPF_TO_BE | BPF_END
       - check map->usercnt after timer->timer is assigned

   - dsa: lan9303: consequently nested-lock physical MDIO

   - dccp/tcp: call security_inet_conn_request() after setting IP addr

   - tg3: fix the TX ring stall due to incorrect full ring handling

   - phylink: initialize carrier state at creation

   - ice: fix direction of VF rules in switchdev mode

  Misc:

   - fill in a bunch of missing MODULE_DESCRIPTION()s, more to come"

* tag 'net-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (84 commits)
  net: ti: icss-iep: fix setting counter value
  ptp: fix corrupted list in ptp_open
  ptp: ptp_read should not release queue
  net_sched: sch_fq: better validate TCA_FQ_WEIGHTS and TCA_FQ_PRIOMAP
  net: kcm: fill in MODULE_DESCRIPTION()
  net/sched: act_ct: Always fill offloading tuple iifidx
  netfilter: nat: fix ipv6 nat redirect with mapped and scoped addresses
  netfilter: xt_recent: fix (increase) ipv6 literal buffer length
  ipvs: add missing module descriptions
  netfilter: nf_tables: remove catchall element in GC sync path
  netfilter: add missing module descriptions
  drivers/net/ppp: use standard array-copy-function
  net: enetc: shorten enetc_setup_xdp_prog() error message to fit NETLINK_MAX_FMTMSG_LEN
  virtio/vsock: Fix uninit-value in virtio_transport_recv_pkt()
  r8169: respect userspace disabling IFF_MULTICAST
  selftests/bpf: get trusted cgrp from bpf_iter__cgroup directly
  bpf: Let verifier consider {task,cgroup} is trusted in bpf_iter_reg
  net: phylink: initialize carrier state at creation
  test/vsock: add dobule bind connect test
  test/vsock: refactor vsock_accept
  ...
2023-11-09 17:09:35 -08:00
Jakub Kicinski 942b8b38de bpf-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZUsiDAAKCRDbK58LschI
 g9xXAQCaFjj55sXDpr1qKG2D3PMSDURx7SzmpzIay/A/dqVDPgEAlgU6XsMW6w6S
 poMN8KniDLtBgj6nIKfJEAgIXeIYTAs=
 =qXjW
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Daniel Borkmann says:

====================
pull-request: bpf 2023-11-08

We've added 16 non-merge commits during the last 6 day(s) which contain
a total of 30 files changed, 341 insertions(+), 130 deletions(-).

The main changes are:

1) Fix a BPF verifier issue in precision tracking for BPF_ALU | BPF_TO_BE |
   BPF_END where the source register was incorrectly marked as precise,
   from Shung-Hsi Yu.

2) Fix a concurrency issue in bpf_timer where the former could still have
   been alive after an application releases or unpins the map, from Hou Tao.

3) Fix a BPF verifier issue where immediates are incorrectly cast to u32
   before being spilled and therefore losing sign information, from Hao Sun.

4) Fix a misplaced BPF_TRACE_ITER in check_css_task_iter_allowlist which
   incorrectly compared bpf_prog_type with bpf_attach_type, from Chuyi Zhou.

5) Add __bpf_hook_{start,end} as well as __bpf_kfunc_{start,end}_defs macros,
   migrate all BPF-related __diag callsites over to it, and add a new
   __diag_ignore_all for -Wmissing-declarations to the macros to address
   recent build warnings, from Dave Marchevsky.

6) Fix broken BPF selftest build of xdp_hw_metadata test on architectures
   where char is not signed, from Björn Töpel.

7) Fix test_maps selftest to properly use LIBBPF_OPTS() macro to initialize
   the bpf_map_create_opts, from Andrii Nakryiko.

8) Fix bpffs selftest to avoid unmounting /sys/kernel/debug as it may have
   been mounted and used by other applications already, from Manu Bretelle.

9) Fix a build issue without CONFIG_CGROUPS wrt css_task open-coded
   iterators, from Matthieu Baerts.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  selftests/bpf: get trusted cgrp from bpf_iter__cgroup directly
  bpf: Let verifier consider {task,cgroup} is trusted in bpf_iter_reg
  selftests/bpf: Fix broken build where char is unsigned
  selftests/bpf: precision tracking test for BPF_NEG and BPF_END
  bpf: Fix precision tracking for BPF_ALU | BPF_TO_BE | BPF_END
  selftests/bpf: Add test for using css_task iter in sleepable progs
  selftests/bpf: Add tests for css_task iter combining with cgroup iter
  bpf: Relax allowlist for css_task iter
  selftests/bpf: fix test_maps' use of bpf_map_create_opts
  bpf: Check map->usercnt after timer->timer is assigned
  bpf: Add __bpf_hook_{start,end} macros
  bpf: Add __bpf_kfunc_{start,end}_defs macros
  selftests/bpf: fix test_bpffs
  selftests/bpf: Add test for immediate spilled to stack
  bpf: Fix check_stack_write_fixed_off() to correctly spill imm
  bpf: fix compilation error without CGROUPS
====================

Link: https://lore.kernel.org/r/20231108132448.1970-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-08 17:56:14 -08:00
Linus Torvalds d46392bbf5 RISC-V Patches for the 6.7 Merge Window, Part 1
* Support for cbo.zero in userspace.
 * Support for CBOs on ACPI-based systems.
 * A handful of improvements for the T-Head cache flushing ops.
 * Support for software shadow call stacks.
 * Various cleanups and fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmVJAJoTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiWZrD/9ECV/0tuX5LbS56kA0ElkwiakyIVGu
 ZVuF26yGJ6w+XvwnHPhqKNVN0ReYR6s6CwH1WpHI5Du9QHZGQU3DKJ43dFMTP3Dn
 dQFli7QJ+tsNo1nre8NZWKj5Ac+Cu906F794qM0q0XrZmyb9DY3ojVYJAYy+dtoo
 /9gwbB7P0GLyDlURLn48oQyz36WQW3CkL5Jkfu+uYwnFe9DAFtfakIKq5mLlNuaH
 PgUk8pAVhSy2GdPOGFtnFFhdXMrTjpgxdo62ZIZC0lbsts26Dxp95oUygqMg51Iy
 ilaXkA2U1c1+gFQNpEove7BVZa5708Kaj6RLQ3/kAJblAzibszwQvIWlWOh7RVni
 3GQAS7/0D0+0cjDwXdWaPIaFFzLfi3bDxRYkc7n59p6nOz+GrxnSNsRPQJGgYxeU
 oTtJfaqWKntm72iutiHmXgx/pvAxWOHpqDnSTlDdtjvgzXCplqBbxZFF/azj30o5
 jplNW5YvdvD9fviYMAoGSOz03IwDeZF5rMlAhqu6vXlyD2//mID82yw/hBluIA3+
 /hLo5QfTLiUGs9nnijxMcfoyusN6AXsJOxwYdAJCIuJOr78YUj0S974gd9KvJXma
 KedrwRVwW7KE7CwY1jhrWBsZEpzl8YrtpMDN47y4gRtDZN8XJMQ+lHqd+BHT/DUO
 TGUCYi5xvr6Vlw==
 =hKWl
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

 - Support for cbo.zero in userspace

 - Support for CBOs on ACPI-based systems

 - A handful of improvements for the T-Head cache flushing ops

 - Support for software shadow call stacks

 - Various cleanups and fixes

* tag 'riscv-for-linus-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (31 commits)
  RISC-V: hwprobe: Fix vDSO SIGSEGV
  riscv: configs: defconfig: Enable configs required for RZ/Five SoC
  riscv: errata: prefix T-Head mnemonics with th.
  riscv: put interrupt entries into .irqentry.text
  riscv: mm: Update the comment of CONFIG_PAGE_OFFSET
  riscv: Using TOOLCHAIN_HAS_ZIHINTPAUSE marco replace zihintpause
  riscv/mm: Fix the comment for swap pte format
  RISC-V: clarify the QEMU workaround in ISA parser
  riscv: correct pt_level name via pgtable_l5/4_enabled
  RISC-V: Provide pgtable_l5_enabled on rv32
  clocksource: timer-riscv: Increase rating of clock_event_device for Sstc
  clocksource: timer-riscv: Don't enable/disable timer interrupt
  lkdtm: Fix CFI_BACKWARD on RISC-V
  riscv: Use separate IRQ shadow call stacks
  riscv: Implement Shadow Call Stack
  riscv: Move global pointer loading to a macro
  riscv: Deduplicate IRQ stack switching
  riscv: VMAP_STACK overflow detection thread-safe
  RISC-V: cacheflush: Initialize CBO variables on ACPI systems
  RISC-V: ACPI: RHCT: Add function to get CBO block sizes
  ...
2023-11-08 09:21:18 -08:00
Hengqi Chen 1d375d6546 selftests/bpf: Enable cpu v4 tests for LoongArch
Enable the cpu v4 tests for LoongArch. Currently, we don't have BPF
trampoline in LoongArch JIT, so the fentry test `test_ptr_struct_arg`
still failed, will followup.

Test result attached below:

  # ./test_progs -t verifier_sdiv,verifier_movsx,verifier_ldsx,verifier_gotol,verifier_bswap
  #316/1   verifier_bswap/BSWAP, 16:OK
  #316/2   verifier_bswap/BSWAP, 16 @unpriv:OK
  #316/3   verifier_bswap/BSWAP, 32:OK
  #316/4   verifier_bswap/BSWAP, 32 @unpriv:OK
  #316/5   verifier_bswap/BSWAP, 64:OK
  #316/6   verifier_bswap/BSWAP, 64 @unpriv:OK
  #316     verifier_bswap:OK
  #330/1   verifier_gotol/gotol, small_imm:OK
  #330/2   verifier_gotol/gotol, small_imm @unpriv:OK
  #330     verifier_gotol:OK
  #338/1   verifier_ldsx/LDSX, S8:OK
  #338/2   verifier_ldsx/LDSX, S8 @unpriv:OK
  #338/3   verifier_ldsx/LDSX, S16:OK
  #338/4   verifier_ldsx/LDSX, S16 @unpriv:OK
  #338/5   verifier_ldsx/LDSX, S32:OK
  #338/6   verifier_ldsx/LDSX, S32 @unpriv:OK
  #338/7   verifier_ldsx/LDSX, S8 range checking, privileged:OK
  #338/8   verifier_ldsx/LDSX, S16 range checking:OK
  #338/9   verifier_ldsx/LDSX, S16 range checking @unpriv:OK
  #338/10  verifier_ldsx/LDSX, S32 range checking:OK
  #338/11  verifier_ldsx/LDSX, S32 range checking @unpriv:OK
  #338     verifier_ldsx:OK
  #349/1   verifier_movsx/MOV32SX, S8:OK
  #349/2   verifier_movsx/MOV32SX, S8 @unpriv:OK
  #349/3   verifier_movsx/MOV32SX, S16:OK
  #349/4   verifier_movsx/MOV32SX, S16 @unpriv:OK
  #349/5   verifier_movsx/MOV64SX, S8:OK
  #349/6   verifier_movsx/MOV64SX, S8 @unpriv:OK
  #349/7   verifier_movsx/MOV64SX, S16:OK
  #349/8   verifier_movsx/MOV64SX, S16 @unpriv:OK
  #349/9   verifier_movsx/MOV64SX, S32:OK
  #349/10  verifier_movsx/MOV64SX, S32 @unpriv:OK
  #349/11  verifier_movsx/MOV32SX, S8, range_check:OK
  #349/12  verifier_movsx/MOV32SX, S8, range_check @unpriv:OK
  #349/13  verifier_movsx/MOV32SX, S16, range_check:OK
  #349/14  verifier_movsx/MOV32SX, S16, range_check @unpriv:OK
  #349/15  verifier_movsx/MOV32SX, S16, range_check 2:OK
  #349/16  verifier_movsx/MOV32SX, S16, range_check 2 @unpriv:OK
  #349/17  verifier_movsx/MOV64SX, S8, range_check:OK
  #349/18  verifier_movsx/MOV64SX, S8, range_check @unpriv:OK
  #349/19  verifier_movsx/MOV64SX, S16, range_check:OK
  #349/20  verifier_movsx/MOV64SX, S16, range_check @unpriv:OK
  #349/21  verifier_movsx/MOV64SX, S32, range_check:OK
  #349/22  verifier_movsx/MOV64SX, S32, range_check @unpriv:OK
  #349/23  verifier_movsx/MOV64SX, S16, R10 Sign Extension:OK
  #349/24  verifier_movsx/MOV64SX, S16, R10 Sign Extension @unpriv:OK
  #349     verifier_movsx:OK
  #361/1   verifier_sdiv/SDIV32, non-zero imm divisor, check 1:OK
  #361/2   verifier_sdiv/SDIV32, non-zero imm divisor, check 1 @unpriv:OK
  #361/3   verifier_sdiv/SDIV32, non-zero imm divisor, check 2:OK
  #361/4   verifier_sdiv/SDIV32, non-zero imm divisor, check 2 @unpriv:OK
  #361/5   verifier_sdiv/SDIV32, non-zero imm divisor, check 3:OK
  #361/6   verifier_sdiv/SDIV32, non-zero imm divisor, check 3 @unpriv:OK
  #361/7   verifier_sdiv/SDIV32, non-zero imm divisor, check 4:OK
  #361/8   verifier_sdiv/SDIV32, non-zero imm divisor, check 4 @unpriv:OK
  #361/9   verifier_sdiv/SDIV32, non-zero imm divisor, check 5:OK
  #361/10  verifier_sdiv/SDIV32, non-zero imm divisor, check 5 @unpriv:OK
  #361/11  verifier_sdiv/SDIV32, non-zero imm divisor, check 6:OK
  #361/12  verifier_sdiv/SDIV32, non-zero imm divisor, check 6 @unpriv:OK
  #361/13  verifier_sdiv/SDIV32, non-zero imm divisor, check 7:OK
  #361/14  verifier_sdiv/SDIV32, non-zero imm divisor, check 7 @unpriv:OK
  #361/15  verifier_sdiv/SDIV32, non-zero imm divisor, check 8:OK
  #361/16  verifier_sdiv/SDIV32, non-zero imm divisor, check 8 @unpriv:OK
  #361/17  verifier_sdiv/SDIV32, non-zero reg divisor, check 1:OK
  #361/18  verifier_sdiv/SDIV32, non-zero reg divisor, check 1 @unpriv:OK
  #361/19  verifier_sdiv/SDIV32, non-zero reg divisor, check 2:OK
  #361/20  verifier_sdiv/SDIV32, non-zero reg divisor, check 2 @unpriv:OK
  #361/21  verifier_sdiv/SDIV32, non-zero reg divisor, check 3:OK
  #361/22  verifier_sdiv/SDIV32, non-zero reg divisor, check 3 @unpriv:OK
  #361/23  verifier_sdiv/SDIV32, non-zero reg divisor, check 4:OK
  #361/24  verifier_sdiv/SDIV32, non-zero reg divisor, check 4 @unpriv:OK
  #361/25  verifier_sdiv/SDIV32, non-zero reg divisor, check 5:OK
  #361/26  verifier_sdiv/SDIV32, non-zero reg divisor, check 5 @unpriv:OK
  #361/27  verifier_sdiv/SDIV32, non-zero reg divisor, check 6:OK
  #361/28  verifier_sdiv/SDIV32, non-zero reg divisor, check 6 @unpriv:OK
  #361/29  verifier_sdiv/SDIV32, non-zero reg divisor, check 7:OK
  #361/30  verifier_sdiv/SDIV32, non-zero reg divisor, check 7 @unpriv:OK
  #361/31  verifier_sdiv/SDIV32, non-zero reg divisor, check 8:OK
  #361/32  verifier_sdiv/SDIV32, non-zero reg divisor, check 8 @unpriv:OK
  #361/33  verifier_sdiv/SDIV64, non-zero imm divisor, check 1:OK
  #361/34  verifier_sdiv/SDIV64, non-zero imm divisor, check 1 @unpriv:OK
  #361/35  verifier_sdiv/SDIV64, non-zero imm divisor, check 2:OK
  #361/36  verifier_sdiv/SDIV64, non-zero imm divisor, check 2 @unpriv:OK
  #361/37  verifier_sdiv/SDIV64, non-zero imm divisor, check 3:OK
  #361/38  verifier_sdiv/SDIV64, non-zero imm divisor, check 3 @unpriv:OK
  #361/39  verifier_sdiv/SDIV64, non-zero imm divisor, check 4:OK
  #361/40  verifier_sdiv/SDIV64, non-zero imm divisor, check 4 @unpriv:OK
  #361/41  verifier_sdiv/SDIV64, non-zero imm divisor, check 5:OK
  #361/42  verifier_sdiv/SDIV64, non-zero imm divisor, check 5 @unpriv:OK
  #361/43  verifier_sdiv/SDIV64, non-zero imm divisor, check 6:OK
  #361/44  verifier_sdiv/SDIV64, non-zero imm divisor, check 6 @unpriv:OK
  #361/45  verifier_sdiv/SDIV64, non-zero reg divisor, check 1:OK
  #361/46  verifier_sdiv/SDIV64, non-zero reg divisor, check 1 @unpriv:OK
  #361/47  verifier_sdiv/SDIV64, non-zero reg divisor, check 2:OK
  #361/48  verifier_sdiv/SDIV64, non-zero reg divisor, check 2 @unpriv:OK
  #361/49  verifier_sdiv/SDIV64, non-zero reg divisor, check 3:OK
  #361/50  verifier_sdiv/SDIV64, non-zero reg divisor, check 3 @unpriv:OK
  #361/51  verifier_sdiv/SDIV64, non-zero reg divisor, check 4:OK
  #361/52  verifier_sdiv/SDIV64, non-zero reg divisor, check 4 @unpriv:OK
  #361/53  verifier_sdiv/SDIV64, non-zero reg divisor, check 5:OK
  #361/54  verifier_sdiv/SDIV64, non-zero reg divisor, check 5 @unpriv:OK
  #361/55  verifier_sdiv/SDIV64, non-zero reg divisor, check 6:OK
  #361/56  verifier_sdiv/SDIV64, non-zero reg divisor, check 6 @unpriv:OK
  #361/57  verifier_sdiv/SMOD32, non-zero imm divisor, check 1:OK
  #361/58  verifier_sdiv/SMOD32, non-zero imm divisor, check 1 @unpriv:OK
  #361/59  verifier_sdiv/SMOD32, non-zero imm divisor, check 2:OK
  #361/60  verifier_sdiv/SMOD32, non-zero imm divisor, check 2 @unpriv:OK
  #361/61  verifier_sdiv/SMOD32, non-zero imm divisor, check 3:OK
  #361/62  verifier_sdiv/SMOD32, non-zero imm divisor, check 3 @unpriv:OK
  #361/63  verifier_sdiv/SMOD32, non-zero imm divisor, check 4:OK
  #361/64  verifier_sdiv/SMOD32, non-zero imm divisor, check 4 @unpriv:OK
  #361/65  verifier_sdiv/SMOD32, non-zero imm divisor, check 5:OK
  #361/66  verifier_sdiv/SMOD32, non-zero imm divisor, check 5 @unpriv:OK
  #361/67  verifier_sdiv/SMOD32, non-zero imm divisor, check 6:OK
  #361/68  verifier_sdiv/SMOD32, non-zero imm divisor, check 6 @unpriv:OK
  #361/69  verifier_sdiv/SMOD32, non-zero reg divisor, check 1:OK
  #361/70  verifier_sdiv/SMOD32, non-zero reg divisor, check 1 @unpriv:OK
  #361/71  verifier_sdiv/SMOD32, non-zero reg divisor, check 2:OK
  #361/72  verifier_sdiv/SMOD32, non-zero reg divisor, check 2 @unpriv:OK
  #361/73  verifier_sdiv/SMOD32, non-zero reg divisor, check 3:OK
  #361/74  verifier_sdiv/SMOD32, non-zero reg divisor, check 3 @unpriv:OK
  #361/75  verifier_sdiv/SMOD32, non-zero reg divisor, check 4:OK
  #361/76  verifier_sdiv/SMOD32, non-zero reg divisor, check 4 @unpriv:OK
  #361/77  verifier_sdiv/SMOD32, non-zero reg divisor, check 5:OK
  #361/78  verifier_sdiv/SMOD32, non-zero reg divisor, check 5 @unpriv:OK
  #361/79  verifier_sdiv/SMOD32, non-zero reg divisor, check 6:OK
  #361/80  verifier_sdiv/SMOD32, non-zero reg divisor, check 6 @unpriv:OK
  #361/81  verifier_sdiv/SMOD64, non-zero imm divisor, check 1:OK
  #361/82  verifier_sdiv/SMOD64, non-zero imm divisor, check 1 @unpriv:OK
  #361/83  verifier_sdiv/SMOD64, non-zero imm divisor, check 2:OK
  #361/84  verifier_sdiv/SMOD64, non-zero imm divisor, check 2 @unpriv:OK
  #361/85  verifier_sdiv/SMOD64, non-zero imm divisor, check 3:OK
  #361/86  verifier_sdiv/SMOD64, non-zero imm divisor, check 3 @unpriv:OK
  #361/87  verifier_sdiv/SMOD64, non-zero imm divisor, check 4:OK
  #361/88  verifier_sdiv/SMOD64, non-zero imm divisor, check 4 @unpriv:OK
  #361/89  verifier_sdiv/SMOD64, non-zero imm divisor, check 5:OK
  #361/90  verifier_sdiv/SMOD64, non-zero imm divisor, check 5 @unpriv:OK
  #361/91  verifier_sdiv/SMOD64, non-zero imm divisor, check 6:OK
  #361/92  verifier_sdiv/SMOD64, non-zero imm divisor, check 6 @unpriv:OK
  #361/93  verifier_sdiv/SMOD64, non-zero imm divisor, check 7:OK
  #361/94  verifier_sdiv/SMOD64, non-zero imm divisor, check 7 @unpriv:OK
  #361/95  verifier_sdiv/SMOD64, non-zero imm divisor, check 8:OK
  #361/96  verifier_sdiv/SMOD64, non-zero imm divisor, check 8 @unpriv:OK
  #361/97  verifier_sdiv/SMOD64, non-zero reg divisor, check 1:OK
  #361/98  verifier_sdiv/SMOD64, non-zero reg divisor, check 1 @unpriv:OK
  #361/99  verifier_sdiv/SMOD64, non-zero reg divisor, check 2:OK
  #361/100 verifier_sdiv/SMOD64, non-zero reg divisor, check 2 @unpriv:OK
  #361/101 verifier_sdiv/SMOD64, non-zero reg divisor, check 3:OK
  #361/102 verifier_sdiv/SMOD64, non-zero reg divisor, check 3 @unpriv:OK
  #361/103 verifier_sdiv/SMOD64, non-zero reg divisor, check 4:OK
  #361/104 verifier_sdiv/SMOD64, non-zero reg divisor, check 4 @unpriv:OK
  #361/105 verifier_sdiv/SMOD64, non-zero reg divisor, check 5:OK
  #361/106 verifier_sdiv/SMOD64, non-zero reg divisor, check 5 @unpriv:OK
  #361/107 verifier_sdiv/SMOD64, non-zero reg divisor, check 6:OK
  #361/108 verifier_sdiv/SMOD64, non-zero reg divisor, check 6 @unpriv:OK
  #361/109 verifier_sdiv/SMOD64, non-zero reg divisor, check 7:OK
  #361/110 verifier_sdiv/SMOD64, non-zero reg divisor, check 7 @unpriv:OK
  #361/111 verifier_sdiv/SMOD64, non-zero reg divisor, check 8:OK
  #361/112 verifier_sdiv/SMOD64, non-zero reg divisor, check 8 @unpriv:OK
  #361/113 verifier_sdiv/SDIV32, zero divisor:OK
  #361/114 verifier_sdiv/SDIV32, zero divisor @unpriv:OK
  #361/115 verifier_sdiv/SDIV64, zero divisor:OK
  #361/116 verifier_sdiv/SDIV64, zero divisor @unpriv:OK
  #361/117 verifier_sdiv/SMOD32, zero divisor:OK
  #361/118 verifier_sdiv/SMOD32, zero divisor @unpriv:OK
  #361/119 verifier_sdiv/SMOD64, zero divisor:OK
  #361/120 verifier_sdiv/SMOD64, zero divisor @unpriv:OK
  #361     verifier_sdiv:OK
  Summary: 5/163 PASSED, 0 SKIPPED, 0 FAILED

  # ./test_progs -t ldsx_insn
  test_map_val_and_probed_memory:PASS:test_ldsx_insn__open 0 nsec
  test_map_val_and_probed_memory:PASS:test_ldsx_insn__load 0 nsec
  libbpf: prog 'test_ptr_struct_arg': failed to attach: ERROR: strerror_r(-524)=22
  libbpf: prog 'test_ptr_struct_arg': failed to auto-attach: -524
  test_map_val_and_probed_memory:FAIL:test_ldsx_insn__attach unexpected error: -524 (errno 524)
  #116/1   ldsx_insn/map_val and probed_memory:FAIL
  #116/2   ldsx_insn/ctx_member_sign_ext:OK
  #116/3   ldsx_insn/ctx_member_narrow_sign_ext:OK
  #116     ldsx_insn:FAIL

  All error logs:
  test_map_val_and_probed_memory:PASS:test_ldsx_insn__open 0 nsec
  test_map_val_and_probed_memory:PASS:test_ldsx_insn__load 0 nsec
  libbpf: prog 'test_ptr_struct_arg': failed to attach: ERROR: strerror_r(-524)=22
  libbpf: prog 'test_ptr_struct_arg': failed to auto-attach: -524
  test_map_val_and_probed_memory:FAIL:test_ldsx_insn__attach unexpected error: -524 (errno 524)
  #116/1   ldsx_insn/map_val and probed_memory:FAIL
  #116     ldsx_insn:FAIL
  Summary: 0/2 PASSED, 0 SKIPPED, 1 FAILED

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-11-08 14:12:21 +08:00
Chuyi Zhou 3c5864ba9c selftests/bpf: get trusted cgrp from bpf_iter__cgroup directly
Commit f49843afde (selftests/bpf: Add tests for css_task iter combining
with cgroup iter) added a test which demonstrates how css_task iter can be
combined with cgroup iter. That test used bpf_cgroup_from_id() to convert
bpf_iter__cgroup->cgroup to a trusted ptr which is pointless now, since
with the previous fix, we can get a trusted cgroup directly from
bpf_iter__cgroup.

Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20231107132204.912120-3-zhouchuyi@bytedance.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-11-07 15:28:06 -08:00
Filippo Storniolo d80f63f690 test/vsock: add dobule bind connect test
This add bind connect test which creates a listening server socket
and tries to connect a client with a bound local port to it twice.

Co-developed-by: Luigi Leonardi <luigi.leonardi@outlook.com>
Signed-off-by: Luigi Leonardi <luigi.leonardi@outlook.com>
Signed-off-by: Filippo Storniolo <f.storniolo95@gmail.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-07 22:27:07 +00:00
Filippo Storniolo 84d5fb9741 test/vsock: refactor vsock_accept
This is a preliminary patch to introduce SOCK_STREAM bind connect test.
vsock_accept() is split into vsock_listen() and vsock_accept().

Co-developed-by: Luigi Leonardi <luigi.leonardi@outlook.com>
Signed-off-by: Luigi Leonardi <luigi.leonardi@outlook.com>
Signed-off-by: Filippo Storniolo <f.storniolo95@gmail.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-07 22:27:07 +00:00
Filippo Storniolo bfada5a767 test/vsock fix: add missing check on socket creation
Add check on socket() return value in vsock_listen()
and vsock_connect()

Co-developed-by: Luigi Leonardi <luigi.leonardi@outlook.com>
Signed-off-by: Luigi Leonardi <luigi.leonardi@outlook.com>
Signed-off-by: Filippo Storniolo <f.storniolo95@gmail.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-07 22:27:07 +00:00
Linus Torvalds b8cc56d041 cxl for v6.7
- Add support for RCH (Restricted CXL Host) Error recovery
 
 - Fix several region assembly bugs
 
 - Fix mem-device lifetime issues relative to the sanitize command and
   RCH topology.
 
 - Refactor ACPI table parsing for CDAT parsing re-use in preparation for
   CXL QOS support.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCZUaowQAKCRDfioYZHlFs
 Z75rAP44azzLPwJtva7Ur60KpNsGuoZKhvWWdeI1/zo9k4pHbwEA/Vaf/GGo0U5k
 bMkoTmwPTd7YY79B5HNUQSZsqF9wlAc=
 =TEQ0
 -----END PGP SIGNATURE-----

Merge tag 'cxl-for-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull CXL (Compute Express Link) updates from Dan Williams:
 "The main new functionality this time is work to allow Linux to
  natively handle CXL link protocol errors signalled via PCIe AER for
  current generation CXL platforms. This required some enlightenment of
  the PCIe AER core to workaround the fact that current generation RCH
  (Restricted CXL Host) platforms physically hide topology details and
  registers via a mechanism called RCRB (Root Complex Register Block).

  The next major highlight is reworks to address bugs in parsing region
  configurations for next generation VH (Virtual Host) topologies. The
  old broken algorithm is replaced with a simpler one that significantly
  increases the number of region configurations supported by Linux. This
  is again relevant for error handling so that forward and reverse
  address translation of memory errors can be carried out by Linux for
  memory regions instantiated by platform firmware.

  As for other cross-tree work, the ACPI table parsing code has been
  refactored for reuse parsing the "CDAT" structure which is an
  ACPI-like data structure that is reported by CXL devices. That work is
  in preparation for v6.8 support for CXL QoS. Think of this as dynamic
  generation of NUMA node topology information generated by Linux rather
  than platform firmware.

  Lastly, a number of internal object lifetime issues have been resolved
  along with misc. fixes and feature updates (decoders_committed sysfs
  ABI).

  Summary:

   - Add support for RCH (Restricted CXL Host) Error recovery

   - Fix several region assembly bugs

   - Fix mem-device lifetime issues relative to the sanitize command and
     RCH topology.

   - Refactor ACPI table parsing for CDAT parsing re-use in preparation
     for CXL QOS support"

* tag 'cxl-for-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (50 commits)
  lib/fw_table: Remove acpi_parse_entries_array() export
  cxl/pci: Change CXL AER support check to use native AER
  cxl/hdm: Remove broken error path
  cxl/hdm: Fix && vs || bug
  acpi: Move common tables helper functions to common lib
  cxl: Add support for reading CXL switch CDAT table
  cxl: Add checksum verification to CDAT from CXL
  cxl: Export QTG ids from CFMWS to sysfs as qos_class attribute
  cxl: Add decoders_committed sysfs attribute to cxl_port
  cxl: Add cxl_decoders_committed() helper
  cxl/core/regs: Rework cxl_map_pmu_regs() to use map->dev for devm
  cxl/core/regs: Rename phys_addr in cxl_map_component_regs()
  PCI/AER: Unmask RCEC internal errors to enable RCH downstream port error handling
  PCI/AER: Forward RCH downstream port-detected errors to the CXL.mem dev handler
  cxl/pci: Disable root port interrupts in RCH mode
  cxl/pci: Add RCH downstream port error logging
  cxl/pci: Map RCH downstream AER registers for logging protocol errors
  cxl/pci: Update CXL error logging to use RAS register address
  PCI/AER: Refactor cper_print_aer() for use by CXL driver module
  cxl/pci: Add RCH downstream port AER register discovery
  ...
2023-11-04 16:20:36 -10:00
Linus Torvalds 136cc1e1f5 Landlock updates for v6.7-rc1
-----BEGIN PGP SIGNATURE-----
 
 iIYEABYIAC4WIQSVyBthFV4iTW/VU1/l49DojIL20gUCZUOZKRAcbWljQGRpZ2lr
 b2QubmV0AAoJEOXj0OiMgvbSoaIBAMHG8wxzRcTMplddgQHXmbWPByFIjhA0hqqp
 +hEgLFfyAQCqLPi4fW49CokrkynATKXTLMIBfZ37EYZ3llJgveHTDw==
 =rPTd
 -----END PGP SIGNATURE-----

Merge tag 'landlock-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux

Pull landlock updates from Mickaël Salaün:
 "A Landlock ruleset can now handle two new access rights:
  LANDLOCK_ACCESS_NET_BIND_TCP and LANDLOCK_ACCESS_NET_CONNECT_TCP. When
  handled, the related actions are denied unless explicitly allowed by a
  Landlock network rule for a specific port.

  The related patch series has been reviewed for almost two years, it
  has evolved a lot and we now have reached a decent design, code and
  testing. The refactored kernel code and the new test helpers also
  bring the foundation to support more network protocols.

  Test coverage for security/landlock is 92.4% of 710 lines according to
  gcc/gcov-13, and it was 93.1% of 597 lines before this series. The
  decrease in coverage is due to code refactoring to make the ruleset
  management more generic (i.e. dealing with inodes and ports) that also
  added new WARN_ON_ONCE() checks not possible to test from user space.

  syzkaller has been updated accordingly [4], and such patched instance
  (tailored to Landlock) has been running for a month, covering all the
  new network-related code [5]"

Link: https://lore.kernel.org/r/20231026014751.414649-1-konstantin.meskhidze@huawei.com [1]
Link: https://lore.kernel.org/r/CAHC9VhS1wwgH6NNd+cJz4MYogPiRV8NyPDd1yj5SpaxeUB4UVg@mail.gmail.com [2]
Link: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next-history.git/commit/?id=c8dc5ee69d3a [3]
Link: https://github.com/google/syzkaller/pull/4266 [4]
Link: https://storage.googleapis.com/syzbot-assets/82e8608dec36/ci-upstream-linux-next-kasan-gce-root-ab577164.html#security%2flandlock%2fnet.c [5]

* tag 'landlock-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
  selftests/landlock: Add tests for FS topology changes with network rules
  landlock: Document network support
  samples/landlock: Support TCP restrictions
  selftests/landlock: Add network tests
  selftests/landlock: Share enforce_ruleset() helper
  landlock: Support network rules with TCP bind and connect
  landlock: Refactor landlock_add_rule() syscall
  landlock: Refactor layer helpers
  landlock: Move and rename layer helpers
  landlock: Refactor merge/inherit_ruleset helpers
  landlock: Refactor landlock_find_rule/insert_rule helpers
  landlock: Allow FS topology changes for domains without such rule type
  landlock: Make ruleset's access masks more generic
2023-11-03 09:28:53 -10:00
Linus Torvalds 31e5f934ff Tracing updates for v6.7:
- Remove eventfs_file descriptor
 
   This is the biggest change, and the second part of making eventfs
   create its files dynamically.
 
   In 6.6 the first part was added, and that maintained a one to one
   mapping between eventfs meta descriptors and the directories and
   file inodes and dentries that were dynamically created. The
   directories were represented by a eventfs_inode and the files
   were represented by a eventfs_file.
 
   In v6.7 the eventfs_file is removed. As all events have the same
   directory make up (sched_switch has an "enable", "id", "format",
   etc files), the handing of what files are underneath each leaf
   eventfs directory is moved back to the tracing subsystem via a
   callback. When a event is added to the eventfs, it registers
   an array of evenfs_entry's. These hold the names of the files and
   the callbacks to call when the file is referenced. The callback gets
   the name so that the same callback may be used by multiple files.
   The callback then supplies the filesystem_operations structure needed
   to create this file.
 
   This has brought the memory footprint of creating multiple eventfs
   instances down by 2 megs each!
 
 - User events now has persistent events that are not associated
   to a single processes. These are privileged events that hang around
   even if no process is attached to them.
 
 - Clean up of seq_buf.
   There's talk about using seq_buf more to replace strscpy() and friends.
   But this also requires some minor modifications of seq_buf to be
   able to do this.
 
 - Expand instance ring buffers individually
   Currently if boot up creates an instance, and a trace event is
   enabled on that instance, the ring buffer for that instance and the
   top level ring buffer are expanded (1.4 MB per CPU). This wastes
   memory as this happens when nothing is using the top level instance.
 
 - Other minor clean ups and fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZUMrBBQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6quzVAQCed/kPM7X9j2QZamJVDruMf2CmVxpu
 /TOvKvSKV584GgEAxLntf5VKx1Q98bc68y3Zkg+OCi8jSgORos1ROmURhws=
 =iIgb
 -----END PGP SIGNATURE-----

Merge tag 'trace-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing updates from Steven Rostedt:

 - Remove eventfs_file descriptor

   This is the biggest change, and the second part of making eventfs
   create its files dynamically.

   In 6.6 the first part was added, and that maintained a one to one
   mapping between eventfs meta descriptors and the directories and file
   inodes and dentries that were dynamically created. The directories
   were represented by a eventfs_inode and the files were represented by
   a eventfs_file.

   In v6.7 the eventfs_file is removed. As all events have the same
   directory make up (sched_switch has an "enable", "id", "format", etc
   files), the handing of what files are underneath each leaf eventfs
   directory is moved back to the tracing subsystem via a callback.

   When an event is added to the eventfs, it registers an array of
   evenfs_entry's. These hold the names of the files and the callbacks
   to call when the file is referenced. The callback gets the name so
   that the same callback may be used by multiple files. The callback
   then supplies the filesystem_operations structure needed to create
   this file.

   This has brought the memory footprint of creating multiple eventfs
   instances down by 2 megs each!

 - User events now has persistent events that are not associated to a
   single processes. These are privileged events that hang around even
   if no process is attached to them

 - Clean up of seq_buf

   There's talk about using seq_buf more to replace strscpy() and
   friends. But this also requires some minor modifications of seq_buf
   to be able to do this

 - Expand instance ring buffers individually

   Currently if boot up creates an instance, and a trace event is
   enabled on that instance, the ring buffer for that instance and the
   top level ring buffer are expanded (1.4 MB per CPU). This wastes
   memory as this happens when nothing is using the top level instance

 - Other minor clean ups and fixes

* tag 'trace-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (34 commits)
  seq_buf: Export seq_buf_puts()
  seq_buf: Export seq_buf_putc()
  eventfs: Use simple_recursive_removal() to clean up dentries
  eventfs: Remove special processing of dput() of events directory
  eventfs: Delete eventfs_inode when the last dentry is freed
  eventfs: Hold eventfs_mutex when calling callback functions
  eventfs: Save ownership and mode
  eventfs: Test for ei->is_freed when accessing ei->dentry
  eventfs: Have a free_ei() that just frees the eventfs_inode
  eventfs: Remove "is_freed" union with rcu head
  eventfs: Fix kerneldoc of eventfs_remove_rec()
  tracing: Have the user copy of synthetic event address use correct context
  eventfs: Remove extra dget() in eventfs_create_events_dir()
  tracing: Have trace_event_file have ref counters
  seq_buf: Introduce DECLARE_SEQ_BUF and seq_buf_str()
  eventfs: Fix typo in eventfs_inode union comment
  eventfs: Fix WARN_ON() in create_file_dentry()
  powerpc: Remove initialisation of readpos
  tracing/histograms: Simplify last_cmd_set()
  seq_buf: fix a misleading comment
  ...
2023-11-03 07:41:18 -10:00
Hangbin Liu 63e201916b selftests: pmtu.sh: fix result checking
In the PMTU test, when all previous tests are skipped and the new test
passes, the exit code is set to 0. However, the current check mistakenly
treats this as an assignment, causing the check to pass every time.

Consequently, regardless of how many tests have failed, if the latest test
passes, the PMTU test will report a pass.

Fixes: 2a9d3716b8 ("selftests: pmtu.sh: improve the test result processing")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-03 09:15:42 +00:00
Linus Torvalds 8f6f76a6a2 As usual, lots of singleton and doubleton patches all over the tree and
there's little I can say which isn't in the individual changelogs.
 
 The lengthier patch series are
 
 - "kdump: use generic functions to simplify crashkernel reservation in
   arch", from Baoquan He.  This is mainly cleanups and consolidation of
   the "crashkernel=" kernel parameter handling.
 
 - After much discussion, David Laight's "minmax: Relax type checks in
   min() and max()" is here.  Hopefully reduces some typecasting and the
   use of min_t() and max_t().
 
 - A group of patches from Oleg Nesterov which clean up and slightly fix
   our handling of reads from /proc/PID/task/...  and which remove
   task_struct.therad_group.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZUQP9wAKCRDdBJ7gKXxA
 jmOAAQDh8sxagQYocoVsSm28ICqXFeaY9Co1jzBIDdNesAvYVwD/c2DHRqJHEiS4
 63BNcG3+hM9nwGJHb5lyh5m79nBMRg0=
 =On4u
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2023-11-02-14-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:
 "As usual, lots of singleton and doubleton patches all over the tree
  and there's little I can say which isn't in the individual changelogs.

  The lengthier patch series are

   - 'kdump: use generic functions to simplify crashkernel reservation
     in arch', from Baoquan He. This is mainly cleanups and
     consolidation of the 'crashkernel=' kernel parameter handling

   - After much discussion, David Laight's 'minmax: Relax type checks in
     min() and max()' is here. Hopefully reduces some typecasting and
     the use of min_t() and max_t()

   - A group of patches from Oleg Nesterov which clean up and slightly
     fix our handling of reads from /proc/PID/task/... and which remove
     task_struct.thread_group"

* tag 'mm-nonmm-stable-2023-11-02-14-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (64 commits)
  scripts/gdb/vmalloc: disable on no-MMU
  scripts/gdb: fix usage of MOD_TEXT not defined when CONFIG_MODULES=n
  .mailmap: add address mapping for Tomeu Vizoso
  mailmap: update email address for Claudiu Beznea
  tools/testing/selftests/mm/run_vmtests.sh: lower the ptrace permissions
  .mailmap: map Benjamin Poirier's address
  scripts/gdb: add lx_current support for riscv
  ocfs2: fix a spelling typo in comment
  proc: test ProtectionKey in proc-empty-vm test
  proc: fix proc-empty-vm test with vsyscall
  fs/proc/base.c: remove unneeded semicolon
  do_io_accounting: use sig->stats_lock
  do_io_accounting: use __for_each_thread()
  ocfs2: replace BUG_ON() at ocfs2_num_free_extents() with ocfs2_error()
  ocfs2: fix a typo in a comment
  scripts/show_delta: add __main__ judgement before main code
  treewide: mark stuff as __ro_after_init
  fs: ocfs2: check status values
  proc: test /proc/${pid}/statm
  compiler.h: move __is_constexpr() to compiler.h
  ...
2023-11-02 20:53:31 -10:00
Linus Torvalds ecae0bd517 Many singleton patches against the MM code. The patch series which are
included in this merge do the following:
 
 - Kemeng Shi has contributed some compation maintenance work in the
   series "Fixes and cleanups to compaction".
 
 - Joel Fernandes has a patchset ("Optimize mremap during mutual
   alignment within PMD") which fixes an obscure issue with mremap()'s
   pagetable handling during a subsequent exec(), based upon an
   implementation which Linus suggested.
 
 - More DAMON/DAMOS maintenance and feature work from SeongJae Park i the
   following patch series:
 
 	mm/damon: misc fixups for documents, comments and its tracepoint
 	mm/damon: add a tracepoint for damos apply target regions
 	mm/damon: provide pseudo-moving sum based access rate
 	mm/damon: implement DAMOS apply intervals
 	mm/damon/core-test: Fix memory leaks in core-test
 	mm/damon/sysfs-schemes: Do DAMOS tried regions update for only one apply interval
 
 - In the series "Do not try to access unaccepted memory" Adrian Hunter
   provides some fixups for the recently-added "unaccepted memory' feature.
   To increase the feature's checking coverage.  "Plug a few gaps where
   RAM is exposed without checking if it is unaccepted memory".
 
 - In the series "cleanups for lockless slab shrink" Qi Zheng has done
   some maintenance work which is preparation for the lockless slab
   shrinking code.
 
 - Qi Zheng has redone the earlier (and reverted) attempt to make slab
   shrinking lockless in the series "use refcount+RCU method to implement
   lockless slab shrink".
 
 - David Hildenbrand contributes some maintenance work for the rmap code
   in the series "Anon rmap cleanups".
 
 - Kefeng Wang does more folio conversions and some maintenance work in
   the migration code.  Series "mm: migrate: more folio conversion and
   unification".
 
 - Matthew Wilcox has fixed an issue in the buffer_head code which was
   causing long stalls under some heavy memory/IO loads.  Some cleanups
   were added on the way.  Series "Add and use bdev_getblk()".
 
 - In the series "Use nth_page() in place of direct struct page
   manipulation" Zi Yan has fixed a potential issue with the direct
   manipulation of hugetlb page frames.
 
 - In the series "mm: hugetlb: Skip initialization of gigantic tail
   struct pages if freed by HVO" has improved our handling of gigantic
   pages in the hugetlb vmmemmep optimizaton code.  This provides
   significant boot time improvements when significant amounts of gigantic
   pages are in use.
 
 - Matthew Wilcox has sent the series "Small hugetlb cleanups" - code
   rationalization and folio conversions in the hugetlb code.
 
 - Yin Fengwei has improved mlock()'s handling of large folios in the
   series "support large folio for mlock"
 
 - In the series "Expose swapcache stat for memcg v1" Liu Shixin has
   added statistics for memcg v1 users which are available (and useful)
   under memcg v2.
 
 - Florent Revest has enhanced the MDWE (Memory-Deny-Write-Executable)
   prctl so that userspace may direct the kernel to not automatically
   propagate the denial to child processes.  The series is named "MDWE
   without inheritance".
 
 - Kefeng Wang has provided the series "mm: convert numa balancing
   functions to use a folio" which does what it says.
 
 - In the series "mm/ksm: add fork-exec support for prctl" Stefan Roesch
   makes is possible for a process to propagate KSM treatment across
   exec().
 
 - Huang Ying has enhanced memory tiering's calculation of memory
   distances.  This is used to permit the dax/kmem driver to use "high
   bandwidth memory" in addition to Optane Data Center Persistent Memory
   Modules (DCPMM).  The series is named "memory tiering: calculate
   abstract distance based on ACPI HMAT"
 
 - In the series "Smart scanning mode for KSM" Stefan Roesch has
   optimized KSM by teaching it to retain and use some historical
   information from previous scans.
 
 - Yosry Ahmed has fixed some inconsistencies in memcg statistics in the
   series "mm: memcg: fix tracking of pending stats updates values".
 
 - In the series "Implement IOCTL to get and optionally clear info about
   PTEs" Peter Xu has added an ioctl to /proc/<pid>/pagemap which permits
   us to atomically read-then-clear page softdirty state.  This is mainly
   used by CRIU.
 
 - Hugh Dickins contributed the series "shmem,tmpfs: general maintenance"
   - a bunch of relatively minor maintenance tweaks to this code.
 
 - Matthew Wilcox has increased the use of the VMA lock over file-backed
   page faults in the series "Handle more faults under the VMA lock".  Some
   rationalizations of the fault path became possible as a result.
 
 - In the series "mm/rmap: convert page_move_anon_rmap() to
   folio_move_anon_rmap()" David Hildenbrand has implemented some cleanups
   and folio conversions.
 
 - In the series "various improvements to the GUP interface" Lorenzo
   Stoakes has simplified and improved the GUP interface with an eye to
   providing groundwork for future improvements.
 
 - Andrey Konovalov has sent along the series "kasan: assorted fixes and
   improvements" which does those things.
 
 - Some page allocator maintenance work from Kemeng Shi in the series
   "Two minor cleanups to break_down_buddy_pages".
 
 - In thes series "New selftest for mm" Breno Leitao has developed
   another MM self test which tickles a race we had between madvise() and
   page faults.
 
 - In the series "Add folio_end_read" Matthew Wilcox provides cleanups
   and an optimization to the core pagecache code.
 
 - Nhat Pham has added memcg accounting for hugetlb memory in the series
   "hugetlb memcg accounting".
 
 - Cleanups and rationalizations to the pagemap code from Lorenzo
   Stoakes, in the series "Abstract vma_merge() and split_vma()".
 
 - Audra Mitchell has fixed issues in the procfs page_owner code's new
   timestamping feature which was causing some misbehaviours.  In the
   series "Fix page_owner's use of free timestamps".
 
 - Lorenzo Stoakes has fixed the handling of new mappings of sealed files
   in the series "permit write-sealed memfd read-only shared mappings".
 
 - Mike Kravetz has optimized the hugetlb vmemmap optimization in the
   series "Batch hugetlb vmemmap modification operations".
 
 - Some buffer_head folio conversions and cleanups from Matthew Wilcox in
   the series "Finish the create_empty_buffers() transition".
 
 - As a page allocator performance optimization Huang Ying has added
   automatic tuning to the allocator's per-cpu-pages feature, in the series
   "mm: PCP high auto-tuning".
 
 - Roman Gushchin has contributed the patchset "mm: improve performance
   of accounted kernel memory allocations" which improves their performance
   by ~30% as measured by a micro-benchmark.
 
 - folio conversions from Kefeng Wang in the series "mm: convert page
   cpupid functions to folios".
 
 - Some kmemleak fixups in Liu Shixin's series "Some bugfix about
   kmemleak".
 
 - Qi Zheng has improved our handling of memoryless nodes by keeping them
   off the allocation fallback list.  This is done in the series "handle
   memoryless nodes more appropriately".
 
 - khugepaged conversions from Vishal Moola in the series "Some
   khugepaged folio conversions".
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZULEMwAKCRDdBJ7gKXxA
 jhQHAQCYpD3g849x69DmHnHWHm/EHQLvQmRMDeYZI+nx/sCJOwEAw4AKg0Oemv9y
 FgeUPAD1oasg6CP+INZvCj34waNxwAc=
 =E+Y4
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2023-11-01-14-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:
 "Many singleton patches against the MM code. The patch series which are
  included in this merge do the following:

   - Kemeng Shi has contributed some compation maintenance work in the
     series 'Fixes and cleanups to compaction'

   - Joel Fernandes has a patchset ('Optimize mremap during mutual
     alignment within PMD') which fixes an obscure issue with mremap()'s
     pagetable handling during a subsequent exec(), based upon an
     implementation which Linus suggested

   - More DAMON/DAMOS maintenance and feature work from SeongJae Park i
     the following patch series:

	mm/damon: misc fixups for documents, comments and its tracepoint
	mm/damon: add a tracepoint for damos apply target regions
	mm/damon: provide pseudo-moving sum based access rate
	mm/damon: implement DAMOS apply intervals
	mm/damon/core-test: Fix memory leaks in core-test
	mm/damon/sysfs-schemes: Do DAMOS tried regions update for only one apply interval

   - In the series 'Do not try to access unaccepted memory' Adrian
     Hunter provides some fixups for the recently-added 'unaccepted
     memory' feature. To increase the feature's checking coverage. 'Plug
     a few gaps where RAM is exposed without checking if it is
     unaccepted memory'

   - In the series 'cleanups for lockless slab shrink' Qi Zheng has done
     some maintenance work which is preparation for the lockless slab
     shrinking code

   - Qi Zheng has redone the earlier (and reverted) attempt to make slab
     shrinking lockless in the series 'use refcount+RCU method to
     implement lockless slab shrink'

   - David Hildenbrand contributes some maintenance work for the rmap
     code in the series 'Anon rmap cleanups'

   - Kefeng Wang does more folio conversions and some maintenance work
     in the migration code. Series 'mm: migrate: more folio conversion
     and unification'

   - Matthew Wilcox has fixed an issue in the buffer_head code which was
     causing long stalls under some heavy memory/IO loads. Some cleanups
     were added on the way. Series 'Add and use bdev_getblk()'

   - In the series 'Use nth_page() in place of direct struct page
     manipulation' Zi Yan has fixed a potential issue with the direct
     manipulation of hugetlb page frames

   - In the series 'mm: hugetlb: Skip initialization of gigantic tail
     struct pages if freed by HVO' has improved our handling of gigantic
     pages in the hugetlb vmmemmep optimizaton code. This provides
     significant boot time improvements when significant amounts of
     gigantic pages are in use

   - Matthew Wilcox has sent the series 'Small hugetlb cleanups' - code
     rationalization and folio conversions in the hugetlb code

   - Yin Fengwei has improved mlock()'s handling of large folios in the
     series 'support large folio for mlock'

   - In the series 'Expose swapcache stat for memcg v1' Liu Shixin has
     added statistics for memcg v1 users which are available (and
     useful) under memcg v2

   - Florent Revest has enhanced the MDWE (Memory-Deny-Write-Executable)
     prctl so that userspace may direct the kernel to not automatically
     propagate the denial to child processes. The series is named 'MDWE
     without inheritance'

   - Kefeng Wang has provided the series 'mm: convert numa balancing
     functions to use a folio' which does what it says

   - In the series 'mm/ksm: add fork-exec support for prctl' Stefan
     Roesch makes is possible for a process to propagate KSM treatment
     across exec()

   - Huang Ying has enhanced memory tiering's calculation of memory
     distances. This is used to permit the dax/kmem driver to use 'high
     bandwidth memory' in addition to Optane Data Center Persistent
     Memory Modules (DCPMM). The series is named 'memory tiering:
     calculate abstract distance based on ACPI HMAT'

   - In the series 'Smart scanning mode for KSM' Stefan Roesch has
     optimized KSM by teaching it to retain and use some historical
     information from previous scans

   - Yosry Ahmed has fixed some inconsistencies in memcg statistics in
     the series 'mm: memcg: fix tracking of pending stats updates
     values'

   - In the series 'Implement IOCTL to get and optionally clear info
     about PTEs' Peter Xu has added an ioctl to /proc/<pid>/pagemap
     which permits us to atomically read-then-clear page softdirty
     state. This is mainly used by CRIU

   - Hugh Dickins contributed the series 'shmem,tmpfs: general
     maintenance', a bunch of relatively minor maintenance tweaks to
     this code

   - Matthew Wilcox has increased the use of the VMA lock over
     file-backed page faults in the series 'Handle more faults under the
     VMA lock'. Some rationalizations of the fault path became possible
     as a result

   - In the series 'mm/rmap: convert page_move_anon_rmap() to
     folio_move_anon_rmap()' David Hildenbrand has implemented some
     cleanups and folio conversions

   - In the series 'various improvements to the GUP interface' Lorenzo
     Stoakes has simplified and improved the GUP interface with an eye
     to providing groundwork for future improvements

   - Andrey Konovalov has sent along the series 'kasan: assorted fixes
     and improvements' which does those things

   - Some page allocator maintenance work from Kemeng Shi in the series
     'Two minor cleanups to break_down_buddy_pages'

   - In thes series 'New selftest for mm' Breno Leitao has developed
     another MM self test which tickles a race we had between madvise()
     and page faults

   - In the series 'Add folio_end_read' Matthew Wilcox provides cleanups
     and an optimization to the core pagecache code

   - Nhat Pham has added memcg accounting for hugetlb memory in the
     series 'hugetlb memcg accounting'

   - Cleanups and rationalizations to the pagemap code from Lorenzo
     Stoakes, in the series 'Abstract vma_merge() and split_vma()'

   - Audra Mitchell has fixed issues in the procfs page_owner code's new
     timestamping feature which was causing some misbehaviours. In the
     series 'Fix page_owner's use of free timestamps'

   - Lorenzo Stoakes has fixed the handling of new mappings of sealed
     files in the series 'permit write-sealed memfd read-only shared
     mappings'

   - Mike Kravetz has optimized the hugetlb vmemmap optimization in the
     series 'Batch hugetlb vmemmap modification operations'

   - Some buffer_head folio conversions and cleanups from Matthew Wilcox
     in the series 'Finish the create_empty_buffers() transition'

   - As a page allocator performance optimization Huang Ying has added
     automatic tuning to the allocator's per-cpu-pages feature, in the
     series 'mm: PCP high auto-tuning'

   - Roman Gushchin has contributed the patchset 'mm: improve
     performance of accounted kernel memory allocations' which improves
     their performance by ~30% as measured by a micro-benchmark

   - folio conversions from Kefeng Wang in the series 'mm: convert page
     cpupid functions to folios'

   - Some kmemleak fixups in Liu Shixin's series 'Some bugfix about
     kmemleak'

   - Qi Zheng has improved our handling of memoryless nodes by keeping
     them off the allocation fallback list. This is done in the series
     'handle memoryless nodes more appropriately'

   - khugepaged conversions from Vishal Moola in the series 'Some
     khugepaged folio conversions'"

[ bcachefs conflicts with the dynamically allocated shrinkers have been
  resolved as per Stephen Rothwell in

     https://lore.kernel.org/all/20230913093553.4290421e@canb.auug.org.au/

  with help from Qi Zheng.

  The clone3 test filtering conflict was half-arsed by yours truly ]

* tag 'mm-stable-2023-11-01-14-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (406 commits)
  mm/damon/sysfs: update monitoring target regions for online input commit
  mm/damon/sysfs: remove requested targets when online-commit inputs
  selftests: add a sanity check for zswap
  Documentation: maple_tree: fix word spelling error
  mm/vmalloc: fix the unchecked dereference warning in vread_iter()
  zswap: export compression failure stats
  Documentation: ubsan: drop "the" from article title
  mempolicy: migration attempt to match interleave nodes
  mempolicy: mmap_lock is not needed while migrating folios
  mempolicy: alloc_pages_mpol() for NUMA policy without vma
  mm: add page_rmappable_folio() wrapper
  mempolicy: remove confusing MPOL_MF_LAZY dead code
  mempolicy: mpol_shared_policy_init() without pseudo-vma
  mempolicy trivia: use pgoff_t in shared mempolicy tree
  mempolicy trivia: slightly more consistent naming
  mempolicy trivia: delete those ancient pr_debug()s
  mempolicy: fix migrate_pages(2) syscall return nr_failed
  kernfs: drop shared NUMA mempolicy hooks
  hugetlbfs: drop shared NUMA mempolicy pretence
  mm/damon/sysfs-test: add a unit test for damon_sysfs_set_targets()
  ...
2023-11-02 19:38:47 -10:00
Linus Torvalds 6803bd7956 ARM:
* Generalized infrastructure for 'writable' ID registers, effectively
   allowing userspace to opt-out of certain vCPU features for its guest
 
 * Optimization for vSGI injection, opportunistically compressing MPIDR
   to vCPU mapping into a table
 
 * Improvements to KVM's PMU emulation, allowing userspace to select
   the number of PMCs available to a VM
 
 * Guest support for memory operation instructions (FEAT_MOPS)
 
 * Cleanups to handling feature flags in KVM_ARM_VCPU_INIT, squashing
   bugs and getting rid of useless code
 
 * Changes to the way the SMCCC filter is constructed, avoiding wasted
   memory allocations when not in use
 
 * Load the stage-2 MMU context at vcpu_load() for VHE systems, reducing
   the overhead of errata mitigations
 
 * Miscellaneous kernel and selftest fixes
 
 LoongArch:
 
 * New architecture.  The hardware uses the same model as x86, s390
   and RISC-V, where guest/host mode is orthogonal to supervisor/user
   mode.  The virtualization extensions are very similar to MIPS,
   therefore the code also has some similarities but it's been cleaned
   up to avoid some of the historical bogosities that are found in
   arch/mips.  The kernel emulates MMU, timer and CSR accesses, while
   interrupt controllers are only emulated in userspace, at least for
   now.
 
 RISC-V:
 
 * Support for the Smstateen and Zicond extensions
 
 * Support for virtualizing senvcfg
 
 * Support for virtualized SBI debug console (DBCN)
 
 S390:
 
 * Nested page table management can be monitored through tracepoints
   and statistics
 
 x86:
 
 * Fix incorrect handling of VMX posted interrupt descriptor in KVM_SET_LAPIC,
   which could result in a dropped timer IRQ
 
 * Avoid WARN on systems with Intel IPI virtualization
 
 * Add CONFIG_KVM_MAX_NR_VCPUS, to allow supporting up to 4096 vCPUs without
   forcing more common use cases to eat the extra memory overhead.
 
 * Add virtualization support for AMD SRSO mitigation (IBPB_BRTYPE and
   SBPB, aka Selective Branch Predictor Barrier).
 
 * Fix a bug where restoring a vCPU snapshot that was taken within 1 second of
   creating the original vCPU would cause KVM to try to synchronize the vCPU's
   TSC and thus clobber the correct TSC being set by userspace.
 
 * Compute guest wall clock using a single TSC read to avoid generating an
   inaccurate time, e.g. if the vCPU is preempted between multiple TSC reads.
 
 * "Virtualize" HWCR.TscFreqSel to make Linux guests happy, which complain
   about a "Firmware Bug" if the bit isn't set for select F/M/S combos.
   Likewise "virtualize" (ignore) MSR_AMD64_TW_CFG to appease Windows Server
   2022.
 
 * Don't apply side effects to Hyper-V's synthetic timer on writes from
   userspace to fix an issue where the auto-enable behavior can trigger
   spurious interrupts, i.e. do auto-enabling only for guest writes.
 
 * Remove an unnecessary kick of all vCPUs when synchronizing the dirty log
   without PML enabled.
 
 * Advertise "support" for non-serializing FS/GS base MSR writes as appropriate.
 
 * Harden the fast page fault path to guard against encountering an invalid
   root when walking SPTEs.
 
 * Omit "struct kvm_vcpu_xen" entirely when CONFIG_KVM_XEN=n.
 
 * Use the fast path directly from the timer callback when delivering Xen
   timer events, instead of waiting for the next iteration of the run loop.
   This was not done so far because previously proposed code had races,
   but now care is taken to stop the hrtimer at critical points such as
   restarting the timer or saving the timer information for userspace.
 
 * Follow the lead of upstream Xen and ignore the VCPU_SSHOTTMR_future flag.
 
 * Optimize injection of PMU interrupts that are simultaneous with NMIs.
 
 * Usual handful of fixes for typos and other warts.
 
 x86 - MTRR/PAT fixes and optimizations:
 
 * Clean up code that deals with honoring guest MTRRs when the VM has
   non-coherent DMA and host MTRRs are ignored, i.e. EPT is enabled.
 
 * Zap EPT entries when non-coherent DMA assignment stops/start to prevent
   using stale entries with the wrong memtype.
 
 * Don't ignore guest PAT for CR0.CD=1 && KVM_X86_QUIRK_CD_NW_CLEARED=y.
   This was done as a workaround for virtual machine BIOSes that did not
   bother to clear CR0.CD (because ancient KVM/QEMU did not bother to
   set it, in turn), and there's zero reason to extend the quirk to
   also ignore guest PAT.
 
 x86 - SEV fixes:
 
 * Report KVM_EXIT_SHUTDOWN instead of EINVAL if KVM intercepts SHUTDOWN while
   running an SEV-ES guest.
 
 * Clean up the recognition of emulation failures on SEV guests, when KVM would
   like to "skip" the instruction but it had already been partially emulated.
   This makes it possible to drop a hack that second guessed the (insufficient)
   information provided by the emulator, and just do the right thing.
 
 Documentation:
 
 * Various updates and fixes, mostly for x86
 
 * MTRR and PAT fixes and optimizations:
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmVBZc0UHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroP1LQf+NgsmZ1lkGQlKdSdijoQ856w+k0or
 l2SV1wUwiEdFPSGK+RTUlHV5Y1ni1dn/CqCVIJZKEI3ZtZ1m9/4HKIRXvbMwFHIH
 hx+E4Lnf8YUjsGjKTLd531UKcpphztZavQ6pXLEwazkSkDEra+JIKtooI8uU+9/p
 bd/eF1V+13a8CHQf1iNztFJVxqBJbVlnPx4cZDRQQvewskIDGnVDtwbrwCUKGtzD
 eNSzhY7si6O2kdQNkuA8xPhg29dYX9XLaCK2K1l8xOUm8WipLdtF86GAKJ5BVuOL
 6ek/2QCYjZ7a+coAZNfgSEUi8JmFHEqCo7cnKmWzPJp+2zyXsdudqAhT1g==
 =UIxm
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "ARM:

   - Generalized infrastructure for 'writable' ID registers, effectively
     allowing userspace to opt-out of certain vCPU features for its
     guest

   - Optimization for vSGI injection, opportunistically compressing
     MPIDR to vCPU mapping into a table

   - Improvements to KVM's PMU emulation, allowing userspace to select
     the number of PMCs available to a VM

   - Guest support for memory operation instructions (FEAT_MOPS)

   - Cleanups to handling feature flags in KVM_ARM_VCPU_INIT, squashing
     bugs and getting rid of useless code

   - Changes to the way the SMCCC filter is constructed, avoiding wasted
     memory allocations when not in use

   - Load the stage-2 MMU context at vcpu_load() for VHE systems,
     reducing the overhead of errata mitigations

   - Miscellaneous kernel and selftest fixes

  LoongArch:

   - New architecture for kvm.

     The hardware uses the same model as x86, s390 and RISC-V, where
     guest/host mode is orthogonal to supervisor/user mode. The
     virtualization extensions are very similar to MIPS, therefore the
     code also has some similarities but it's been cleaned up to avoid
     some of the historical bogosities that are found in arch/mips. The
     kernel emulates MMU, timer and CSR accesses, while interrupt
     controllers are only emulated in userspace, at least for now.

  RISC-V:

   - Support for the Smstateen and Zicond extensions

   - Support for virtualizing senvcfg

   - Support for virtualized SBI debug console (DBCN)

  S390:

   - Nested page table management can be monitored through tracepoints
     and statistics

  x86:

   - Fix incorrect handling of VMX posted interrupt descriptor in
     KVM_SET_LAPIC, which could result in a dropped timer IRQ

   - Avoid WARN on systems with Intel IPI virtualization

   - Add CONFIG_KVM_MAX_NR_VCPUS, to allow supporting up to 4096 vCPUs
     without forcing more common use cases to eat the extra memory
     overhead.

   - Add virtualization support for AMD SRSO mitigation (IBPB_BRTYPE and
     SBPB, aka Selective Branch Predictor Barrier).

   - Fix a bug where restoring a vCPU snapshot that was taken within 1
     second of creating the original vCPU would cause KVM to try to
     synchronize the vCPU's TSC and thus clobber the correct TSC being
     set by userspace.

   - Compute guest wall clock using a single TSC read to avoid
     generating an inaccurate time, e.g. if the vCPU is preempted
     between multiple TSC reads.

   - "Virtualize" HWCR.TscFreqSel to make Linux guests happy, which
     complain about a "Firmware Bug" if the bit isn't set for select
     F/M/S combos. Likewise "virtualize" (ignore) MSR_AMD64_TW_CFG to
     appease Windows Server 2022.

   - Don't apply side effects to Hyper-V's synthetic timer on writes
     from userspace to fix an issue where the auto-enable behavior can
     trigger spurious interrupts, i.e. do auto-enabling only for guest
     writes.

   - Remove an unnecessary kick of all vCPUs when synchronizing the
     dirty log without PML enabled.

   - Advertise "support" for non-serializing FS/GS base MSR writes as
     appropriate.

   - Harden the fast page fault path to guard against encountering an
     invalid root when walking SPTEs.

   - Omit "struct kvm_vcpu_xen" entirely when CONFIG_KVM_XEN=n.

   - Use the fast path directly from the timer callback when delivering
     Xen timer events, instead of waiting for the next iteration of the
     run loop. This was not done so far because previously proposed code
     had races, but now care is taken to stop the hrtimer at critical
     points such as restarting the timer or saving the timer information
     for userspace.

   - Follow the lead of upstream Xen and ignore the VCPU_SSHOTTMR_future
     flag.

   - Optimize injection of PMU interrupts that are simultaneous with
     NMIs.

   - Usual handful of fixes for typos and other warts.

  x86 - MTRR/PAT fixes and optimizations:

   - Clean up code that deals with honoring guest MTRRs when the VM has
     non-coherent DMA and host MTRRs are ignored, i.e. EPT is enabled.

   - Zap EPT entries when non-coherent DMA assignment stops/start to
     prevent using stale entries with the wrong memtype.

   - Don't ignore guest PAT for CR0.CD=1 && KVM_X86_QUIRK_CD_NW_CLEARED=y

     This was done as a workaround for virtual machine BIOSes that did
     not bother to clear CR0.CD (because ancient KVM/QEMU did not bother
     to set it, in turn), and there's zero reason to extend the quirk to
     also ignore guest PAT.

  x86 - SEV fixes:

   - Report KVM_EXIT_SHUTDOWN instead of EINVAL if KVM intercepts
     SHUTDOWN while running an SEV-ES guest.

   - Clean up the recognition of emulation failures on SEV guests, when
     KVM would like to "skip" the instruction but it had already been
     partially emulated. This makes it possible to drop a hack that
     second guessed the (insufficient) information provided by the
     emulator, and just do the right thing.

  Documentation:

   - Various updates and fixes, mostly for x86

   - MTRR and PAT fixes and optimizations"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (164 commits)
  KVM: selftests: Avoid using forced target for generating arm64 headers
  tools headers arm64: Fix references to top srcdir in Makefile
  KVM: arm64: Add tracepoint for MMIO accesses where ISV==0
  KVM: arm64: selftest: Perform ISB before reading PAR_EL1
  KVM: arm64: selftest: Add the missing .guest_prepare()
  KVM: arm64: Always invalidate TLB for stage-2 permission faults
  KVM: x86: Service NMI requests after PMI requests in VM-Enter path
  KVM: arm64: Handle AArch32 SPSR_{irq,abt,und,fiq} as RAZ/WI
  KVM: arm64: Do not let a L1 hypervisor access the *32_EL2 sysregs
  KVM: arm64: Refine _EL2 system register list that require trap reinjection
  arm64: Add missing _EL2 encodings
  arm64: Add missing _EL12 encodings
  KVM: selftests: aarch64: vPMU test for validating user accesses
  KVM: selftests: aarch64: vPMU register test for unimplemented counters
  KVM: selftests: aarch64: vPMU register test for implemented counters
  KVM: selftests: aarch64: Introduce vpmu_counter_access test
  tools: Import arm_pmuv3.h
  KVM: arm64: PMU: Allow userspace to limit PMCR_EL0.N for the guest
  KVM: arm64: Sanitize PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR} before first run
  KVM: arm64: Add {get,set}_user for PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR}
  ...
2023-11-02 15:45:15 -10:00
Linus Torvalds 90a300dc05 libnvdimm updates for v6.7
- updates to deprecated and changed interfaces
 - bug/kdoc fixes
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYKADIWIQSgX9xt+GwmrJEQ+euebuN7TNx1MQUCZULdfxQcaXJhLndlaW55
 QGludGVsLmNvbQAKCRCebuN7TNx1MaDDAP9sCcjx4o49EXWST/x2z/Si0EUMEq+y
 mtsVBRqYO32jXAD9HuAA4K/ECNjbT3ENQO8WWYQqZOcbYgBRFQKi8mmaKAY=
 =0WCP
 -----END PGP SIGNATURE-----

Merge tag 'libnvdimm-for-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm updates from Ira Weiny:

 - updates to deprecated and changed interfaces

 - bug/kdoc fixes

* tag 'libnvdimm-for-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  libnvdimm: remove kernel-doc warnings:
  testing: nvdimm: make struct class structures constant
  libnvdimm: Annotate struct nd_region with __counted_by
  nd_btt: Make BTT lanes preemptible
  libnvdimm/of_pmem: Use devm_kstrdup instead of kstrdup and check its return value
  dax: refactor deprecated strncpy
2023-11-02 15:06:04 -10:00
Linus Torvalds edd8e84ae9 sound updates for 6.7
Most of changes at this time are for ASoC, spread over ASoC core and
 drivers due to the API prefix standardization.  Other than that, there
 have little change wrt API, rather lots of driver-specific updates and
 fixes.  Some highlight below:
 
 ASoC:
 - Standardization of API prefix
 - GPIO API usage improvements
 - Support for HDA patches
 - Lots of work on SOF, including crash dump support
 - Fixes for noise when stopping some Sounwire CODECs
 - Support for AMD platforms with es83xx, AMD ACP 6.3 and 7.0, Awinc
   AT87390 and AW88399, many Intel platforms, many Mediatek platforms,
   Qualcomm SM6115 and SC7180 platforms, Richtek RTQ9128 and Texas
   Instruments TAS575x
 
 HD-audio and USB-audio:
 - Deferred probe support of audio component binding
 - More fixes and enhancements for Cirrus subcodecs
 - USB Scarlett2 mixer and McIntosh DSD quirk
 
 Others:
 - More enhancement of snd-aloop driver
 - Update MAINTAINERS entry for linux-sound mailing list
 -----BEGIN PGP SIGNATURE-----
 
 iQJCBAABCAAsFiEEIXTw5fNLNI7mMiVaLtJE4w1nLE8FAmVDqXcOHHRpd2FpQHN1
 c2UuZGUACgkQLtJE4w1nLE+PbhAAsmiOxjXpCNVGbhy9tR+UB9p6gCIzFv5RQQIc
 0DzqnqssbJt86jtFMrv0VC6BNfnwcyBVzKxWinKz861xXiM9SRU5vTirUES3Cil/
 sZaQYC9ppTrZm3q2EGF4eGC349oZGzuLzk1EkEVPfNHiELwcO4R1NZbtgWKIc8LE
 bNz8RT3FyxHAv+5juW5suGs0Bq4Mwa9z+eM8xEPwxPL3gpIAb5EapIesHEgaa893
 w3jYwKRHCDq0ADtXIY9xI3ypqenfbeTge4nuBnw354sPRVdZInrRfCNkTJojb+tp
 5Pc2gpFmTUGy6T2wG6QP23VgeV14BJHrQD3Z1Wh2aQ8V+ARa92XvY1Xeg4vJ+NE0
 yhvlh028GjKrMIvhl7mmepV9mia2zA1TluqlzKEla3B5lIj0E1zvMA+vCzNAz3Ro
 lV2Q0dpJ3ENQ9ahGF/d37u3glrqXxISlG9uTGdY0UcF7U9Iyxb0jEnhQYl05b+zR
 Oaw/HApuvIUj4cdJWEYf0AnTTqeE8KSZ3wUlPPyuQAdAusCaQFMciWeO0EeLqEId
 KR/rbnSgVKS3zHLdNw5A67Sv36E9OG/E+EiJ6Tet15a69yq2Oyv4pwMMwbqsvBG5
 8kNbtBFGxOHnCrZgM6VV2/g3BP/IwyIFd5kkS2q13FXBTYRpY01dQDjHlmspasNK
 hYH69AA=
 =2/dI
 -----END PGP SIGNATURE-----

Merge tag 'sound-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound updates from Takashi Iwai:
 "Most of changes at this time are for ASoC, spread over ASoC core and
  drivers due to the API prefix standardization.

  Other than that, there have little change wrt API, rather lots of
  driver-specific updates and fixes.

  Some highlight below:

  ASoC:
   - Standardization of API prefix
   - GPIO API usage improvements
   - Support for HDA patches
   - Lots of work on SOF, including crash dump support
   - Fixes for noise when stopping some Sounwire CODECs
   - Support for AMD platforms with es83xx, AMD ACP 6.3 and 7.0, Awinc
     AT87390 and AW88399, many Intel platforms, many Mediatek platforms,
     Qualcomm SM6115 and SC7180 platforms, Richtek RTQ9128 and Texas
     Instruments TAS575x

  HD-audio and USB-audio:
   - Deferred probe support of audio component binding
   - More fixes and enhancements for Cirrus subcodecs
   - USB Scarlett2 mixer and McIntosh DSD quirk

  Others:
   - More enhancement of snd-aloop driver
   - Update MAINTAINERS entry for linux-sound mailing list"

* tag 'sound-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (485 commits)
  ALSA: hda: cs35l41: Fix missing error code in cs35l41_smart_amp()
  ALSA: hda: cs35l41: mark cs35l41_verify_id() static
  ASoC: codecs: wsa883x: make use of new mute_unmute_on_trigger flag
  ASoC: soc-dai: add flag to mute and unmute stream during trigger
  ASoC: ams-delta.c: use component after check
  ASoC: amd: acp: select SND_SOC_AMD_ACP_LEGACY_COMMON for ACP63
  ASoC: codecs: aw88399: fix typo in Kconfig select
  ASoC: amd: acp: add ACPI dependency
  ASoC: Intel: avs: Add rt5514 machine board
  ASoC: Intel: avs: Add rt5514 machine board
  ALSA: scarlett2: Add missing check with firmware version control
  ALSA: virtio: use ack callback
  ALSA: scarlett2: Remap Level Meter values
  ALSA: scarlett2: Allow passing any output to line_out_remap()
  ALSA: scarlett2: Add support for reading firmware version
  ALSA: scarlett2: Rename Gen 3 config sets
  ALSA: scarlett2: Rename scarlett_gen2 to scarlett2
  ASoC: cs35l41: Detect CSPL errors when sending CSPL commands
  ALSA: hda: cs35l41: Check CSPL state after loading firmware
  ALSA: hda: cs35l41: Do not unload firmware before reset in system suspend
  ...
2023-11-02 14:34:14 -10:00
Linus Torvalds 4ea4ed22b5 for-linus-2023110101
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIVAwUAZUGPxqZi849r7WBJAQIMkhAAxZqCeGjJ2QsO/C41DodzxbVVyvfkPWwC
 zkNL0KORtTmaGI4UOhakh0447lyUwNN1GT17GXIC7BjD8BCm0vm/474FOI9hXsEb
 wHP4RsVUTb2zK5r6zeWvzurLmBmsryCKb95Co4o7yPXKY4DHSjyYgvAJIOEOl9ov
 rmk3SZ8SweFp6SlkxANjZ3jU1CF13EmUOL9yPviXfXSa0qYAeIVF3kMMJoNMsi61
 wl32r4NZJg6JpkUxyDUlWAayB0R9L8ean+t+UbH1sOzWhISOizn0Ceq6Nh+9X0AO
 DIulQv9z8GDgn9JkkwFExd5oU9Tknd6eSCeiAN4RlB4iqOhET13pgeUxcwayrQ+4
 WwBncKfJf/1gFu7nQCloQ0qEm+Ehq6CPVmeNa+xEBNPlD6nsRXldNnVqTG4I0rtF
 thB2Ep2w+7luS49wqN93R6sgmWBunSwGpRN7eo5hpLoJ1ybhFT5pn8PX7Brflkyn
 Ru934APRChMf1WKdTgZu9sGkLpgRBMDe1836kkTiLAO6aJu45ln5H7L5nBzQ4aUB
 cjj5kv7j2aE4TPWXOvvd0KwLPU66TJ6j/n6SrEqWUQicsJNp09sUpMAzfjhS7gZG
 ipJBEoiC6dBHRikaEy3+lLBrfEnq0ptfLUDBiOClzilsrgTGwCqgcNaDpKuqCCfN
 TEN9LoDIIs8=
 =Xcc6
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-2023110101' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid

Pull HID updates from Jiri Kosina:

 - fixes for crashes detected by CONFIG_KUNIT_ALL_TESTS in hid-uclogic
   driver (Jinjie Ruan)

 - HID selftests fixes and improvements (Benjamin Tissoires)

 - probe error handling path fixes in hid-nvidia-shield driver
   (Christophe JAILLET)

 - cleanup of LED handling in hid-nintendo (Martino Fontana)

 - big cleanup of logitech-hidpp probe code (Hans de Goede)

 - Suspend/Resume fix for USB Thinkpad Compact Keyboard (Jamie Lentin)

 - firmware detection improvement for Lenovo cptkbd (Mikhail
   Khvainitski)

 - IRQ shutdown and workqueue initialization fixes for hid-cp2112 driver
   (Danny Kaehn)

 - #ifdef CONFIG_PM removal from HID code (Thomas Weißschuh)

 - other assorted device-ID additions and quirks

* tag 'for-linus-2023110101' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: (31 commits)
  HID: Add quirk for Dell Pro Wireless Keyboard and Mouse KM5221W
  HID: logitech-hidpp: Stop IO before calling hid_connect()
  HID: logitech-hidpp: Drop HIDPP_QUIRK_UNIFYING
  HID: logitech-hidpp: Drop delayed_work_cb()
  HID: logitech-hidpp: Fix connect event race
  HID: logitech-hidpp: Remove unused connected param from *_connect()
  HID: logitech-hidpp: Remove connected check for non-unifying devices
  HID: logitech-hidpp: Add hidpp_non_unifying_init() helper
  HID: logitech-hidpp: Move hidpp_overwrite_name() to before connect check
  HID: logitech-hidpp: Move g920_get_config() to just before hidpp_ff_init()
  HID: logitech-hidpp: Remove wtp_get_config() call from probe()
  HID: logitech-hidpp: Move get_wireless_feature_index() check to hidpp_connect_event()
  HID: logitech-hidpp: Revert "Don't restart communication if not necessary"
  HID: logitech-hidpp: Don't restart IO, instead defer hid_connect() only
  HID: rmi: remove #ifdef CONFIG_PM
  HID: multitouch: remove #ifdef CONFIG_PM
  HID: usbhid: remove #ifdef CONFIG_PM
  HID: core: remove #ifdef CONFIG_PM from hid_driver
  hid: lenovo: Resend all settings on reset_resume for compact keyboards
  HID: uclogic: Fix a work->entry not empty bug in __queue_work()
  ...
2023-11-02 14:29:10 -10:00
Björn Töpel d84b139f53 selftests/bpf: Fix broken build where char is unsigned
There are architectures where char is not signed. If so, the following
error is triggered:

  | xdp_hw_metadata.c:435:42: error: result of comparison of constant -1 \
  |   with expression of type 'char' is always true \
  |   [-Werror,-Wtautological-constant-out-of-range-compare]
  |   435 |         while ((opt = getopt(argc, argv, "mh")) != -1) {
  |       |                ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^  ~~
  | 1 error generated.

Correct by changing the char to int.

Fixes: bb6a88885f ("selftests/bpf: Add options and frags to xdp_hw_metadata")
Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Acked-by: Larysa Zaremba <larysa.zaremba@intel.com>
Tested-by: Anders Roxell <anders.roxell@linaro.org>
Link: https://lore.kernel.org/r/20231102103537.247336-1-bjorn@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-02 07:57:21 -07:00
Shung-Hsi Yu 3c41971550 selftests/bpf: precision tracking test for BPF_NEG and BPF_END
As seen from previous commit that fix backtracking for BPF_ALU | BPF_TO_BE
| BPF_END, both BPF_NEG and BPF_END require special handling. Add tests
written with inline assembly to check that the verifier does not incorrecly
use the src_reg field of BPF_NEG and BPF_END (including bswap added in v4).

Suggested-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Link: https://lore.kernel.org/r/20231102053913.12004-4-shung-hsi.yu@suse.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-01 22:54:28 -07:00
Chuyi Zhou d8234d47c4 selftests/bpf: Add test for using css_task iter in sleepable progs
This Patch add a test to prove css_task iter can be used in normal
sleepable progs.

Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20231031050438.93297-4-zhouchuyi@bytedance.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-01 22:49:20 -07:00
Chuyi Zhou f49843afde selftests/bpf: Add tests for css_task iter combining with cgroup iter
This patch adds a test which demonstrates how css_task iter can be combined
with cgroup iter and it won't cause deadlock, though cgroup iter is not
sleepable.

Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20231031050438.93297-3-zhouchuyi@bytedance.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-01 22:49:20 -07:00
Chuyi Zhou 3091b66749 bpf: Relax allowlist for css_task iter
The newly added open-coded css_task iter would try to hold the global
css_set_lock in bpf_iter_css_task_new, so the bpf side has to be careful in
where it allows to use this iter. The mainly concern is dead locking on
css_set_lock. check_css_task_iter_allowlist() in verifier enforced css_task
can only be used in bpf_lsm hooks and sleepable bpf_iter.

This patch relax the allowlist for css_task iter. Any lsm and any iter
(even non-sleepable) and any sleepable are safe since they would not hold
the css_set_lock before entering BPF progs context.

This patch also fixes the misused BPF_TRACE_ITER in
check_css_task_iter_allowlist which compared bpf_prog_type with
bpf_attach_type.

Fixes: 9c66dc94b6 ("bpf: Introduce css_task open-coded iterator kfuncs")
Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20231031050438.93297-2-zhouchuyi@bytedance.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-01 22:49:20 -07:00
Andrii Nakryiko 9af3775962 selftests/bpf: fix test_maps' use of bpf_map_create_opts
Use LIBBPF_OPTS() macro to properly initialize bpf_map_create_opts in
test_maps' tests.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20231029011509.2479232-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-01 22:42:38 -07:00
Dave Marchevsky 15fb6f2b6c bpf: Add __bpf_hook_{start,end} macros
Not all uses of __diag_ignore_all(...) in BPF-related code in order to
suppress warnings are wrapping kfunc definitions. Some "hook point"
definitions - small functions meant to be used as attach points for
fentry and similar BPF progs - need to suppress -Wmissing-declarations.

We could use __bpf_kfunc_{start,end}_defs added in the previous patch in
such cases, but this might be confusing to someone unfamiliar with BPF
internals. Instead, this patch adds __bpf_hook_{start,end} macros,
currently having the same effect as __bpf_kfunc_{start,end}_defs, then
uses them to suppress warnings for two hook points in the kernel itself
and some bpf_testmod hook points as well.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Cc: Yafang Shao <laoar.shao@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/r/20231031215625.2343848-2-davemarchevsky@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-01 22:33:53 -07:00
Manu Bretelle cd60f410dd selftests/bpf: fix test_bpffs
Currently this tests tries to umount /sys/kernel/debug (TDIR) but the
system it is running on may have mounts below.

For example, danobi/vmtest [0] VMs have
    mount -t tracefs tracefs /sys/kernel/debug/tracing
as part of their init.

This change instead creates a "random" directory under /tmp and uses this
as TDIR.
If the directory already exists, ignore the error and keep moving on.

Test:

Originally:

    $ vmtest -k $KERNEL_REPO/arch/x86_64/boot/bzImage "./test_progs -vv -a test_bpffs"
    => bzImage
    ===> Booting
    ===> Setting up VM
    ===> Running command
    [    2.138818] bpf_testmod: loading out-of-tree module taints kernel.
    [    2.140913] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel
    bpf_testmod.ko is already unloaded.
    Loading bpf_testmod.ko...
    Successfully loaded bpf_testmod.ko.
    test_test_bpffs:PASS:clone 0 nsec
    fn:PASS:unshare 0 nsec
    fn:PASS:mount / 0 nsec
    fn:FAIL:umount /sys/kernel/debug unexpected error: -1 (errno 16)
    bpf_testmod.ko is already unloaded.
    Loading bpf_testmod.ko...
    Successfully loaded bpf_testmod.ko.
    test_test_bpffs:PASS:clone 0 nsec
    test_test_bpffs:PASS:waitpid 0 nsec
    test_test_bpffs:FAIL:bpffs test  failed 255#282     test_bpffs:FAIL
    Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
    Successfully unloaded bpf_testmod.ko.
    Command failed with exit code: 1

After this change:

    $ vmtest -k $(make image_name) 'cd tools/testing/selftests/bpf && ./test_progs -vv -a test_bpffs'
    => bzImage
    ===> Booting
    ===> Setting up VM
    ===> Running command
    [    2.295696] bpf_testmod: loading out-of-tree module taints kernel.
    [    2.296468] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel
    bpf_testmod.ko is already unloaded.
    Loading bpf_testmod.ko...
    Successfully loaded bpf_testmod.ko.
    test_test_bpffs:PASS:clone 0 nsec
    fn:PASS:unshare 0 nsec
    fn:PASS:mount / 0 nsec
    fn:PASS:mount tmpfs 0 nsec
    fn:PASS:mkdir /tmp/test_bpffs_testdir/fs1 0 nsec
    fn:PASS:mkdir /tmp/test_bpffs_testdir/fs2 0 nsec
    fn:PASS:mount bpffs /tmp/test_bpffs_testdir/fs1 0 nsec
    fn:PASS:mount bpffs /tmp/test_bpffs_testdir/fs2 0 nsec
    fn:PASS:reading /tmp/test_bpffs_testdir/fs1/maps.debug 0 nsec
    fn:PASS:reading /tmp/test_bpffs_testdir/fs2/progs.debug 0 nsec
    fn:PASS:creating /tmp/test_bpffs_testdir/fs1/a 0 nsec
    fn:PASS:creating /tmp/test_bpffs_testdir/fs1/a/1 0 nsec
    fn:PASS:creating /tmp/test_bpffs_testdir/fs1/b 0 nsec
    fn:PASS:create_map(ARRAY) 0 nsec
    fn:PASS:pin map 0 nsec
    fn:PASS:stat(/tmp/test_bpffs_testdir/fs1/a) 0 nsec
    fn:PASS:renameat2(/fs1/a, /fs1/b, RENAME_EXCHANGE) 0 nsec
    fn:PASS:stat(/tmp/test_bpffs_testdir/fs1/b) 0 nsec
    fn:PASS:b should have a's inode 0 nsec
    fn:PASS:access(/tmp/test_bpffs_testdir/fs1/b/1) 0 nsec
    fn:PASS:stat(/tmp/test_bpffs_testdir/fs1/map) 0 nsec
    fn:PASS:renameat2(/fs1/c, /fs1/b, RENAME_EXCHANGE) 0 nsec
    fn:PASS:stat(/tmp/test_bpffs_testdir/fs1/b) 0 nsec
    fn:PASS:b should have c's inode 0 nsec
    fn:PASS:access(/tmp/test_bpffs_testdir/fs1/c/1) 0 nsec
    fn:PASS:renameat2(RENAME_NOREPLACE) 0 nsec
    fn:PASS:access(/tmp/test_bpffs_testdir/fs1/b) 0 nsec
    bpf_testmod.ko is already unloaded.
    Loading bpf_testmod.ko...
    Successfully loaded bpf_testmod.ko.
    test_test_bpffs:PASS:clone 0 nsec
    test_test_bpffs:PASS:waitpid 0 nsec
    test_test_bpffs:PASS:bpffs test  0 nsec
    #282     test_bpffs:OK
    Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
    Successfully unloaded bpf_testmod.ko.

[0] https://github.com/danobi/vmtest

This is a follow-up of https://lore.kernel.org/bpf/20231024201852.1512720-1-chantr4@gmail.com/T/

v1 -> v2:
  - use a TDIR name that is related to test
  - use C-style comments

Signed-off-by: Manu Bretelle <chantr4@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20231031223606.2927976-1-chantr4@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-01 22:31:41 -07:00
Hao Sun 85eb035e6c selftests/bpf: Add test for immediate spilled to stack
Add a test to check if the verifier correctly reason about the sign
of an immediate spilled to stack by BPF_ST instruction.

Signed-off-by: Hao Sun <sunhao.th@gmail.com>
Link: https://lore.kernel.org/r/20231101-fix-check-stack-write-v3-2-f05c2b1473d5@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-01 22:30:27 -07:00
Linus Torvalds 7dc0e9c7dd linux_kselftest-next-6.7-rc1
This kselftest update for Linux 6.7-rc1 consists of:
 
 -- kbuild kselftest-merge target fixes
 -- fixes to several tests
 -- resctrl test fixes and enhancements
 -- ksft_perror() helper and reporting improvements
 -- printf attribute to kselftest prints to improve reporting
 -- documentation and clang build warning fixes
 
 Bulk of the patches are for resctrl fixes and enhancements.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmVCoHMACgkQCwJExA0N
 QxwzrA//ehiiLdV2lyghzPpDTVY8jKlB1xIpg3s0r0M3m/j6nAdnOgOe2gkapT7T
 gFGL0r7xL9crqFdymwDANLSvNWOeghqB1oIok9Ruw5Rl3FcLnkh920bE6tPsddJg
 9+/KqtZvL0Sr43l9OSgX2Uzqyw60wRQwpO0431hmgnKjblk8Rh4GZ7fUCLLNf4Ia
 yOq1s2/cdmEwRc96lDaBWZaOTusejwh/xy8tgAjozHipLsmsexbyyHVWJWkVhMOD
 ZklCtrq4lckRz+Vky6akvjoL6Mjl//7pg323e2fUcDCQxQvqwnCo2VqqyOVBnN2A
 6XHQ6yXwh0xzCKRFgAiFhWlsKOz3wEIDrdp4dmhDkg4lw4gGJcwNke1UyX5zXYKM
 1a6R1vbQS9qQOsWf34AYKZBHruFNtUt0FJYgI43SuH+fGc0D5cU91Rz+s9QIPCwj
 8tcr5RWin8BOziDz05lxSKWRHD+3oc5qmsmGYBJhilwtvY2wNbRZNDZjiO28kiIy
 3kUWXeCtHmZE1KHK1H5v6bMC8SqUU7ukvV5WebqGpxzJ2eFPbeXcek9/AWSWOFni
 7thUg6MG3e4c/zRk8JYbmqXS/GeTkdmc3+VMXApLhTB8uSOWsnVMfJS9Zc2A1tGg
 n6NRBJFQO8t9Wm1l9XvlnC9HA/8lO/3uih+SzKn/u8KvoN96HPM=
 =JZb+
 -----END PGP SIGNATURE-----

Merge tag 'linux_kselftest-next-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kselftest updates from Shuah Khan:

 - kbuild kselftest-merge target fixes

 - fixes to several tests

 - resctrl test fixes and enhancements

 - ksft_perror() helper and reporting improvements

 - printf attribute to kselftest prints to improve reporting

 - documentation and clang build warning fixes

The bulk of the patches are for resctrl fixes and enhancements.

* tag 'linux_kselftest-next-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (51 commits)
  selftests/resctrl: Fix MBM test failure when MBA unavailable
  selftests/clone3: Report descriptive test names
  selftests:modify the incorrect print format
  selftests/efivarfs: create-read: fix a resource leak
  selftests/ftrace: Add riscv support for kprobe arg tests
  selftests/ftrace: add loongarch support for kprobe args char tests
  selftests/amd-pstate: Added option to provide perf binary path
  selftests/amd-pstate: Fix broken paths to run workloads in amd-pstate-ut
  selftests/resctrl: Move run_benchmark() to a more fitting file
  selftests/resctrl: Fix schemata write error check
  selftests/resctrl: Reduce failures due to outliers in MBA/MBM tests
  selftests/resctrl: Fix feature checks
  selftests/resctrl: Refactor feature check to use resource and feature name
  selftests/resctrl: Move _GNU_SOURCE define into Makefile
  selftests/resctrl: Remove duplicate feature check from CMT test
  selftests/resctrl: Extend signal handler coverage to unmount on receiving signal
  selftests/resctrl: Fix uninitialized .sa_flags
  selftests/resctrl: Cleanup benchmark argument parsing
  selftests/resctrl: Remove ben_count variable
  selftests/resctrl: Make benchmark command const and build it with pointers
  ...
2023-11-01 17:08:10 -10:00
Linus Torvalds 463f46e114 iommufd for 6.7
This branch has three new iommufd capabilities:
 
  - Dirty tracking for DMA. AMD/ARM/Intel CPUs can now record if a DMA
    writes to a page in the IOPTEs within the IO page table. This can be used
    to generate a record of what memory is being dirtied by DMA activities
    during a VM migration process. A VMM like qemu will combine the IOMMU
    dirty bits with the CPU's dirty log to determine what memory to
    transfer.
 
    VFIO already has a DMA dirty tracking framework that requires PCI
    devices to implement tracking HW internally. The iommufd version
    provides an alternative that the VMM can select, if available. The two
    are designed to have very similar APIs.
 
  - Userspace controlled attributes for hardware page
    tables (HWPT/iommu_domain). There are currently a few generic attributes
    for HWPTs (support dirty tracking, and parent of a nest). This is an
    entry point for the userspace iommu driver to control the HW in detail.
 
  - Nested translation support for HWPTs. This is a 2D translation scheme
    similar to the CPU where a DMA goes through a first stage to determine
    an intermediate address which is then translated trough a second stage
    to a physical address.
 
    Like for CPU translation the first stage table would exist in VM
    controlled memory and the second stage is in the kernel and matches the
    VM's guest to physical map.
 
    As every IOMMU has a unique set of parameter to describe the S1 IO page
    table and its associated parameters the userspace IOMMU driver has to
    marshal the information into the correct format.
 
    This is 1/3 of the feature, it allows creating the nested translation
    and binding it to VFIO devices, however the API to support IOTLB and
    ATC invalidation of the stage 1 io page table, and forwarding of IO
    faults are still in progress.
 
 The series includes AMD and Intel support for dirty tracking. Intel
 support for nested translation.
 
 Along the way are a number of internal items:
 
  - New iommu core items: ops->domain_alloc_user(), ops->set_dirty_tracking,
    ops->read_and_clear_dirty(), IOMMU_DOMAIN_NESTED, and iommu_copy_struct_from_user
 
  - UAF fix in iopt_area_split()
 
  - Spelling fixes and some test suite improvement
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCZUDu2wAKCRCFwuHvBreF
 YcdeAQDaBmjyGLrRIlzPyohF6FrombyWo2512n51Hs8IHR4IvQEA3oRNgQ2tsJRr
 1UPuOqnOD5T/oVX6AkUPRBwanCUQwwM=
 =nyJ3
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd

Pull iommufd updates from Jason Gunthorpe:
 "This brings three new iommufd capabilities:

   - Dirty tracking for DMA.

     AMD/ARM/Intel CPUs can now record if a DMA writes to a page in the
     IOPTEs within the IO page table. This can be used to generate a
     record of what memory is being dirtied by DMA activities during a
     VM migration process. A VMM like qemu will combine the IOMMU dirty
     bits with the CPU's dirty log to determine what memory to transfer.

     VFIO already has a DMA dirty tracking framework that requires PCI
     devices to implement tracking HW internally. The iommufd version
     provides an alternative that the VMM can select, if available. The
     two are designed to have very similar APIs.

   - Userspace controlled attributes for hardware page tables
     (HWPT/iommu_domain). There are currently a few generic attributes
     for HWPTs (support dirty tracking, and parent of a nest). This is
     an entry point for the userspace iommu driver to control the HW in
     detail.

   - Nested translation support for HWPTs. This is a 2D translation
     scheme similar to the CPU where a DMA goes through a first stage to
     determine an intermediate address which is then translated trough a
     second stage to a physical address.

     Like for CPU translation the first stage table would exist in VM
     controlled memory and the second stage is in the kernel and matches
     the VM's guest to physical map.

     As every IOMMU has a unique set of parameter to describe the S1 IO
     page table and its associated parameters the userspace IOMMU driver
     has to marshal the information into the correct format.

     This is 1/3 of the feature, it allows creating the nested
     translation and binding it to VFIO devices, however the API to
     support IOTLB and ATC invalidation of the stage 1 io page table,
     and forwarding of IO faults are still in progress.

  The series includes AMD and Intel support for dirty tracking. Intel
  support for nested translation.

  Along the way are a number of internal items:

   - New iommu core items: ops->domain_alloc_user(),
     ops->set_dirty_tracking, ops->read_and_clear_dirty(),
     IOMMU_DOMAIN_NESTED, and iommu_copy_struct_from_user

   - UAF fix in iopt_area_split()

   - Spelling fixes and some test suite improvement"

* tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: (52 commits)
  iommufd: Organize the mock domain alloc functions closer to Joerg's tree
  iommufd/selftest: Fix page-size check in iommufd_test_dirty()
  iommufd: Add iopt_area_alloc()
  iommufd: Fix missing update of domains_itree after splitting iopt_area
  iommu/vt-d: Disallow read-only mappings to nest parent domain
  iommu/vt-d: Add nested domain allocation
  iommu/vt-d: Set the nested domain to a device
  iommu/vt-d: Make domain attach helpers to be extern
  iommu/vt-d: Add helper to setup pasid nested translation
  iommu/vt-d: Add helper for nested domain allocation
  iommu/vt-d: Extend dmar_domain to support nested domain
  iommufd: Add data structure for Intel VT-d stage-1 domain allocation
  iommu/vt-d: Enhance capability check for nested parent domain allocation
  iommufd/selftest: Add coverage for IOMMU_HWPT_ALLOC with nested HWPTs
  iommufd/selftest: Add nested domain allocation for mock domain
  iommu: Add iommu_copy_struct_from_user helper
  iommufd: Add a nested HW pagetable object
  iommu: Pass in parent domain with user_data to domain_alloc_user op
  iommufd: Share iommufd_hwpt_alloc with IOMMUFD_OBJ_HWPT_NESTED
  iommufd: Derive iommufd_hwpt_paging from iommufd_hw_pagetable
  ...
2023-11-01 16:44:56 -10:00
Linus Torvalds f5277ad1e9 for-6.7/io_uring-sockopt-2023-10-30
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmU/vdwQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpr2rD/0astIsj/AACVSPzHARg9lnhkIvUeweMSSl
 CjifLTzK3a9E3R2IrC4sflObUKIEL3fste0Lva141eNULZvBJ6cQJDvY7Bp72Bkc
 CTPEwEQiwDJKLhTzQh3gY0H0+nFMWwEm1uc4dyeNAft/R9bPP/qOq62ttCoCp9+S
 1UoFmTlJE3bhejyS7fytoGZvKqhkpdR7rtbR4ya7CXWPoAG+v9amo8fputbxm0dj
 WECpKdd65JHWwYV4rbPA69T7jZ9V0oUsLen9RJ9BmjMLOFggHYqQdvEwG0Htirhw
 t5uaXqSvc8pXsJhKXMS3tXCrLNtBha5nlWHBpSE+6ovcmKiRzFjUaRXkRbcIrOAx
 ljIm0HHto1+xv0pDrNl3/lIjv5dpNOEauqqgMeYytQJIHa0JpSWbYzvjwQ8EZXQv
 WWDiRfH5Z0/3BsFdOCVqd8mTt4Pbksp2VFcxGkojRtSqSr4CML3mPZSmqGcs3nE6
 Fc16XXw7oLEWoF1tQYMP6KG0cVLem4on28c8CcVMJ/pRvcun3jBCif2gmMHJkWyA
 a6Uq116amqQ61f1p+EQ3ChqyTA5uALrXPmovu6Ne3Y/btW5yG4+Vu7AsPLjPHdFN
 oGHjOPV77XQzEqzUWRXmXPecZ+QifkcCV/8kbqtEHQqk5n+HUKQZmpC8+014ms3V
 Af6LYI/vYg==
 =sk8+
 -----END PGP SIGNATURE-----

Merge tag 'for-6.7/io_uring-sockopt-2023-10-30' of git://git.kernel.dk/linux

Pull io_uring {get,set}sockopt support from Jens Axboe:
 "This adds support for using getsockopt and setsockopt via io_uring.

  The main use cases for this is to enable use of direct descriptors,
  rather than first instantiating a normal file descriptor, doing the
  option tweaking needed, then turning it into a direct descriptor. With
  this support, we can avoid needing a regular file descriptor
  completely.

  The net and bpf bits have been signed off on their side"

* tag 'for-6.7/io_uring-sockopt-2023-10-30' of git://git.kernel.dk/linux:
  selftests/bpf/sockopt: Add io_uring support
  io_uring/cmd: Introduce SOCKET_URING_OP_SETSOCKOPT
  io_uring/cmd: Introduce SOCKET_URING_OP_GETSOCKOPT
  io_uring/cmd: return -EOPNOTSUPP if net is disabled
  selftests/net: Extract uring helpers to be reusable
  tools headers: Grab copy of io_uring.h
  io_uring/cmd: Pass compat mode in issue_flags
  net/socket: Break down __sys_getsockopt
  net/socket: Break down __sys_setsockopt
  bpf: Add sockptr support for setsockopt
  bpf: Add sockptr support for getsockopt
2023-11-01 11:16:34 -10:00
Itaru Kitayama 2ffc27b15b tools/testing/selftests/mm/run_vmtests.sh: lower the ptrace permissions
On Ubuntu and probably other distros, ptrace permissions are tightend a
bit by default; i.e., /proc/sys/kernel/yama/ptrace_score is set to 1. 
This cases memfd_secret's ptrace attach test fails with a permission
error.  Set it to 0 piror to running the program.  

Link: https://lkml.kernel.org/r/20231030-selftest-v1-1-743df68bb996@linux.dev
Signed-off-by: Itaru Kitayama <itaru.kitayama@linux.dev>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-11-01 12:46:59 -07:00
Swarup Laxman Kotiaklapudi bf5add391e proc: test ProtectionKey in proc-empty-vm test
Check ProtectionKey field in /proc/*/smaps output, if system supports
protection keys feature.

[adobriyan@gmail.com: test support in the beginning of the program, use syscall, not glibc pkey_alloc(3) which may not compile]

Link: https://lkml.kernel.org/r/ac05efa7-d2a0-48ad-b704-ffdd5450582e@p183
Signed-off-by: Swarup Laxman Kotiaklapudi <swarupkotikalapudi@gmail.com>
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reviewed-by: Swarup Laxman Kotikalapudi<swarupkotikalapudi@gmail.com>
Tested-by: Swarup Laxman Kotikalapudi<swarupkotikalapudi@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-11-01 12:46:59 -07:00
Alexey Dobriyan 20e34aa7e0 proc: fix proc-empty-vm test with vsyscall
* fix embarassing /proc/*/smaps test bug due to a typo in variable name
  it tested only the first line of the output if vsyscall is enabled:

  	ffffffffff600000-ffffffffff601000 r-xp ...

  so test passed but tested only VMA location and permissions.

* add "KSM" entry, unnoticed because (1)

* swap "r-xp" and "--xp" vsyscall test strings,
  also unnoticed because (1)

Link: https://lkml.kernel.org/r/76f42cce-b1ab-45ec-b6b2-4c64f0dccb90@p183
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Tested-by: Swarup Laxman Kotikalapudi<swarupkotikalapudi@mail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-11-01 12:46:59 -07:00
Nhat Pham 6479b29203 selftests: add a sanity check for zswap
We recently encountered a bug that makes all zswap store attempt fail. 
Specifically, after:

"141fdeececb3 mm/zswap: delay the initialization of zswap"

if we build a kernel with zswap disabled by default, then enabled after
the swapfile is set up, the zswap tree will not be initialized.  As a
result, all zswap store calls will be short-circuited.  We have to perform
another swapon to get zswap working properly again.

Fortunately, this issue has since been fixed by the patch that kills
frontswap:

"42c06a0e8ebe mm: kill frontswap"

which performs zswap_swapon() unconditionally, i.e always initializing
the zswap tree.

This test add a sanity check that ensure zswap storing works as
intended.

Link: https://lkml.kernel.org/r/20231020222009.2358953-1-nphamcs@gmail.com
Signed-off-by: Nhat Pham <nphamcs@gmail.com>
Acked-by: Rik van Riel <riel@surriel.com>
Cc: Domenico Cerasuolo <cerasuolodomenico@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Zefan Li <lizefan.x@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-11-01 12:38:35 -07:00
Linus Torvalds 56ec8e4cd8 arm64 updates for 6.7:
* Major refactoring of the CPU capability detection logic resulting in
   the removal of the cpus_have_const_cap() function and migrating the
   code to "alternative" branches where possible
 
 * Backtrace/kgdb: use IPIs and pseudo-NMI
 
 * Perf and PMU:
 
   - Add support for Ampere SoC PMUs
 
   - Multi-DTC improvements for larger CMN configurations with multiple
     Debug & Trace Controllers
 
   - Rework the Arm CoreSight PMU driver to allow separate registration of
     vendor backend modules
 
   - Fixes: add missing MODULE_DEVICE_TABLE to the amlogic perf
     driver; use device_get_match_data() in the xgene driver; fix NULL
     pointer dereference in the hisi driver caused by calling
     cpuhp_state_remove_instance(); use-after-free in the hisi driver
 
 * HWCAP updates:
 
   - FEAT_SVE_B16B16 (BFloat16)
 
   - FEAT_LRCPC3 (release consistency model)
 
   - FEAT_LSE128 (128-bit atomic instructions)
 
 * SVE: remove a couple of pseudo registers from the cpufeature code.
   There is logic in place already to detect mismatched SVE features
 
 * Miscellaneous:
 
   - Reduce the default swiotlb size (currently 64MB) if no ZONE_DMA
     bouncing is needed. The buffer is still required for small kmalloc()
     buffers
 
   - Fix module PLT counting with !RANDOMIZE_BASE
 
   - Restrict CPU_BIG_ENDIAN to LLVM IAS 15.x or newer move
     synchronisation code out of the set_ptes() loop
 
   - More compact cpufeature displaying enabled cores
 
   - Kselftest updates for the new CPU features
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmU7/QUACgkQa9axLQDI
 XvEx3xAAjICmHm+ryKJxS1IGXLYu2DXMcHUjeW6w1SxkK/vKhTMlHRx/CIWDze2l
 eENu7TcDLtTw+Gv9kqg30TSwzLfJhP9oFpX2T5TKkh5qlJlbz8fBtm+as14DTLCZ
 p2sra3J0w4B5JwTVqnj2RHOlEftMKvbyLGRkz3ve6wIUbsp5pXMkxAd/k3wOf0lC
 m6d9w1OMA2sOsw9YCgjcCNQGEzFMJk+13w7K+4w6A8Djn/Jxkt4fAFVn2ZlCiZzD
 NA2lTDWJqGmeGHo3iFdCTensWXmWTqjzxsNEf7PyBk5mBOdzDVxlTfEL7vnJg7gf
 BlTQ/nhIpra7rHQ9q2rwqEzbF+4Tn3uWlQfdDb7+/4goPjDh7tlBhEOYyOwTCEIT
 0t9cCSvBmSCKeXC3lKWWtJ+QJKhZHSmXN84EotTs65KyyfIsi4RuSezvV/+aIL86
 06sHYlYxETuujZP1cgOjf69Wsdsgizx0mqXJXf/xOjp22HFDcL4Bki6Rgi6t5OZj
 GEHG15kSE+eJ+RIpxpuAN8fdrlxYubsVLIksCqK7cZf9zXbQGIlifKAIrYiEx6kz
 FD+o+j/5niRWR6yJZCtCcGxqpSlwnYWPqc1Ds0GES8A/BphWMPozXUAZ0ll4Fnp1
 yyR2/Due/eBsCNESn579kP8989rashubB8vxvdx2fcWVtLC7VgE=
 =QaEo
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:
 "No major architecture features this time around, just some new HWCAP
  definitions, support for the Ampere SoC PMUs and a few fixes/cleanups.

  The bulk of the changes is reworking of the CPU capability checking
  code (cpus_have_cap() etc).

   - Major refactoring of the CPU capability detection logic resulting
     in the removal of the cpus_have_const_cap() function and migrating
     the code to "alternative" branches where possible

   - Backtrace/kgdb: use IPIs and pseudo-NMI

   - Perf and PMU:

      - Add support for Ampere SoC PMUs

      - Multi-DTC improvements for larger CMN configurations with
        multiple Debug & Trace Controllers

      - Rework the Arm CoreSight PMU driver to allow separate
        registration of vendor backend modules

      - Fixes: add missing MODULE_DEVICE_TABLE to the amlogic perf
        driver; use device_get_match_data() in the xgene driver; fix
        NULL pointer dereference in the hisi driver caused by calling
        cpuhp_state_remove_instance(); use-after-free in the hisi driver

   - HWCAP updates:

      - FEAT_SVE_B16B16 (BFloat16)

      - FEAT_LRCPC3 (release consistency model)

      - FEAT_LSE128 (128-bit atomic instructions)

   - SVE: remove a couple of pseudo registers from the cpufeature code.
     There is logic in place already to detect mismatched SVE features

   - Miscellaneous:

      - Reduce the default swiotlb size (currently 64MB) if no ZONE_DMA
        bouncing is needed. The buffer is still required for small
        kmalloc() buffers

      - Fix module PLT counting with !RANDOMIZE_BASE

      - Restrict CPU_BIG_ENDIAN to LLVM IAS 15.x or newer move
        synchronisation code out of the set_ptes() loop

      - More compact cpufeature displaying enabled cores

      - Kselftest updates for the new CPU features"

 * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (83 commits)
  arm64: Restrict CPU_BIG_ENDIAN to GNU as or LLVM IAS 15.x or newer
  arm64: module: Fix PLT counting when CONFIG_RANDOMIZE_BASE=n
  arm64, irqchip/gic-v3, ACPI: Move MADT GICC enabled check into a helper
  perf: hisi: Fix use-after-free when register pmu fails
  drivers/perf: hisi_pcie: Initialize event->cpu only on success
  drivers/perf: hisi_pcie: Check the type first in pmu::event_init()
  arm64: cpufeature: Change DBM to display enabled cores
  arm64: cpufeature: Display the set of cores with a feature
  perf/arm-cmn: Enable per-DTC counter allocation
  perf/arm-cmn: Rework DTC counters (again)
  perf/arm-cmn: Fix DTC domain detection
  drivers: perf: arm_pmuv3: Drop some unused arguments from armv8_pmu_init()
  drivers: perf: arm_pmuv3: Read PMMIR_EL1 unconditionally
  drivers/perf: hisi: use cpuhp_state_remove_instance_nocalls() for hisi_hns3_pmu uninit process
  clocksource/drivers/arm_arch_timer: limit XGene-1 workaround
  arm64: Remove system_uses_lse_atomics()
  arm64: Mark the 'addr' argument to set_ptes() and __set_pte_at() as unused
  drivers/perf: xgene: Use device_get_match_data()
  perf/amlogic: add missing MODULE_DEVICE_TABLE
  arm64/mm: Hoist synchronization out of set_ptes() loop
  ...
2023-11-01 09:34:55 -10:00
Linus Torvalds 8bc9e65151 Devicetree updates for 6.7:
- Add a kselftest to check for unprobed DT devices
 
 - Fix address translation for some 3 address cells cases
 
 - Refactor firmware node refcounting for AMBA bus
 
 - Add bindings for qcom,sm4450-pdc, Qualcomm Kryo 465 CPU, and Freescale
   QMC HDLC
 
 - Add Marantec vendor prefix
 
 - Convert qcom,pm8921-keypad, cnxt,cx92755-wdt, da9062-wdt,
   and atmel,at91rm9200-wdt bindings to DT schema
 
 - Several additionalProperties/unevaluatedProperties on child node
   schemas fixes
 
 - Drop reserved-memory bindings which now live in dtschema project
 
 - Fix a reference to rockchip,inno-usb2phy.yaml
 
 - Remove backlight nodes from display panel examples
 
 - Expand example for using DT_SCHEMA_FILES
 
 - Merge simple LVDS panel bindings to one binding doc
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEktVUI4SxYhzZyEuo+vtdtY28YcMFAmVBdEoACgkQ+vtdtY28
 YcObKw//ZkdPTh8t2m4ZH0kGSzcFGx1RiRxOOwVW9UTLovGDsxHixxu/j/9kerQw
 LHQH2UntlpmZhfIGgqlDf6QrPIuCAFLKTlx0+G2upq4TfHWUEOGcGCracDs65zJa
 XleEDw9Kt37fiVMUH/i+0mKTm98f+Zb//7IReSzGYtKW1alIr8TAUds26SbBckQ+
 /KClOJXuJmsqIWi3cJm3j59rzsSUcnLPR/GHEa03grazZXZ1MNHeaGB3+xZmSKMu
 0rhJrBX3PICxFx7FZevZFcHR4S4BQWmste72GTPZi+Htb3CtgjJFkzRdutoPByF7
 sSaLhs7f2msfcXhlgw2QoK3Wb2m33cZ+TaESXxx4YmVs/pRMD7kPGfODk7qf+vvJ
 kPN+bPh2THlp/L8x7S5EeqH+8NqJzXrdLf7CSUnOmkF/0GZ7/Id3Wt0rpoQeXLs3
 gi/v3K6qDyBKJ8cqEudftXMiYFcmSQJMvOA3x97j2J5iDAYltNFwI30hE07uXFhz
 WpNt/6wM8JLtQfL1IiMiL2I++0tEA4zCc8/aLfwcl6IkAjbP8KTGxtw3gFcyGaqt
 jzJQXr0j2xrfN6M/g55xXpPhN7R+2NaeiDETlDF9NggadrwnV7Nn9FFxASSXNomD
 BQU0jIECDo946NJv7/vw7RKxDJuzNdmqp54QZwoMlUPdxJgMw6g=
 =JCj5
 -----END PGP SIGNATURE-----

Merge tag 'devicetree-for-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull devicetree updates from Rob Herring:

 - Add a kselftest to check for unprobed DT devices

 - Fix address translation for some 3 address cells cases

 - Refactor firmware node refcounting for AMBA bus

 - Add bindings for qcom,sm4450-pdc, Qualcomm Kryo 465 CPU, and
   Freescale QMC HDLC

 - Add Marantec vendor prefix

 - Convert qcom,pm8921-keypad, cnxt,cx92755-wdt, da9062-wdt, and
   atmel,at91rm9200-wdt bindings to DT schema

 - Several additionalProperties/unevaluatedProperties on child node
   schemas fixes

 - Drop reserved-memory bindings which now live in dtschema project

 - Fix a reference to rockchip,inno-usb2phy.yaml

 - Remove backlight nodes from display panel examples

 - Expand example for using DT_SCHEMA_FILES

 - Merge simple LVDS panel bindings to one binding doc

* tag 'devicetree-for-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (34 commits)
  dt-bindings: soc: fsl: cpm_qe: cpm1-scc-qmc: Add support for QMC HDLC
  dt-bindings: soc: fsl: cpm_qe: cpm1-scc-qmc: Add 'additionalProperties: false' in child nodes
  dt-bindings: soc: fsl: cpm_qe: cpm1-scc-qmc: Fix example property name
  dt-bindings: arm,coresight-cti: Add missing additionalProperties on child nodes
  dt-bindings: arm,coresight-cti: Drop type for 'cpu' property
  dt-bindings: soundwire: Add reference to soundwire-controller.yaml schema
  dt-bindings: input: syna,rmi4: Make "additionalProperties: true" explicit
  media: dt-bindings: ti,ds90ub960: Add missing type for "i2c-alias"
  dt-bindings: input: qcom,pm8921-keypad: convert to YAML format
  of: overlay: unittest: overlay_bad_unresolved: Spelling s/ok/okay/
  of: address: Consolidate bus .map() functions
  of: address: Store number of bus flag cells rather than bool
  of: unittest: Add tests for address translations
  of: address: Remove duplicated functions
  of: address: Fix address translation when address-size is greater than 2
  dt-bindings: watchdog: cnxt,cx92755-wdt: convert txt to yaml
  dt-bindings: watchdog: da9062-wdt: convert txt to yaml
  dt-bindings: watchdog: fsl,scu-wdt: Document imx8dl
  dt-bindings: watchdog: atmel,at91rm9200-wdt: convert txt to yaml
  dt-bindings: usb: rockchip,dwc3: update inno usb2 phy binding name
  ...
2023-10-31 18:50:13 -10:00
Huacai Chen a6bdc082ad Merge 'bpf-next 2023-10-16' into loongarch-next
LoongArch architecture changes for 6.7 (BPF CPU v4 support) depend on
the bpf changes to fix conflictions in selftests and work, so merge them
to create a base.
2023-11-01 10:55:00 +08:00
Linus Torvalds 4ac4677fdb Thermal control updates for 6.7-rc1
- Untangle the initialization and updates of passive and active trip
    points in the ACPI thermal driver (Rafael Wysocki).
 
  - Reduce code duplication related to the initialization and updates
    of trip points in the ACPI thermal driver (Rafael Wysocki).
 
  - Use trip pointers for cooling device binding in the ACPI thermal
    driver (Rafael Wysocki).
 
  - Simplify critical and hot trips representation in the ACPI thermal
    driver (Rafael Wysocki).
 
  - Use trip pointers in thermal governors and in the related part of
    the thermal core (Rafael Wysocki).
 
  - Drop the trips_disabled bitmask that has become redundant from the
    thermal core (Rafael Wysocki).
 
  - Avoid updating trip points when the thermal zone temperature falls
    into a trip point's hysteresis range (ícolas F. R. A. Prado).
 
  - Add power floor notifications support to the int340x thermal control
    driver (Srinivas Pandruvada).
 
  - Rework updating trip points in the int340x thermal driver so that it
    does not access thermal zone internals directly (Rafael Wysocki).
 
  - Use param_get_byte() instead of param_get_int() as the max_idle module
    parameter .get() callback in the Intel powerclamp thermal driver to
    avoid possible out-of-bounds access (David Arcari).
 
  - Add workload hints support to the int340x thermal driver (Srinivas
    Pandruvada).
 
  - Add support for Mediatek LVTS MT8192 along with suspend/resume
    routines (Balsam Chihi).
 
  - Fix probe for THERMAL_V2 in the Mediatek LVTS driver (Markus
    Schneider-Pargmann).
 
  - Remove duplicate error message from the max76620 driver when
    thermal_of_zone_register() fails (Thierry Reding).
 
  - Add i.MX7D compatible bindings to fix a warning from dtbs_check for
    the imx6ul platform (Alexander Stein).
 
  - Add sa8775p compatible to the QCom tsens driver (Priyansh Jain).
 
  - Fix error check in lvts_debugfs_init() to be against PTR_ERR() in the
    LVTS Mediatek driver (Minjie Du).
 
  - Remove unused variable in thermal/tools (Kuan-Wei Chiu).
 
  - Document the imx8dl thermal sensor (Fabio Estevam).
 
  - Add variable names in callback prototypes to prevent warning from
    checkpatch.pl in the imx8mm driver (Bragatheswaran Manickavel).
 
  - Add missing unevaluatedProperties on child node schemas for tegra124
    (Rob Herring)
 
  - Add mt7988 support to the Mediatek LVTS driver (Frank Wunderlich).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmU6a+sSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRx9s8P/A/u3d8jn5Dj7B4N3Y4uTL1IG3aleGnZ
 EDxoiW6ju723s9ZMGsJg6qzXoksJDjleUTGtdyGYgbPyFz44/6/bcYqI6H0GsHlH
 lah5jAgXEaJdbc9CwwhjBoVEN6wO5ATyDIJ1H+iU0x/6svZpXN+tlQp2dAFacAxD
 CyZkFrBrSrEercO59NxCTyA9n9hfAfHdX7hBd5e686SXtcR+wdLyLfrPiEoE4oYq
 vW9QkzTSK+nILrLTSGnrmtYvsOzbyRKqoIPrRhqSwgobvDHvYgkokdESYGzu/d3F
 k922eHV8C4RyFZmVT9o+dVaK27gZ3KQI3d8BfpitMYD/Tb0N4axmwkZ44rrA8OOy
 kpTFENvlICyUpfcTqwLYIZT9WK7rXKpxqjLeKMZvzrUWBoNm8tFsuJy3Hxd0IP6V
 F2X62UH/sA4QAtJHD0VP5mc7FRbuFBdpyQ6Tq9ZNUQGoOeFT6VxQaqzRzqWwV6Na
 lxxvX3FST4lRguNRg9NW3RrOxQOyLaQP1TIN3CaZ0rOs2dlGz8v2UN252uM/K1Y0
 JO0rXS2HoXdUqWWaIMnVvNqOsaYeMGhVfCNySEn1DoChpInAZF50Z0/fL5Vpb51L
 cl0VrNxReHwDHeuRtKcFgH1S8YrsehrRAZp3bDArmD/XszJQR+Rv5N6BSdURDLSU
 1mkfn37l8fKc
 =PSaG
 -----END PGP SIGNATURE-----

Merge tag 'thermal-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull thermal control updates from Rafael Wysocki:
 "These further rework the ACPI thermal driver, after the changes made
  to it in the previous cycle, to make it easier to grasp, get rid of
  redundant pieces of internal data structures and eliminate its
  reliance on a specific ordering of trip point objects in the thermal
  core, make thermal core adjustments needed for the ACPI thermal driver
  rework, modify the thermal governor interface so as to use trip
  pointers for representing trip points in it, switch over multiple
  thermal drivers to using void platform driver remove callbacks, add
  support for 2 hardware features to the Intel int340x thermal driver,
  add support for new hardware on ARM platforms, update documentation,
  fix problems, clean up code and update the MAINTAINERS record for
  thermal control.

  Specifics:

   - Untangle the initialization and updates of passive and active trip
     points in the ACPI thermal driver (Rafael Wysocki)

   - Reduce code duplication related to the initialization and updates
     of trip points in the ACPI thermal driver (Rafael Wysocki)

   - Use trip pointers for cooling device binding in the ACPI thermal
     driver (Rafael Wysocki)

   - Simplify critical and hot trips representation in the ACPI thermal
     driver (Rafael Wysocki)

   - Use trip pointers in thermal governors and in the related part of
     the thermal core (Rafael Wysocki)

   - Drop the trips_disabled bitmask that has become redundant from the
     thermal core (Rafael Wysocki)

   - Avoid updating trip points when the thermal zone temperature falls
     into a trip point's hysteresis range (ícolas F. R. A. Prado)

   - Add power floor notifications support to the int340x thermal
     control driver (Srinivas Pandruvada)

   - Rework updating trip points in the int340x thermal driver so that
     it does not access thermal zone internals directly (Rafael
     Wysocki)

   - Use param_get_byte() instead of param_get_int() as the max_idle
     module parameter .get() callback in the Intel powerclamp thermal
     driver to avoid possible out-of-bounds access (David Arcari)

   - Add workload hints support to the int340x thermal driver (Srinivas
     Pandruvada)

   - Add support for Mediatek LVTS MT8192 along with suspend/resume
     routines (Balsam Chihi)

   - Fix probe for THERMAL_V2 in the Mediatek LVTS driver (Markus
     Schneider-Pargmann)

   - Remove duplicate error message from the max76620 driver when
     thermal_of_zone_register() fails (Thierry Reding)

   - Add i.MX7D compatible bindings to fix a warning from dtbs_check for
     the imx6ul platform (Alexander Stein)

   - Add sa8775p compatible to the QCom tsens driver (Priyansh Jain)

   - Fix error check in lvts_debugfs_init() to be against PTR_ERR() in
     the LVTS Mediatek driver (Minjie Du)

   - Remove unused variable in thermal/tools (Kuan-Wei Chiu)

   - Document the imx8dl thermal sensor (Fabio Estevam)

   - Add variable names in callback prototypes to prevent warning from
     checkpatch.pl in the imx8mm driver (Bragatheswaran Manickavel)

   - Add missing unevaluatedProperties on child node schemas for
     tegra124 (Rob Herring)

   - Add mt7988 support to the Mediatek LVTS driver (Frank Wunderlich)"

* tag 'thermal-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (111 commits)
  thermal: ACPI: Include the right header file
  thermal: core: Don't update trip points inside the hysteresis range
  thermal: core: Pass trip pointer to governor throttle callback
  thermal: gov_step_wise: Fold update_passive_instance() into its caller
  thermal: gov_power_allocator: Use trip pointers instead of trip indices
  thermal: gov_fair_share: Rearrange get_trip_level()
  thermal: trip: Define for_each_trip() macro
  thermal: trip: Simplify computing trip indices
  thermal/qcom/tsens: Drop ops_v0_1
  thermal/drivers/mediatek/lvts_thermal: Update calibration data documentation
  thermal/drivers/mediatek/lvts_thermal: Add mt8192 support
  thermal/drivers/mediatek/lvts_thermal: Add suspend and resume
  dt-bindings: thermal: mediatek: Add LVTS thermal controller definition for mt8192
  thermal/drivers/mediatek: Fix probe for THERMAL_V2
  thermal/drivers/max77620: Remove duplicate error message
  dt-bindings: timer: add imx7d compatible
  dt-bindings: net: microchip: Allow nvmem-cell usage
  dt-bindings: imx-thermal: Add #thermal-sensor-cells property
  dt-bindings: thermal: tsens: Add sa8775p compatible
  thermal/drivers/mediatek/lvts_thermal: Fix error check in lvts_debugfs_init()
  ...
2023-10-31 15:28:37 -10:00
Paolo Bonzini 45b890f768 KVM/arm64 updates for 6.7
- Generalized infrastructure for 'writable' ID registers, effectively
    allowing userspace to opt-out of certain vCPU features for its guest
 
  - Optimization for vSGI injection, opportunistically compressing MPIDR
    to vCPU mapping into a table
 
  - Improvements to KVM's PMU emulation, allowing userspace to select
    the number of PMCs available to a VM
 
  - Guest support for memory operation instructions (FEAT_MOPS)
 
  - Cleanups to handling feature flags in KVM_ARM_VCPU_INIT, squashing
    bugs and getting rid of useless code
 
  - Changes to the way the SMCCC filter is constructed, avoiding wasted
    memory allocations when not in use
 
  - Load the stage-2 MMU context at vcpu_load() for VHE systems, reducing
    the overhead of errata mitigations
 
  - Miscellaneous kernel and selftest fixes
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSNXHjWXuzMZutrKNKivnWIJHzdFgUCZUFJRgAKCRCivnWIJHzd
 FtgYAP9cMsc1Mhlw3jNQnTc6q0cbTulD/SoEDPUat1dXMqjs+gEAnskwQTrTX834
 fgGQeCAyp7Gmar+KeP64H0xm8kPSpAw=
 =R4M7
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 updates for 6.7

 - Generalized infrastructure for 'writable' ID registers, effectively
   allowing userspace to opt-out of certain vCPU features for its guest

 - Optimization for vSGI injection, opportunistically compressing MPIDR
   to vCPU mapping into a table

 - Improvements to KVM's PMU emulation, allowing userspace to select
   the number of PMCs available to a VM

 - Guest support for memory operation instructions (FEAT_MOPS)

 - Cleanups to handling feature flags in KVM_ARM_VCPU_INIT, squashing
   bugs and getting rid of useless code

 - Changes to the way the SMCCC filter is constructed, avoiding wasted
   memory allocations when not in use

 - Load the stage-2 MMU context at vcpu_load() for VHE systems, reducing
   the overhead of errata mitigations

 - Miscellaneous kernel and selftest fixes
2023-10-31 16:37:07 -04:00
Dan Williams 7f946e6d83 Merge branch 'for-6.7/cxl-rch-eh' into cxl/next
Restricted CXL Host (RCH) Error Handling undoes the topology munging of
CXL 1.1 to enabled some AER recovery, and lands some base infrastructure
for handling Root-Complex-Event-Collectors (RCECs) with CXL. Include
this long running series finally for v6.7.
2023-10-31 10:59:00 -07:00
Linus Torvalds 89ed67ef12 Networking changes for 6.7.
Core & protocols
 ----------------
 
  - Support usec resolution of TCP timestamps, enabled selectively by
    a route attribute.
 
  - Defer regular TCP ACK while processing socket backlog, try to send
    a cumulative ACK at the end. Increase single TCP flow performance
    on a 200Gbit NIC by 20% (100Gbit -> 120Gbit).
 
  - The Fair Queuing (FQ) packet scheduler:
    - add built-in 3 band prio / WRR scheduling
    - support bypass if the qdisc is mostly idle (5% speed up for TCP RR)
    - improve inactive flow reporting
    - optimize the layout of structures for better cache locality
 
  - Support TCP Authentication Option (RFC 5925, TCP-AO), a more modern
    replacement for the old MD5 option.
 
  - Add more retransmission timeout (RTO) related statistics to TCP_INFO.
 
  - Support sending fragmented skbs over vsock sockets.
 
  - Make sure we send SIGPIPE for vsock sockets if socket was shutdown().
 
  - Add sysctl for ignoring lower limit on lifetime in Router
    Advertisement PIO, based on an in-progress IETF draft.
 
  - Add sysctl to control activation of TCP ping-pong mode.
 
  - Add sysctl to make connection timeout in MPTCP configurable.
 
  - Support rcvlowat and notsent_lowat on MPTCP sockets, to help apps
    limit the number of wakeups.
 
  - Support netlink GET for MDB (multicast forwarding), allowing user
    space to request a single MDB entry instead of dumping the entire
    table.
 
  - Support selective FDB flushing in the VXLAN tunnel driver.
 
  - Allow limiting learned FDB entries in bridges, prevent OOM attacks.
 
  - Allow controlling via configfs netconsole targets which were created
    via the kernel cmdline at boot, rather than via configfs at runtime.
 
  - Support multiple PTP timestamp event queue readers with different
    filters.
 
  - MCTP over I3C.
 
 BPF
 ---
 
  - Add new veth-like netdevice where BPF program defines the logic
    of the xmit routine. It can operate in L3 and L2 mode.
 
  - Support exceptions - allow asserting conditions which should
    never be true but are hard for the verifier to infer.
    With some extra flexibility around handling of the exit / failure.
    https://lwn.net/Articles/938435/
 
  - Add support for local per-cpu kptr, allow allocating and storing
    per-cpu objects in maps. Access to those objects operates on
    the value for the current CPU. This allows to deprecate local
    one-off implementations of per-CPU storage like
    BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE maps.
 
  - Extend cgroup BPF sockaddr hooks for UNIX sockets. The use case is
    for systemd to re-implement the LogNamespace feature which allows
    running multiple instances of systemd-journald to process the logs
    of different services.
 
  - Enable open-coded task_vma iteration, after maple tree conversion
    made it hard to directly walk VMAs in tracing programs.
 
  - Add open-coded task, css_task and css iterator support.
    One of the use cases is customizable OOM victim selection via BPF.
 
  - Allow source address selection with bpf_*_fib_lookup().
 
  - Add ability to pin BPF timer to the current CPU.
 
  - Prevent creation of infinite loops by combining tail calls and
    fentry/fexit programs.
 
  - Add missed stats for kprobes to retrieve the number of missed kprobe
    executions and subsequent executions of BPF programs.
 
  - Inherit system settings for CPU security mitigations.
 
  - Add BPF v4 CPU instruction support for arm32 and s390x.
 
 Changes to common code
 ----------------------
 
  - overflow: add DEFINE_FLEX() for on-stack definition of structs
    with flexible array members.
 
  - Process doc update with more guidance for reviewers.
 
 Driver API
 ----------
 
  - Simplify locking in WiFi (cfg80211 and mac80211 layers), use wiphy
    mutex in most places and remove a lot of smaller locks.
 
  - Create a common DPLL configuration API. Allow configuring
    and querying state of PLL circuits used for clock syntonization,
    in network time distribution.
 
  - Unify fragmented and full page allocation APIs in page pool code.
    Let drivers be ignorant of PAGE_SIZE.
 
  - Rework PHY state machine to avoid races with calls to phy_stop().
 
  - Notify DSA drivers of MAC address changes on user ports, improve
    correctness of offloads which depend on matching port MAC addresses.
 
  - Allow antenna control on injected WiFi frames.
 
  - Reduce the number of variants of napi_schedule().
 
  - Simplify error handling when composing devlink health messages.
 
 Misc
 ----
 
  - A lot of KCSAN data race "fixes", from Eric.
 
  - A lot of __counted_by() annotations, from Kees.
 
  - A lot of strncpy -> strscpy and printf format fixes.
 
  - Replace master/slave terminology with conduit/user in DSA drivers.
 
  - Handful of KUnit tests for netdev and WiFi core.
 
 Removed
 -------
 
  - AppleTalk COPS.
 
  - AppleTalk ipddp.
 
  - TI AR7 CPMAC Ethernet driver.
 
 Drivers
 -------
 
  - Ethernet high-speed NICs:
    - Intel (100G, ice, idpf):
      - add a driver for the Intel E2000 IPUs
      - make CRC/FCS stripping configurable
      - cross-timestamping for E823 devices
      - basic support for E830 devices
      - use aux-bus for managing client drivers
      - i40e: report firmware versions via devlink
    - nVidia/Mellanox:
      - support 4-port NICs
      - increase max number of channels to 256
      - optimize / parallelize SF creation flow
    - Broadcom (bnxt):
      - enhance NIC temperature reporting
      - support PAM4 speeds and lane configuration
    - Marvell OcteonTX2:
      - PTP pulse-per-second output support
      - enable hardware timestamping for VFs
    - Solarflare/AMD:
      - conntrack NAT offload and offload for tunnels
    - Wangxun (ngbe/txgbe):
      - expose HW statistics
    - Pensando/AMD:
      - support PCI level reset
      - narrow down the condition under which skbs are linearized
    - Netronome/Corigine (nfp):
      - support CHACHA20-POLY1305 crypto in IPsec offload
 
  - Ethernet NICs embedded, slower, virtual:
    - Synopsys (stmmac):
      - add Loongson-1 SoC support
      - enable use of HW queues with no offload capabilities
      - enable PPS input support on all 5 channels
      - increase TX coalesce timer to 5ms
    - RealTek USB (r8152): improve efficiency of Rx by using GRO frags
    - xen: support SW packet timestamping
    - add drivers for implementations based on TI's PRUSS (AM64x EVM)
 
  - nVidia/Mellanox Ethernet datacenter switches:
    - avoid poor HW resource use on Spectrum-4 by better block selection
      for IPv6 multicast forwarding and ordering of blocks in ACL region
 
  - Ethernet embedded switches:
    - Microchip:
      - support configuring the drive strength for EMI compliance
      - ksz9477: partial ACL support
      - ksz9477: HSR offload
      - ksz9477: Wake on LAN
    - Realtek:
      - rtl8366rb: respect device tree config of the CPU port
 
  - Ethernet PHYs:
    - support Broadcom BCM5221 PHYs
    - TI dp83867: support hardware LED blinking
 
  - CAN:
    - add support for Linux-PHY based CAN transceivers
    - at91_can: clean up and use rx-offload helpers
 
  - WiFi:
    - MediaTek (mt76):
      - new sub-driver for mt7925 USB/PCIe devices
      - HW wireless <> Ethernet bridging in MT7988 chips
      - mt7603/mt7628 stability improvements
    - Qualcomm (ath12k):
      - WCN7850:
        - enable 320 MHz channels in 6 GHz band
        - hardware rfkill support
        - enable IEEE80211_HW_SINGLE_SCAN_ON_ALL_BANDS
          to make scan faster
        - read board data variant name from SMBIOS
      - QCN9274: mesh support
    - RealTek (rtw89):
      - TDMA-based multi-channel concurrency (MCC)
    - Silicon Labs (wfx):
      - Remain-On-Channel (ROC) support
 
  - Bluetooth:
    - ISO: many improvements for broadcast support
    - mark BCM4378/BCM4387 as BROKEN_LE_CODED
    - add support for QCA2066
    - btmtksdio: enable Bluetooth wakeup from suspend
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmU8XsYACgkQMUZtbf5S
 Irv19RAAnud/24OOF5XMEJkIcYlnfqximh4XO6PujRSYkSkOUJdZTF6iJPgf3pSP
 YpwoHYbYKHYfeOf8+3bTNESiQNSnoVmvmvwiS6/7lZ3behHUrGLQzW9Htc3EZyWH
 2h6QkDZ5OOjfg0bwYSfp3vXkmMH2k8WE9Y0NvCkhcohqZi13Rmp14RnyPmNb2d1V
 yZRYDMSM133KqE6gnBr1Ct65IEvnKeGlCUN2mTGqOJgdn6DZMsyxvtt0y4rmN7Ab
 41+CgPU5SfxfbYpW+Dl2HJpgfte3WrC57KC6AM0PAPJzPmQWgeB/m9mjz/apj6Bg
 bhsEIo7FdvbCnQm3yWPhK2OgCAcSwLr8jfGMU+Q+W4VnL5SRRR3Rm0zjsze+kHNP
 OfqJgxzl3DpvoJqVBy1h5FGcZt0XHwhksm4cTxWqIahsF+veY0ECBXbuBBQx9XTF
 Y7INfI8ulg7wISJs+CJfIClYkgOibTw2u8taBS5ikbtgxNqp5D4QqODn7UefQap1
 PR/IDYODF+zRgmMJLeBqSa6fij6BkfOEDiOWak5kggBoZdtbtmeKI6tzze06CNdW
 lWv1WEhRufxnwK+IuWsEkjhiMbs2WGLvkJ5JbgQV9BfqHfIfiqBCrcWtT/WbQnGt
 lmU46CXh1t/FZEqbmK9h+8vsIIfrcDl6jb5npEiKPRG00vDKRTM=
 =46nS
 -----END PGP SIGNATURE-----

Merge tag 'net-next-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Core & protocols:

   - Support usec resolution of TCP timestamps, enabled selectively by a
     route attribute.

   - Defer regular TCP ACK while processing socket backlog, try to send
     a cumulative ACK at the end. Increase single TCP flow performance
     on a 200Gbit NIC by 20% (100Gbit -> 120Gbit).

   - The Fair Queuing (FQ) packet scheduler:
       - add built-in 3 band prio / WRR scheduling
       - support bypass if the qdisc is mostly idle (5% speed up for TCP RR)
       - improve inactive flow reporting
       - optimize the layout of structures for better cache locality

   - Support TCP Authentication Option (RFC 5925, TCP-AO), a more modern
     replacement for the old MD5 option.

   - Add more retransmission timeout (RTO) related statistics to
     TCP_INFO.

   - Support sending fragmented skbs over vsock sockets.

   - Make sure we send SIGPIPE for vsock sockets if socket was
     shutdown().

   - Add sysctl for ignoring lower limit on lifetime in Router
     Advertisement PIO, based on an in-progress IETF draft.

   - Add sysctl to control activation of TCP ping-pong mode.

   - Add sysctl to make connection timeout in MPTCP configurable.

   - Support rcvlowat and notsent_lowat on MPTCP sockets, to help apps
     limit the number of wakeups.

   - Support netlink GET for MDB (multicast forwarding), allowing user
     space to request a single MDB entry instead of dumping the entire
     table.

   - Support selective FDB flushing in the VXLAN tunnel driver.

   - Allow limiting learned FDB entries in bridges, prevent OOM attacks.

   - Allow controlling via configfs netconsole targets which were
     created via the kernel cmdline at boot, rather than via configfs at
     runtime.

   - Support multiple PTP timestamp event queue readers with different
     filters.

   - MCTP over I3C.

  BPF:

   - Add new veth-like netdevice where BPF program defines the logic of
     the xmit routine. It can operate in L3 and L2 mode.

   - Support exceptions - allow asserting conditions which should never
     be true but are hard for the verifier to infer. With some extra
     flexibility around handling of the exit / failure:

          https://lwn.net/Articles/938435/

   - Add support for local per-cpu kptr, allow allocating and storing
     per-cpu objects in maps. Access to those objects operates on the
     value for the current CPU.

     This allows to deprecate local one-off implementations of per-CPU
     storage like BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE maps.

   - Extend cgroup BPF sockaddr hooks for UNIX sockets. The use case is
     for systemd to re-implement the LogNamespace feature which allows
     running multiple instances of systemd-journald to process the logs
     of different services.

   - Enable open-coded task_vma iteration, after maple tree conversion
     made it hard to directly walk VMAs in tracing programs.

   - Add open-coded task, css_task and css iterator support. One of the
     use cases is customizable OOM victim selection via BPF.

   - Allow source address selection with bpf_*_fib_lookup().

   - Add ability to pin BPF timer to the current CPU.

   - Prevent creation of infinite loops by combining tail calls and
     fentry/fexit programs.

   - Add missed stats for kprobes to retrieve the number of missed
     kprobe executions and subsequent executions of BPF programs.

   - Inherit system settings for CPU security mitigations.

   - Add BPF v4 CPU instruction support for arm32 and s390x.

  Changes to common code:

   - overflow: add DEFINE_FLEX() for on-stack definition of structs with
     flexible array members.

   - Process doc update with more guidance for reviewers.

  Driver API:

   - Simplify locking in WiFi (cfg80211 and mac80211 layers), use wiphy
     mutex in most places and remove a lot of smaller locks.

   - Create a common DPLL configuration API. Allow configuring and
     querying state of PLL circuits used for clock syntonization, in
     network time distribution.

   - Unify fragmented and full page allocation APIs in page pool code.
     Let drivers be ignorant of PAGE_SIZE.

   - Rework PHY state machine to avoid races with calls to phy_stop().

   - Notify DSA drivers of MAC address changes on user ports, improve
     correctness of offloads which depend on matching port MAC
     addresses.

   - Allow antenna control on injected WiFi frames.

   - Reduce the number of variants of napi_schedule().

   - Simplify error handling when composing devlink health messages.

  Misc:

   - A lot of KCSAN data race "fixes", from Eric.

   - A lot of __counted_by() annotations, from Kees.

   - A lot of strncpy -> strscpy and printf format fixes.

   - Replace master/slave terminology with conduit/user in DSA drivers.

   - Handful of KUnit tests for netdev and WiFi core.

  Removed:

   - AppleTalk COPS.

   - AppleTalk ipddp.

   - TI AR7 CPMAC Ethernet driver.

  Drivers:

   - Ethernet high-speed NICs:
      - Intel (100G, ice, idpf):
         - add a driver for the Intel E2000 IPUs
         - make CRC/FCS stripping configurable
         - cross-timestamping for E823 devices
         - basic support for E830 devices
         - use aux-bus for managing client drivers
         - i40e: report firmware versions via devlink
      - nVidia/Mellanox:
         - support 4-port NICs
         - increase max number of channels to 256
         - optimize / parallelize SF creation flow
      - Broadcom (bnxt):
         - enhance NIC temperature reporting
         - support PAM4 speeds and lane configuration
      - Marvell OcteonTX2:
         - PTP pulse-per-second output support
         - enable hardware timestamping for VFs
      - Solarflare/AMD:
         - conntrack NAT offload and offload for tunnels
      - Wangxun (ngbe/txgbe):
         - expose HW statistics
      - Pensando/AMD:
         - support PCI level reset
         - narrow down the condition under which skbs are linearized
      - Netronome/Corigine (nfp):
         - support CHACHA20-POLY1305 crypto in IPsec offload

   - Ethernet NICs embedded, slower, virtual:
      - Synopsys (stmmac):
         - add Loongson-1 SoC support
         - enable use of HW queues with no offload capabilities
         - enable PPS input support on all 5 channels
         - increase TX coalesce timer to 5ms
      - RealTek USB (r8152): improve efficiency of Rx by using GRO frags
      - xen: support SW packet timestamping
      - add drivers for implementations based on TI's PRUSS (AM64x EVM)

   - nVidia/Mellanox Ethernet datacenter switches:
      - avoid poor HW resource use on Spectrum-4 by better block
        selection for IPv6 multicast forwarding and ordering of blocks
        in ACL region

   - Ethernet embedded switches:
      - Microchip:
         - support configuring the drive strength for EMI compliance
         - ksz9477: partial ACL support
         - ksz9477: HSR offload
         - ksz9477: Wake on LAN
      - Realtek:
         - rtl8366rb: respect device tree config of the CPU port

   - Ethernet PHYs:
      - support Broadcom BCM5221 PHYs
      - TI dp83867: support hardware LED blinking

   - CAN:
      - add support for Linux-PHY based CAN transceivers
      - at91_can: clean up and use rx-offload helpers

   - WiFi:
      - MediaTek (mt76):
         - new sub-driver for mt7925 USB/PCIe devices
         - HW wireless <> Ethernet bridging in MT7988 chips
         - mt7603/mt7628 stability improvements
      - Qualcomm (ath12k):
         - WCN7850:
            - enable 320 MHz channels in 6 GHz band
            - hardware rfkill support
            - enable IEEE80211_HW_SINGLE_SCAN_ON_ALL_BANDS to
              make scan faster
            - read board data variant name from SMBIOS
        - QCN9274: mesh support
      - RealTek (rtw89):
         - TDMA-based multi-channel concurrency (MCC)
      - Silicon Labs (wfx):
         - Remain-On-Channel (ROC) support

   - Bluetooth:
      - ISO: many improvements for broadcast support
      - mark BCM4378/BCM4387 as BROKEN_LE_CODED
      - add support for QCA2066
      - btmtksdio: enable Bluetooth wakeup from suspend"

* tag 'net-next-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1816 commits)
  net: pcs: xpcs: Add 2500BASE-X case in get state for XPCS drivers
  net: bpf: Use sockopt_lock_sock() in ip_sock_set_tos()
  net: mana: Use xdp_set_features_flag instead of direct assignment
  vxlan: Cleanup IFLA_VXLAN_PORT_RANGE entry in vxlan_get_size()
  iavf: delete the iavf client interface
  iavf: add a common function for undoing the interrupt scheme
  iavf: use unregister_netdev
  iavf: rely on netdev's own registered state
  iavf: fix the waiting time for initial reset
  iavf: in iavf_down, don't queue watchdog_task if comms failed
  iavf: simplify mutex_trylock+sleep loops
  iavf: fix comments about old bit locks
  doc/netlink: Update schema to support cmd-cnt-name and cmd-max-name
  tools: ynl: introduce option to process unknown attributes or types
  ipvlan: properly track tx_errors
  netdevsim: Block until all devices are released
  nfp: using napi_build_skb() to replace build_skb()
  net: dsa: microchip: ksz9477: Fix spelling mistake "Enery" -> "Energy"
  net: dsa: microchip: Ensure Stable PME Pin State for Wake-on-LAN
  net: dsa: microchip: Refactor switch shutdown routine for WoL preparation
  ...
2023-10-31 05:10:11 -10:00
Paolo Bonzini f292dc8aad KVM x86 misc changes for 6.7:
- Add CONFIG_KVM_MAX_NR_VCPUS to allow supporting up to 4096 vCPUs without
    forcing more common use cases to eat the extra memory overhead.
 
  - Add IBPB and SBPB virtualization support.
 
  - Fix a bug where restoring a vCPU snapshot that was taken within 1 second of
    creating the original vCPU would cause KVM to try to synchronize the vCPU's
    TSC and thus clobber the correct TSC being set by userspace.
 
  - Compute guest wall clock using a single TSC read to avoid generating an
    inaccurate time, e.g. if the vCPU is preempted between multiple TSC reads.
 
  - "Virtualize" HWCR.TscFreqSel to make Linux guests happy, which complain
     about a "Firmware Bug" if the bit isn't set for select F/M/S combos.
 
  - Don't apply side effects to Hyper-V's synthetic timer on writes from
    userspace to fix an issue where the auto-enable behavior can trigger
    spurious interrupts, i.e. do auto-enabling only for guest writes.
 
  - Remove an unnecessary kick of all vCPUs when synchronizing the dirty log
    without PML enabled.
 
  - Advertise "support" for non-serializing FS/GS base MSR writes as appropriate.
 
  - Use octal notation for file permissions through KVM x86.
 
  - Fix a handful of typo fixes and warts.
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCgAwFiEEMHr+pfEFOIzK+KY1YJEiAU0MEvkFAmU8EugSHHNlYW5qY0Bn
 b29nbGUuY29tAAoJEGCRIgFNDBL5xS0P+gPTDO81CUZO70LrO2W4E7toRBf/F9x1
 /v5D/76p9hG32Z6+BJs/xxDxJFagw75MtoR5oKivtXiip3TxbfOyDOlaQkIRo85E
 /d95il/LRidL3Mv3TXRj1lykXnxSSz9tigAGEZti1Y9Fn9fXEIwurJH7dU5cBI1E
 fin5bsDaTNRjG4jjTiEUbnKPRTlD/S7CQJn4CaYvZhMv/eJkYDLyBBVy4VLoLzvD
 ctL6VJQLGPVxbxr9mEmulaqMrSuDIQQLkRVQJAViKyerBInTEc5d/GPCHuE8O3zi
 0r/QSJbMS9titWLz07NhJ1UH4VJNyaEhRlyJPSFhBW4h6dzUb3EXdUe0Hwa+JH/S
 H2cVqsANItTCIhvDtuEGIRDahu0eD+63h90InJ0gEVL1kSJS+UWZHB71PkUEQgAV
 2OsuT1D26fuxrv+0b9ioBZURycqKw++zGsrwyVhe77eBgqBJ12tbL4TAD+QNjaQ5
 HZTCe6YV83gZoOMeVkoTGSf96s9lGORgxsaAIXmFuLB9RVCVXhVh0ph2HZsnV8Hw
 ZXEXpBEFo7GUhb0NIvsk2W73QL87A3fLv15yITWc8KuC7/dXP9z6KpSKjFySS69X
 uWD1MVx6shhvbg97UzoJlXc3/z0aVzmdZJudE5d0gcFvAjIItqp6ICPOoKxfj8pT
 tqRZu3kVHd61
 =sfp8
 -----END PGP SIGNATURE-----

Merge tag 'kvm-x86-misc-6.7' of https://github.com/kvm-x86/linux into HEAD

KVM x86 misc changes for 6.7:

 - Add CONFIG_KVM_MAX_NR_VCPUS to allow supporting up to 4096 vCPUs without
   forcing more common use cases to eat the extra memory overhead.

 - Add IBPB and SBPB virtualization support.

 - Fix a bug where restoring a vCPU snapshot that was taken within 1 second of
   creating the original vCPU would cause KVM to try to synchronize the vCPU's
   TSC and thus clobber the correct TSC being set by userspace.

 - Compute guest wall clock using a single TSC read to avoid generating an
   inaccurate time, e.g. if the vCPU is preempted between multiple TSC reads.

 - "Virtualize" HWCR.TscFreqSel to make Linux guests happy, which complain
    about a "Firmware Bug" if the bit isn't set for select F/M/S combos.

 - Don't apply side effects to Hyper-V's synthetic timer on writes from
   userspace to fix an issue where the auto-enable behavior can trigger
   spurious interrupts, i.e. do auto-enabling only for guest writes.

 - Remove an unnecessary kick of all vCPUs when synchronizing the dirty log
   without PML enabled.

 - Advertise "support" for non-serializing FS/GS base MSR writes as appropriate.

 - Use octal notation for file permissions through KVM x86.

 - Fix a handful of typo fixes and warts.
2023-10-31 10:15:15 -04:00
Paolo Bonzini 957eedc703 KVM/riscv changes for 6.7
- Smstateen and Zicond support for Guest/VM
 - Virtualized senvcfg CSR for Guest/VM
 - Added Smstateen registers to the get-reg-list selftests
 - Added Zicond to the get-reg-list selftests
 - Virtualized SBI debug console (DBCN) for Guest/VM
 - Added SBI debug console (DBCN) to the get-reg-list selftests
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEZdn75s5e6LHDQ+f/rUjsVaLHLAcFAmU3dKgACgkQrUjsVaLH
 LAeh6g//RVlm3rLLwnWes3PwnBZDtkXYVVLy8oCEPjjjZGMJ+piwV8kcq4QpNear
 e1nrOJTnndkjPG9TdJIxFW0qGnPLe4D9Ag6NoFpJXwxZRy/Glr8iWbt3pOcrv6vu
 gk/9RP5gQpJpcP7D2scR+go6ryEzncCNY5GLtXoyFM9drQk+IROSrMWrWI5CXBJi
 c0qwQWsSWb3KTBTZ7Gsm0Y7E9WIi5u6bBxcLJvblUKszS+PpXO9IA+8qOGZS4Cts
 ocy5qqu+NNPECZjSa+kZti8ZSbdjNw9ORyB4OHlBUt55Uidp72pPFdTjGvSvs/jq
 6wF70i1qjZOtjVUNdlkF93EN1xu4f3Hus/6/Q1pcyq+k9vo7eeJQQRWR4bJZ9SJR
 ZkhYfOgqMUzMnrf2M20en9DygRDFstL2XW5mGDPmk9XLSNv1KwpKf96Xl2E20uPb
 21EVQSiybmpl80WkWs0txwk62qQYzYUx+Io8XKXl8MrqNN2HCnzUUy/fjy/kd7iM
 Oe1a8kNcV/eawpURBNVCcpld9qM1OPCmseqrr9tpvdR4SmAo1mAi/d7/I61umwpE
 a36fqmwIZ0ppMs86BOcMRcTaee6EjPeldXnRCTjCLqo6YvsdHRrwpNBHRMbUOKpy
 0688Xf+vjmu81IXhfaRTcs5dvKe1NI4puRtG5DUt7n1csSUyNWI=
 =TFKQ
 -----END PGP SIGNATURE-----

Merge tag 'kvm-riscv-6.7-1' of https://github.com/kvm-riscv/linux into HEAD

KVM/riscv changes for 6.7

- Smstateen and Zicond support for Guest/VM
- Virtualized senvcfg CSR for Guest/VM
- Added Smstateen registers to the get-reg-list selftests
- Added Zicond to the get-reg-list selftests
- Virtualized SBI debug console (DBCN) for Guest/VM
- Added SBI debug console (DBCN) to the get-reg-list selftests
2023-10-31 10:09:39 -04:00
Takashi Iwai 2dc15ff73b ASoC: Updates for v6.7
More updates for v6,7 following the early merge request:
 
   - Fixes for handling of component name prefixing when name prefixes
     are used by the machine driver.
   - Fixes for noise when stopping some Sounwire CODECs.
   - Support for AMD ACP 6.3 and 7.0, Awinc AW88399, more Intel
     platforms and more Qualcomm SC7180 platforms.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmU/rAQACgkQJNaLcl1U
 h9AXVwf/SwWrxTus3+O2hS5rwusjqQBn1t2mzlnxsYNVCYGBcjOpMGL4HrJzn++e
 DJXoMyWis5FNKFWyPtKMGE1kZYdUUE/g7LpZOBew4P47nBv6SQWRvUxPfoq8mdOg
 Xb4+kjBzTq1dhwZnZPvNdsknvM7cLfx/lyo2vUJR4peDL0rzgnx72fhRZAjzh2OH
 CSz69aTjpliqikp+V7JVFYf2yma2LjTOCL2saiIF/PcxsqUUa73XTggg610EPY+R
 pOFb2MBRDbZuJrETbaiytLwtcPuvrpiHRHhxuClsjHGVTDbGfE7GzePS+yUu66y9
 LO8oAl7kJebw+WWffIOoL2IjXcG9tA==
 =nn11
 -----END PGP SIGNATURE-----

Merge tag 'asoc-v6.7-2' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Updates for v6.7

More updates for v6,7 following the early merge request:

  - Fixes for handling of component name prefixing when name prefixes
    are used by the machine driver.
  - Fixes for noise when stopping some Sounwire CODECs.
  - Support for AMD ACP 6.3 and 7.0, Awinc AW88399, more Intel
    platforms and more Qualcomm SC7180 platforms.
2023-10-31 09:01:25 +01:00
Linus Torvalds 5a6a09e971 cgroup: Changes for v6.7
* cpuset now supports remote partitions where CPUs can be reserved for
   exclusive use down the tree without requiring all the intermediate nodes
   to be partitions. This makes it easier to use partitions without modifying
   existing cgroup hierarchy.
 
 * cpuset partition configuration behavior improvement.
 
 * cgroup_favordynmods= boot param added to allow setting the flag on boot on
   cgroup1.
 
 * Misc code and doc updates.
 -----BEGIN PGP SIGNATURE-----
 
 iIQEABYIACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCZUBUKA4cdGpAa2VybmVs
 Lm9yZwAKCRCxYfJx3gVYGWfMAP9WP+Z21qzzL2bY5I5kOxu+rD2fF9ORk7azILrI
 c3gXSQD/bdxNWcdtrCQMvs+ToNEJvqDFxrmgG5uRUF8Ea+FCtQ8=
 =gZj2
 -----END PGP SIGNATURE-----

Merge tag 'cgroup-for-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

Pull cgroup updates from Tejun Heo:

 - cpuset now supports remote partitions where CPUs can be reserved for
   exclusive use down the tree without requiring all the intermediate
   nodes to be partitions. This makes it easier to use partitions
   without modifying existing cgroup hierarchy.

 - cpuset partition configuration behavior improvement

 - cgroup_favordynmods= boot param added to allow setting the flag on
   boot on cgroup1

 - Misc code and doc updates

* tag 'cgroup-for-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  docs/cgroup: Add the list of threaded controllers to cgroup-v2.rst
  cgroup: use legacy_name for cgroup v1 disable info
  cgroup/cpuset: Cleanup signedness issue in cpu_exclusive_check()
  cgroup/cpuset: Enable invalid to valid local partition transition
  cgroup: add cgroup_favordynmods= command-line option
  cgroup/cpuset: Extend test_cpuset_prs.sh to test remote partition
  cgroup/cpuset: Documentation update for partition
  cgroup/cpuset: Check partition conflict with housekeeping setup
  cgroup/cpuset: Introduce remote partition
  cgroup/cpuset: Add cpuset.cpus.exclusive for v2
  cgroup/cpuset: Add cpuset.cpus.exclusive.effective for v2
  cgroup/cpuset: Fix load balance state in update_partition_sd_lb()
  cgroup: Avoid extra dereference in css_populate_dir()
  cgroup: Check for ret during cgroup1_base_files cft addition
2023-10-30 20:58:48 -10:00
Linus Torvalds befaa609f4 hardening updates for v6.7-rc1
- Add LKDTM test for stuck CPUs (Mark Rutland)
 
 - Improve LKDTM selftest behavior under UBSan (Ricardo Cañuelo)
 
 - Refactor more 1-element arrays into flexible arrays (Gustavo A. R. Silva)
 
 - Analyze and replace strlcpy and strncpy uses (Justin Stitt, Azeem Shaikh)
 
 - Convert group_info.usage to refcount_t (Elena Reshetova)
 
 - Add __counted_by annotations (Kees Cook, Gustavo A. R. Silva)
 
 - Add Kconfig fragment for basic hardening options (Kees Cook, Lukas Bulwahn)
 
 - Fix randstruct GCC plugin performance mode to stay in groups (Kees Cook)
 
 - Fix strtomem() compile-time check for small sources (Kees Cook)
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmU/3cUWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJsEoEACBGPSiOmfSWdH3TOnIG270PD24
 jGjg8KFv7RC/JTOdYmpLl0okdlGT9LvjN/ToSSDEw3PIayxoXUdhkbYy0MYtiV3m
 yz2ozDTzJuplQX/W2fPE+nXSzIwHao2zjPPFjHnT7lt8IIjhgjiOtLfZ2gGUkW99
 Mdu2aWh3u0r4tC8OS23++yN5ibRc5l72efsjDWjZ0aPXnxE1bjmLMiIPiizpndIf
 beasPuDBs98sJVYouemCwnsPXuXOPz3Q1Cpo/fTd+TMTJCLSemCQZCTuOBU0acI/
 ZjLCgCaJU1yIYKBMtrIN4G9kITZniXX3/Nm4o6NQMVlcCqMeNaHuflomqWoqWfhE
 UPbRo2eghZOaMNiCKLLvZDIqPrh1IcsiEl6Ef3W4hICc42GTK96IuGisIvDXwQ4N
 /SzTOupJuN42noh3z1M3XuZy5RoXJ99IYDNY5CTKf9IdqvA0bbGkU3nb1gZH/xw9
 BjTqKzR/7K1kTXuSgagDZ1Wceej9pZxhX7E3IHYsP8ZOvKug3EeL4yybVwQ3HRfq
 Qnzcp/qPB9cOkLSQXveRTFTsj2mX28Gixct/iDuc1jIYwGQlY1gI6dcUcqby6ptM
 BrQti7eR2NH2+T3aE2UVCIWsZVhx7NaSF+z8JxfAuu56jicc4xJVsi8zrNveWX5M
 m2VXyBl3121BVtKi4w==
 =0iVF
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull hardening updates from Kees Cook:
 "One of the more voluminous set of changes is for adding the new
  __counted_by annotation[1] to gain run-time bounds checking of
  dynamically sized arrays with UBSan.

   - Add LKDTM test for stuck CPUs (Mark Rutland)

   - Improve LKDTM selftest behavior under UBSan (Ricardo Cañuelo)

   - Refactor more 1-element arrays into flexible arrays (Gustavo A. R.
     Silva)

   - Analyze and replace strlcpy and strncpy uses (Justin Stitt, Azeem
     Shaikh)

   - Convert group_info.usage to refcount_t (Elena Reshetova)

   - Add __counted_by annotations (Kees Cook, Gustavo A. R. Silva)

   - Add Kconfig fragment for basic hardening options (Kees Cook, Lukas
     Bulwahn)

   - Fix randstruct GCC plugin performance mode to stay in groups (Kees
     Cook)

   - Fix strtomem() compile-time check for small sources (Kees Cook)"

* tag 'hardening-v6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (56 commits)
  hwmon: (acpi_power_meter) replace open-coded kmemdup_nul
  reset: Annotate struct reset_control_array with __counted_by
  kexec: Annotate struct crash_mem with __counted_by
  virtio_console: Annotate struct port_buffer with __counted_by
  ima: Add __counted_by for struct modsig and use struct_size()
  MAINTAINERS: Include stackleak paths in hardening entry
  string: Adjust strtomem() logic to allow for smaller sources
  hardening: x86: drop reference to removed config AMD_IOMMU_V2
  randstruct: Fix gcc-plugin performance mode to stay in group
  mailbox: zynqmp: Annotate struct zynqmp_ipi_pdata with __counted_by
  drivers: thermal: tsens: Annotate struct tsens_priv with __counted_by
  irqchip/imx-intmux: Annotate struct intmux_data with __counted_by
  KVM: Annotate struct kvm_irq_routing_table with __counted_by
  virt: acrn: Annotate struct vm_memory_region_batch with __counted_by
  hwmon: Annotate struct gsc_hwmon_platform_data with __counted_by
  sparc: Annotate struct cpuinfo_tree with __counted_by
  isdn: kcapi: replace deprecated strncpy with strscpy_pad
  isdn: replace deprecated strncpy with strscpy
  NFS/flexfiles: Annotate struct nfs4_ff_layout_segment with __counted_by
  nfs41: Annotate struct nfs4_file_layout_dsaddr with __counted_by
  ...
2023-10-30 19:09:55 -10:00
Linus Torvalds 2656821f1f RCU pull request for v6.7
This pull request contains the following branches:
 
 rcu/torture: RCU torture, locktorture and generic torture infrastructure
 	updates that include various fixes, cleanups and consolidations.
 	Among the user visible things, ftrace dumps can now be found into
 	their own file, and module parameters get better documented and
 	reported on dumps.
 
 rcu/fixes: Generic and misc fixes all over the place. Some highlights:
 
 	* Hotplug handling has seen some light cleanups and comments.
 
 	* An RCU barrier can now be triggered through sysfs to serialize
 	memory stress testing and avoid OOM.
 
 	* Object information is now dumped in case of invalid callback
 	invocation.
 
 	* Also various SRCU issues, too hard to trigger to deserve urgent
 	pull requests, have been fixed.
 
 rcu/docs: RCU documentation updates
 
 rcu/refscale: RCU reference scalability test minor fixes and doc
 	improvements.
 
 rcu/tasks: RCU tasks minor fixes
 
 rcu/stall: Stall detection updates. Introduce RCU CPU Stall notifiers
 	that allows a subsystem to provide informations to help debugging.
 	Also cure some false positive stalls.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEd76+gtGM8MbftQlOhSRUR1COjHcFAmU21h0ACgkQhSRUR1CO
 jHdUgA/+Myy5K5OxNrqlF/gIK+flOSg635RyZ0DBx8OMXZ/fAg9qRI+PKt5I4Lha
 eXAg6EtmwSgHmIbjcg8WzsvwniEsqqjOF+n1qil447fHUI2Qqw6c7fIm/MXQkeHJ
 qA7CODDRtsAnwnjmTteasmMeGV0bmXDENxhNrAZBFnVkRgTqfyDbFcn+nxOaPK6b
 fmbKvnB07WUg1KOV8/MbEtAZPb8QgHo58bXSZRKjKkiqRQWB/D3On+tShFK7SYJi
 wIqQ96MLyUXLaIWQ47v6xEO4PZO+3o1wAryvP1DRdb5UrPjO6yKFfQaoo5Mza92G
 zhBJhnXkVvCoNoCU7GKJIDV54SgDHaB6Sf1GN5cjwfujOkLuGCyg0CpKktCGm7uH
 n3X66PVep608Uj2Y/pAo/hv3Hbv7lCu4nfrERvVLG9YoxUvTJDsKmBv+SF/g2mxF
 rHqFa39HUPr1yHA5WjqOQS3lLdqCXEGKvNi6zXCvOceiDbHbiJFkBo6p8TVrbSMX
 FCOWZ3LoE+6uiLu/lLOEroTjeBd8GhDh1LgWgyVK7o0LhP1018DSBolrpcSwnmOo
 Q/E4G2x+aPWs+5NTOmMGOIPY70khKQIM3c8YZelSRffJBo6O3yV68h6X45NQxYvx
 keLvrDaza8h4hKwaof/QaX4ZJgTOZ0xjpawr1vR0hbK8LNtPrUw=
 =cVD7
 -----END PGP SIGNATURE-----

Merge tag 'rcu-next-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks

Pull RCU updates from Frederic Weisbecker:

 - RCU torture, locktorture and generic torture infrastructure updates
   that include various fixes, cleanups and consolidations.

   Among the user visible things, ftrace dumps can now be found into
   their own file, and module parameters get better documented and
   reported on dumps.

 - Generic and misc fixes all over the place. Some highlights:

     * Hotplug handling has seen some light cleanups and comments

     * An RCU barrier can now be triggered through sysfs to serialize
       memory stress testing and avoid OOM

     * Object information is now dumped in case of invalid callback
       invocation

     * Also various SRCU issues, too hard to trigger to deserve urgent
       pull requests, have been fixed

 - RCU documentation updates

 - RCU reference scalability test minor fixes and doc improvements.

 - RCU tasks minor fixes

 - Stall detection updates. Introduce RCU CPU Stall notifiers that
   allows a subsystem to provide informations to help debugging. Also
   cure some false positive stalls.

* tag 'rcu-next-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks: (56 commits)
  srcu: Only accelerate on enqueue time
  locktorture: Check the correct variable for allocation failure
  srcu: Fix callbacks acceleration mishandling
  rcu: Comment why callbacks migration can't wait for CPUHP_RCUTREE_PREP
  rcu: Standardize explicit CPU-hotplug calls
  rcu: Conditionally build CPU-hotplug teardown callbacks
  rcu: Remove references to rcu_migrate_callbacks() from diagrams
  rcu: Assume rcu_report_dead() is always called locally
  rcu: Assume IRQS disabled from rcu_report_dead()
  rcu: Use rcu_segcblist_segempty() instead of open coding it
  rcu: kmemleak: Ignore kmemleak false positives when RCU-freeing objects
  srcu: Fix srcu_struct node grpmask overflow on 64-bit systems
  torture: Convert parse-console.sh to mktemp
  rcutorture: Traverse possible cpu to set maxcpu in rcu_nocb_toggle()
  rcutorture: Replace schedule_timeout*() 1-jiffy waits with HZ/20
  torture: Add kvm.sh --debug-info argument
  locktorture: Rename readers_bind/writers_bind to bind_readers/bind_writers
  doc: Catch-up update for locktorture module parameters
  locktorture: Add call_rcu_chains module parameter
  locktorture: Add new module parameters to lock_torture_print_module_parms()
  ...
2023-10-30 18:01:41 -10:00
Linus Torvalds c9049984f0 nolibc updates for v6.7
o	Add stdarg.h header and a few additional system-call upgrades.
 
 o	Add support for constructors and destructors.
 
 o	Add tests to verify the ability to link multiple .o files
 	against nolibc.
 
 o	Numerous string-function optimizations and improvements.
 
 o	Prevent redundant kernel relinks by avoiding embedding of
 	initramfs into the kernel image.
 
 o	Allow building i386 with multiarch compiler and make ppc64le
 	use qemu-system-ppc64.
 
 o	Miscellaneous fixups, including addition of -nostdinc for
 	nolibc-test, avoiding -Wstringop-overflow warnings, and avoiding
 	unused parameter warnings for ENOSYS fallbacks.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmU3A1ETHHBhdWxtY2tA
 a2VybmVsLm9yZwAKCRCevxLzctn7jGyAEACRJX0s3JBvxRz5zKOA+l41sAN3DZ9z
 ygKIghnlNeDXQEFSqkZXb9Pa8BUPaVnFet3X5gUy8/0jcbXPPjIHU68U6EGQWk3f
 y5uTaxhTQkC+5gLyRhvq7FtjWxwYlg24D2e6ctrEw4pCt18PfkEhxof5PBhg/71K
 UVrZ55cRvXG7CTLSm5p1+jNkAOJuNfV+zD32QuV9V+7CwNLU088TZS9jGALFjKC0
 UyE8E5uvmTQ6QQOl64Z6GNhpQual/2BslIGDVtb/+/Ii5Ch2nA8gV2YiC8cVPzpz
 r8yxqSEwfmiTNDPFH6PRIAx/optfgV/uScyyCNEiLwh/gFcag04BjC9GpWy5jKzA
 akchr0n+7yfJTpzzNmM38OAoaqMgzcPedxW2RDP5Eeb4cw0AKoy7bD3WeBRfmpgl
 tAgd8Gl7vpvSjecQSZfCY1hJ4F/qS2CfnObL4/EbHxIOfyLo0A6eEKLIf9PP02bT
 w2YJkZVSprKi8CXvIaV5KAhxXUGp07FJ5PHYLFFjinez/e6ksb7AH/ltrnAULMoV
 Ig3aQCYJ4oOFFVjH+h9+fFrqbI87Xo13UfO6PtkJU3gV749prqRIAg6FCTU0HkIz
 TSvy/PEFAxaXvNSGsSXikxQxvx3Fph9laoxQ1dSgEZSXKY43v1sfnXp1h5vGDdKH
 GPWaEtEBfNVUdg==
 =UKBQ
 -----END PGP SIGNATURE-----

Merge tag 'nolibc.2023.10.23a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu

Pull nolibc updates from Paul McKenney:

 - Add stdarg.h header and a few additional system-call upgrades

 - Add support for constructors and destructors

 - Add tests to verify the ability to link multiple .o files against
   nolibc

 - Numerous string-function optimizations and improvements

 - Prevent redundant kernel relinks by avoiding embedding of initramfs
   into the kernel image

 - Allow building i386 with multiarch compiler and make ppc64le use
   qemu-system-ppc64

 - Miscellaneous fixups, including addition of -nostdinc for
   nolibc-test, avoiding -Wstringop-overflow warnings, and avoiding
   unused parameter warnings for ENOSYS fallbacks

* tag 'nolibc.2023.10.23a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
  selftests/nolibc: add tests for multi-object linkage
  selftests/nolibc: use qemu-system-ppc64 for ppc64le
  tools/nolibc: add support for constructors and destructors
  tools/nolibc: drop test for getauxval(AT_PAGESZ)
  tools/nolibc: automatically detect necessity to use pselect6
  tools/nolibc: don't define new syscall number
  tools/nolibc: avoid unused parameter warnings for ENOSYS fallbacks
  selftests/nolibc: allow building i386 with multiarch compiler
  selftests/nolibc: don't embed initramfs into kernel image
  selftests/nolibc: libc-test: avoid -Wstringop-overflow warnings
  tools/nolibc: string: Remove the `_nolibc_memcpy_up()` function
  tools/nolibc: string: Remove the `_nolibc_memcpy_down()` function
  tools/nolibc: x86-64: Use `rep stosb` for `memset()`
  tools/nolibc: x86-64: Use `rep movsb` for `memcpy()` and `memmove()`
  selftests/nolibc: use -nostdinc for nolibc-test
  tools/nolibc: add stdarg.h header
2023-10-30 17:52:45 -10:00
Linus Torvalds f0d25b5d0f x86 MM handling code changes for v6.7:
- Add new NX-stack self-test
  - Improve NUMA partial-CFMWS handling
  - Fix #VC handler bugs resulting in SEV-SNP boot failures
  - Drop the 4MB memory size restriction on minimal NUMA nodes
  - Reorganize headers a bit, in preparation to header dependency reduction efforts
  - Misc cleanups & fixes
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmU9Ek4RHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1gIJQ/+Mg6mzMaThyNXqhJszeZJBmDaBv2sqjAB
 5tcferg1nJBdNBzX8bJ95UFt9fIqeYAcgH00qlQCYSmyzbC1TQTk9U2Pre1zbOw4
 042ONK8sygKSje1zdYleHoBeqwnxD2VNM0NwBElhGjumwHRng/tbLiI9wx6qiz+C
 VsFXavkBszHGA1pjy9wZLGixYIH5jCygMpH134Wp+CIhpS+C4nftcGdIL1D5Oil1
 6Tm2XeI6uyfiQhm9IOwDjfoYeC7gUjx1rp8rHseGUMJxyO/BX9q5j1ixbsVriqfW
 97ucYuRL9mza7ic516C9v7OlAA3AGH2xWV+SYOGK88i9Co4kYzP4WnamxXqOsD8+
 popxG55oa6QelhaouTBZvgERpZ4fWupSDs/UccsDaE9leMCerNEbGHEzt/Mm/2sw
 xopjMQ0y5Kn6/fS0dLv8U+XHu4ANkvXJkFd6Ny0h/WfgGefuQOOTG9ruYgfeqqB8
 dViQ4R7CO8ySjD45KawAZl/EqL86x1M/CI1nlt0YY4vNwUuOJbebL7Jn8w3Fjxm5
 FVfUlDmcPdhZfL9Vnrsi6MIou1cU1yJPw4D6sXJ4sg4s7A4ebBcRRrjayVQ4msjv
 Q7cvBOMnWEHhOV11pvP50FmQuj74XW3bUqiuWrnK1SypvnhHavF6kc1XYpBLs1xZ
 y8nueJW2qPw=
 =tT5F
 -----END PGP SIGNATURE-----

Merge tag 'x86-mm-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 mm handling updates from Ingo Molnar:

 - Add new NX-stack self-test

 - Improve NUMA partial-CFMWS handling

 - Fix #VC handler bugs resulting in SEV-SNP boot failures

 - Drop the 4MB memory size restriction on minimal NUMA nodes

 - Reorganize headers a bit, in preparation to header dependency
   reduction efforts

 - Misc cleanups & fixes

* tag 'x86-mm-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm: Drop the 4 MB restriction on minimal NUMA node memory size
  selftests/x86/lam: Zero out buffer for readlink()
  x86/sev: Drop unneeded #include
  x86/sev: Move sev_setup_arch() to mem_encrypt.c
  x86/tdx: Replace deprecated strncpy() with strtomem_pad()
  selftests/x86/mm: Add new test that userspace stack is in fact NX
  x86/sev: Make boot_ghcb_page[] static
  x86/boot: Move x86_cache_alignment initialization to correct spot
  x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach
  x86/sev-es: Allow copy_from_kernel_nofault() in earlier boot
  x86_64: Show CR4.PSE on auxiliaries like on BSP
  x86/iommu/docs: Update AMD IOMMU specification document URL
  x86/sev/docs: Update document URL in amd-memory-encryption.rst
  x86/mm: Move arch_memory_failure() and arch_is_platform_page() definitions from <asm/processor.h> to <asm/pgtable.h>
  ACPI/NUMA: Apply SRAT proximity domain to entire CFMWS window
  x86/numa: Introduce numa_fill_memblks()
2023-10-30 15:40:57 -10:00
Oliver Upton 123f42f0ad Merge branch kvm-arm64/pmu_pmcr_n into kvmarm/next
* kvm-arm64/pmu_pmcr_n:
  : User-defined PMC limit, courtesy Raghavendra Rao Ananta
  :
  : Certain VMMs may want to reserve some PMCs for host use while running a
  : KVM guest. This was a bit difficult before, as KVM advertised all
  : supported counters to the guest. Userspace can now limit the number of
  : advertised PMCs by writing to PMCR_EL0.N, as KVM's sysreg and PMU
  : emulation enforce the specified limit for handling guest accesses.
  KVM: selftests: aarch64: vPMU test for validating user accesses
  KVM: selftests: aarch64: vPMU register test for unimplemented counters
  KVM: selftests: aarch64: vPMU register test for implemented counters
  KVM: selftests: aarch64: Introduce vpmu_counter_access test
  tools: Import arm_pmuv3.h
  KVM: arm64: PMU: Allow userspace to limit PMCR_EL0.N for the guest
  KVM: arm64: Sanitize PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR} before first run
  KVM: arm64: Add {get,set}_user for PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR}
  KVM: arm64: PMU: Set PMCR_EL0.N for vCPU based on the associated PMU
  KVM: arm64: PMU: Add a helper to read a vCPU's PMCR_EL0
  KVM: arm64: Select default PMU in KVM_ARM_VCPU_INIT handler
  KVM: arm64: PMU: Introduce helpers to set the guest's PMU

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-30 20:24:19 +00:00
Oliver Upton a87a36436c Merge branch kvm-arm64/writable-id-regs into kvmarm/next
* kvm-arm64/writable-id-regs:
  : Writable ID registers, courtesy of Jing Zhang
  :
  : This series significantly expands the architectural feature set that
  : userspace can manipulate via the ID registers. A new ioctl is defined
  : that makes the mutable fields in the ID registers discoverable to
  : userspace.
  KVM: selftests: Avoid using forced target for generating arm64 headers
  tools headers arm64: Fix references to top srcdir in Makefile
  KVM: arm64: selftests: Test for setting ID register from usersapce
  tools headers arm64: Update sysreg.h with kernel sources
  KVM: selftests: Generate sysreg-defs.h and add to include path
  perf build: Generate arm64's sysreg-defs.h and add to include path
  tools: arm64: Add a Makefile for generating sysreg-defs.h
  KVM: arm64: Document vCPU feature selection UAPIs
  KVM: arm64: Allow userspace to change ID_AA64ZFR0_EL1
  KVM: arm64: Allow userspace to change ID_AA64PFR0_EL1
  KVM: arm64: Allow userspace to change ID_AA64MMFR{0-2}_EL1
  KVM: arm64: Allow userspace to change ID_AA64ISAR{0-2}_EL1
  KVM: arm64: Bump up the default KVM sanitised debug version to v8p8
  KVM: arm64: Reject attempts to set invalid debug arch version
  KVM: arm64: Advertise selected DebugVer in DBGDIDR.Version
  KVM: arm64: Use guest ID register values for the sake of emulation
  KVM: arm64: Document KVM_ARM_GET_REG_WRITABLE_MASKS
  KVM: arm64: Allow userspace to get the writable masks for feature ID registers

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-30 20:21:09 +00:00
Oliver Upton 70c7b704ca KVM: selftests: Avoid using forced target for generating arm64 headers
The 'prepare' target that generates the arm64 sysreg headers had no
prerequisites, so it wound up forcing a rebuild of all KVM selftests
each invocation. Add a rule for the generated headers and just have
dependents use that for a prerequisite.

Reported-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Fixes: 9697d84cc3 ("KVM: selftests: Generate sysreg-defs.h and add to include path")
Tested-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Link: https://lore.kernel.org/r/20231027005439.3142015-3-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-30 20:20:39 +00:00
Oliver Upton 054056bf98 Merge branch kvm-arm64/misc into kvmarm/next
* kvm-arm64/misc:
  : Miscellaneous updates
  :
  :  - Put an upper bound on the number of I-cache invalidations by
  :    cacheline to avoid soft lockups
  :
  :  - Get rid of bogus refererence count transfer for THP mappings
  :
  :  - Do a local TLB invalidation on permission fault race
  :
  :  - Fixes for page_fault_test KVM selftest
  :
  :  - Add a tracepoint for detecting MMIO instructions unsupported by KVM
  KVM: arm64: Add tracepoint for MMIO accesses where ISV==0
  KVM: arm64: selftest: Perform ISB before reading PAR_EL1
  KVM: arm64: selftest: Add the missing .guest_prepare()
  KVM: arm64: Always invalidate TLB for stage-2 permission faults
  KVM: arm64: Do not transfer page refcount for THP adjustment
  KVM: arm64: Avoid soft lockups due to I-cache maintenance
  arm64: tlbflush: Rename MAX_TLBI_OPS
  KVM: arm64: Don't use kerneldoc comment for arm64_check_features()

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-30 20:18:00 +00:00
Zenghui Yu 06899aa5dd KVM: arm64: selftest: Perform ISB before reading PAR_EL1
It looks like a mistake to issue ISB *after* reading PAR_EL1, we should
instead perform it between the AT instruction and the reads of PAR_EL1.

As according to DDI0487J.a IJTYVP,

"When an address translation instruction is executed, explicit
 synchronization is required to guarantee the result is visible to
 subsequent direct reads of PAR_EL1."

Otherwise all guest_at testcases fail on my box with

==== Test Assertion Failure ====
  aarch64/page_fault_test.c:142: par & 1 == 0
  pid=1355864 tid=1355864 errno=4 - Interrupted system call
     1	0x0000000000402853: vcpu_run_loop at page_fault_test.c:681
     2	0x0000000000402cdb: run_test at page_fault_test.c:730
     3	0x0000000000403897: for_each_guest_mode at guest_modes.c:100
     4	0x00000000004019f3: for_each_test_and_guest_mode at page_fault_test.c:1105
     5	 (inlined by) main at page_fault_test.c:1131
     6	0x0000ffffb153c03b: ?? ??:0
     7	0x0000ffffb153c113: ?? ??:0
     8	0x0000000000401aaf: _start at ??:?
  0x1 != 0x0 (par & 1 != 0)

Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231007124043.626-2-yuzenghui@huawei.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-30 20:12:46 +00:00
Zenghui Yu beaf35b480 KVM: arm64: selftest: Add the missing .guest_prepare()
Running page_fault_test on a Cortex A72 fails with

Test: ro_memslot_no_syndrome_guest_cas
Testing guest mode: PA-bits:40,  VA-bits:48,  4K pages
Testing memory backing src type: anonymous
==== Test Assertion Failure ====
  aarch64/page_fault_test.c:117: guest_check_lse()
  pid=1944087 tid=1944087 errno=4 - Interrupted system call
     1	0x00000000004028b3: vcpu_run_loop at page_fault_test.c:682
     2	0x0000000000402d93: run_test at page_fault_test.c:731
     3	0x0000000000403957: for_each_guest_mode at guest_modes.c:100
     4	0x00000000004019f3: for_each_test_and_guest_mode at page_fault_test.c:1108
     5	 (inlined by) main at page_fault_test.c:1134
     6	0x0000ffff868e503b: ?? ??:0
     7	0x0000ffff868e5113: ?? ??:0
     8	0x0000000000401aaf: _start at ??:?
  guest_check_lse()

because we don't have a guest_prepare stage to check the presence of
FEAT_LSE and skip the related guest_cas testing, and we end-up failing in
GUEST_ASSERT(guest_check_lse()).

Add the missing .guest_prepare() where it's indeed required.

Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231007124043.626-1-yuzenghui@huawei.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-30 20:12:46 +00:00
Robert Richter f611d98a00 cxl/pci: Remove Component Register base address from struct cxl_dev_state
The Component Register base address @component_reg_phys is no longer
used after the rework of the Component Register setup which now uses
struct member @reg_map instead. Remove the base address.

Signed-off-by: Terry Bowman <terry.bowman@amd.com>
Signed-off-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20231018171713.1883517-9-rrichter@amd.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-10-27 20:13:37 -07:00
Vishal Verma 8f61d48c83 tools/testing/cxl: Slow down the mock firmware transfer
The cxl-cli unit test for firmware update does operations like starting
an asynchronous firmware update, making sure it is in progress, and
attempting to cancel it. In some cases, such as with no or minimal
dynamic debugging turned on, the firmware update completes too quickly,
not allowing the test to have a chance to verify it was in progress.
This caused a failure of the signature:

  expected fw_update_in_progress:true
  test/cxl-update-firmware.sh: failed at line 88

Fix this by adding a delay (~1.5 - 2 ms) to each firmware transfer
request handled by the mocked interface.

Reported-by: Dan Williams <dan.j.williams@intel.com>
Tested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Link: https://lore.kernel.org/r/20231026-vv-fw_upd_test_fix-v2-1-5282fd193883@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-10-27 13:04:52 -07:00
Jim Harris 98a04c7ace cxl/region: Fix x1 root-decoder granularity calculations
Root decoder granularity must match value from CFWMS, which may not
be the region's granularity for non-interleaved root decoders.

So when calculating granularities for host bridge decoders, use the
region's granularity instead of the root decoder's granularity to ensure
the correct granularities are set for the host bridge decoders and any
downstream switch decoders.

Test configuration is 1 host bridge * 2 switches * 2 endpoints per switch.

Region created with 2048 granularity using following command line:

cxl create-region -m -d decoder0.0 -w 4 mem0 mem2 mem1 mem3 \
		  -g 2048 -s 2048M

Use "cxl list -PDE | grep granularity" to get a view of the granularity
set at each level of the topology.

Before this patch:
        "interleave_granularity":2048,
        "interleave_granularity":2048,
    "interleave_granularity":512,
        "interleave_granularity":2048,
        "interleave_granularity":2048,
    "interleave_granularity":512,
"interleave_granularity":256,

After:
        "interleave_granularity":2048,
        "interleave_granularity":2048,
    "interleave_granularity":4096,
        "interleave_granularity":2048,
        "interleave_granularity":2048,
    "interleave_granularity":4096,
"interleave_granularity":2048,

Fixes: 27b3f8d138 ("cxl/region: Program target lists")
Cc: <stable@vger.kernel.org>
Signed-off-by: Jim Harris <jim.harris@samsung.com>
Link: https://lore.kernel.org/r/169824893473.1403938.16110924262989774582.stgit@bgt-140510-bm03.eng.stellus.in
[djbw: fixup the prebuilt cxl_test region]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-10-27 13:04:52 -07:00
Mickaël Salaün f12f8f8450
selftests/landlock: Add tests for FS topology changes with network rules
Add 2 tests to the layout1 fixture:
* topology_changes_with_net_only: Checks that FS topology
  changes are not denied by network-only restrictions.
* topology_changes_with_net_and_fs: Make sure that FS topology
  changes are still denied with FS and network restrictions.

This specifically test commit d722036403 ("landlock: Allow FS topology
changes for domains without such rule type").

Cc: Konstantin Meskhidze <konstantin.meskhidze@huawei.com>
Link: https://lore.kernel.org/r/20231027154615.815134-1-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2023-10-27 17:53:31 +02:00
Geliang Tang 629b35a225 selftests: mptcp: display simult in extra_msg
Just like displaying "invert" after "Info: ", "simult" should be
displayed too when rm_subflow_nr doesn't match the expect value in
chk_rm_nr():

      syn                                 [ ok ]
      synack                              [ ok ]
      ack                                 [ ok ]
      add                                 [ ok ]
      echo                                [ ok ]
      rm                                  [ ok ]
      rmsf                                [ ok ] 3 in [2:4]
      Info: invert simult

      syn                                 [ ok ]
      synack                              [ ok ]
      ack                                 [ ok ]
      add                                 [ ok ]
      echo                                [ ok ]
      rm                                  [ ok ]
      rmsf                                [ ok ]
      Info: invert

Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231025-send-net-next-20231025-v1-10-db8f25f798eb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-27 08:47:30 -07:00
Geliang Tang e71aab6777 selftests: mptcp: sockopt: drop mptcp_connect var
Global var mptcp_connect defined at the front of mptcp_sockopt.sh is
duplicate with local var mptcp_connect defined in do_transfer(), drop
this useless global one.

Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231025-send-net-next-20231025-v1-9-db8f25f798eb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-27 08:47:30 -07:00
Geliang Tang 9168ea02b8 selftests: mptcp: fix wait_rm_addr/sf parameters
The second input parameter of 'wait_rm_addr/sf $1 1' is misused. If it's
1, wait_rm_addr/sf will never break, and will loop ten times, then
'wait_rm_addr/sf' equals to 'sleep 1'. This delay time is too long,
which can sometimes make the tests fail.

A better way to use wait_rm_addr/sf is to use rm_addr/sf_count to obtain
the current value, and then pass into wait_rm_addr/sf.

Fixes: 4369c198e5 ("selftests: mptcp: test userspace pm out of transfer")
Cc: stable@vger.kernel.org
Suggested-by: Matthieu Baerts <matttbe@kernel.org>
Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231025-send-net-next-20231025-v1-2-db8f25f798eb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-27 08:47:07 -07:00
Geliang Tang f4a75e9d11 selftests: mptcp: run userspace pm tests slower
Some userspace pm tests failed are reported by CI:

112 userspace pm add & remove address
      syn                                 [ ok ]
      synack                              [ ok ]
      ack                                 [ ok ]
      add                                 [ ok ]
      echo                                [ ok ]
      mptcp_info subflows=1:1             [ ok ]
      subflows_total 2:2                  [ ok ]
      mptcp_info add_addr_signal=1:1      [ ok ]
      rm                                  [ ok ]
      rmsf                                [ ok ]
      Info: invert
      mptcp_info subflows=0:0             [ ok ]
      subflows_total 1:1                  [fail]
                         got subflows 0:0 expected 1:1
Server ns stats
TcpPassiveOpens                 2                  0.0
TcpInSegs                       118                0.0

This patch fixes them by changing 'speed' to 5 to run the tests much more
slowly.

Fixes: 4369c198e5 ("selftests: mptcp: test userspace pm out of transfer")
Cc: stable@vger.kernel.org
Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231025-send-net-next-20231025-v1-1-db8f25f798eb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-27 08:47:07 -07:00
Ido Schimmel 0514dd0593 selftests: vxlan_mdb: Use MDB get instead of dump
Test the new MDB get functionality by converting dump and grep to MDB
get.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27 10:51:42 +01:00
Ido Schimmel e8bba9e83c selftests: bridge_mdb: Use MDB get instead of dump
Test the new MDB get functionality by converting dump and grep to MDB
get.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27 10:51:42 +01:00
Jakub Kicinski c6f9b7138b bpf-next-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZTp12QAKCRDbK58LschI
 g8BrAQDifqp5liEEdXV8jdReBwJtqInjrL5tzy5LcyHUMQbTaAEA6Ph3Ct3B+3oA
 mFnIW/y6UJiJrby0Xz4+vV5BXI/5WQg=
 =pLCV
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2023-10-26

We've added 51 non-merge commits during the last 10 day(s) which contain
a total of 75 files changed, 5037 insertions(+), 200 deletions(-).

The main changes are:

1) Add open-coded task, css_task and css iterator support.
   One of the use cases is customizable OOM victim selection via BPF,
   from Chuyi Zhou.

2) Fix BPF verifier's iterator convergence logic to use exact states
   comparison for convergence checks, from Eduard Zingerman,
   Andrii Nakryiko and Alexei Starovoitov.

3) Add BPF programmable net device where bpf_mprog defines the logic
   of its xmit routine. It can operate in L3 and L2 mode,
   from Daniel Borkmann and Nikolay Aleksandrov.

4) Batch of fixes for BPF per-CPU kptr and re-enable unit_size checking
   for global per-CPU allocator, from Hou Tao.

5) Fix libbpf which eagerly assumed that SHT_GNU_verdef ELF section
   was going to be present whenever a binary has SHT_GNU_versym section,
   from Andrii Nakryiko.

6) Fix BPF ringbuf correctness to fold smp_mb__before_atomic() into
   atomic_set_release(), from Paul E. McKenney.

7) Add a warning if NAPI callback missed xdp_do_flush() under
   CONFIG_DEBUG_NET which helps checking if drivers were missing
   the former, from Sebastian Andrzej Siewior.

8) Fix missed RCU read-lock in bpf_task_under_cgroup() which was throwing
   a warning under sleepable programs, from Yafang Shao.

9) Avoid unnecessary -EBUSY from htab_lock_bucket by disabling IRQ before
   checking map_locked, from Song Liu.

10) Make BPF CI linked_list failure test more robust,
    from Kumar Kartikeya Dwivedi.

11) Enable samples/bpf to be built as PIE in Fedora, from Viktor Malik.

12) Fix xsk starving when multiple xsk sockets were associated with
    a single xsk_buff_pool, from Albert Huang.

13) Clarify the signed modulo implementation for the BPF ISA standardization
    document that it uses truncated division, from Dave Thaler.

14) Improve BPF verifier's JEQ/JNE branch taken logic to also consider
    signed bounds knowledge, from Andrii Nakryiko.

15) Add an option to XDP selftests to use multi-buffer AF_XDP
    xdp_hw_metadata and mark used XDP programs as capable to use frags,
    from Larysa Zaremba.

16) Fix bpftool's BTF dumper wrt printing a pointer value and another
    one to fix struct_ops dump in an array, from Manu Bretelle.

* tag 'for-netdev' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (51 commits)
  netkit: Remove explicit active/peer ptr initialization
  selftests/bpf: Fix selftests broken by mitigations=off
  samples/bpf: Allow building with custom bpftool
  samples/bpf: Fix passing LDFLAGS to libbpf
  samples/bpf: Allow building with custom CFLAGS/LDFLAGS
  bpf: Add more WARN_ON_ONCE checks for mismatched alloc and free
  selftests/bpf: Add selftests for netkit
  selftests/bpf: Add netlink helper library
  bpftool: Extend net dump with netkit progs
  bpftool: Implement link show support for netkit
  libbpf: Add link-based API for netkit
  tools: Sync if_link uapi header
  netkit, bpf: Add bpf programmable net device
  bpf: Improve JEQ/JNE branch taken logic
  bpf: Fold smp_mb__before_atomic() into atomic_set_release()
  bpf: Fix unnecessary -EBUSY from htab_lock_bucket
  xsk: Avoid starving the xsk further down the list
  bpf: print full verifier states on infinite loop detection
  selftests/bpf: test if state loops are detected in a tricky case
  bpf: correct loop detection for iterators convergence
  ...
====================

Link: https://lore.kernel.org/r/20231026150509.2824-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-26 20:02:41 -07:00
Jakub Kicinski ec4c20ca09 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

net/mac80211/rx.c
  91535613b6 ("wifi: mac80211: don't drop all unprotected public action frames")
  6c02fab724 ("wifi: mac80211: split ieee80211_drop_unencrypted_mgmt() return value")

Adjacent changes:

drivers/net/ethernet/apm/xgene/xgene_enet_main.c
  61471264c0 ("net: ethernet: apm: Convert to platform remove callback returning void")
  d2ca43f306 ("net: xgene: Fix unused xgene_enet_of_match warning for !CONFIG_OF")

net/vmw_vsock/virtio_transport.c
  64c99d2d6a ("vsock/virtio: support to send non-linear skb")
  53b08c4985 ("vsock/virtio: initialize the_virtio_vsock before using VQs")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-26 13:46:28 -07:00
Konstantin Meskhidze a549d055a2
selftests/landlock: Add network tests
Add 82 test suites to check edge cases related to bind() and connect()
actions. They are defined with 6 fixtures and their variants:

The "protocol" fixture is extended with 12 variants defined as a matrix
of: sandboxed/not-sandboxed, IPv4/IPv6/unix network domain, and
stream/datagram socket. 4 related tests suites are defined:
* bind: Tests bind action.
* connect: Tests connect action.
* bind_unspec: Tests bind action with the AF_UNSPEC socket family.
* connect_unspec: Tests connect action with the AF_UNSPEC socket family.

The "ipv4" fixture is extended with 4 variants defined as a matrix
of: sandboxed/not-sandboxed, and stream/datagram socket. 1 related test
suite is defined:
* from_unix_to_inet: Tests to make sure unix sockets' actions are not
  restricted by Landlock rules applied to TCP ones.

The "tcp_layers" fixture is extended with 8 variants defined as a matrix
of: IPv4/IPv6 network domain, and different number of landlock rule
layers. 2 related tests suites are defined:
* ruleset_overlap: Tests nested layers with less constraints.
* ruleset_expand: Tests nested layers with more constraints.

In the "mini" fixture 4 tests suites are defined:
* network_access_rights: Tests handling of known access rights.
* unknown_access_rights: Tests handling of unknown access rights.
* inval: Tests unhandled allowed access and zero access value.
* tcp_port_overflow: Tests with port values greater than 65535.

The "ipv4_tcp" fixture supports IPv4 network domain with stream socket.
2 tests suites are defined:
* port_endianness: Tests with big/little endian port formats.
* with_fs: Tests a ruleset with both filesystem and network
  restrictions.

The "port_specific" fixture is extended with 4 variants defined
as a matrix of: sandboxed/not-sandboxed, IPv4/IPv6 network domain,
and stream socket. 2 related tests suites are defined:
* bind_connect_zero: Tests with port 0.
* bind_connect_1023: Tests with port 1023.

Test coverage for security/landlock is 92.4% of 710 lines according to
gcc/gcov-13.

Signed-off-by: Konstantin Meskhidze <konstantin.meskhidze@huawei.com>
Link: https://lore.kernel.org/r/20231026014751.414649-11-konstantin.meskhidze@huawei.com
[mic: Extend commit message, update test coverage, clean up capability
use, fix useless TEST_F_FORK, and improve ipv4_tcp.with_fs]
Co-developed-by: Mickaël Salaün <mic@digikod.net>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2023-10-26 21:07:16 +02:00
Konstantin Meskhidze 1fa335209f
selftests/landlock: Share enforce_ruleset() helper
Move enforce_ruleset() helper function to common.h so that it can be
used both by filesystem tests and network ones.

Signed-off-by: Konstantin Meskhidze <konstantin.meskhidze@huawei.com>
Link: https://lore.kernel.org/r/20231026014751.414649-10-konstantin.meskhidze@huawei.com
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2023-10-26 21:07:15 +02:00
Konstantin Meskhidze fff69fb03d
landlock: Support network rules with TCP bind and connect
Add network rules support in the ruleset management helpers and the
landlock_create_ruleset() syscall. Extend user space API to support
network actions:
* Add new network access rights: LANDLOCK_ACCESS_NET_BIND_TCP and
  LANDLOCK_ACCESS_NET_CONNECT_TCP.
* Add a new network rule type: LANDLOCK_RULE_NET_PORT tied to struct
  landlock_net_port_attr. The allowed_access field contains the network
  access rights, and the port field contains the port value according to
  the controlled protocol. This field can take up to a 64-bit value
  but the maximum value depends on the related protocol (e.g. 16-bit
  value for TCP). Network port is in host endianness [1].
* Add a new handled_access_net field to struct landlock_ruleset_attr
  that contains network access rights.
* Increment the Landlock ABI version to 4.

Implement socket_bind() and socket_connect() LSM hooks, which enable
to control TCP socket binding and connection to specific ports.

Expand access_masks_t from u16 to u32 to be able to store network access
rights alongside filesystem access rights for rulesets' handled access
rights.

Access rights are not tied to socket file descriptors but checked at
bind() or connect() call time against the caller's Landlock domain. For
the filesystem, a file descriptor is a direct access to a file/data.
However, for network sockets, we cannot identify for which data or peer
a newly created socket will give access to. Indeed, we need to wait for
a connect or bind request to identify the use case for this socket.
Likewise a directory file descriptor may enable to open another file
(i.e. a new data item), but this opening is also restricted by the
caller's domain, not the file descriptor's access rights [2].

[1] https://lore.kernel.org/r/278ab07f-7583-a4e0-3d37-1bacd091531d@digikod.net
[2] https://lore.kernel.org/r/263c1eb3-602f-57fe-8450-3f138581bee7@digikod.net

Signed-off-by: Konstantin Meskhidze <konstantin.meskhidze@huawei.com>
Link: https://lore.kernel.org/r/20231026014751.414649-9-konstantin.meskhidze@huawei.com
[mic: Extend commit message, fix typo in comments, and specify
endianness in the documentation]
Co-developed-by: Mickaël Salaün <mic@digikod.net>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2023-10-26 21:07:15 +02:00
Catalin Marinas 2baca17e6a Merge branch 'for-next/feat_lse128' into for-next/core
* for-next/feat_lse128:
  : HWCAP for FEAT_LSE128
  kselftest/arm64: add FEAT_LSE128 to hwcap test
  arm64: add FEAT_LSE128 HWCAP
2023-10-26 17:10:07 +01:00
Catalin Marinas 023113fe66 Merge branch 'for-next/feat_lrcpc3' into for-next/core
* for-next/feat_lrcpc3:
  : HWCAP for FEAT_LRCPC3
  selftests/arm64: add HWCAP2_LRCPC3 test
  arm64: add FEAT_LRCPC3 HWCAP
2023-10-26 17:10:05 +01:00
Catalin Marinas 2a3f8ce3bb Merge branch 'for-next/feat_sve_b16b16' into for-next/core
* for-next/feat_sve_b16b16:
  : Add support for FEAT_SVE_B16B16 (BFloat16)
  kselftest/arm64: Verify HWCAP2_SVE_B16B16
  arm64/sve: Report FEAT_SVE_B16B16 to userspace
2023-10-26 17:10:01 +01:00
Nicolin Chen 55a01657cb iommufd/selftest: Add coverage for IOMMU_HWPT_ALLOC with nested HWPTs
The IOMMU_HWPT_ALLOC ioctl now supports passing user_data to allocate a
user-managed domain for nested HWPTs. Add its coverage for that. Also,
update _test_cmd_hwpt_alloc() and add test_cmd/err_hwpt_alloc_nested().

Link: https://lore.kernel.org/r/20231026043938.63898-11-yi.l.liu@intel.com
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-10-26 11:15:57 -03:00
Yafang Shao 399f6185a1 selftests/bpf: Fix selftests broken by mitigations=off
When we configure the kernel command line with 'mitigations=off' and set
the sysctl knob 'kernel.unprivileged_bpf_disabled' to 0, the commit
bc5bc309db ("bpf: Inherit system settings for CPU security mitigations")
causes issues in the execution of `test_progs -t verifier`. This is
because 'mitigations=off' bypasses Spectre v1 and Spectre v4 protections.

Currently, when a program requests to run in unprivileged mode
(kernel.unprivileged_bpf_disabled = 0), the BPF verifier may prevent
it from running due to the following conditions not being enabled:

  - bypass_spec_v1
  - bypass_spec_v4
  - allow_ptr_leaks
  - allow_uninit_stack

While 'mitigations=off' enables the first two conditions, it does not
enable the latter two. As a result, some test cases in
'test_progs -t verifier' that were expected to fail to run may run
successfully, while others still fail but with different error messages.
This makes it challenging to address them comprehensively.

Moreover, in the future, we may introduce more fine-grained control over
CPU mitigations, such as enabling only bypass_spec_v1 or bypass_spec_v4.

Given the complexity of the situation, rather than fixing each broken test
case individually, it's preferable to skip them when 'mitigations=off' is
in effect and introduce specific test cases for the new 'mitigations=off'
scenario. For instance, we can introduce new BTF declaration tags like
'__failure__nospec', '__failure_nospecv1' and '__failure_nospecv4'.

In this patch, the approach is to simply skip the broken test cases when
'mitigations=off' is enabled. The result of `test_progs -t verifier` as
follows after this commit,

Before this commit
==================

- without 'mitigations=off'
  - kernel.unprivileged_bpf_disabled = 2
    Summary: 74/948 PASSED, 388 SKIPPED, 0 FAILED
  - kernel.unprivileged_bpf_disabled = 0
    Summary: 74/1336 PASSED, 0 SKIPPED, 0 FAILED    <<<<
- with 'mitigations=off'
  - kernel.unprivileged_bpf_disabled = 2
    Summary: 74/948 PASSED, 388 SKIPPED, 0 FAILED
  - kernel.unprivileged_bpf_disabled = 0
    Summary: 63/1276 PASSED, 0 SKIPPED, 11 FAILED   <<<< 11 FAILED

After this commit
=================

- without 'mitigations=off'
  - kernel.unprivileged_bpf_disabled = 2
    Summary: 74/948 PASSED, 388 SKIPPED, 0 FAILED
  - kernel.unprivileged_bpf_disabled = 0
    Summary: 74/1336 PASSED, 0 SKIPPED, 0 FAILED    <<<<
- with this patch, with 'mitigations=off'
  - kernel.unprivileged_bpf_disabled = 2
    Summary: 74/948 PASSED, 388 SKIPPED, 0 FAILED
  - kernel.unprivileged_bpf_disabled = 0
    Summary: 74/948 PASSED, 388 SKIPPED, 0 FAILED   <<<< SKIPPED

Fixes: bc5bc309db ("bpf: Inherit system settings for CPU security mitigations")
Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Closes: https://lore.kernel.org/bpf/CAADnVQKUBJqg+hHtbLeeC2jhoJAWqnmRAzXW3hmUCNSV9kx4sQ@mail.gmail.com
Link: https://lore.kernel.org/bpf/20231025031144.5508-1-laoar.shao@gmail.com
2023-10-26 15:42:03 +02:00
Daniel Borkmann ace15f91e5 selftests/bpf: Add selftests for netkit
Add a bigger batch of test coverage to assert correct operation of
netkit devices and their BPF program management:

  # ./test_progs -t tc_netkit
  [...]
  [    1.166267] bpf_testmod: loading out-of-tree module taints kernel.
  [    1.166831] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel
  [    1.270957] tsc: Refined TSC clocksource calibration: 3407.988 MHz
  [    1.272579] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fc932722, max_idle_ns: 440795381586 ns
  [    1.275336] clocksource: Switched to clocksource tsc
  #257     tc_netkit_basic:OK
  #258     tc_netkit_device:OK
  #259     tc_netkit_multi_links:OK
  #260     tc_netkit_multi_opts:OK
  #261     tc_netkit_neigh_links:OK
  Summary: 5/0 PASSED, 0 SKIPPED, 0 FAILED
  [...]

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20231024214904.29825-8-daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-10-24 16:07:43 -07:00
Daniel Borkmann 51f1892b52 selftests/bpf: Add netlink helper library
Add a minimal netlink helper library for the BPF selftests. This has been
taken and cut down and cleaned up from iproute2. This covers basics such
as netdevice creation which we need for BPF selftests / BPF CI given
iproute2 package cannot cover it yet.

Stanislav Fomichev suggested that this could be replaced in future by ynl
tool generated C code once it has RTNL support to create devices. Once we
get to this point the BPF CI would also need to add libmnl. If no further
extensions are needed, a second option could be that we remove this code
again once iproute2 package has support.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20231024214904.29825-7-daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-10-24 16:07:39 -07:00
Raghavendra Rao Ananta 62708be351 KVM: selftests: aarch64: vPMU test for validating user accesses
Add a vPMU test scenario to validate the userspace accesses for
the registers PM{C,I}NTEN{SET,CLR} and PMOVS{SET,CLR} to ensure
that KVM honors the architectural definitions of these registers
for a given PMCR.N.

Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231020214053.2144305-13-rananta@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-24 22:59:31 +00:00
Reiji Watanabe e1cc872063 KVM: selftests: aarch64: vPMU register test for unimplemented counters
Add a new test case to the vpmu_counter_access test to check
if PMU registers or their bits for unimplemented counters are not
accessible or are RAZ, as expected.

Signed-off-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231020214053.2144305-12-rananta@google.com
[Oliver: fix issues relating to exception return address]
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-24 22:59:31 +00:00
Reiji Watanabe ada1ae6826 KVM: selftests: aarch64: vPMU register test for implemented counters
Add a new test case to the vpmu_counter_access test to check if PMU
registers or their bits for implemented counters on the vCPU are
readable/writable as expected, and can be programmed to count events.

Signed-off-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231020214053.2144305-11-rananta@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-24 22:59:31 +00:00
Reiji Watanabe 8d0aebe1ca KVM: selftests: aarch64: Introduce vpmu_counter_access test
Introduce vpmu_counter_access test for arm64 platforms.
The test configures PMUv3 for a vCPU, sets PMCR_EL0.N for the vCPU,
and check if the guest can consistently see the same number of the
PMU event counters (PMCR_EL0.N) that userspace sets.
This test case is done with each of the PMCR_EL0.N values from
0 to 31 (With the PMCR_EL0.N values greater than the host value,
the test expects KVM_SET_ONE_REG for the PMCR_EL0 to fail).

Signed-off-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231020214053.2144305-10-rananta@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-24 22:59:31 +00:00
Swarup Laxman Kotiaklapudi 37a38e439d selftests: net: change ifconfig with ip command
Change ifconfig with ip command, on a system where ifconfig is
not used this script will not work correcly.

Test result with this patchset:

sudo make TARGETS="net" kselftest
....
TAP version 13
1..1
 timeout set to 1500
 selftests: net: route_localnet.sh
 run arp_announce test
 net.ipv4.conf.veth0.route_localnet = 1
 net.ipv4.conf.veth1.route_localnet = 1
 net.ipv4.conf.veth0.arp_announce = 2
 net.ipv4.conf.veth1.arp_announce = 2
 PING 127.25.3.14 (127.25.3.14) from 127.25.3.4 veth0: 56(84)
  bytes of data.
 64 bytes from 127.25.3.14: icmp_seq=1 ttl=64 time=0.038 ms
 64 bytes from 127.25.3.14: icmp_seq=2 ttl=64 time=0.068 ms
 64 bytes from 127.25.3.14: icmp_seq=3 ttl=64 time=0.068 ms
 64 bytes from 127.25.3.14: icmp_seq=4 ttl=64 time=0.068 ms
 64 bytes from 127.25.3.14: icmp_seq=5 ttl=64 time=0.068 ms

 --- 127.25.3.14 ping statistics ---
 5 packets transmitted, 5 received, 0% packet loss, time 4073ms
 rtt min/avg/max/mdev = 0.038/0.062/0.068/0.012 ms
 ok
 run arp_ignore test
 net.ipv4.conf.veth0.route_localnet = 1
 net.ipv4.conf.veth1.route_localnet = 1
 net.ipv4.conf.veth0.arp_ignore = 3
 net.ipv4.conf.veth1.arp_ignore = 3
 PING 127.25.3.14 (127.25.3.14) from 127.25.3.4 veth0: 56(84)
  bytes of data.
 64 bytes from 127.25.3.14: icmp_seq=1 ttl=64 time=0.032 ms
 64 bytes from 127.25.3.14: icmp_seq=2 ttl=64 time=0.065 ms
 64 bytes from 127.25.3.14: icmp_seq=3 ttl=64 time=0.066 ms
 64 bytes from 127.25.3.14: icmp_seq=4 ttl=64 time=0.065 ms
 64 bytes from 127.25.3.14: icmp_seq=5 ttl=64 time=0.065 ms

 --- 127.25.3.14 ping statistics ---
 5 packets transmitted, 5 received, 0% packet loss, time 4092ms
 rtt min/avg/max/mdev = 0.032/0.058/0.066/0.013 ms
 ok
ok 1 selftests: net: route_localnet.sh
...

Signed-off-by: Swarup Laxman Kotiaklapudi <swarupkotikalapudi@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20231023123422.2895-1-swarupkotikalapudi@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-24 13:53:39 -07:00
Linus Torvalds 4f82870119 20 hotfixes. 12 are cc:stable and the remainder address post-6.5 issues
or aren't considered necessary for earlier kernel versions.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZTfz/QAKCRDdBJ7gKXxA
 joMyAP99hLaLYeJbjlf+4tLJzhlpbVoFra1ieun2D+ZgFE78xQD/T4T3PYrZhYqD
 WdrxGT9fiKOykXM5pmQRH9Zr4EvJBA0=
 =Obbk
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2023-10-24-09-40' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "20 hotfixes. 12 are cc:stable and the remainder address post-6.5
  issues or aren't considered necessary for earlier kernel versions"

* tag 'mm-hotfixes-stable-2023-10-24-09-40' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  maple_tree: add GFP_KERNEL to allocations in mas_expected_entries()
  selftests/mm: include mman header to access MREMAP_DONTUNMAP identifier
  mailmap: correct email aliasing for Oleksij Rempel
  mailmap: map Bartosz's old address to the current one
  mm/damon/sysfs: check DAMOS regions update progress from before_terminate()
  MAINTAINERS: Ondrej has moved
  kasan: disable kasan_non_canonical_hook() for HW tags
  kasan: print the original fault addr when access invalid shadow
  hugetlbfs: close race between MADV_DONTNEED and page fault
  hugetlbfs: extend hugetlb_vma_lock to private VMAs
  hugetlbfs: clear resv_map pointer if mmap fails
  mm: zswap: fix pool refcount bug around shrink_worker()
  mm/migrate: fix do_pages_move for compat pointers
  riscv: fix set_huge_pte_at() for NAPOT mappings when a swap entry is set
  riscv: handle VM_FAULT_[HWPOISON|HWPOISON_LARGE] faults instead of panicking
  mmap: fix error paths with dup_anon_vma()
  mmap: fix vma_iterator in error path of vma_merge()
  mm: fix vm_brk_flags() to not bail out while holding lock
  mm/mempolicy: fix set_mempolicy_home_node() previous VMA pointer
  mm/page_alloc: correct start page when guard page debug is enabled
2023-10-24 09:52:16 -10:00
Joao Martins 0795b305da iommufd/selftest: Test IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR flag
Change test_mock_dirty_bitmaps() to pass a flag where it specifies the flag
under test. The test does the same thing as the GET_DIRTY_BITMAP regular
test. Except that it tests whether the dirtied bits are fetched all the
same a second time, as opposed to observing them cleared.

Link: https://lore.kernel.org/r/20231024135109.73787-19-joao.m.martins@oracle.com
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-10-24 11:58:44 -03:00
Joao Martins ae36fe70ce iommufd/selftest: Test out_capabilities in IOMMU_GET_HW_INFO
Enumerate the capabilities from the mock device and test whether it
advertises as expected. Include it as part of the iommufd_dirty_tracking
fixture.

Link: https://lore.kernel.org/r/20231024135109.73787-18-joao.m.martins@oracle.com
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-10-24 11:58:44 -03:00
Joao Martins a9af47e382 iommufd/selftest: Test IOMMU_HWPT_GET_DIRTY_BITMAP
Add a new test ioctl for simulating the dirty IOVAs in the mock domain, and
implement the mock iommu domain ops that get the dirty tracking supported.

The selftest exercises the usual main workflow of:

1) Setting dirty tracking from the iommu domain
2) Read and clear dirty IOPTEs

Different fixtures will test different IOVA range sizes, that exercise
corner cases of the bitmaps.

Link: https://lore.kernel.org/r/20231024135109.73787-17-joao.m.martins@oracle.com
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-10-24 11:58:44 -03:00
Joao Martins 7adf267d66 iommufd/selftest: Test IOMMU_HWPT_SET_DIRTY_TRACKING
Change mock_domain to supporting dirty tracking and add tests to exercise
the new SET_DIRTY_TRACKING API in the iommufd_dirty_tracking selftest
fixture.

Link: https://lore.kernel.org/r/20231024135109.73787-16-joao.m.martins@oracle.com
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-10-24 11:58:44 -03:00
Joao Martins 266ce58989 iommufd/selftest: Test IOMMU_HWPT_ALLOC_DIRTY_TRACKING
In order to selftest the iommu domain dirty enforcing implement the
mock_domain necessary support and add a new dev_flags to test that the
hwpt_alloc/attach_device fails as expected.

Expand the existing mock_domain fixture with a enforce_dirty test that
exercises the hwpt_alloc and device attachment.

Link: https://lore.kernel.org/r/20231024135109.73787-15-joao.m.martins@oracle.com
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-10-24 11:58:44 -03:00
Joao Martins e04b23c8d4 iommufd/selftest: Expand mock_domain with dev_flags
Expand mock_domain test to be able to manipulate the device capabilities.
This allows testing with mockdev without dirty tracking support advertised
and thus make sure enforce_dirty test does the expected.

To avoid breaking IOMMUFD_TEST UABI replicate the mock_domain struct and
thus add an input dev_flags at the end.

Link: https://lore.kernel.org/r/20231024135109.73787-14-joao.m.martins@oracle.com
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-10-24 11:58:44 -03:00
Eduard Zingerman 64870feebe selftests/bpf: test if state loops are detected in a tricky case
A convoluted test case for iterators convergence logic that
demonstrates that states with branch count equal to 0 might still be
a part of not completely explored loop.

E.g. consider the following state diagram:

               initial     Here state 'succ' was processed first,
                 |         it was eventually tracked to produce a
                 V         state identical to 'hdr'.
    .---------> hdr        All branches from 'succ' had been explored
    |            |         and thus 'succ' has its .branches == 0.
    |            V
    |    .------...        Suppose states 'cur' and 'succ' correspond
    |    |       |         to the same instruction + callsites.
    |    V       V         In such case it is necessary to check
    |   ...     ...        whether 'succ' and 'cur' are identical.
    |    |       |         If 'succ' and 'cur' are a part of the same loop
    |    V       V         they have to be compared exactly.
    |   succ <- cur
    |    |
    |    V
    |   ...
    |    |
    '----'

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20231024000917.12153-7-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-10-23 21:49:32 -07:00
Eduard Zingerman 389ede06c2 selftests/bpf: tests with delayed read/precision makrs in loop body
These test cases try to hide read and precision marks from loop
convergence logic: marks would only be assigned on subsequent loop
iterations or after exploring states pushed to env->head stack first.
Without verifier fix to use exact states comparison logic for
iterators convergence these tests (except 'triple_continue') would be
errorneously marked as safe.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20231024000917.12153-5-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-10-23 21:49:31 -07:00
Eduard Zingerman 2793a8b015 bpf: exact states comparison for iterator convergence checks
Convergence for open coded iterators is computed in is_state_visited()
by examining states with branches count > 1 and using states_equal().
states_equal() computes sub-state relation using read and precision marks.
Read and precision marks are propagated from children states,
thus are not guaranteed to be complete inside a loop when branches
count > 1. This could be demonstrated using the following unsafe program:

     1. r7 = -16
     2. r6 = bpf_get_prandom_u32()
     3. while (bpf_iter_num_next(&fp[-8])) {
     4.   if (r6 != 42) {
     5.     r7 = -32
     6.     r6 = bpf_get_prandom_u32()
     7.     continue
     8.   }
     9.   r0 = r10
    10.   r0 += r7
    11.   r8 = *(u64 *)(r0 + 0)
    12.   r6 = bpf_get_prandom_u32()
    13. }

Here verifier would first visit path 1-3, create a checkpoint at 3
with r7=-16, continue to 4-7,3 with r7=-32.

Because instructions at 9-12 had not been visitied yet existing
checkpoint at 3 does not have read or precision mark for r7.
Thus states_equal() would return true and verifier would discard
current state, thus unsafe memory access at 11 would not be caught.

This commit fixes this loophole by introducing exact state comparisons
for iterator convergence logic:
- registers are compared using regs_exact() regardless of read or
  precision marks;
- stack slots have to have identical type.

Unfortunately, this is too strict even for simple programs like below:

    i = 0;
    while(iter_next(&it))
      i++;

At each iteration step i++ would produce a new distinct state and
eventually instruction processing limit would be reached.

To avoid such behavior speculatively forget (widen) range for
imprecise scalar registers, if those registers were not precise at the
end of the previous iteration and do not match exactly.

This a conservative heuristic that allows to verify wide range of
programs, however it precludes verification of programs that conjure
an imprecise value on the first loop iteration and use it as precise
on the second.

Test case iter_task_vma_for_each() presents one of such cases:

        unsigned int seen = 0;
        ...
        bpf_for_each(task_vma, vma, task, 0) {
                if (seen >= 1000)
                        break;
                ...
                seen++;
        }

Here clang generates the following code:

<LBB0_4>:
      24:       r8 = r6                          ; stash current value of
                ... body ...                       'seen'
      29:       r1 = r10
      30:       r1 += -0x8
      31:       call bpf_iter_task_vma_next
      32:       r6 += 0x1                        ; seen++;
      33:       if r0 == 0x0 goto +0x2 <LBB0_6>  ; exit on next() == NULL
      34:       r7 += 0x10
      35:       if r8 < 0x3e7 goto -0xc <LBB0_4> ; loop on seen < 1000

<LBB0_6>:
      ... exit ...

Note that counter in r6 is copied to r8 and then incremented,
conditional jump is done using r8. Because of this precision mark for
r6 lags one state behind of precision mark on r8 and widening logic
kicks in.

Adding barrier_var(seen) after conditional is sufficient to force
clang use the same register for both counting and conditional jump.

This issue was discussed in the thread [1] which was started by
Andrew Werner <awerner32@gmail.com> demonstrating a similar bug
in callback functions handling. The callbacks would be addressed
in a followup patch.

[1] https://lore.kernel.org/bpf/97a90da09404c65c8e810cf83c94ac703705dc0e.camel@gmail.com/

Co-developed-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Co-developed-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20231024000917.12153-4-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-10-23 21:49:31 -07:00
Eric Dumazet 2a7c8d291f tcp: introduce tcp_clock_ms()
It delivers current TCP time stamp in ms unit, and is used
in place of confusing tcp_time_stamp_raw()

It is the same family than tcp_clock_ns() and tcp_clock_ms().

tcp_time_stamp_raw() will be replaced later for TSval
contexts with a more descriptive name.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-23 09:35:01 +01:00
Linus Torvalds 023cc83605 Probes fixes for v6.6-rc6.2:
- kprobe-events: Fix kprobe events to reject if the attached symbol
   is not unique name because it may not the function which the user
   want to attach to. (User can attach a probe to such symbol using
   the nearest unique symbol + offset.)
 
 - selftest: Add a testcase to ensure the kprobe event rejects non
   unique symbol correctly.
 -----BEGIN PGP SIGNATURE-----
 
 iQFPBAABCgA5FiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmUzdQobHG1hc2FtaS5o
 aXJhbWF0c3VAZ21haWwuY29tAAoJENv7B78FKz8bMNAH/inFWv8e+rMm8F5Po6ZI
 CmBxuZbxy2l+KfYDjXqSHu7TLKngVd6Bhdb5H2K7fgdwiZxrS0i6qvdppo+Cxgop
 Yod06peDTM80IKavioCcOJOwLPGXXpZkMlK5fdC48HN6vrf9km4vws5ZAagfc1ng
 YhnYm1HHeXcIYwtLkE2dCr6HkwkaOebWTLdZ8c70d1OPw0L9rzxH+edjhKCq8uIw
 6WUg9ERxJYPUuCkQxOxVJrTdzNMRXsgf28FHc0LyYRm8kDpECT2BP6e/Y+TBbsX5
 2pN5cUY5qfI6t3Pc1HDs2KX8ui2QCmj0mCvT0VixhdjThdHpRf0VjIFFAANf3LNO
 XVA=
 =O1Aa
 -----END PGP SIGNATURE-----

Merge tag 'probes-fixes-v6.6-rc6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull probes fixes from Masami Hiramatsu:

 - kprobe-events: Fix kprobe events to reject if the attached symbol is
   not unique name because it may not the function which the user want
   to attach to. (User can attach a probe to such symbol using the
   nearest unique symbol + offset.)

 - selftest: Add a testcase to ensure the kprobe event rejects non
   unique symbol correctly.

* tag 'probes-fixes-v6.6-rc6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  selftests/ftrace: Add new test case which checks non unique symbol
  tracing/kprobes: Return EADDRNOTAVAIL when func matches several symbols
2023-10-21 11:00:36 -07:00
Pedro Tammela ee3d122854 selftests: tc-testing: add test for 'rt' upgrade on hfsc
Add a test to check if inner rt curves are upgraded to sc curves.

Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-21 11:46:41 +01:00
Linus Torvalds 444ccf1b11 linux_kselftest_active-fixes-6.6-rc7
This Kselftest update for Linux 6.6-rc7 consists of one single fix
 to assert check in user_events abi_test to properly check bit value
 on Big Endian architectures. The current code treats the bit values
 as Little Endian and the check fails on Big Endian.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmUytR4ACgkQCwJExA0N
 QxzjoQ/+MWgtyggLZo/dP+qJ7AU+tG08YXWWuh99lkSxVs3xHvwl2EGaYf1PXN9c
 ZUXb/KGfls8G4tv20KU2+/uSRSirkf+CFLN6HaBBG+2cun8o0KpHlVKGfmvRjb13
 ZUX8UQJ5u9kTSTqV7gCxVbemV5iOhTazuwVQ1Aq7wFL6e/0oL4eolbgNP0ub76vy
 UsZl9j/8pFhtVfdqRJqorKQ9H5RgvW7CCmobkUruGJoXg7jFuFdeqXtS4U1ziyJT
 g42LtXuC053KrEnEmhj1EadZC4E1eXadffssanfyWolAXjRD3N+2vLYasJOiGDAO
 mT111kQLaRNvZLJBULlrnWITkbVOhfE9Cxu14idxvdLShQiUO8kRCjoN2TbLZ2ET
 n7u/4anHBu99ljO6uZVNe7JY5ZrwaM1o7Myvk9eOVRe4WKkgOXJKUM3lfwvfMK8Q
 sEWvfBY04gL5y697DDiQvJ4g0fRwwnoadpHFvJTJX977H5C8/c750YZw/Emuip9D
 zxyHGEVtncM6awbHP8bGBgg2f6XISWKs5qvzrwLmdXx42oi0VjU4z+cEVOUSx/rY
 K+B+LVzcdYc/6UhjQzmylNBUs5CO7K6D05/tes07QaZ1MCCyw3b+DAR+LJn2U2+C
 Y3+x/kQT16abMIIpTM8qVWqxswvaFSyZ+5hCkVrAsYwkM1fpB4c=
 =PJhV
 -----END PGP SIGNATURE-----

Merge tag 'linux_kselftest_active-fixes-6.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull Kselftest fix from Shuah Khan:
 "One single fix to assert check in user_events abi_test to properly
  check bit value on Big Endian architectures. The code treated the bit
  values as Little Endian and the check failed on Big Endian"

* tag 'linux_kselftest_active-fixes-6.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests/user_events: Fix abi_test for BE archs
2023-10-20 14:45:41 -07:00
Hou Tao d440ba91ca selftests/bpf: Add more test cases for bpf memory allocator
Add the following 3 test cases for bpf memory allocator:
1) Do allocation in bpf program and free through map free
2) Do batch per-cpu allocation and per-cpu free in bpf program
3) Do per-cpu allocation in bpf program and free through map free

For per-cpu allocation, because per-cpu allocation can not refill timely
sometimes, so test 2) and test 3) consider it is OK for
bpf_percpu_obj_new_impl() to return NULL.

Signed-off-by: Hou Tao <houtao1@huawei.com>
Link: https://lore.kernel.org/r/20231020133202.4043247-8-houtao@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-10-20 14:15:13 -07:00
Kumar Kartikeya Dwivedi da1055b673 selftests/bpf: Make linked_list failure test more robust
The linked list failure test 'pop_front_off' and 'pop_back_off'
currently rely on matching exact instruction and register values.  The
purpose of the test is to ensure the offset is correctly incremented for
the returned pointers from list pop helpers, which can then be used with
container_of to obtain the real object. Hence, somehow obtaining the
information that the offset is 48 will work for us. Make the test more
robust by relying on verifier error string of bpf_spin_lock and remove
dependence on fragile instruction index or register number, which can be
affected by different clang versions used to build the selftests.

Fixes: 300f19dcdb ("selftests/bpf: Add BPF linked list API tests")
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20231020144839.2734006-1-memxor@gmail.com
2023-10-20 09:29:39 -07:00
Francis Laniel 03b80ff802 selftests/ftrace: Add new test case which checks non unique symbol
If name_show() is non unique, this test will try to install a kprobe on this
function which should fail returning EADDRNOTAVAIL.
On kernel where name_show() is not unique, this test is skipped.

Link: https://lore.kernel.org/all/20231020104250.9537-3-flaniel@linux.microsoft.com/

Cc: stable@vger.kernel.org
Signed-off-by: Francis Laniel <flaniel@linux.microsoft.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2023-10-20 22:11:49 +09:00
Anup Patel d9c00f44e5 KVM: riscv: selftests: Add SBI DBCN extension to get-reg-list test
We have a new SBI debug console (DBCN) extension supported by in-kernel
KVM so let us add this extension to get-reg-list test.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
2023-10-20 16:50:39 +05:30
Chuyi Zhou 130e0f7af9 selftests/bpf: Add tests for open-coded task and css iter
This patch adds 4 subtests to demonstrate these patterns and validating
correctness.

subtest1:

1) We use task_iter to iterate all process in the system and search for the
current process with a given pid.

2) We create some threads in current process context, and use
BPF_TASK_ITER_PROC_THREADS to iterate all threads of current process. As
expected, we would find all the threads of current process.

3) We create some threads and use BPF_TASK_ITER_ALL_THREADS to iterate all
threads in the system. As expected, we would find all the threads which was
created.

subtest2:

We create a cgroup and add the current task to the cgroup. In the
BPF program, we would use bpf_for_each(css_task, task, css) to iterate all
tasks under the cgroup. As expected, we would find the current process.

subtest3:

1) We create a cgroup tree. In the BPF program, we use
bpf_for_each(css, pos, root, XXX) to iterate all descendant under the root
with pre and post order. As expected, we would find all descendant and the
last iterating cgroup in post-order is root cgroup, the first iterating
cgroup in pre-order is root cgroup.

2) We wse BPF_CGROUP_ITER_ANCESTORS_UP to traverse the cgroup tree starting
from leaf and root separately, and record the height. The diff of the
hights would be the total tree-high - 1.

subtest4:

Add some failure testcase when using css_task, task and css iters, e.g,
unlock when using task-iters to iterate tasks.

Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com>
Link: https://lore.kernel.org/r/20231018061746.111364-9-zhouchuyi@bytedance.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-10-19 17:02:47 -07:00
Chuyi Zhou ddab78cbb5 selftests/bpf: rename bpf_iter_task.c to bpf_iter_tasks.c
The newly-added struct bpf_iter_task has a name collision with a selftest
for the seq_file task iter's bpf skel, so the selftests/bpf/progs file is
renamed in order to avoid the collision.

Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20231018061746.111364-8-zhouchuyi@bytedance.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-10-19 17:02:47 -07:00
Chuyi Zhou 7251d0905e bpf: Introduce css open-coded iterator kfuncs
This Patch adds kfuncs bpf_iter_css_{new,next,destroy} which allow
creation and manipulation of struct bpf_iter_css in open-coded iterator
style. These kfuncs actually wrapps css_next_descendant_{pre, post}.
css_iter can be used to:

1) iterating a sepcific cgroup tree with pre/post/up order

2) iterating cgroup_subsystem in BPF Prog, like
for_each_mem_cgroup_tree/cpuset_for_each_descendant_pre in kernel.

The API design is consistent with cgroup_iter. bpf_iter_css_new accepts
parameters defining iteration order and starting css. Here we also reuse
BPF_CGROUP_ITER_DESCENDANTS_PRE, BPF_CGROUP_ITER_DESCENDANTS_POST,
BPF_CGROUP_ITER_ANCESTORS_UP enums.

Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20231018061746.111364-5-zhouchuyi@bytedance.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-10-19 17:02:46 -07:00
Chuyi Zhou c68a78ffe2 bpf: Introduce task open coded iterator kfuncs
This patch adds kfuncs bpf_iter_task_{new,next,destroy} which allow
creation and manipulation of struct bpf_iter_task in open-coded iterator
style. BPF programs can use these kfuncs or through bpf_for_each macro to
iterate all processes in the system.

The API design keep consistent with SEC("iter/task"). bpf_iter_task_new()
accepts a specific task and iterating type which allows:

1. iterating all process in the system (BPF_TASK_ITER_ALL_PROCS)

2. iterating all threads in the system (BPF_TASK_ITER_ALL_THREADS)

3. iterating all threads of a specific task (BPF_TASK_ITER_PROC_THREADS)

Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com>
Link: https://lore.kernel.org/r/20231018061746.111364-4-zhouchuyi@bytedance.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-10-19 17:02:46 -07:00
Chuyi Zhou 9c66dc94b6 bpf: Introduce css_task open-coded iterator kfuncs
This patch adds kfuncs bpf_iter_css_task_{new,next,destroy} which allow
creation and manipulation of struct bpf_iter_css_task in open-coded
iterator style. These kfuncs actually wrapps css_task_iter_{start,next,
end}. BPF programs can use these kfuncs through bpf_for_each macro for
iteration of all tasks under a css.

css_task_iter_*() would try to get the global spin-lock *css_set_lock*, so
the bpf side has to be careful in where it allows to use this iter.
Currently we only allow it in bpf_lsm and bpf iter-s.

Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20231018061746.111364-3-zhouchuyi@bytedance.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-10-19 17:02:46 -07:00
Breno Leitao b9ec913212 selftests/bpf/sockopt: Add io_uring support
Expand the sockopt test to use also check for io_uring {g,s}etsockopt
commands operations.

This patch starts by marking each test if they support io_uring support
or not.

Right now, io_uring cmd getsockopt() has a limitation of only
accepting level == SOL_SOCKET, otherwise it returns -EOPNOTSUPP. Since
there aren't any test exercising getsockopt(level == SOL_SOCKET), this
patch changes two tests to use level == SOL_SOCKET, they are
"getsockopt: support smaller ctx->optlen" and "getsockopt: read
ctx->optlen".
There is no limitation for the setsockopt() part.

Later, each test runs using regular {g,s}etsockopt systemcalls, and, if
liburing is supported, execute the same test (again), but calling
liburing {g,s}setsockopt commands.

This patch also changes the level of two tests to use SOL_SOCKET for the
following two tests. This is going to help to exercise the io_uring
subsystem:
 * getsockopt: read ctx->optlen
 * getsockopt: support smaller ctx->optlen

Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://lore.kernel.org/r/20231016134750.1381153-12-leitao@debian.org
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-10-19 16:42:04 -06:00
Breno Leitao ba6e0e5cb5 selftests/net: Extract uring helpers to be reusable
Instead of defining basic io_uring functions in the test case, move them
to a common directory, so, other tests can use them.

This simplify the test code and reuse the common liburing
infrastructure. This is basically a copy of what we have in
io_uring_zerocopy_tx with some minor improvements to make checkpatch
happy.

A follow-up test will use the same helpers in a BPF sockopt test.

Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://lore.kernel.org/r/20231016134750.1381153-8-leitao@debian.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-10-19 16:42:03 -06:00
Jakub Kicinski 041c3466f3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

net/mac80211/key.c
  02e0e426a2 ("wifi: mac80211: fix error path key leak")
  2a8b665e6b ("wifi: mac80211: remove key_mtx")
  7d6904bf26 ("Merge wireless into wireless-next")
https://lore.kernel.org/all/20231012113648.46eea5ec@canb.auug.org.au/

Adjacent changes:

drivers/net/ethernet/ti/Kconfig
  a602ee3176 ("net: ethernet: ti: Fix mixed module-builtin object")
  98bdeae950 ("net: cpmac: remove driver to prepare for platform removal")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-19 13:29:01 -07:00
Linus Torvalds ce55c22ec8 Including fixes from bluetooth, netfilter, WiFi.
Feels like an up-tick in regression fixes, mostly for older releases.
 The hfsc fix, tcp_disconnect() and Intel WWAN fixes stand out as fairly
 clear-cut user reported regressions. The mlx5 DMA bug was causing strife
 for 390x folks. The fixes themselves are not particularly scary, tho.
 No open investigations / outstanding reports at the time of writing.
 
 Current release - regressions:
 
  - eth: mlx5: perform DMA operations in the right locations,
    make devices usable on s390x, again
 
  - sched: sch_hfsc: upgrade 'rt' to 'sc' when it becomes a inner curve,
    previous fix of rejecting invalid config broke some scripts
 
  - rfkill: reduce data->mtx scope in rfkill_fop_open, avoid deadlock
 
  - revert "ethtool: Fix mod state of verbose no_mask bitset",
    needs more work
 
 Current release - new code bugs:
 
  - tcp: fix listen() warning with v4-mapped-v6 address
 
 Previous releases - regressions:
 
  - tcp: allow tcp_disconnect() again when threads are waiting,
    it was denied to plug a constant source of bugs but turns out
    .NET depends on it
 
  - eth: mlx5: fix double-free if buffer refill fails under OOM
 
  - revert "net: wwan: iosm: enable runtime pm support for 7560",
    it's causing regressions and the WWAN team at Intel disappeared
 
  - tcp: tsq: relax tcp_small_queue_check() when rtx queue contains
    a single skb, fix single-stream perf regression on some devices
 
 Previous releases - always broken:
 
  - Bluetooth:
    - fix issues in legacy BR/EDR PIN code pairing
    - correctly bounds check and pad HCI_MON_NEW_INDEX name
 
  - netfilter:
    - more fixes / follow ups for the large "commit protocol" rework,
      which went in as a fix to 6.5
    - fix null-derefs on netlink attrs which user may not pass in
 
  - tcp: fix excessive TLP and RACK timeouts from HZ rounding
    (bless Debian for keeping HZ=250 alive)
 
  - net: more strict VIRTIO_NET_HDR_GSO_UDP_L4 validation, prevent
    letting frankenstein UDP super-frames from getting into the stack
 
  - net: fix interface altnames when ifc moves to a new namespace
 
  - eth: qed: fix the size of the RX buffers
 
  - mptcp: avoid sending RST when closing the initial subflow
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmUxawUACgkQMUZtbf5S
 Irt75w/+MC2ilnFQhfoQqpCxL2uHWOKzhenUz2PoNKR60OKgLLmTq8YBYcpyCVAQ
 wBQaNzNu1/yjF5BG7aNS/j0suGeJsOfhfoahQvXlLaat9NuDqxTpaeoT5FZ7eNQw
 RZ8+CJug3BRDV0TkaJH9UDVC/nfJTsnsGWNIhNXYGPsuveqAUun+xrnN8ZbvZIrn
 6D9rMF+u9SdVO+ANCquXBC7+CWEWiJS1ljUrU7BRNiv/9FSlnPQtjOdpuKleeBO8
 4usMS7TezHNgRdiAKC8GjSGUiIkIIMJT4y4wuczBEQAD4Pkki9UpBrui97ozOj7h
 W4N7UOuPlUBIardvKNoYz9rZyiFXBcPPm0GruHiuCqpxyqmzoFgv2XJsb/6KfzNn
 Dyro+lvh8smtbFHvFqiwaNu5y8ucfClaowvR4gjSe2KcB7hIpwNkh6vWC6OMGJK3
 hiKHnDrnXBQMbnP1YfiJ4feLmm3UYCG8eFdv/ULZT0a9TzZ7fKfzAfywUwD+/O8Y
 +S28Hr9srdDCHO7ih/gF3Wq9wtnnLy8QEkpt7cpXjRDj0uWH8JkHU3YEIGF2814Y
 LNVGmX9y6RcgrHNp03K1PmUcgAzhTTuV9QRoQKEucuBT5AK9ALDQ8YomnWDWDgrp
 UOdJPi1RUTsmqslADF15wZ9W5Ki/cDnUJsE4HU/MtnM2w95C49Q=
 =z1F6
 -----END PGP SIGNATURE-----

Merge tag 'net-6.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from bluetooth, netfilter, WiFi.

  Feels like an up-tick in regression fixes, mostly for older releases.
  The hfsc fix, tcp_disconnect() and Intel WWAN fixes stand out as
  fairly clear-cut user reported regressions. The mlx5 DMA bug was
  causing strife for 390x folks. The fixes themselves are not
  particularly scary, tho. No open investigations / outstanding reports
  at the time of writing.

  Current release - regressions:

   - eth: mlx5: perform DMA operations in the right locations, make
     devices usable on s390x, again

   - sched: sch_hfsc: upgrade 'rt' to 'sc' when it becomes a inner
     curve, previous fix of rejecting invalid config broke some scripts

   - rfkill: reduce data->mtx scope in rfkill_fop_open, avoid deadlock

   - revert "ethtool: Fix mod state of verbose no_mask bitset", needs
     more work

  Current release - new code bugs:

   - tcp: fix listen() warning with v4-mapped-v6 address

  Previous releases - regressions:

   - tcp: allow tcp_disconnect() again when threads are waiting, it was
     denied to plug a constant source of bugs but turns out .NET depends
     on it

   - eth: mlx5: fix double-free if buffer refill fails under OOM

   - revert "net: wwan: iosm: enable runtime pm support for 7560", it's
     causing regressions and the WWAN team at Intel disappeared

   - tcp: tsq: relax tcp_small_queue_check() when rtx queue contains a
     single skb, fix single-stream perf regression on some devices

  Previous releases - always broken:

   - Bluetooth:
      - fix issues in legacy BR/EDR PIN code pairing
      - correctly bounds check and pad HCI_MON_NEW_INDEX name

   - netfilter:
      - more fixes / follow ups for the large "commit protocol" rework,
        which went in as a fix to 6.5
      - fix null-derefs on netlink attrs which user may not pass in

   - tcp: fix excessive TLP and RACK timeouts from HZ rounding (bless
     Debian for keeping HZ=250 alive)

   - net: more strict VIRTIO_NET_HDR_GSO_UDP_L4 validation, prevent
     letting frankenstein UDP super-frames from getting into the stack

   - net: fix interface altnames when ifc moves to a new namespace

   - eth: qed: fix the size of the RX buffers

   - mptcp: avoid sending RST when closing the initial subflow"

* tag 'net-6.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (94 commits)
  Revert "ethtool: Fix mod state of verbose no_mask bitset"
  selftests: mptcp: join: no RST when rm subflow/addr
  mptcp: avoid sending RST when closing the initial subflow
  mptcp: more conservative check for zero probes
  tcp: check mptcp-level constraints for backlog coalescing
  selftests: mptcp: join: correctly check for no RST
  net: ti: icssg-prueth: Fix r30 CMDs bitmasks
  selftests: net: add very basic test for netdev names and namespaces
  net: move altnames together with the netdevice
  net: avoid UAF on deleted altname
  net: check for altname conflicts when changing netdev's netns
  net: fix ifname in netlink ntf during netns move
  net: ethernet: ti: Fix mixed module-builtin object
  net: phy: bcm7xxx: Add missing 16nm EPHY statistics
  ipv4: fib: annotate races around nh->nh_saddr_genid and nh->nh_saddr
  tcp_bpf: properly release resources on error paths
  net/sched: sch_hfsc: upgrade 'rt' to 'sc' when it becomes a inner curve
  net: mdio-mux: fix C45 access returning -EIO after API change
  tcp: tsq: relax tcp_small_queue_check() when rtx queue contains a single skb
  octeon_ep: update BQL sent bytes before ringing doorbell
  ...
2023-10-19 12:08:18 -07:00
Matthieu Baerts 2cfaa8b3b7 selftests: mptcp: join: no RST when rm subflow/addr
Recently, we noticed that some RST were wrongly generated when removing
the initial subflow.

This patch makes sure RST are not sent when removing any subflows or any
addresses.

Fixes: c2b2ae3925 ("mptcp: handle correctly disconnect() failures")
Cc: stable@vger.kernel.org
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231018-send-net-20231018-v1-5-17ecb002e41d@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-19 09:10:00 -07:00
Matthieu Baerts b134a58054 selftests: mptcp: join: correctly check for no RST
The commit mentioned below was more tolerant with the number of RST seen
during a test because in some uncontrollable situations, multiple RST
can be generated.

But it was not taking into account the case where no RST are expected:
this validation was then no longer reporting issues for the 0 RST case
because it is not possible to have less than 0 RST in the counter. This
patch fixes the issue by adding a specific condition.

Fixes: 6bf41020b7 ("selftests: mptcp: update and extend fastclose test-cases")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231018-send-net-20231018-v1-1-17ecb002e41d@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-19 09:09:59 -07:00
Jakub Kicinski 3920431d98 selftests: net: add very basic test for netdev names and namespaces
Add selftest for fixes around naming netdevs and namespaces.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-19 15:51:16 +02:00
Eric Dumazet 878d951c67 inet: lock the socket in ip_sock_set_tos()
Christoph Paasch reported a panic in TCP stack [1]

Indeed, we should not call sk_dst_reset() without holding
the socket lock, as __sk_dst_get() callers do not all rely
on bare RCU.

[1]
BUG: kernel NULL pointer dereference, address: 0000000000000000
PGD 12bad6067 P4D 12bad6067 PUD 12bad5067 PMD 0
Oops: 0000 [#1] PREEMPT SMP
CPU: 1 PID: 2750 Comm: syz-executor.5 Not tainted 6.6.0-rc4-g7a5720a344e7 #49
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
RIP: 0010:tcp_get_metrics+0x118/0x8f0 net/ipv4/tcp_metrics.c:321
Code: c7 44 24 70 02 00 8b 03 89 44 24 48 c7 44 24 4c 00 00 00 00 66 c7 44 24 58 02 00 66 ba 02 00 b1 01 89 4c 24 04 4c 89 7c 24 10 <49> 8b 0f 48 8b 89 50 05 00 00 48 89 4c 24 30 33 81 00 02 00 00 69
RSP: 0018:ffffc90000af79b8 EFLAGS: 00010293
RAX: 000000000100007f RBX: ffff88812ae8f500 RCX: ffff88812b5f8f01
RDX: 0000000000000002 RSI: ffffffff8300f080 RDI: 0000000000000002
RBP: 0000000000000002 R08: 0000000000000003 R09: ffffffff8205eca0
R10: 0000000000000002 R11: ffff88812b5f8f00 R12: ffff88812a9e0580
R13: 0000000000000000 R14: ffff88812ae8fbd2 R15: 0000000000000000
FS: 00007f70a006b640(0000) GS:ffff88813bd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000012bad7003 CR4: 0000000000170ee0
Call Trace:
<TASK>
tcp_fastopen_cache_get+0x32/0x140 net/ipv4/tcp_metrics.c:567
tcp_fastopen_cookie_check+0x28/0x180 net/ipv4/tcp_fastopen.c:419
tcp_connect+0x9c8/0x12a0 net/ipv4/tcp_output.c:3839
tcp_v4_connect+0x645/0x6e0 net/ipv4/tcp_ipv4.c:323
__inet_stream_connect+0x120/0x590 net/ipv4/af_inet.c:676
tcp_sendmsg_fastopen+0x2d6/0x3a0 net/ipv4/tcp.c:1021
tcp_sendmsg_locked+0x1957/0x1b00 net/ipv4/tcp.c:1073
tcp_sendmsg+0x30/0x50 net/ipv4/tcp.c:1336
__sock_sendmsg+0x83/0xd0 net/socket.c:730
__sys_sendto+0x20a/0x2a0 net/socket.c:2194
__do_sys_sendto net/socket.c:2206 [inline]

Fixes: e08d0b3d17 ("inet: implement lockless IP_TOS")
Reported-by: Christoph Paasch <cpaasch@apple.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20231018090014.345158-1-edumazet@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-19 13:13:13 +02:00
Pedro Tammela 35027c7909 selftests: tc-testing: move auxiliary scripts to a dedicated folder
Some taprio tests need auxiliary scripts to wait for workqueue events to
process. Move them to a dedicated folder in order to package them for
the kselftests tarball.

Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Link: https://lore.kernel.org/r/20231017152309.3196320-3-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-18 18:07:51 -07:00
Pedro Tammela f157b73d51 selftests: tc-testing: add missing Kconfig options to 'config'
Make sure CI builds using just tc-testing/config can run all tdc tests.
Some tests were broken because of missing knobs.

Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Link: https://lore.kernel.org/r/20231017152309.3196320-2-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-18 18:07:51 -07:00
Jing Zhang 54a9ea7352 KVM: arm64: selftests: Test for setting ID register from usersapce
Add tests to verify setting ID registers from userspace is handled
correctly by KVM. Also add a test case to use ioctl
KVM_ARM_GET_REG_WRITABLE_MASKS to get writable masks.

Signed-off-by: Jing Zhang <jingzhangos@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231011195740.3349631-6-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-18 23:37:33 +00:00
Jing Zhang 0359c946b1 tools headers arm64: Update sysreg.h with kernel sources
The users of sysreg.h (perf, KVM selftests) are now generating the
necessary sysreg-defs.h; sync sysreg.h with the kernel sources and
fix the KVM selftests that use macros which suffered a rename.

Signed-off-by: Jing Zhang <jingzhangos@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231011195740.3349631-5-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-18 23:36:25 +00:00
Oliver Upton 9697d84cc3 KVM: selftests: Generate sysreg-defs.h and add to include path
Start generating sysreg-defs.h for arm64 builds in anticipation of
updating sysreg.h to a version that depends on it.

Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231011195740.3349631-4-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-18 23:36:25 +00:00
Swarup Laxman Kotiaklapudi 6e79b375ad proc: test /proc/${pid}/statm
My original comment lied, output can be "0 A A B 0 0 0\n"
(see comment in the code).

I don't quite understand why

	get_mm_counter(mm, MM_FILEPAGES) + get_mm_counter(mm, MM_SHMEMPAGES)

can stay positive but get_mm_counter(mm, MM_ANONPAGES) is always 0 after
everything is unmapped but that's just me.

[adobriyan@gmail.com: more or less rewritten]
Link: https://lkml.kernel.org/r/0721ca69-7bb4-40aa-8d01-0c5f91e5f363@p183
Signed-off-by: Swarup Laxman Kotiaklapudi <swarupkotikalapudi@gmail.com>
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-18 14:43:22 -07:00
Nhat Pham c0dddb7aa5 selftests: add a selftest to verify hugetlb usage in memcg
This patch add a new kselftest to demonstrate and verify the new hugetlb
memcg accounting behavior.

Link: https://lkml.kernel.org/r/20231006184629.155543-5-nphamcs@gmail.com
Signed-off-by: Nhat Pham <nphamcs@gmail.com>
Cc: Frank van der Linden <fvdl@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Rik van Riel <riel@surriel.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Tejun heo <tj@kernel.org>
Cc: Yosry Ahmed <yosryahmed@google.com>
Cc: Zefan Li <lizefan.x@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-18 14:34:18 -07:00
Breno Leitao 116d57303a selftests/mm: add a new test for madv and hugetlb
Create a selftest that exercises the race between page faults and
madvise(MADV_DONTNEED) in the same huge page. Do it by running two
threads that touches the huge page and madvise(MADV_DONTNEED) at the same
time.

In case of a SIGBUS coming at pagefault, the test should fail, since we
hit the bug.

The test doesn't have a signal handler, and if it fails, it fails like
the following

  ----------------------------------
  running ./hugetlb_fault_after_madv
  ----------------------------------
  ./run_vmtests.sh: line 186: 595563 Bus error    (core dumped) "$@"
  [FAIL]

This selftest goes together with the fix of the bug[1] itself.

[1] https://lore.kernel.org/all/20231001005659.2185316-1-riel@surriel.com/#r

Link: https://lkml.kernel.org/r/20231005163922.87568-3-leitao@debian.org
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Rik van Riel <riel@surriel.com>
Tested-by: Rik van Riel <riel@surriel.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Muchun Song <muchun.song@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-18 14:34:16 -07:00
Breno Leitao c8b9073142 selftests/mm: export get_free_hugepages()
Patch series "New selftest for mm", v2.

This is a simple test case that reproduces an mm problem[1], where a page
fault races with madvise(), and it is not trivial to reproduce and debug.

This test-case aims to avoid such race problems from happening again,
impacting workloads that leverages external allocators, such as tcmalloc,
jemalloc, etc.

[1] https://lore.kernel.org/all/20231001005659.2185316-1-riel@surriel.com/#r


This patch (of 2):

get_free_hugepages() is helpful for other hugepage tests.  Export it to
the common file (vm_util.c) to be reused.

Link: https://lkml.kernel.org/r/20231005163922.87568-1-leitao@debian.org
Link: https://lkml.kernel.org/r/20231005163922.87568-2-leitao@debian.org
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Rik van Riel <riel@surriel.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Muchun Song <muchun.song@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-18 14:34:16 -07:00
Liam R. Howlett 7771dcf019 radix tree test suite: fix allocation calculation in kmem_cache_alloc_bulk()
The bulk allocation is iterating through an array and storing enough
memory for the entire bulk allocation instead of a single array entry. 
Only allocate an array element of the size set in the kmem_cache.

Link: https://lkml.kernel.org/r/20230929201359.2857583-1-Liam.Howlett@oracle.com
Fixes: cc86e0c2f3 ("radix tree test suite: add support for slab bulk APIs")
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reported-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-18 14:34:13 -07:00
Muhammad Usama Anjum 46fd75d4a3 selftests: mm: add pagemap ioctl tests
Add pagemap ioctl tests. Add several different types of tests to judge
the correction of the interface.

Link: https://lkml.kernel.org/r/20230821141518.870589-7-usama.anjum@collabora.com
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: Alex Sierra <alex.sierra@amd.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrei Vagin <avagin@gmail.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Miroslaw <emmir@google.com>
Cc: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Nadav Amit <namit@vmware.com>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Paul Gofman <pgofman@codeweavers.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Yun Zhou <yun.zhou@windriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-18 14:34:13 -07:00
Andrew Morton 5ef8f1b2b4 Merge mm-hotfixes-stable into mm-stable to pick up depended-upon changes. 2023-10-18 14:32:58 -07:00
Ilpo Järvinen 5247e6dbed selftests/resctrl: Fix MBM test failure when MBA unavailable
Commit 20d96b25cc ("selftests/resctrl: Fix schemata write error
check") exposed a problem in feature detection logic in MBM selftest.
If schemata does not support MB:x=x entries, the schemata write to
initialize 100% memory bandwidth allocation in mbm_setup() will now
fail with -EINVAL due to the error handling corrected by the commit
20d96b25cc ("selftests/resctrl: Fix schemata write error check").
That commit just uncovers the failed write, it is not wrong itself.

If MB:x=x is not supported by schemata, it is safe to assume 100%
memory bandwidth is always set. Therefore, the previously ignored error
does not make the MBM test itself wrong.

Restore the previous behavior of MBM test by checking MB support before
attempting to write it into schemata which results in behavior
equivalent to ignoring the write error.

Fixes: 20d96b25cc ("selftests/resctrl: Fix schemata write error check")
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-18 14:33:44 -06:00
Mark Brown 34dce23f7e selftests/clone3: Report descriptive test names
The clone3() selftests currently report test results in a format that does
not mesh entirely well with automation. They log output for each test such
as:

  # [1382411] Trying clone3() with flags 0 (size 0)
  # I am the parent (1382411). My child's pid is 1382412
  # I am the child, my PID is 1382412
  # [1382411] clone3() with flags says: 0 expected 0
  ok 1 [1382411] Result (0) matches expectation (0)

This is not ideal for automated parsers since the text after the "ok 1" is
treated as the test name when comparing runs by a lot of automation (tests
routinely get renumbered due to things like new tests being added based on
logical groupings). The PID means that the test names will frequently vary
and the rest of the name being a description of results means several tests
have identical text there.

Address this by refactoring things so that we have a static descriptive
name for each test which we use when logging passes, failures and skips
and since we now have a stable name for the test to hand log that before
starting the test to address the common issue reading logs where the test
name is only printed after any diagnostics. The result is:

 # Running test 'simple clone3()'
 # [1562777] Trying clone3() with flags 0 (size 0)
 # I am the parent (1562777). My child's pid is 1562778
 # I am the child, my PID is 1562778
 # [1562777] clone3() with flags says: 0 expected 0
 ok 1 simple clone3()

In order to handle skips a bit more neatly this is done in a moderately
invasive fashion where we move from a sequence of function calls to having
an array of test parameters. This hopefully also makes it a little easier
to see what the tests are doing when looking at both the source and the
logs.

Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-18 14:33:44 -06:00
zhujun2 ecc4185a4d selftests:modify the incorrect print format
when the argument type is 'unsigned int',printf '%u'
in format string. Problem found during code reading.

Update commit log with information on how the problem
was found:
Shuah Khan <skhan@linuxfoundation.org>

Signed-off-by: zhujun2 <zhujun2@cmss.chinamobile.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-18 14:33:44 -06:00
zhujun2 3f6f8a8c5e selftests/efivarfs: create-read: fix a resource leak
The opened file should be closed in main(), otherwise resource
leak will occur that this problem was discovered by code reading

Signed-off-by: zhujun2 <zhujun2@cmss.chinamobile.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-18 14:33:44 -06:00
Yu Liao 11df28854b selftests/ftrace: Add riscv support for kprobe arg tests
This is the riscv variant of commit 9855c4626c ("selftests/ftrace:
Add ppc support for kprobe args tests").

Signed-off-by: Yu Liao <liaoyu15@huawei.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-18 14:33:44 -06:00
Yu Liao 2eadb32992 selftests/ftrace: add loongarch support for kprobe args char tests
Add loongarch support for the recently added kprobe args tests.

Signed-off-by: Yu Liao <liaoyu15@huawei.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-18 14:33:43 -06:00
Tiezhu Yang fc7f04dc23 selftests/clone3: Fix broken test under !CONFIG_TIME_NS
When execute the following command to test clone3 under !CONFIG_TIME_NS:

  # make headers && cd tools/testing/selftests/clone3 && make && ./clone3

we can see the following error info:

  # [7538] Trying clone3() with flags 0x80 (size 0)
  # Invalid argument - Failed to create new process
  # [7538] clone3() with flags says: -22 expected 0
  not ok 18 [7538] Result (-22) is different than expected (0)
  ...
  # Totals: pass:18 fail:1 xfail:0 xpass:0 skip:0 error:0

This is because if CONFIG_TIME_NS is not set, but the flag
CLONE_NEWTIME (0x80) is used to clone a time namespace, it
will return -EINVAL in copy_time_ns().

If kernel does not support CONFIG_TIME_NS, /proc/self/ns/time
will be not exist, and then we should skip clone3() test with
CLONE_NEWTIME.

With this patch under !CONFIG_TIME_NS:

  # make headers && cd tools/testing/selftests/clone3 && make && ./clone3
  ...
  # Time namespaces are not supported
  ok 18 # SKIP Skipping clone3() with CLONE_NEWTIME
  ...
  # Totals: pass:18 fail:0 xfail:0 xpass:0 skip:1 error:0

Link: https://lkml.kernel.org/r/1689066814-13295-1-git-send-email-yangtiezhu@loongson.cn
Fixes: 515bddf0ec ("selftests/clone3: test clone3 with CLONE_NEWTIME")
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-18 12:12:41 -07:00
Samasth Norway Ananda e2de156b0d selftests/mm: include mman header to access MREMAP_DONTUNMAP identifier
Definition for MREMAP_DONTUNMAP is not present in glibc older than 2.32
thus throwing an undeclared error when running make on mm.  Including
linux/mman.h solves the build error for people having older glibc.

Link: https://lkml.kernel.org/r/20231012155257.891776-1-samasth.norway.ananda@oracle.com
Fixes: 0183d777c2 ("selftests: mm: remove duplicate unneeded defines")
Signed-off-by: Samasth Norway Ananda <samasth.norway.ananda@oracle.com>
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Closes: https://lore.kernel.org/linux-mm/CA+G9fYvV-71XqpCr_jhdDfEtN701fBdG3q+=bafaZiGwUXy_aA@mail.gmail.com/
Tested-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-18 12:12:41 -07:00
Phil Sutter 2e2d9c7d4d selftests: netfilter: Run nft_audit.sh in its own netns
Don't mess with the host's firewall ruleset. Since audit logging is not
per-netns, add an initial delay of a second so other selftests' netns
cleanups have a chance to finish.

Fixes: e8dbde59ca ("selftests: netfilter: Test nf_tables audit logging")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Florian Westphal <fw@strlen.de>
2023-10-18 13:47:08 +02:00
Phil Sutter 1baf0152f7 netfilter: nf_tables: audit log object reset once per table
When resetting multiple objects at once (via dump request), emit a log
message per table (or filled skb) and resurrect the 'entries' parameter
to contain the number of objects being logged for.

To test the skb exhaustion path, perform some bulk counter and quota
adds in the kselftest.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Richard Guy Briggs <rgb@redhat.com>
Acked-by: Paul Moore <paul@paul-moore.com> (Audit)
Signed-off-by: Florian Westphal <fw@strlen.de>
2023-10-18 13:43:40 +02:00
Phil Sutter c4eee56e14 net: skb_find_text: Ignore patterns extending past 'to'
Assume that caller's 'to' offset really represents an upper boundary for
the pattern search, so patterns extending past this offset are to be
rejected.

The old behaviour also was kind of inconsistent when it comes to
fragmentation (or otherwise non-linear skbs): If the pattern started in
between 'to' and 'from' offsets but extended to the next fragment, it
was not found if 'to' offset was still within the current fragment.

Test the new behaviour in a kselftest using iptables' string match.

Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org>
Fixes: f72b948dcb ("[NET]: skb_find_text ignores to argument")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-18 11:09:55 +01:00
Larysa Zaremba bb6a88885f selftests/bpf: Add options and frags to xdp_hw_metadata
This is a follow-up to the commit 9b2b86332a ("bpf: Allow to use kfunc
XDP hints and frags together").

The are some possible implementations problems that may arise when providing
metadata specifically for multi-buffer packets, therefore there must be a
possibility to test such option separately.

Add an option to use multi-buffer AF_XDP xdp_hw_metadata and mark used XDP
program as capable to use frags.

As for now, xdp_hw_metadata accepts no options, so add simple option
parsing logic and a help message.

For quick reference, also add an ingress packet generation command to the
help message. The command comes from [0].

Example of output for multi-buffer packet:

  xsk_ring_cons__peek: 1
  0xead018: rx_desc[15]->addr=10000000000f000 addr=f100 comp_addr=f000
  rx_hash: 0x5789FCBB with RSS type:0x29
  rx_timestamp:  1696856851535324697 (sec:1696856851.5353)
  XDP RX-time:   1696856843158256391 (sec:1696856843.1583)
  	delta sec:-8.3771 (-8377068.306 usec)
  AF_XDP time:   1696856843158413078 (sec:1696856843.1584)
  	delta sec:0.0002 (156.687 usec)
  0xead018: complete idx=23 addr=f000
  xsk_ring_cons__peek: 1
  0xead018: rx_desc[16]->addr=100000000008000 addr=8100 comp_addr=8000
  0xead018: complete idx=24 addr=8000
  xsk_ring_cons__peek: 1
  0xead018: rx_desc[17]->addr=100000000009000 addr=9100 comp_addr=9000 EoP
  0xead018: complete idx=25 addr=9000

Metadata is printed for the first packet only.

  [0] https://lore.kernel.org/all/20230119221536.3349901-18-sdf@google.com/

Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/bpf/20231017162800.24080-1-larysa.zaremba@intel.com
2023-10-18 10:08:28 +02:00
Johannes Nixdorf 6f84090333 selftests: forwarding: bridge_fdb_learning_limit: Add a new selftest
Add a suite covering the fdb_n_learned and fdb_max_learned bridge
features, touching all special cases in accounting at least once.

Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Johannes Nixdorf <jnixdorf-oss@avm.de>
Link: https://lore.kernel.org/r/20231016-fdb_limit-v5-5-32cddff87758@avm.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-17 17:39:02 -07:00
Beau Belgrave cf5a103c98 selftests/user_events: Fix abi_test for BE archs
The abi_test currently uses a long sized test value for enablement
checks. On LE this works fine, however, on BE this results in inaccurate
assert checks due to a bit being used and assuming it's value is the
same on both LE and BE.

Use int type for 32-bit values and long type for 64-bit values to ensure
appropriate behavior on both LE and BE.

Fixes: 60b1af8de8 ("tracing/user_events: Add ABI self-test")
Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-17 15:07:19 -06:00
Daniel Borkmann 24516309e3 selftests/bpf: Add additional mprog query test coverage
Add several new test cases which assert corner cases on the mprog query
mechanism, for example, around passing in a too small or a larger array
than the current count.

  ./test_progs -t tc_opts
  #252     tc_opts_after:OK
  #253     tc_opts_append:OK
  #254     tc_opts_basic:OK
  #255     tc_opts_before:OK
  #256     tc_opts_chain_classic:OK
  #257     tc_opts_chain_mixed:OK
  #258     tc_opts_delete_empty:OK
  #259     tc_opts_demixed:OK
  #260     tc_opts_detach:OK
  #261     tc_opts_detach_after:OK
  #262     tc_opts_detach_before:OK
  #263     tc_opts_dev_cleanup:OK
  #264     tc_opts_invalid:OK
  #265     tc_opts_max:OK
  #266     tc_opts_mixed:OK
  #267     tc_opts_prepend:OK
  #268     tc_opts_query:OK
  #269     tc_opts_query_attach:OK
  #270     tc_opts_replace:OK
  #271     tc_opts_revision:OK
  Summary: 20/0 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20231017081728.24769-1-daniel@iogearbox.net
2023-10-17 12:57:43 -07:00
Yafang Shao 44cb03f19b selftests/bpf: Add selftest for bpf_task_under_cgroup() in sleepable prog
The result is as follows:

  $ tools/testing/selftests/bpf/test_progs --name=task_under_cgroup
  #237     task_under_cgroup:OK
  Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED

Without the previous patch, there will be RCU warnings in dmesg when
CONFIG_PROVE_RCU is enabled. While with the previous patch, there will
be no warnings.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/bpf/20231007135945.4306-2-laoar.shao@gmail.com
2023-10-17 18:31:27 +02:00
Jakub Kicinski a3c2dd9648 bpf-next-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZS1d4wAKCRDbK58LschI
 g4DSAP441CdKh8fd+wNKUSKHFbpCQ6EvocR6Nf+Sj2DFUx/w/QEA7mfju7Abqjc3
 xwDEx0BuhrjMrjV5MmEpxc7lYl9XcQU=
 =vuWk
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2023-10-16

We've added 90 non-merge commits during the last 25 day(s) which contain
a total of 120 files changed, 3519 insertions(+), 895 deletions(-).

The main changes are:

1) Add missed stats for kprobes to retrieve the number of missed kprobe
   executions and subsequent executions of BPF programs, from Jiri Olsa.

2) Add cgroup BPF sockaddr hooks for unix sockets. The use case is
   for systemd to reimplement the LogNamespace feature which allows
   running multiple instances of systemd-journald to process the logs
   of different services, from Daan De Meyer.

3) Implement BPF CPUv4 support for s390x BPF JIT, from Ilya Leoshkevich.

4) Improve BPF verifier log output for scalar registers to better
   disambiguate their internal state wrt defaults vs min/max values
   matching, from Andrii Nakryiko.

5) Extend the BPF fib lookup helpers for IPv4/IPv6 to support retrieving
   the source IP address with a new BPF_FIB_LOOKUP_SRC flag,
   from Martynas Pumputis.

6) Add support for open-coded task_vma iterator to help with symbolization
   for BPF-collected user stacks, from Dave Marchevsky.

7) Add libbpf getters for accessing individual BPF ring buffers which
   is useful for polling them individually, for example, from Martin Kelly.

8) Extend AF_XDP selftests to validate the SHARED_UMEM feature,
   from Tushar Vyavahare.

9) Improve BPF selftests cross-building support for riscv arch,
   from Björn Töpel.

10) Add the ability to pin a BPF timer to the same calling CPU,
   from David Vernet.

11) Fix libbpf's bpf_tracing.h macros for riscv to use the generic
   implementation of PT_REGS_SYSCALL_REGS() to access syscall arguments,
   from Alexandre Ghiti.

12) Extend libbpf to support symbol versioning for uprobes, from Hengqi Chen.

13) Fix bpftool's skeleton code generation to guarantee that ELF data
    is 8 byte aligned, from Ian Rogers.

14) Inherit system-wide cpu_mitigations_off() setting for Spectre v1/v4
    security mitigations in BPF verifier, from Yafang Shao.

15) Annotate struct bpf_stack_map with __counted_by attribute to prepare
    BPF side for upcoming __counted_by compiler support, from Kees Cook.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (90 commits)
  bpf: Ensure proper register state printing for cond jumps
  bpf: Disambiguate SCALAR register state output in verifier logs
  selftests/bpf: Make align selftests more robust
  selftests/bpf: Improve missed_kprobe_recursion test robustness
  selftests/bpf: Improve percpu_alloc test robustness
  selftests/bpf: Add tests for open-coded task_vma iter
  bpf: Introduce task_vma open-coded iterator kfuncs
  selftests/bpf: Rename bpf_iter_task_vma.c to bpf_iter_task_vmas.c
  bpf: Don't explicitly emit BTF for struct btf_iter_num
  bpf: Change syscall_nr type to int in struct syscall_tp_t
  net/bpf: Avoid unused "sin_addr_len" warning when CONFIG_CGROUP_BPF is not set
  bpf: Avoid unnecessary audit log for CPU security mitigations
  selftests/bpf: Add tests for cgroup unix socket address hooks
  selftests/bpf: Make sure mount directory exists
  documentation/bpf: Document cgroup unix socket address hooks
  bpftool: Add support for cgroup unix socket address hooks
  libbpf: Add support for cgroup unix socket address hooks
  bpf: Implement cgroup sockaddr hooks for unix sockets
  bpf: Add bpf_sock_addr_set_sun_path() to allow writing unix sockaddr from bpf
  bpf: Propagate modified uaddrlen from cgroup sockaddr programs
  ...
====================

Link: https://lore.kernel.org/r/20231016204803.30153-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-16 21:05:33 -07:00
Linus Torvalds 86d6a628a2 ARM:
- Fix the handling of the phycal timer offset when FEAT_ECV
   and CNTPOFF_EL2 are implemented.
 
 - Restore the functionnality of Permission Indirection that
   was broken by the Fine Grained Trapping rework
 
 - Cleanup some PMU event sharing code
 
 MIPS:
 
 - Fix W=1 build.
 
 s390:
 
 - One small fix for gisa to avoid stalls.
 
 x86:
 
 - Truncate writes to PMU counters to the counter's width to avoid spurious
   overflows when emulating counter events in software.
 
 - Set the LVTPC entry mask bit when handling a PMI (to match Intel-defined
   architectural behavior).
 
 - Treat KVM_REQ_PMI as a wake event instead of queueing host IRQ work to
   kick the guest out of emulated halt.
 
 - Fix for loading XSAVE state from an old kernel into a new one.
 
 - Fixes for AMD AVIC
 
 selftests:
 
 - Play nice with %llx when formatting guest printf and assert statements.
 
 - Clean up stale test metadata.
 
 - Zero-initialize structures in memslot perf test to workaround a suspected
   "may be used uninitialized" false positives from GCC.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmUtvXgUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroOE3gf/Q0Xvi/oU/+iDMuvfCbMZg/nhbrsa
 WmE/zXLrCF0DknppAsWulkhLGL2ceL6X+f37f2vWpBdG9SVDG/vSAg+QQDwsXiKN
 hTJoaybtMMPZM64emPZKeLMrq3A/U32qIMmWMJkoQRyz6dftUhGqZEuy1jw8oomJ
 n9idRDCMkbo+bick4URt0FEuI3Tf6dPIlG7P5hObFTw+nenzzxTjoxWZ432Mgyod
 yqveEke4hcQ+6K1zdAcDNZqT9ZhxeTxAO4yrBAYfnFoPLhUXKDUumkqAQPNOhKTo
 YN+b29kHBm+HvYkHN785FQla/13wjE1aq5TUj5J7NEDv4uRXDefDq2OAeg==
 =b9AY
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "ARM:

   - Fix the handling of the phycal timer offset when FEAT_ECV and
     CNTPOFF_EL2 are implemented

   - Restore the functionnality of Permission Indirection that was
     broken by the Fine Grained Trapping rework

   - Cleanup some PMU event sharing code

  MIPS:

   - Fix W=1 build

  s390:

   - One small fix for gisa to avoid stalls

  x86:

   - Truncate writes to PMU counters to the counter's width to avoid
     spurious overflows when emulating counter events in software

   - Set the LVTPC entry mask bit when handling a PMI (to match
     Intel-defined architectural behavior)

   - Treat KVM_REQ_PMI as a wake event instead of queueing host IRQ work
     to kick the guest out of emulated halt

   - Fix for loading XSAVE state from an old kernel into a new one

   - Fixes for AMD AVIC

  selftests:

   - Play nice with %llx when formatting guest printf and assert
     statements

   - Clean up stale test metadata

   - Zero-initialize structures in memslot perf test to workaround a
     suspected 'may be used uninitialized' false positives from GCC"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (21 commits)
  KVM: arm64: timers: Correctly handle TGE flip with CNTPOFF_EL2
  KVM: arm64: POR{E0}_EL1 do not need trap handlers
  KVM: arm64: Add nPIR{E0}_EL1 to HFG traps
  KVM: MIPS: fix -Wunused-but-set-variable warning
  KVM: arm64: pmu: Drop redundant check for non-NULL kvm_pmu_events
  KVM: SVM: Fix build error when using -Werror=unused-but-set-variable
  x86: KVM: SVM: refresh AVIC inhibition in svm_leave_nested()
  x86: KVM: SVM: add support for Invalid IPI Vector interception
  x86: KVM: SVM: always update the x2avic msr interception
  KVM: selftests: Force load all supported XSAVE state in state test
  KVM: selftests: Load XSAVE state into untouched vCPU during state test
  KVM: selftests: Touch relevant XSAVE state in guest for state test
  KVM: x86: Constrain guest-supported xfeatures only at KVM_GET_XSAVE{2}
  x86/fpu: Allow caller to constrain xfeatures when copying to uabi buffer
  KVM: selftests: Zero-initialize entire test_result in memslot perf test
  KVM: selftests: Remove obsolete and incorrect test case metadata
  KVM: selftests: Treat %llx like %lx when formatting guest printf
  KVM: x86/pmu: Synthesize at most one PMI per VM-exit
  KVM: x86: Mask LVTPC when handling a PMI
  KVM: x86/pmu: Truncate counter value to allowed width on write
  ...
2023-10-16 18:34:17 -07:00
Stefan Roesch 0374af1da0 mm/ksm: test case for prctl fork/exec workflow
This adds a new test case to the ksm functional tests to make sure that
the KSM setting is inherited by the child process when doing a fork/exec.

Link: https://lkml.kernel.org/r/20230922211141.320789-3-shr@devkernel.io
Signed-off-by: Stefan Roesch <shr@devkernel.io>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Carl Klemm <carl@uvos.xyz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-16 15:44:38 -07:00
Swapnil Sapkal 0996e67423 selftests/amd-pstate: Added option to provide perf binary path
In selftests/amd-pstate, distro `perf` is used to capture `perf stat`
while running microbenchmarks. Distro `perf` is not working with
upstream kernel. Fix this by providing an option to give the perf
binary path.

Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Swapnil Sapkal <swapnil.sapkal@amd.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-16 13:06:58 -06:00
Swapnil Sapkal 27aabb2c43 selftests/amd-pstate: Fix broken paths to run workloads in amd-pstate-ut
In selftests/amd-pstate, tbench and gitsource microbenchmarks are
used to compare the performance with different governors. In current
implementation the relative path to run `amd_pstate_tracer.py` is broken.
Fix this by using absolute paths.

Signed-off-by: Swapnil Sapkal <swapnil.sapkal@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-16 13:06:50 -06:00
Nicolin Chen 266dcae34d iommufd/selftest: Rework TEST_LENGTH to test min_size explicitly
TEST_LENGTH passing ".size = sizeof(struct _struct) - 1" expects -EINVAL
from "if (ucmd.user_size < op->min_size)" check in iommufd_fops_ioctl().
This has been working when min_size is exactly the size of the structure.

However, if the size of the structure becomes larger than min_size, i.e.
the passing size above is larger than min_size, that min_size sanity no
longer works.

Since the first test in TEST_LENGTH() was to test that min_size sanity
routine, rework it to support a min_size calculation, rather than using
the full size of the structure.

Link: https://lore.kernel.org/r/20231015074648.24185-1-nicolinc@nvidia.com
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-10-16 11:05:51 -03:00
Andrii Nakryiko 72f8a1de4a bpf: Disambiguate SCALAR register state output in verifier logs
Currently the way that verifier prints SCALAR_VALUE register state (and
PTR_TO_PACKET, which can have var_off and ranges info as well) is very
ambiguous.

In the name of brevity we are trying to eliminate "unnecessary" output
of umin/umax, smin/smax, u32_min/u32_max, and s32_min/s32_max values, if
possible. Current rules are that if any of those have their default
value (which for mins is the minimal value of its respective types: 0,
S32_MIN, or S64_MIN, while for maxs it's U32_MAX, S32_MAX, S64_MAX, or
U64_MAX) *OR* if there is another min/max value that as matching value.
E.g., if smin=100 and umin=100, we'll emit only umin=10, omitting smin
altogether. This approach has a few problems, being both ambiguous and
sort-of incorrect in some cases.

Ambiguity is due to missing value could be either default value or value
of umin/umax or smin/smax. This is especially confusing when we mix
signed and unsigned ranges. Quite often, umin=0 and smin=0, and so we'll
have only `umin=0` leaving anyone reading verifier log to guess whether
smin is actually 0 or it's actually -9223372036854775808 (S64_MIN). And
often times it's important to know, especially when debugging tricky
issues.

"Sort-of incorrectness" comes from mixing negative and positive values.
E.g., if umin is some large positive number, it can be equal to smin
which is, interpreted as signed value, is actually some negative value.
Currently, that smin will be omitted and only umin will be emitted with
a large positive value, giving an impression that smin is also positive.

Anyway, ambiguity is the biggest issue making it impossible to have an
exact understanding of register state, preventing any sort of automated
testing of verifier state based on verifier log. This patch is
attempting to rectify the situation by removing ambiguity, while
minimizing the verboseness of register state output.

The rules are straightforward:
  - if some of the values are missing, then it definitely has a default
  value. I.e., `umin=0` means that umin is zero, but smin is actually
  S64_MIN;
  - all the various boundaries that happen to have the same value are
  emitted in one equality separated sequence. E.g., if umin and smin are
  both 100, we'll emit `smin=umin=100`, making this explicit;
  - we do not mix negative and positive values together, and even if
  they happen to have the same bit-level value, they will be emitted
  separately with proper sign. I.e., if both umax and smax happen to be
  0xffffffffffffffff, we'll emit them both separately as
  `smax=-1,umax=18446744073709551615`;
  - in the name of a bit more uniformity and consistency,
  {u32,s32}_{min,max} are renamed to {s,u}{min,max}32, which seems to
  improve readability.

The above means that in case of all 4 ranges being, say, [50, 100] range,
we'd previously see hugely ambiguous:

    R1=scalar(umin=50,umax=100)

Now, we'll be more explicit:

    R1=scalar(smin=umin=smin32=umin32=50,smax=umax=smax32=umax32=100)

This is slightly more verbose, but distinct from the case when we don't
know anything about signed boundaries and 32-bit boundaries, which under
new rules will match the old case:

    R1=scalar(umin=50,umax=100)

Also, in the name of simplicity of implementation and consistency, order
for {s,u}32_{min,max} are emitted *before* var_off. Previously they were
emitted afterwards, for unclear reasons.

This patch also includes a few fixes to selftests that expect exact
register state to accommodate slight changes to verifier format. You can
see that the changes are pretty minimal in common cases.

Note, the special case when SCALAR_VALUE register is a known constant
isn't changed, we'll emit constant value once, interpreted as signed
value.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20231011223728.3188086-5-andrii@kernel.org
2023-10-16 13:49:18 +02:00
Andrii Nakryiko cde7851428 selftests/bpf: Make align selftests more robust
Align subtest is very specific and finicky about expected verifier log
output and format. This is often completely unnecessary as in a bunch of
situations test actually cares about var_off part of register state. But
given how exact it is right now, any tiny verifier log changes can lead
to align tests failures, requiring constant adjustment.

This patch tries to make this a bit more robust by making logic first
search for specified register and then allowing to match only portion of
register state, not everything exactly. This will come handly with
follow up changes to SCALAR register output disambiguation.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20231011223728.3188086-4-andrii@kernel.org
2023-10-16 13:49:18 +02:00
Andrii Nakryiko 08a7078fea selftests/bpf: Improve missed_kprobe_recursion test robustness
Given missed_kprobe_recursion is non-serial and uses common testing
kfuncs to count number of recursion misses it's possible that some other
parallel test can trigger extraneous recursion misses. So we can't
expect exactly 1 miss. Relax conditions and expect at least one.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20231011223728.3188086-3-andrii@kernel.org
2023-10-16 13:49:18 +02:00
Andrii Nakryiko 2d78928c9c selftests/bpf: Improve percpu_alloc test robustness
Make these non-serial tests filter BPF programs by intended PID of
a test runner process. This makes it isolated from other parallel tests
that might interfere accidentally.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20231011223728.3188086-2-andrii@kernel.org
2023-10-16 13:49:18 +02:00
Binbin Wu 2906063341 selftests/x86/lam: Zero out buffer for readlink()
Zero out the buffer for readlink() since readlink() does not append a
terminating null byte to the buffer.  Also change the buffer length
passed to readlink() to 'PATH_MAX - 1' to ensure the resulting string
is always null terminated.

Fixes: 833c12ce0f ("selftests/x86/lam: Add inherit test cases for linear-address masking")
Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Link: https://lore.kernel.org/r/20231016062446.695-1-binbin.wu@linux.intel.com
2023-10-16 11:39:57 +02:00
zhujun2 3c4fe89878 selftests: net: remove unused variables
These variables are never referenced in the code, just remove them

Signed-off-by: zhujun2 <zhujun2@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-16 09:20:08 +01:00
Xabier Marquiegui 26285e689c ptp: add testptp mask test
Add option to test timestamp event queue mask manipulation in testptp.

Option -F allows the user to specify a single channel that will be
applied on the mask filter via IOCTL.

The test program will maintain the file open until user input is
received.

This allows checking the effect of the IOCTL in debugfs.

eg:

Console 1:
```
Channel 12 exclusively enabled. Check on debugfs.
Press any key to continue
```

Console 2:
```
0x00000000 0x00000001 0x00000000 0x00000000 0x00000000 0x00000000
0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000
0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000
0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000
0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000
0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000
0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000
0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000
0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000
0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000
0x00000000 0x00000000 0x00000000 0x00000000
```

Signed-off-by: Xabier Marquiegui <reibax@gmail.com>
Suggested-by: Richard Cochran <richardcochran@gmail.com>
Suggested-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-15 20:07:52 +01:00
Xabier Marquiegui 403376ddb4 ptp: add debugfs interface to see applied channel masks
Use debugfs to be able to view channel mask applied to every timestamp
event queue.

Every time the device is opened, a new entry is created in
`$DEBUGFS_MOUNTPOINT/ptpN/$INSTANCE_ADDRESS/mask`.

The mask value can be viewed grouped in 32bit decimal values using cat,
or converted to hexadecimal with the included `ptpchmaskfmt.sh` script.
32 bit values are listed from least significant to most significant.

Signed-off-by: Xabier Marquiegui <reibax@gmail.com>
Suggested-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-15 20:07:52 +01:00
Aaron Conole 8eff0e0622 selftests: openvswitch: Fix the ct_tuple for v4
The ct_tuple v4 data structure decode / encode routines were using
the v6 IP address decode and relying on default encode. This could
cause exceptions during encode / decode depending on how a ct4
tuple would appear in a netlink message.

Caught during code review.

Fixes: e52b07aa1a ("selftests: openvswitch: add flow dump support")
Signed-off-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-15 20:02:51 +01:00
Aaron Conole 76035fd12c selftests: openvswitch: Skip drop testing on older kernels
Kernels that don't have support for openvswitch drop reasons also
won't have the drop counter reasons, so we should skip the test
completely.  It previously wasn't possible to build a test case
for this without polluting the datapath, so we introduce a mechanism
to clear all the flows from a datapath allowing us to test for
explicit drop actions, and then clear the flows to build the
original test case.

Fixes: 4242029164 ("selftests: openvswitch: add explicit drop testcase")
Signed-off-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-15 20:02:51 +01:00
Aaron Conole af846afad5 selftests: openvswitch: Catch cases where the tests are killed
In case of fatal signal, or early abort at least cleanup the current
test case.

Fixes: 25f16c873f ("selftests: add openvswitch selftest suite")
Signed-off-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-15 20:02:51 +01:00
Aaron Conole 92e37f20f2 selftests: openvswitch: Add version check for pyroute2
Paolo Abeni reports that on some systems the pyroute2 version isn't
new enough to run the test suite.  Ensure that we support a minimum
version of 0.6 for all cases (which does include the existing ones).
The 0.6.1 version was released in May of 2021, so should be
propagated to most installations at this point.

The alternative that Paolo proposed was to only skip when the
add-flow is being run.  This would be okay for most cases, except
if a future test case is added that needs to do flow dump without
an associated add (just guessing).  In that case, it could also be
broken and we would need additional skip logic anyway.  Just draw
a line in the sand now.

Fixes: 25f16c873f ("selftests: add openvswitch selftest suite")
Reported-by: Paolo Abeni <pabeni@redhat.com>
Closes: https://lore.kernel.org/lkml/8470c431e0930d2ea204a9363a60937289b7fdbe.camel@redhat.com/
Signed-off-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-15 20:02:51 +01:00
Paolo Bonzini 2b3f2325e7 KVM selftests fixes for 6.6:
- Play nice with %llx when formatting guest printf and assert statements.
 
  - Clean up stale test metadata.
 
  - Zero-initialize structures in memslot perf test to workaround a suspected
    "may be used uninitialized" false positives from GCC.
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCgAwFiEEMHr+pfEFOIzK+KY1YJEiAU0MEvkFAmUp1RISHHNlYW5qY0Bn
 b29nbGUuY29tAAoJEGCRIgFNDBL5I+wP/1hpxu+wC29SxT6PxLcMFCuIy+XV+jSV
 mnFRKhu4Np4E2Th8jFs94qnVN/gdPSNF+kbY//4kqfenf6V6VpNj4rUER2xZY+HV
 0Ben08AzuoDdzn9UikG59GJeyXuZZDqqOdjM2xgefXDqyRYvFhwQleVCULzoGBeB
 iwUD7g2MOJfFjqUhJAV8HDQTzhMgDIc9J+d+oO6J8FRXd64jQ4CBpaM1FDyjmpFp
 RiKMRPkOuVkTbiScCGGwrMdpOXpT1tY5DcIP0N7O6EafyohItQ21hZWth5/1DmIS
 wL1rMTad7uRcdQUdTDLUTegcFRQUjtUo5NS5l92MNFWGlW+J+C5GNVR8znMQCGq1
 ZgovBR72FpFHaup2P2G4mxE9A0M7kiyhLqJFyaM6dvQQTyYSw0gPxKEjUmdljru1
 Gf4SVZPdTAe543xpFJy3ANB4lWsc5uYzWxbIuNNSNl1p6l2em+wvvDroMJujVYU5
 GnKxEtuwEnAUN094a8MMvWR6Q1m1Bfi5brFaPYmUveoZjGb7z8LxebQ4Tg3L08Ga
 rH2Ri/Dt5mF/Pr5hje7eWDOYyDopPIcUWzRpIVA0/ORScqMXOV/oqEVufA+wwlSj
 k0vcRkieRa2cxsKV+F00BSxKXEtdH51k3TEZA7dKFIKDuQk9q46+g5JXr4DF29dl
 eXVDD03hHAFa
 =LJ3q
 -----END PGP SIGNATURE-----

Merge tag 'kvm-x86-selftests-6.6-fixes' of https://github.com/kvm-x86/linux into HEAD

KVM selftests fixes for 6.6:

 - Play nice with %llx when formatting guest printf and assert statements.

 - Clean up stale test metadata.

 - Zero-initialize structures in memslot perf test to workaround a suspected
   "may be used uninitialized" false positives from GCC.
2023-10-15 08:25:18 -04:00
Arseniy Krasnov 8d211285c6 test/vsock: io_uring rx/tx tests
This adds set of tests which use io_uring for rx/tx. This test suite is
implemented as separated util like 'vsock_test' and has the same set of
input arguments as 'vsock_test'. These tests only cover cases of data
transmission (no connect/bind/accept etc).

Signed-off-by: Arseniy Krasnov <avkrasnov@salutedevices.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-15 13:19:43 +01:00
Arseniy Krasnov e846d679ad test/vsock: MSG_ZEROCOPY support for vsock_perf
To use this option pass '--zerocopy' parameter:

./vsock_perf --zerocopy --sender <cid> ...

With this option MSG_ZEROCOPY flag will be passed to the 'send()' call.

Signed-off-by: Arseniy Krasnov <avkrasnov@salutedevices.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-15 13:19:42 +01:00
Arseniy Krasnov bc36442ef3 test/vsock: MSG_ZEROCOPY flag tests
This adds three tests for MSG_ZEROCOPY feature:
1) SOCK_STREAM tx with different buffers.
2) SOCK_SEQPACKET tx with different buffers.
3) SOCK_STREAM test to read empty error queue of the socket.

Patch also works as preparation for the next patches for tools in this
patchset: vsock_perf and vsock_uring_test:
1) Adds several new functions to util.c - they will be also used by
   vsock_uring_test.
2) Adds two new functions for MSG_ZEROCOPY handling to a new source
   file - such source will be shared between vsock_test, vsock_perf and
   vsock_uring_test, thus avoiding code copy-pasting.

Signed-off-by: Arseniy Krasnov <avkrasnov@salutedevices.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-15 13:19:42 +01:00
Dave Marchevsky e0e1a7a5fc selftests/bpf: Add tests for open-coded task_vma iter
The open-coded task_vma iter added earlier in this series allows for
natural iteration over a task's vmas using existing open-coded iter
infrastructure, specifically bpf_for_each.

This patch adds a test demonstrating this pattern and validating
correctness. The vma->vm_start and vma->vm_end addresses of the first
1000 vmas are recorded and compared to /proc/PID/maps output. As
expected, both see the same vmas and addresses - with the exception of
the [vsyscall] vma - which is explained in a comment in the prog_tests
program.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20231013204426.1074286-5-davemarchevsky@fb.com
2023-10-13 15:48:58 -07:00
Dave Marchevsky 45b38941c8 selftests/bpf: Rename bpf_iter_task_vma.c to bpf_iter_task_vmas.c
Further patches in this series will add a struct bpf_iter_task_vma,
which will result in a name collision with the selftest prog renamed in
this patch. Rename the selftest to avoid the collision.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20231013204426.1074286-3-davemarchevsky@fb.com
2023-10-13 15:48:58 -07:00
Ido Schimmel aa13e5241a selftests: fib_tests: Count all trace point invocations
The tests rely on the IPv{4,6} FIB trace points being triggered once for
each forwarded packet. If receive processing is deferred to the
ksoftirqd task these invocations will not be counted and the tests will
fail. Fix by specifying the '-a' flag to avoid perf from filtering on
the mausezahn task.

Before:

 # ./fib_tests.sh -t ipv4_mpath_list

 IPv4 multipath list receive tests
     TEST: Multipath route hit ratio (.68)                               [FAIL]

 # ./fib_tests.sh -t ipv6_mpath_list

 IPv6 multipath list receive tests
     TEST: Multipath route hit ratio (.27)                               [FAIL]

After:

 # ./fib_tests.sh -t ipv4_mpath_list

 IPv4 multipath list receive tests
     TEST: Multipath route hit ratio (1.00)                              [ OK ]

 # ./fib_tests.sh -t ipv6_mpath_list

 IPv6 multipath list receive tests
     TEST: Multipath route hit ratio (.99)                               [ OK ]

Fixes: 8ae9efb859 ("selftests: fib_tests: Add multipath list receive tests")
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/netdev/202309191658.c00d8b8-oliver.sang@intel.com/
Tested-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Tested-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Link: https://lore.kernel.org/r/20231010132113.3014691-3-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13 14:28:22 -07:00
Ido Schimmel dbb13378ba selftests: fib_tests: Disable RP filter in multipath list receive test
The test relies on the fib:fib_table_lookup trace point being triggered
once for each forwarded packet. If RP filter is not disabled, the trace
point will be triggered twice for each packet (for source validation and
forwarding), potentially masking actual bugs. Fix by explicitly
disabling RP filter.

Before:

 # ./fib_tests.sh -t ipv4_mpath_list

 IPv4 multipath list receive tests
     TEST: Multipath route hit ratio (1.99)                              [ OK ]

After:

 # ./fib_tests.sh -t ipv4_mpath_list

 IPv4 multipath list receive tests
     TEST: Multipath route hit ratio (.99)                               [ OK ]

Fixes: 8ae9efb859 ("selftests: fib_tests: Add multipath list receive tests")
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/netdev/202309191658.c00d8b8-oliver.sang@intel.com/
Tested-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Tested-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Link: https://lore.kernel.org/r/20231010132113.3014691-2-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13 14:28:22 -07:00
Maciej Wieczor-Retman 508934b5d1 selftests/resctrl: Move run_benchmark() to a more fitting file
resctrlfs.c contains mostly functions that interact in some way with
resctrl FS entries while functions inside resctrl_val.c deal with
measurements and benchmarking.

run_benchmark() is located in resctrlfs.c even though it's purpose
is not interacting with the resctrl FS but to execute cache checking
logic.

Move run_benchmark() to resctrl_val.c just before resctrl_val() that
makes use of run_benchmark(). Make run_benchmark() static since it's
not used between multiple files anymore.

Remove return comment from kernel-doc since the function is type void.

Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-13 14:59:06 -06:00
Maciej Wieczor-Retman 20d96b25cc selftests/resctrl: Fix schemata write error check
Writing bitmasks to the schemata can fail when the bitmask doesn't
adhere to constraints defined by what a particular CPU supports.
Some example of constraints are max length or having contiguous bits.
The driver should properly return errors when any rule concerning
bitmask format is broken.

Resctrl FS returns error codes from fprintf() only when fclose() is
called. Current error checking scheme allows invalid bitmasks to be
written into schemata file and the selftest doesn't notice because the
fclose() error code isn't checked.

Substitute fopen(), flose() and fprintf() with open(), close() and
write() to avoid error code buffering between fprintf() and fclose().

Remove newline character from the schema string after writing it to
the schemata file so it prints correctly before function return.

Pass the string generated with strerror() to the "reason" buffer so
the error message is more verbose. Extend "reason" buffer so it can hold
longer messages.

Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-13 14:59:01 -06:00
Ilpo Järvinen ef43c30858 selftests/resctrl: Reduce failures due to outliers in MBA/MBM tests
The initial value of 5% chosen for the maximum allowed percentage
difference between resctrl mbm value and IMC mbm value in

commit 06bd03a57f ("selftests/resctrl: Fix MBA/MBM results reporting
       format") was "randomly chosen value" (as admitted by the changelog).

When running tests in our lab across a large number platforms, 5%
difference upper bound for success seems a bit on the low side for the
MBA and MBM tests. Some platforms produce outliers that are slightly
above that, typically 6-7%, which leads MBA/MBM test frequently
failing.

Replace the "randomly chosen value" with a success bound that is based
on those measurements across large number of platforms by relaxing the
MBA/MBM success bound to 8%. The relaxed bound removes the failures due
the frequent outliers.

Fixed commit description style error during merge:
Shuah Khan <skhan@linuxfoundation.org>

Fixes: 06bd03a57f ("selftests/resctrl: Fix MBA/MBM results reporting format")
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-13 14:54:38 -06:00
Ilpo Järvinen 06035f0194 selftests/resctrl: Fix feature checks
The MBA and CMT tests expect support of other features to be able to
run.

When platform only supports MBA but not MBM, MBA test will fail with:
Failed to open total bw file: No such file or directory

When platform only supports CMT but not CAT, CMT test will fail with:
Failed to open bit mask file '/sys/fs/resctrl/info/L3/cbm_mask': No such file or directory

It leads to the test reporting test fail (even if no test was run at
all).

Extend feature checks to cover these two conditions to show these tests
were skipped rather than failed.

Fixes: ee0415681e ("selftests/resctrl: Use resctrl/info for feature detection")
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Cc: <stable@vger.kernel.org> # selftests/resctrl: Refactor feature check to use resource and feature name
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-13 14:54:34 -06:00
Ilpo Järvinen d56e5da0e0 selftests/resctrl: Refactor feature check to use resource and feature name
Feature check in validate_resctrl_feature_request() takes in the test
name string and maps that to what to check per test.

Pass resource and feature names to validate_resctrl_feature_request()
directly rather than deriving them from the test name inside the
function which makes the feature check easier to extend for new test
cases.

Use !! in the return statement to make the boolean conversion more
obvious even if it is not strictly necessary from correctness point of
view (to avoid it looking like the function is returning a freed
pointer).

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Cc: <stable@vger.kernel.org> # selftests/resctrl: Remove duplicate feature check from CMT test
Cc: <stable@vger.kernel.org> # selftests/resctrl: Move _GNU_SOURCE define into Makefile
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-13 14:54:27 -06:00
Ilpo Järvinen 3a1e4a91aa selftests/resctrl: Move _GNU_SOURCE define into Makefile
_GNU_SOURCE is defined in resctrl.h. Defining _GNU_SOURCE has a large
impact on what gets defined when including headers either before or
after it. This can result in compile failures if .c file decides to
include a standard header file before resctrl.h.

It is safer to define _GNU_SOURCE in Makefile so it is always defined
regardless of in which order includes are done.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-13 14:54:21 -06:00
Ilpo Järvinen 030b48fb2c selftests/resctrl: Remove duplicate feature check from CMT test
The test runner run_cmt_test() in resctrl_tests.c checks for CMT
feature and does not run cmt_resctrl_val() if CMT is not supported.
Then cmt_resctrl_val() also check is CMT is supported.

Remove the duplicated feature check for CMT from cmt_resctrl_val().

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-13 14:54:16 -06:00
Ilpo Järvinen 3aff514644 selftests/resctrl: Extend signal handler coverage to unmount on receiving signal
Unmounting resctrl FS has been moved into the per test functions in
resctrl_tests.c by commit caddc0fbe4 ("selftests/resctrl: Move
resctrl FS mount/umount to higher level"). In case a signal (SIGINT,
SIGTERM, or SIGHUP) is received, the running selftest is aborted by
ctrlc_handler() which then unmounts resctrl fs before exiting. The
current section between signal_handler_register() and
signal_handler_unregister(), however, does not cover the entire
duration when resctrl FS is mounted.

Move signal_handler_register() and signal_handler_unregister() calls
from per test files into resctrl_tests.c to properly unmount resctrl
fs. In order to not add signal_handler_register()/unregister() n times,
create helpers test_prepare() and test_cleanup().

Do not call ksft_exit_fail_msg() in test_prepare() but only in the per
test function to keep the control flow cleaner without adding calls to
exit() deep into the call chain.

Adjust child process kill() call in ctrlc_handler() to only be invoked
if the child was already forked.

Fixes: caddc0fbe4 ("selftests/resctrl: Move resctrl FS mount/umount to higher level")
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-13 14:54:09 -06:00
Ilpo Järvinen beb7f47184 selftests/resctrl: Fix uninitialized .sa_flags
signal_handler_unregister() calls sigaction() with uninitializing
sa_flags in the struct sigaction.

Make sure sa_flags is always initialized in signal_handler_unregister()
by initializing the struct sigaction when declaring it. Also add the
initialization to signal_handler_register() even if there are no know
bugs in there because correctness is then obvious from the code itself.

Fixes: 73c55fa5ab ("selftests/resctrl: Commonize the signal handler register/unregister for all tests")
Suggested-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Cc: <stable@vger.kernel.org>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-13 14:54:04 -06:00
Ilpo Järvinen f23c7925e9 selftests/resctrl: Cleanup benchmark argument parsing
Benchmark argument is handled by custom argument parsing code which is
more complicated than it needs to be.

Process benchmark argument within the normal getopt() handling and drop
unnecessary ben_ind and has_ben variables. When -b is given, terminate
the argument processing as -b consumes all remaining arguments.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: "Wieczor-Retman, Maciej" <maciej.wieczor-retman@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-13 14:29:08 -06:00
Ilpo Järvinen 149ff72953 selftests/resctrl: Remove ben_count variable
ben_count is only used to write the terminator for the list. It is
enough to use i from the loop so no need for another variable.

Remove ben_count variable as it is not needed.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: "Wieczor-Retman, Maciej" <maciej.wieczor-retman@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-13 14:29:02 -06:00
Ilpo Järvinen e33cb5702a selftests/resctrl: Make benchmark command const and build it with pointers
Benchmark command is used in multiple tests so it should not be
mutated by the tests but CMT test alters span argument. Due to the
order of tests (CMT test runs last), mutating the span argument in CMT
test does not trigger any real problems currently.

Mark benchmark_cmd strings as const and setup the benchmark command
using pointers. Because the benchmark command becomes const, the input
arguments can be used directly. Besides being simpler, using the input
arguments directly also removes the internal size restriction.

CMT test has to create a copy of the benchmark command before altering
the benchmark command.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: "Wieczor-Retman, Maciej" <maciej.wieczor-retman@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-13 14:28:55 -06:00
Ilpo Järvinen 47809eb70c selftests/resctrl: Reorder resctrl FS prep code and benchmark_cmd init
Benchmark command is initialized before resctrl FS check and
preparation code that can call ksft_exit_skip(). There is no strong
reason why the resctrl FS support check and unmounting it (if already
mounted), has to be done after the benchmark command initialization.

Move benchmark command initialization such that it is done not until
right before the tests commence. This simplifies rollback handling when
benchmark command initialization starts to use dynamic allocation (in a
change following this).

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: "Wieczor-Retman, Maciej" <maciej.wieczor-retman@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-13 14:28:49 -06:00
Ilpo Järvinen b1a901e078 selftests/resctrl: Simplify span lifetime
struct resctrl_val_param contains span member. resctrl_val(), however,
never uses it because the value of span is embedded into the default
benchmark command and parsed from it by run_benchmark().

Remove span from resctrl_val_param. Provide DEFAULT_SPAN for the code
that needs it. CMT and CAT tests communicate span that is different
from the DEFAULT_SPAN between their internal functions which is
converted into passing it directly as a parameter.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: "Wieczor-Retman, Maciej" <maciej.wieczor-retman@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-13 14:28:44 -06:00
Ilpo Järvinen 47e36f16c7 selftests/resctrl: Remove bw_report and bm_type from main()
bw_report is always set to "reads" and bm_type is set to "fill_buf" but
is never used.

Set bw_report directly to "reads" in MBA/MBM test and remove bm_type.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: "Wieczor-Retman, Maciej" <maciej.wieczor-retman@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-13 14:28:38 -06:00
Ilpo Järvinen 5eb6360eee selftests/resctrl: Correct benchmark command help
Benchmark command must be the last argument because it consumes all the
remaining arguments but help misleadingly shows it as the first
argument. The benchmark command is also shown in quotes but it does not
match with the code.

Correct -b argument place in the help message and remove the quotes.
Tweak also how the options are presented by using ... notation.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: "Wieczor-Retman, Maciej" <maciej.wieczor-retman@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-13 14:28:33 -06:00
Ilpo Järvinen 4a28c7665c selftests/resctrl: Ensure the benchmark commands fits to its array
Benchmark command is copied into an array in the stack. The array is
BENCHMARK_ARGS items long but the command line could try to provide a
longer command. Argument size is also fixed by BENCHMARK_ARG_SIZE (63
bytes of space after fitting the terminating \0 character) and user
could have inputted argument longer than that.

Return error in case the benchmark command does not fit to the space
allocated for it.

Fixes: ecdbb911f2 ("selftests/resctrl: Add MBM test")
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: "Wieczor-Retman, Maciej" <maciej.wieczor-retman@intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-13 14:28:26 -06:00
Maciej Wieczor-Retman 27c734f440 selftests/resctrl: Fix wrong format specifier
Compiling resctrl selftest after adding a __printf() attribute to
ksft_print_msg() exposes -Wformat warning in show_cache_info().
The format specifier used expects a variable of type int but a long
unsigned int variable is passed instead.

Change the format specifier to match the passed variable.

Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-13 14:08:42 -06:00
Maciej Wieczor-Retman d3772e7bad selftests/mm: Substitute attribute with a macro
Compiling mm selftest after adding a __printf() attribute to
ksft_print_msg() exposes -Wformat warning in remap_region().

Fix the wrong format specifier causing the warning.

The mm selftest uses the printf attribute in its full form. Since the
header file that uses it also includes kselftests.h it can use the macro
defined there.

Use __printf() included with kselftests.h instead of the full attribute.

Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-13 14:08:36 -06:00
Maciej Wieczor-Retman 07bd3c3880 selftests/kvm: Replace attribute with macro
The __printf() macro is used in many tools in the linux kernel to
validate the format specifiers in functions that use printf. The kvm
selftest uses it without putting it in a macro definition while it
also imports the kselftests.h header where the macro attribute is
defined.

Use __printf() from kselftests.h instead of the full attribute.

Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-13 14:08:31 -06:00
Maciej Wieczor-Retman a8cfb03611 selftests/sigaltstack: Fix wrong format specifier
Compiling sigaltstack selftest after adding a __printf() attribute to
ksft_print_msg() exposes -Wformat warning in main().
The format specifier inside ksft_print_msg() expects a long
unsigned int but the passed variable is of unsigned int type.

Fix the format specifier so it matches the passed variable.

Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-13 14:08:26 -06:00
Maciej Wieczor-Retman 4d7f4e8158 selftests/pidfd: Fix ksft print formats
Compiling pidfd selftest after adding a __printf() attribute to
ksft_print_msg() and ksft_test_result_pass() exposes -Wformat warnings
in error_report(), test_pidfd_poll_exec_thread(),
child_poll_exec_test(), test_pidfd_poll_leader_exit_thread(),
child_poll_leader_exit_test().

The ksft_test_result_pass() in error_report() expects a string but
doesn't provide any argument after the format string. All the other
calls to ksft_print_msg() in the functions mentioned above have format
strings that don't match with other passed arguments.

Fix format specifiers so they match the passed variables.

Add a missing variable to ksft_test_result_pass() inside
error_report() so it matches other cases in the switch statement.

Fixes: 2def297ec7 ("pidfd: add tests for NSpid info in fdinfo")

Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-13 14:08:21 -06:00