Commit graph

1030415 commits

Author SHA1 Message Date
Mike Rapoport af64237461 mm/secretmem: wire up ->set_page_dirty
Make secretmem up to date with the changes done in commit 0af573780b
("mm: require ->set_page_dirty to be explicitly wired up") so that
unconditional call to this method won't cause crashes.

Link: https://lkml.kernel.org/r/20210716063933.31633-1-rppt@kernel.org
Fixes: 0af573780b ("mm: require ->set_page_dirty to be explicitly wired up")
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-23 17:43:28 -07:00
Roman Gushchin 593311e85b writeback, cgroup: do not reparent dax inodes
The inode switching code is not suited for dax inodes.  An attempt to
switch a dax inode to a parent writeback structure (as a part of a
writeback cleanup procedure) results in a panic like this:

  run fstests generic/270 at 2021-07-15 05:54:02
  XFS (pmem0p2): EXPERIMENTAL big timestamp feature in use.  Use at your own risk!
  XFS (pmem0p2): DAX enabled. Warning: EXPERIMENTAL, use at your own risk
  XFS (pmem0p2): EXPERIMENTAL inode btree counters feature in use. Use at your own risk!
  XFS (pmem0p2): Mounting V5 Filesystem
  XFS (pmem0p2): Ending clean mount
  XFS (pmem0p2): Quotacheck needed: Please wait.
  XFS (pmem0p2): Quotacheck: Done.
  XFS (pmem0p2): xlog_verify_grant_tail: space > BBTOB(tail_blocks)
  XFS (pmem0p2): xlog_verify_grant_tail: space > BBTOB(tail_blocks)
  XFS (pmem0p2): xlog_verify_grant_tail: space > BBTOB(tail_blocks)
  BUG: unable to handle page fault for address: 0000000005b0f669
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 0 P4D 0
  Oops: 0000 [#1] SMP PTI
  CPU: 13 PID: 10479 Comm: kworker/13:16 Not tainted 5.14.0-rc1-master-8096acd7442e+ #8
  Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 09/13/2016
  Workqueue: inode_switch_wbs inode_switch_wbs_work_fn
  RIP: 0010:inode_do_switch_wbs+0xaf/0x470
  Code: 00 30 0f 85 c1 03 00 00 0f 1f 44 00 00 31 d2 48 c7 c6 ff ff ff ff 48 8d 7c 24 08 e8 eb 49 1a 00 48 85 c0 74 4a bb ff ff ff ff <48> 8b 50 08 48 8d 4a ff 83 e2 01 48 0f 45 c1 48 8b 00 a8 08 0f 85
  RSP: 0018:ffff9c66691abdc8 EFLAGS: 00010002
  RAX: 0000000005b0f661 RBX: 00000000ffffffff RCX: ffff89e6a21382b0
  RDX: 0000000000000001 RSI: ffff89e350230248 RDI: ffffffffffffffff
  RBP: ffff89e681d19400 R08: 0000000000000000 R09: 0000000000000228
  R10: ffffffffffffffff R11: ffffffffffffffc0 R12: ffff89e6a2138130
  R13: ffff89e316af7400 R14: ffff89e316af6e78 R15: ffff89e6a21382b0
  FS:  0000000000000000(0000) GS:ffff89ee5fb40000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000005b0f669 CR3: 0000000cb2410004 CR4: 00000000001706e0
  Call Trace:
   inode_switch_wbs_work_fn+0xb6/0x2a0
   process_one_work+0x1e6/0x380
   worker_thread+0x53/0x3d0
   kthread+0x10f/0x130
   ret_from_fork+0x22/0x30
  Modules linked in: xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_counter nf_tables nfnetlink bridge stp llc rfkill sunrpc intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel ipmi_ssif kvm mgag200 i2c_algo_bit iTCO_wdt irqbypass drm_kms_helper iTCO_vendor_support acpi_ipmi rapl syscopyarea sysfillrect intel_cstate ipmi_si sysimgblt ioatdma dax_pmem_compat fb_sys_fops ipmi_devintf device_dax i2c_i801 pcspkr intel_uncore hpilo nd_pmem cec dax_pmem_core dca i2c_smbus acpi_tad lpc_ich ipmi_msghandler acpi_power_meter drm fuse xfs libcrc32c sd_mod t10_pi crct10dif_pclmul crc32_pclmul crc32c_intel tg3 ghash_clmulni_intel serio_raw hpsa hpwdt scsi_transport_sas wmi dm_mirror dm_region_hash dm_log dm_mod
  CR2: 0000000005b0f669
  ---[ end trace ed2105faff8384f3 ]---
  RIP: 0010:inode_do_switch_wbs+0xaf/0x470
  Code: 00 30 0f 85 c1 03 00 00 0f 1f 44 00 00 31 d2 48 c7 c6 ff ff ff ff 48 8d 7c 24 08 e8 eb 49 1a 00 48 85 c0 74 4a bb ff ff ff ff <48> 8b 50 08 48 8d 4a ff 83 e2 01 48 0f 45 c1 48 8b 00 a8 08 0f 85
  RSP: 0018:ffff9c66691abdc8 EFLAGS: 00010002
  RAX: 0000000005b0f661 RBX: 00000000ffffffff RCX: ffff89e6a21382b0
  RDX: 0000000000000001 RSI: ffff89e350230248 RDI: ffffffffffffffff
  RBP: ffff89e681d19400 R08: 0000000000000000 R09: 0000000000000228
  R10: ffffffffffffffff R11: ffffffffffffffc0 R12: ffff89e6a2138130
  R13: ffff89e316af7400 R14: ffff89e316af6e78 R15: ffff89e6a21382b0
  FS:  0000000000000000(0000) GS:ffff89ee5fb40000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000005b0f669 CR3: 0000000cb2410004 CR4: 00000000001706e0
  Kernel panic - not syncing: Fatal exception
  Kernel Offset: 0x15200000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
  ---[ end Kernel panic - not syncing: Fatal exception ]---

The crash happens on an attempt to iterate over attached pagecache pages
and check the dirty flag: a dax inode's xarray contains pfn's instead of
generic struct page pointers.

This happens for DAX and not for other kinds of non-page entries in the
inodes because it's a tagged iteration, and shadow/swap entries are
never tagged; only DAX entries get tagged.

Fix the problem by bailing out (with the false return value) of
inode_prepare_sbs_switch() if a dax inode is passed.

[willy@infradead.org: changelog addition]

Link: https://lkml.kernel.org/r/20210719171350.3876830-1-guro@fb.com
Fixes: c22d70a162 ("writeback, cgroup: release dying cgwbs by switching attached inodes")
Signed-off-by: Roman Gushchin <guro@fb.com>
Reported-by: Murphy Zhou <jencce.kernel@gmail.com>
Reported-by: Darrick J. Wong <djwong@kernel.org>
Tested-by: Darrick J. Wong <djwong@kernel.org>
Tested-by: Murphy Zhou <jencce.kernel@gmail.com>
Acked-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-23 17:43:28 -07:00
Roman Gushchin b43a9e76b4 writeback, cgroup: remove wb from offline list before releasing refcnt
Boyang reported that the commit c22d70a162 ("writeback, cgroup:
release dying cgwbs by switching attached inodes") causes the kernel to
crash while running xfstests generic/256 on ext4 on aarch64 and ppc64le.

  run fstests generic/256 at 2021-07-12 05:41:40
  EXT4-fs (vda3): mounted filesystem with ordered data mode. Opts: . Quota mode: none.
  Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
  Mem abort info:
     ESR = 0x96000005
     EC = 0x25: DABT (current EL), IL = 32 bits
     SET = 0, FnV = 0
     EA = 0, S1PTW = 0
     FSC = 0x05: level 1 translation fault
  Data abort info:
     ISV = 0, ISS = 0x00000005
     CM = 0, WnR = 0
  user pgtable: 64k pages, 48-bit VAs, pgdp=00000000b0502000
  [0000000000000000] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000
  Internal error: Oops: 96000005 [#1] SMP
  Modules linked in: dm_flakey dm_snapshot dm_bufio dm_zero dm_mod loop tls rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc ext4 vfat fat mbcache jbd2 drm fuse xfs libcrc32c crct10dif_ce ghash_ce sha2_ce sha256_arm64 sha1_ce virtio_blk virtio_net net_failover virtio_console failover virtio_mmio aes_neon_bs [last unloaded: scsi_debug]
  CPU: 0 PID: 408468 Comm: kworker/u8:5 Tainted: G X --------- ---  5.14.0-0.rc1.15.bx.el9.aarch64 #1
  Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
  Workqueue: events_unbound cleanup_offline_cgwbs_workfn
  pstate: 004000c5 (nzcv daIF +PAN -UAO -TCO BTYPE=--)
  pc : cleanup_offline_cgwbs_workfn+0x320/0x394
  lr : cleanup_offline_cgwbs_workfn+0xe0/0x394
  sp : ffff80001554fd10
  x29: ffff80001554fd10 x28: 0000000000000000 x27: 0000000000000001
  x26: 0000000000000000 x25: 00000000000000e0 x24: ffffd2a2fbe671a8
  x23: ffff80001554fd88 x22: ffffd2a2fbe67198 x21: ffffd2a2fc25a730
  x20: ffff210412bc3000 x19: ffff210412bc3280 x18: 0000000000000000
  x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
  x14: 0000000000000000 x13: 0000000000000030 x12: 0000000000000040
  x11: ffff210481572238 x10: ffff21048157223a x9 : ffffd2a2fa276c60
  x8 : ffff210484106b60 x7 : 0000000000000000 x6 : 000000000007d18a
  x5 : ffff210416a86400 x4 : ffff210412bc0280 x3 : 0000000000000000
  x2 : ffff80001554fd88 x1 : ffff210412bc0280 x0 : 0000000000000003
  Call trace:
     cleanup_offline_cgwbs_workfn+0x320/0x394
     process_one_work+0x1f4/0x4b0
     worker_thread+0x184/0x540
     kthread+0x114/0x120
     ret_from_fork+0x10/0x18
  Code: d63f0020 97f99963 17ffffa6 f8588263 (f9400061)
  ---[ end trace e250fe289272792a ]---
  Kernel panic - not syncing: Oops: Fatal exception
  SMP: stopping secondary CPUs
  SMP: failed to stop secondary CPUs 0-2
  Kernel Offset: 0x52a2e9fa0000 from 0xffff800010000000
  PHYS_OFFSET: 0xfff0defca0000000
  CPU features: 0x00200251,23200840
  Memory Limit: none
  ---[ end Kernel panic - not syncing: Oops: Fatal exception ]---

The problem happens when cgwb_release_workfn() races with
cleanup_offline_cgwbs_workfn(): wb_tryget() in
cleanup_offline_cgwbs_workfn() can be called after percpu_ref_exit() is
cgwb_release_workfn(), which is basically a use-after-free error.

Fix the problem by making removing the writeback structure from the
offline list before releasing the percpu reference counter.  It will
guarantee that cleanup_offline_cgwbs_workfn() will not see and not
access writeback structures which are about to be released.

Link: https://lkml.kernel.org/r/20210716201039.3762203-1-guro@fb.com
Fixes: c22d70a162 ("writeback, cgroup: release dying cgwbs by switching attached inodes")
Signed-off-by: Roman Gushchin <guro@fb.com>
Reported-by: Boyang Xue <bxue@redhat.com>
Suggested-by: Jan Kara <jack@suse.cz>
Tested-by: Darrick J. Wong <djwong@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Dave Chinner <dchinner@redhat.com>
Cc: Murphy Zhou <jencce.kernel@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-23 17:43:28 -07:00
Mike Rapoport 79e482e9c3 memblock: make for_each_mem_range() traverse MEMBLOCK_HOTPLUG regions
Commit b10d6bca87 ("arch, drivers: replace for_each_membock() with
for_each_mem_range()") didn't take into account that when there is
movable_node parameter in the kernel command line, for_each_mem_range()
would skip ranges marked with MEMBLOCK_HOTPLUG.

The page table setup code in POWER uses for_each_mem_range() to create
the linear mapping of the physical memory and since the regions marked
as MEMORY_HOTPLUG are skipped, they never make it to the linear map.

A later access to the memory in those ranges will fail:

  BUG: Unable to handle kernel data access on write at 0xc000000400000000
  Faulting instruction address: 0xc00000000008a3c0
  Oops: Kernel access of bad area, sig: 11 [#1]
  LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
  Modules linked in:
  CPU: 0 PID: 53 Comm: kworker/u2:0 Not tainted 5.13.0 #7
  NIP:  c00000000008a3c0 LR: c0000000003c1ed8 CTR: 0000000000000040
  REGS: c000000008a57770 TRAP: 0300   Not tainted  (5.13.0)
  MSR:  8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE>  CR: 84222202  XER: 20040000
  CFAR: c0000000003c1ed4 DAR: c000000400000000 DSISR: 42000000 IRQMASK: 0
  GPR00: c0000000003c1ed8 c000000008a57a10 c0000000019da700 c000000400000000
  GPR04: 0000000000000280 0000000000000180 0000000000000400 0000000000000200
  GPR08: 0000000000000100 0000000000000080 0000000000000040 0000000000000300
  GPR12: 0000000000000380 c000000001bc0000 c0000000001660c8 c000000006337e00
  GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
  GPR20: 0000000040000000 0000000020000000 c000000001a81990 c000000008c30000
  GPR24: c000000008c20000 c000000001a81998 000fffffffff0000 c000000001a819a0
  GPR28: c000000001a81908 c00c000001000000 c000000008c40000 c000000008a64680
  NIP clear_user_page+0x50/0x80
  LR __handle_mm_fault+0xc88/0x1910
  Call Trace:
    __handle_mm_fault+0xc44/0x1910 (unreliable)
    handle_mm_fault+0x130/0x2a0
    __get_user_pages+0x248/0x610
    __get_user_pages_remote+0x12c/0x3e0
    get_arg_page+0x54/0xf0
    copy_string_kernel+0x11c/0x210
    kernel_execve+0x16c/0x220
    call_usermodehelper_exec_async+0x1b0/0x2f0
    ret_from_kernel_thread+0x5c/0x70
  Instruction dump:
  79280fa4 79271764 79261f24 794ae8e2 7ca94214 7d683a14 7c893a14 7d893050
  7d4903a6 60000000 60000000 60000000 <7c001fec> 7c091fec 7c081fec 7c051fec
  ---[ end trace 490b8c67e6075e09 ]---

Making for_each_mem_range() include MEMBLOCK_HOTPLUG regions in the
traversal fixes this issue.

Link: https://bugzilla.redhat.com/show_bug.cgi?id=1976100
Link: https://lkml.kernel.org/r/20210712071132.20902-1-rppt@kernel.org
Fixes: b10d6bca87 ("arch, drivers: replace for_each_membock() with for_each_mem_range()")
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Tested-by: Greg Kurz <groug@kaod.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org>	[5.10+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-23 17:43:28 -07:00
Sergei Trofimovich 69e5d322a2 mm: page_alloc: fix page_poison=1 / INIT_ON_ALLOC_DEFAULT_ON interaction
To reproduce the failure we need the following system:

 - kernel command: page_poison=1 init_on_free=0 init_on_alloc=0

 - kernel config:
    * CONFIG_INIT_ON_ALLOC_DEFAULT_ON=y
    * CONFIG_INIT_ON_FREE_DEFAULT_ON=y
    * CONFIG_PAGE_POISONING=y

Resulting in:

    0000000085629bdd: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    0000000022861832: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00000000c597f5b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    CPU: 11 PID: 15195 Comm: bash Kdump: loaded Tainted: G     U     O      5.13.1-gentoo-x86_64 #1
    Hardware name: System manufacturer System Product Name/PRIME Z370-A, BIOS 2801 01/13/2021
    Call Trace:
     dump_stack+0x64/0x7c
     __kernel_unpoison_pages.cold+0x48/0x84
     post_alloc_hook+0x60/0xa0
     get_page_from_freelist+0xdb8/0x1000
     __alloc_pages+0x163/0x2b0
     __get_free_pages+0xc/0x30
     pgd_alloc+0x2e/0x1a0
     mm_init+0x185/0x270
     dup_mm+0x6b/0x4f0
     copy_process+0x190d/0x1b10
     kernel_clone+0xba/0x3b0
     __do_sys_clone+0x8f/0xb0
     do_syscall_64+0x68/0x80
     entry_SYSCALL_64_after_hwframe+0x44/0xae

Before commit 51cba1ebc6 ("init_on_alloc: Optimize static branches")
init_on_alloc never enabled static branch by default.  It could only be
enabed explicitly by init_mem_debugging_and_hardening().

But after commit 51cba1ebc6, a static branch could already be enabled
by default.  There was no code to ever disable it.  That caused
page_poison=1 / init_on_free=1 conflict.

This change extends init_mem_debugging_and_hardening() to also disable
static branch disabling.

Link: https://lkml.kernel.org/r/20210714031935.4094114-1-keescook@chromium.org
Link: https://lore.kernel.org/r/20210712215816.1512739-1-slyfox@gentoo.org
Fixes: 51cba1ebc6 ("init_on_alloc: Optimize static branches")
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Co-developed-by: Kees Cook <keescook@chromium.org>
Reported-by: Mikhail Morfikov <mmorfikov@gmail.com>
Reported-by: <bowsingbetee@pm.me>
Tested-by: <bowsingbetee@protonmail.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-23 17:43:28 -07:00
Christoph Hellwig d9a42b53bd mm: use kmap_local_page in memzero_page
The commit message introducing the global memzero_page explicitly
mentions switching to kmap_local_page in the commit log but doesn't
actually do that.

Link: https://lkml.kernel.org/r/20210713055231.137602-3-hch@lst.de
Fixes: 28961998f8 ("iov_iter: lift memzero_page() to highmem.h")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-23 17:43:28 -07:00
Christoph Hellwig 8dad53a11f mm: call flush_dcache_page() in memcpy_to_page() and memzero_page()
memcpy_to_page and memzero_page can write to arbitrary pages, which
could be in the page cache or in high memory, so call
flush_kernel_dcache_pages to flush the dcache.

This is a problem when using these helpers on dcache challeneged
architectures.  Right now there are just a few users, chances are no one
used the PC floppy driver, the aha1542 driver for an ISA SCSI HBA, and a
few advanced and optional btrfs and ext4 features on those platforms yet
since the conversion.

Link: https://lkml.kernel.org/r/20210713055231.137602-2-hch@lst.de
Fixes: bb90d4bc7b ("mm/highmem: Lift memcpy_[to|from]_page to core")
Fixes: 28961998f8 ("iov_iter: lift memzero_page() to highmem.h")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Cc: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-23 17:43:28 -07:00
Alexander Potapenko 236e9f1538 kfence: skip all GFP_ZONEMASK allocations
Allocation requests outside ZONE_NORMAL (MOVABLE, HIGHMEM or DMA) cannot
be fulfilled by KFENCE, because KFENCE memory pool is located in a zone
different from the requested one.

Because callers of kmem_cache_alloc() may actually rely on the
allocation to reside in the requested zone (e.g.  memory allocations
done with __GFP_DMA must be DMAable), skip all allocations done with
GFP_ZONEMASK and/or respective SLAB flags (SLAB_CACHE_DMA and
SLAB_CACHE_DMA32).

Link: https://lkml.kernel.org/r/20210714092222.1890268-2-glider@google.com
Fixes: 0ce20dd840 ("mm: add Kernel Electric-Fence infrastructure")
Signed-off-by: Alexander Potapenko <glider@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Acked-by: Souptick Joarder <jrdr.linux@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Cc: <stable@vger.kernel.org>	[5.12+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-23 17:43:28 -07:00
Alexander Potapenko 235a85cb32 kfence: move the size check to the beginning of __kfence_alloc()
Check the allocation size before toggling kfence_allocation_gate.

This way allocations that can't be served by KFENCE will not result in
waiting for another CONFIG_KFENCE_SAMPLE_INTERVAL without allocating
anything.

Link: https://lkml.kernel.org/r/20210714092222.1890268-1-glider@google.com
Signed-off-by: Alexander Potapenko <glider@google.com>
Suggested-by: Marco Elver <elver@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: <stable@vger.kernel.org>	[5.12+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-23 17:43:28 -07:00
Weizhao Ouyang 32ae8a0669 kfence: defer kfence_test_init to ensure that kunit debugfs is created
kfence_test_init and kunit_init both use the same level late_initcall,
which means if kfence_test_init linked ahead of kunit_init,
kfence_test_init will get a NULL debugfs_rootdir as parent dentry, then
kfence_test_init and kfence_debugfs_init both create a debugfs node
named "kfence" under debugfs_mount->mnt_root, and it will throw out
"debugfs: Directory 'kfence' with parent '/' already present!" with
EEXIST.  So kfence_test_init should be deferred.

Link: https://lkml.kernel.org/r/20210714113140.2949995-1-o451686892@gmail.com
Signed-off-by: Weizhao Ouyang <o451686892@gmail.com>
Tested-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-23 17:43:28 -07:00
Peter Collingbourne 0db282ba2c selftest: use mmap instead of posix_memalign to allocate memory
This test passes pointers obtained from anon_allocate_area to the
userfaultfd and mremap APIs.  This causes a problem if the system
allocator returns tagged pointers because with the tagged address ABI
the kernel rejects tagged addresses passed to these APIs, which would
end up causing the test to fail.  To make this test compatible with such
system allocators, stop using the system allocator to allocate memory in
anon_allocate_area, and instead just use mmap.

Link: https://lkml.kernel.org/r/20210714195437.118982-3-pcc@google.com
Link: https://linux-review.googlesource.com/id/Icac91064fcd923f77a83e8e133f8631c5b8fc241
Fixes: c47174fc36 ("userfaultfd: selftest")
Co-developed-by: Lokesh Gidra <lokeshgidra@google.com>
Signed-off-by: Lokesh Gidra <lokeshgidra@google.com>
Signed-off-by: Peter Collingbourne <pcc@google.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Dave Martin <Dave.Martin@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Alistair Delva <adelva@google.com>
Cc: William McVicker <willmcvicker@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Mitch Phillips <mitchp@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: <stable@vger.kernel.org>	[5.4]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-23 17:43:28 -07:00
Peter Collingbourne e71e2ace57 userfaultfd: do not untag user pointers
Patch series "userfaultfd: do not untag user pointers", v5.

If a user program uses userfaultfd on ranges of heap memory, it may end
up passing a tagged pointer to the kernel in the range.start field of
the UFFDIO_REGISTER ioctl.  This can happen when using an MTE-capable
allocator, or on Android if using the Tagged Pointers feature for MTE
readiness [1].

When a fault subsequently occurs, the tag is stripped from the fault
address returned to the application in the fault.address field of struct
uffd_msg.  However, from the application's perspective, the tagged
address *is* the memory address, so if the application is unaware of
memory tags, it may get confused by receiving an address that is, from
its point of view, outside of the bounds of the allocation.  We observed
this behavior in the kselftest for userfaultfd [2] but other
applications could have the same problem.

Address this by not untagging pointers passed to the userfaultfd ioctls.
Instead, let the system call fail.  Also change the kselftest to use
mmap so that it doesn't encounter this problem.

[1] https://source.android.com/devices/tech/debug/tagged-pointers
[2] tools/testing/selftests/vm/userfaultfd.c

This patch (of 2):

Do not untag pointers passed to the userfaultfd ioctls.  Instead, let
the system call fail.  This will provide an early indication of problems
with tag-unaware userspace code instead of letting the code get confused
later, and is consistent with how we decided to handle brk/mmap/mremap
in commit dcde237319 ("mm: Avoid creating virtual address aliases in
brk()/mmap()/mremap()"), as well as being consistent with the existing
tagged address ABI documentation relating to how ioctl arguments are
handled.

The code change is a revert of commit 7d0325749a ("userfaultfd: untag
user pointers") plus some fixups to some additional calls to
validate_range that have appeared since then.

[1] https://source.android.com/devices/tech/debug/tagged-pointers
[2] tools/testing/selftests/vm/userfaultfd.c

Link: https://lkml.kernel.org/r/20210714195437.118982-1-pcc@google.com
Link: https://lkml.kernel.org/r/20210714195437.118982-2-pcc@google.com
Link: https://linux-review.googlesource.com/id/I761aa9f0344454c482b83fcfcce547db0a25501b
Fixes: 63f0c60379 ("arm64: Introduce prctl() options to control the tagged user addresses ABI")
Signed-off-by: Peter Collingbourne <pcc@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Alistair Delva <adelva@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dave Martin <Dave.Martin@arm.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Mitch Phillips <mitchp@google.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: William McVicker <willmcvicker@google.com>
Cc: <stable@vger.kernel.org>	[5.4]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-23 17:43:28 -07:00
Jisheng Zhang 76f5dfacfb
riscv: stacktrace: pin the task's stack in get_wchan
Pin the task's stack before calling walk_stackframe() in get_wchan().
This can fix the panic as reported by Andreas when CONFIG_VMAP_STACK=y:

[   65.609696] Unable to handle kernel paging request at virtual address ffffffd0003bbde8
[   65.610460] Oops [#1]
[   65.610626] Modules linked in: virtio_blk virtio_mmio rtc_goldfish btrfs blake2b_generic libcrc32c xor raid6_pq sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua efivarfs
[   65.611670] CPU: 2 PID: 1 Comm: systemd Not tainted 5.14.0-rc1-1.g34fe32a-default #1 openSUSE Tumbleweed (unreleased) c62f7109153e5a0897ee58ba52393ad99b070fd2
[   65.612334] Hardware name: riscv-virtio,qemu (DT)
[   65.613008] epc : get_wchan+0x5c/0x88
[   65.613334]  ra : get_wchan+0x42/0x88
[   65.613625] epc : ffffffff800048a4 ra : ffffffff8000488a sp : ffffffd00021bb90
[   65.614008]  gp : ffffffff817709f8 tp : ffffffe07fe91b80 t0 : 00000000000001f8
[   65.614411]  t1 : 0000000000020000 t2 : 0000000000000000 s0 : ffffffd00021bbd0
[   65.614818]  s1 : ffffffd0003bbdf0 a0 : 0000000000000001 a1 : 0000000000000002
[   65.615237]  a2 : ffffffff81618008 a3 : 0000000000000000 a4 : 0000000000000000
[   65.615637]  a5 : ffffffd0003bc000 a6 : 0000000000000002 a7 : ffffffe27d370000
[   65.616022]  s2 : ffffffd0003bbd90 s3 : ffffffff8071a81e s4 : 0000000000003fff
[   65.616407]  s5 : ffffffffffffc000 s6 : 0000000000000000 s7 : ffffffff81618008
[   65.616845]  s8 : 0000000000000001 s9 : 0000000180000040 s10: 0000000000000000
[   65.617248]  s11: 000000000000016b t3 : 000000ff00000000 t4 : 0c6aec92de5e3fd7
[   65.617672]  t5 : fff78f60608fcfff t6 : 0000000000000078
[   65.618088] status: 0000000000000120 badaddr: ffffffd0003bbde8 cause: 000000000000000d
[   65.618621] [<ffffffff800048a4>] get_wchan+0x5c/0x88
[   65.619008] [<ffffffff8022da88>] do_task_stat+0x7a2/0xa46
[   65.619325] [<ffffffff8022e87e>] proc_tgid_stat+0xe/0x16
[   65.619637] [<ffffffff80227dd6>] proc_single_show+0x46/0x96
[   65.619979] [<ffffffff801ccb1e>] seq_read_iter+0x190/0x31e
[   65.620341] [<ffffffff801ccd70>] seq_read+0xc4/0x104
[   65.620633] [<ffffffff801a6bfe>] vfs_read+0x6a/0x112
[   65.620922] [<ffffffff801a701c>] ksys_read+0x54/0xbe
[   65.621206] [<ffffffff801a7094>] sys_read+0xe/0x16
[   65.621474] [<ffffffff8000303e>] ret_from_syscall+0x0/0x2
[   65.622169] ---[ end trace f24856ed2b8789c5 ]---
[   65.622832] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-07-23 17:29:03 -07:00
Jens Axboe 991468dcf1 io_uring: explicitly catch any illegal async queue attempt
Catch an illegal case to queue async from an unrelated task that got
the ring fd passed to it. This should not be possible to hit, but
better be proactive and catch it explicitly. io-wq is extended to
check for early IO_WQ_WORK_CANCEL being set on a work item as well,
so it can run the request through the normal cancelation path.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-07-23 16:44:51 -06:00
Jens Axboe 3c30ef0f78 io_uring: never attempt iopoll reissue from release path
There are two reasons why this shouldn't be done:

1) Ring is exiting, and we're canceling requests anyway. Any request
   should be canceled anyway. In theory, this could iterate for a
   number of times if someone else is also driving the target block
   queue into request starvation, however the likelihood of this
   happening is miniscule.

2) If the original task decided to pass the ring to another task, then
   we don't want to be reissuing from this context as it may be an
   unrelated task or context. No assumptions should be made about
   the context in which ->release() is run. This can only happen for pure
   read/write, and we'll get -EFAULT on them anyway.

Link: https://lore.kernel.org/io-uring/YPr4OaHv0iv0KTOc@zeniv-ca.linux.org.uk/
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-07-23 16:32:48 -06:00
David S. Miller 5aa1959d18 Merge branch 'ionic-fixes'
Shannon Nelson says:

====================
ionic: bug fixes

Fix a thread race in rx_mode, remove unnecessary log message,
fix dynamic coalescing issues, and count all csum_none cases.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23 21:57:52 +01:00
Shannon Nelson f07f9815b7 ionic: count csum_none when offload enabled
Be sure to count the csum_none cases when csum offload is
enabled.

Fixes: 0f3154e6bc ("ionic: Add Tx and Rx handling")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23 21:57:41 +01:00
Shannon Nelson 76ed8a4a00 ionic: fix up dim accounting for tx and rx
We need to count the correct Tx and/or Rx packets for dynamic
interrupt moderation, depending on which we're processing on
the queue interrupt.

Fixes: 04a834592b ("ionic: dynamic interrupt moderation")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23 21:57:41 +01:00
Shannon Nelson a6ff85e0a2 ionic: remove intr coalesce update from napi
Move the interrupt coalesce value update out of the napi
thread and into the dim_work thread and set it only when it
has actually changed.

Fixes: 04a834592b ("ionic: dynamic interrupt moderation")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23 21:57:41 +01:00
Shannon Nelson f79eef711e ionic: catch no ptp support earlier
If PTP configuration is attempted on ports that don't support
it, such as VF ports, the driver will return an error status
-95, or EOPNOSUPP and print an error message
    enp98s0: hwstamp set failed: -95

Because some daemons can retry every few seconds, this can end
up filling the dmesg log and pushing out other more useful
messages.

We can catch this issue earlier in our handling and return
the error without a log message.

Fixes: 829600ce5e ("ionic: add ts_config replay")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23 21:57:41 +01:00
Shannon Nelson 6840e17b8e ionic: make all rx_mode work threadsafe
Move the bulk of the code from ionic_set_rx_mode(), which
can be called from atomic context, into ionic_lif_rx_mode()
which is a safe context.

A call from the stack will get pushed off into a work thread,
but it is also possible to simultaneously have a call driven
by a queue reconfig request from an ethtool command or fw
recovery event.  We add a mutex around the rx_mode work to be
sure they don't collide.

Fixes: 81dbc24147 ("ionic: change set_rx_mode from_ndo to can_sleep")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23 21:57:41 +01:00
David S. Miller 0506c93fba Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2021-07-23

This series contains updates to i40e driver only.

Arkadiusz corrects the order of calls for disabling queues to resolve
a false error message and adds a better message to the user when
transitioning FW LLDP back on while the firmware is still processing
the off request.

Lukasz adds additional information regarding possible incorrect cable
use when a PHY type error occurs.

Jedrzej adds ndo_select_queue support to resolve incorrect queue
selection when SW DCB is used and adds a warning when there are not
enough queues for desired TC configuration.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23 21:21:42 +01:00
Linus Torvalds f0fddcec6b for-5.14-rc2-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmD7GZUACgkQxWXV+ddt
 WDs+BA/+OHDY2ROYEnysAqF1qaDENVVnUavnDYYa+Uk61KVvx0pm/mHY9SllsuH4
 WCIwCwH7LZs11cRYp3vD80t4OdVGBKaDvEfX+znMCQYuoBm6G5eT3n5jhsVFr1jJ
 EqUVzUY+S44IWAEhzkVnSAD4xnMsan8b+YnngIFSMEqJlH+on6w8FhyP0QXwInxk
 1kfjl8tDMiryKFaekfGX5WaeflEeWGoHNf2xYokzPD/Oq6TCaoLycar1YXH+80FM
 05Jl0+jfEWbaHouMNd8bW9nHnSxh30i7gorY17Q6KLOFDCThNiKZuypZsQcCi/df
 TbjQDNTZjSsReFvrFeFlEdGv3dFHBGxz1Ns7RFPfVeNgmN0WnOLmzS+4rmfGyi8L
 +3TQ6MGqgG0DppPwfB9caDvxYsbN23uA1v5J1B+Dsbo47lFWWIoBQBtDvErAiHEy
 KF7B4jIOWrx3ZYwv3pkE3D+D19sKkB9+wLnlwVSF77npKO1up8W0h4mPdMLZaznW
 TGBXxwqI4105MSX5UatBpX+HYATpEWG5tmeZz5ERGFNC/piILmY4iVz/c5Vguh9/
 iUQwjSudIDWgGxcL7VClqrdF7sucsml6Svb+ZrxckmK7pa97TG2bIlzJDg0eFcle
 NBcw8RBcBMUay/Y04cKHLJAj6OOjBiXnxKjjHrhvtaBmOV2SHpc=
 =kj2e
 -----END PGP SIGNATURE-----

Merge tag 'for-5.14-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:
 "A few fixes and one patch to help some block layer API cleanups:

   - skip missing device when running fstrim

   - fix unpersisted i_size on fsync after expanding truncate

   - fix lock inversion problem when doing qgroup extent tracing

   - replace bdgrab/bdput usage, replace gendisk by block_device"

* tag 'for-5.14-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: store a block_device in struct btrfs_ordered_extent
  btrfs: fix lock inversion problem when doing qgroup extent tracing
  btrfs: check for missing device in btrfs_trim_fs
  btrfs: fix unpersisted i_size on fsync after expanding truncate
2021-07-23 12:49:07 -07:00
Linus Torvalds 704f4cba43 rbd fixes for a -rc1 regression and a subtle deadlock on lock_rwsem
(marked for stable).  Also included a rare WARN condition tweak.
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEydHwtzie9C7TfviiSn/eOAIR84sFAmD67fETHGlkcnlvbW92
 QGdtYWlsLmNvbQAKCRBKf944AhHzi1UAB/43vuj0sLO2cAW7HkjvoSqQG6MHruUl
 XaeZCUxG6AdgvrpwFxfi7r2k8N4RegoYFKiqEXdnYl6BANEEcZR1KFB6Uy9vEOuo
 R1NdmBF7ZY2U1o22SpWFHbdoCOx7KEdsFHU5rTODw4dwAZuj3GtRyJ8uGPz7VatH
 0wTLPSIcphFkq5mcdA4hQSes3O4vKmDlVfBreUl+PQg/lxnBPsXx07gLIk3Q0gN1
 uKseGr0miSpDHIS1IjYBOMs8AM5VbJKuzcsy5iCE1z/9tI1J5fsPBrZCopCPjajt
 1yN8/r7F7Ih9HaZoEU4NXLbEbLe4eX9XEWGOmiZjgry66zxwOCr3rJGa
 =Mqd9
 -----END PGP SIGNATURE-----

Merge tag 'ceph-for-5.14-rc3' of git://github.com/ceph/ceph-client

Pull ceph fixes from Ilya Dryomov:
 "A subtle deadlock on lock_rwsem (marked for stable) and rbd fixes for
  a -rc1 regression.

  Also included a rare WARN condition tweak"

* tag 'ceph-for-5.14-rc3' of git://github.com/ceph/ceph-client:
  rbd: resurrect setting of disk->private_data in rbd_init_disk()
  ceph: don't WARN if we're still opening a session to an MDS
  rbd: don't hold lock_rwsem while running_list is being drained
  rbd: always kick acquire on "acquired" and "released" notifications
2021-07-23 11:30:12 -07:00
Linus Torvalds 05daae0fb0 Tracing fixes for 5.14-rc2:
- Fix deadloop in ring buffer because of using stale "read" variable
 
 - Fix synthetic event use of field_pos as boolean and not an index
 
 - Fixed histogram special var "cpu" overriding event fields called "cpu"
 
 - Cleaned up error prone logic in alloc_synth_event()
 
 - Removed call to synchronize_rcu_tasks_rude() when not needed
 
 - Removed redundant initialization of a local variable "ret"
 
 - Fixed kernel crash when updating tracepoint callbacks of different
   priorities.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYPrGTxQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qusoAQDZkMo/vBFZgNGcL8GNCFpOF9HcV7QI
 JtBU+UG5GY2LagD/SEFEoG1o9UwKnIaBwr7qxGvrPgg8jKWtK/HEVFU94wk=
 =EVfM
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:

 - Fix deadloop in ring buffer because of using stale "read" variable

 - Fix synthetic event use of field_pos as boolean and not an index

 - Fixed histogram special var "cpu" overriding event fields called
   "cpu"

 - Cleaned up error prone logic in alloc_synth_event()

 - Removed call to synchronize_rcu_tasks_rude() when not needed

 - Removed redundant initialization of a local variable "ret"

 - Fixed kernel crash when updating tracepoint callbacks of different
   priorities.

* tag 'trace-v5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracepoints: Update static_call before tp_funcs when adding a tracepoint
  ftrace: Remove redundant initialization of variable ret
  ftrace: Avoid synchronize_rcu_tasks_rude() call when not necessary
  tracing: Clean up alloc_synth_event()
  tracing/histogram: Rename "cpu" to "common_cpu"
  tracing: Synthetic event field_pos is an index not a boolean
  tracing: Fix bug in rb_per_cpu_empty() that might cause deadloop.
2021-07-23 11:25:21 -07:00
Linus Torvalds 1af09ed5ae m68k updates for v5.14 (take two)
- Fix a Mac defconfig regression due to the IDE -> ATA switch.
 -----BEGIN PGP SIGNATURE-----
 
 iIsEABYIADMWIQQ9qaHoIs/1I4cXmEiKwlD9ZEnxcAUCYPrL7xUcZ2VlcnRAbGlu
 dXgtbTY4ay5vcmcACgkQisJQ/WRJ8XDEAgEAmUrcLIppTBjYzZIaLIQ98By9lAJP
 am95I4oWulXjTuMA/1XGOzIRf71liWI1HuOkFBSyeRYH93gIhlK4pgJjT4QB
 =d8pl
 -----END PGP SIGNATURE-----

Merge tag 'm68k-for-v5.14-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k

Pull m68k fix from Geert Uytterhoeven:

 - Fix a Mac defconfig regression due to the IDE -> ATA switch

* tag 'm68k-for-v5.14-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
  m68k: MAC should select HAVE_PATA_PLATFORM
2021-07-23 11:19:57 -07:00
Linus Torvalds ec6badfbe1 ACPI fixes for 5.14-rc3
- Fix recently broken Kconfig dependency for the ACPI table
    override via built-in initrd (Robert Richter).
 
  - Fix ACPI device reference counting in the for_each_acpi_dev_match()
    helper macro to avoid use-after-free (Andy Shevchenko).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmD63a4SHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRx1/4P/RX0iDLc0johyYFax+X4elOhTzjv8Yp3
 XAqFrJD7mn102jTqMWClZYRT6+KBDDadwVgLW9m9ElLoWW/+PNeyidm7iZbJvLGd
 ZEubCMktc9hrBg+zhnFRXbaDWfn7GS4BEWuBl79D4DKyOUDTg17ggo3JO7E7yrr3
 FCZMCjJu6vUhXafLjL/tCv1vF7wfbZK21bOdZI10QWX2VFa3qTvHLDPiSqp5inYZ
 OCVq2CaDWD94LcWTetyJbRe2uOrAgH2VPj/6Gieh3u0xZrpDVHxgfgQJ6anKTdNK
 xf4TpsV08Em4NS4j+GOXTJZO535K2yriey95tBkhtJd0JSaiLso9NYy8TsOuNl8j
 1yVxbCMaytpmU3867QFu9wq5lkDxX8vpvQUIWtbHd3+2CrRCyup2+kdSPCM7zwpq
 pDCM8ACU3uhAiD9X2KOmW8oxlAvG7C/LhEPnVzJu0vYLj2YJZAklCDEiKmKbn790
 +0n5rVBUXnCxZWTmmn6Pf995rmQT3WJr6NRWNKIA8Lee3HSso+yrR8N27AgWELeQ
 GhmBeQbzGMmX+cQTHByDXum5FC5Kecyo/PqwsN1z+wvnEVCACKn7GVrOJS0lU191
 DYWvFYBm6VsVAQsxakN5N6CzCW/+49yVNT71CjXTrS9nRAPgHi9wDgidlfjtyyS8
 RUz9OIuF6QOX
 =kiEf
 -----END PGP SIGNATURE-----

Merge tag 'acpi-5.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fixes from Rafael Wysocki:
 "These fix a recently broken Kconfig dependency and ACPI device
  reference counting in an iterator macro.

  Specifics:

   - Fix recently broken Kconfig dependency for the ACPI table override
     via built-in initrd (Robert Richter)

   - Fix ACPI device reference counting in the for_each_acpi_dev_match()
     helper macro to avoid use-after-free (Andy Shevchenko)"

* tag 'acpi-5.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: utils: Fix reference counting in for_each_acpi_dev_match()
  ACPI: Kconfig: Fix table override from built-in initrd
2021-07-23 11:08:06 -07:00
Linus Torvalds 1d597682d3 Driver core fixes for 5.14-rc3
Here are 2 small driver core fixes to resolve some reported problems for
 5.14-rc3.  They include:
 	- aux bus memory leak fix
 	- unneeded warning message removed when removing a device link.
 
 Both have been in linux-next with no reported problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYPrY3w8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ykkMwCgqxOw/jjRMrSLeTkspm7vZ9i7hi0AoMQUjjGC
 7RW9oQrZvPzYeqoF7ogJ
 =xmfh
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-5.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core fixes from Greg KH:
 "Here are two small driver core fixes to resolve some reported problems
  for 5.14-rc3. They include:

   - aux bus memory leak fix

   - unneeded warning message removed when removing a device link.

  Both have been in linux-next with no reported problems"

* tag 'driver-core-5.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  driver core: Prevent warning when removing a device link from unregistered consumer
  driver core: auxiliary bus: Fix memory leak when driver_register() fail
2021-07-23 10:20:15 -07:00
Linus Torvalds 8072911b2f Char/Misc fixes for 5.14-rc3
Here are some small char/misc driver fixes for 5.14-rc3.
 
 Included in here are:
 	- MAINTAINERS file updates for 2 changes in different driver
 	  subsystems.
 	- mhi bus bugfixes
 	- nds32 bugfix that resolves a reported problem.
 
 All have been in linux-next with no reported problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYPrYdw8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ym0owCeOjDxm62XwA+OVQi77vrfXuKMp/kAn2FJOFaO
 Jq/RZt6oFIfuTAA6YozC
 =m6eW
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-5.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc fixes from Greg KH:
 "Here are some small char/misc driver fixes for 5.14-rc3.

  Included in here are:

   - MAINTAINERS file updates for two changes in different driver
     subsystems

   - mhi bus bugfixes

   - nds32 bugfix that resolves a reported problem

  All have been in linux-next with no reported problems"

* tag 'char-misc-5.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  nds32: fix up stack guard gap
  MAINTAINERS: Change ACRN HSM driver maintainer
  MAINTAINERS: Update for VMCI driver
  bus: mhi: pci_generic: Fix inbound IPCR channel
  bus: mhi: core: Validate channel ID when processing command completions
  bus: mhi: pci_generic: Apply no-op for wake using sideband wake boolean
2021-07-23 10:14:56 -07:00
Linus Torvalds 74738c556d USB fixes for 5.14-rc3
Here are some USB fixes for 5.14-rc3 to resolve a bunch of tiny problems
 reported.  Included in here are:
 	- dtsi revert to resolve a problem which broke android systems
 	  that relied on the dts name to find the USB controller device.
 	  People are still working out the "real" solution for this, but
 	  for now the revert is needed.
 	- core USB fix for pipe calculation found by syzbot
 	- typec fixes
 	- gadget driver fixes
 	- new usb-serial device ids
 	- new USB quirks
 	- xhci fixes
 	- usb hub fixes for power management issues reported
 	- other tiny fixes
 
 All have been in linux-next with no reported problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYPrXzA8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ymtAQCeLCwOQUwwi3b/GIHW9Ik92eAB2C8AoNf1GZW3
 NBb8mwFi7bZgANICyG7v
 =65r/
 -----END PGP SIGNATURE-----

Merge tag 'usb-5.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
 "Here are some USB fixes for 5.14-rc3 to resolve a bunch of tiny
  problems reported. Included in here are:

   - dtsi revert to resolve a problem which broke android systems that
     relied on the dts name to find the USB controller device.

     People are still working out the "real" solution for this, but for
     now the revert is needed.

   - core USB fix for pipe calculation found by syzbot

   - typec fixes

   - gadget driver fixes

   - new usb-serial device ids

   - new USB quirks

   - xhci fixes

   - usb hub fixes for power management issues reported

   - other tiny fixes

  All have been in linux-next with no reported problems"

* tag 'usb-5.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (27 commits)
  USB: serial: cp210x: add ID for CEL EM3588 USB ZigBee stick
  Revert "USB: quirks: ignore remote wake-up on Fibocom L850-GL LTE modem"
  usb: cdc-wdm: fix build error when CONFIG_WWAN_CORE is not set
  Revert "arm64: dts: qcom: Harmonize DWC USB3 DT nodes name"
  usb: dwc2: gadget: Fix sending zero length packet in DDMA mode.
  usb: dwc2: Skip clock gating on Samsung SoCs
  usb: renesas_usbhs: Fix superfluous irqs happen after usb_pkt_pop()
  usb: dwc2: gadget: Fix GOUTNAK flow for Slave mode.
  usb: phy: Fix page fault from usb_phy_uevent
  usb: xhci: avoid renesas_usb_fw.mem when it's unusable
  usb: gadget: u_serial: remove WARN_ON on null port
  usb: dwc3: avoid NULL access of usb_gadget_driver
  usb: max-3421: Prevent corruption of freed memory
  usb: gadget: Fix Unbalanced pm_runtime_enable in tegra_xudc_probe
  MAINTAINERS: repair reference in USB IP DRIVER FOR HISILICON KIRIN 970
  usb: typec: stusb160x: Don't block probing of consumer of "connector" nodes
  usb: typec: stusb160x: register role switch before interrupt registration
  USB: usb-storage: Add LaCie Rugged USB3-FW to IGNORE_UAS
  usb: ehci: Prevent missed ehci interrupts with edge-triggered MSI
  usb: hub: Disable USB 3 device initiated lpm if exit latency is too high
  ...
2021-07-23 10:09:27 -07:00
Linus Torvalds e7562a00c1 sound fixes for 5.14-rc3
A collection of small fixes, mostly covering device-specific
 regressions and bugs over ASoC, HD-audio and USB-audio, while the
 ALSA PCM core received a few additional fixes for the possible
 (new and old) regressions.
 -----BEGIN PGP SIGNATURE-----
 
 iQJCBAABCAAsFiEEIXTw5fNLNI7mMiVaLtJE4w1nLE8FAmD6eIUOHHRpd2FpQHN1
 c2UuZGUACgkQLtJE4w1nLE+jdhAAwL64tV/0gw1IAq61YMRQmDhSO9D9ebdnqRdo
 kgjedrwDLulSczli9YNCSjy8yhPRcPBwt09W9vcvuvAnp7v7Z2rboReG9RCdbaTm
 Fwle30UQlOJmwE8wiAyxhrrb1R1vhb5omOlMUfnDjXHKaIE3PatZMFoHM4JzyDa8
 YeTdHGjRgu2NfJU3ki57iDol0YO6EfEa2cy/FTvdHBR+x8l4e4F70eeg7CEkcLnQ
 Ckz+hhAx4EIrff4I3MDKaEzn3iY2hTglqdv6qV0/S8eCKB2p8f45lu3wdtmtYnD1
 74wB6pY5InTpaYE5Tf4TdqGhlmuhOsYLv+f65FNriVoIuU9Jc0AgaU9bX9j6WkS4
 DC4ueI0TcfMDxDUWab4YrYhwuvNSFo4neonZ3nPWaSTAMsfmirJWDwEe1VZZrtfG
 t5tmCVpghic5os7p97V8VWOSaiRz3yyp1ap23/btmuBINemgjRPtftIKVEngbnkI
 NS8m1mZUpyCosIv7KJXScMUgA8nOlU0y6p9RdEQTJ2FECaIf2XOnrAbPiMVAlFtj
 TFGF0ubkh3FGCqjEBb0c5nDKnVJmk7yK3ddt78f/ocASxIBRs8ssdhrkkVlXLLEP
 iygBDhzePtXTpfnWUnseyYgctXDPSGD5M0pAfgWqFP7AilJIcThIpr0zhfeVs/KI
 ep0zZYY=
 =O7fC
 -----END PGP SIGNATURE-----

Merge tag 'sound-5.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "A collection of small fixes, mostly covering device-specific
  regressions and bugs over ASoC, HD-audio and USB-audio, while
  the ALSA PCM core received a few additional fixes for the
  possible (new and old) regressions"

* tag 'sound-5.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (29 commits)
  ALSA: usb-audio: Add registration quirk for JBL Quantum headsets
  ALSA: hda/hdmi: Add quirk to force pin connectivity on NUC10
  ALSA: pcm: Fix mmap without buffer preallocation
  ALSA: pcm: Fix mmap capability check
  ALSA: hda: intel-dsp-cfg: add missing ElkhartLake PCI ID
  ASoC: ti: j721e-evm: Check for not initialized parent_clk_id
  ASoC: ti: j721e-evm: Fix unbalanced domain activity tracking during startup
  ALSA: hda/realtek: Fix pop noise and 2 Front Mic issues on a machine
  ALSA: hdmi: Expose all pins on MSI MS-7C94 board
  ALSA: sb: Fix potential ABBA deadlock in CSP driver
  ASoC: rt5682: Fix the issue of garbled recording after powerd_dbus_suspend
  ASoC: amd: reverse stop sequence for stoneyridge platform
  ASoC: soc-pcm: add a flag to reverse the stop sequence
  ASoC: codecs: wcd938x: setup irq during component bind
  ASoC: dt-bindings: renesas: rsnd: Fix incorrect 'port' regex schema
  ALSA: usb-audio: Add missing proc text entry for BESPOKEN type
  ASoC: codecs: wcd938x: make sdw dependency explicit in Kconfig
  ASoC: SOF: Intel: Update ADL descriptor to use ACPI power states
  ASoC: rt5631: Fix regcache sync errors on resume
  ALSA: pcm: Call substream ack() method upon compat mmap commit
  ...
2021-07-23 09:58:23 -07:00
David S. Miller 1f22cf1349 Couple of fixes:
* fix aggregation on mesh
  * fix late enabling of 4-addr mode
  * leave monitor SKBs with some headroom
  * limit band information for old applications
  * fix virt-wifi WARN_ON
  * fix memory leak in cfg80211 BSS list maintenance
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAmD6kGIACgkQB8qZga/f
 l8RRuw/+Ox1Mf9WcZJms3t2yVVVgfPjoWWJYkay8ozzaTGDsZ9XxCm7jOuNNwNv2
 NyeyGizaZDcD7Db2HYkMqh/VU4sYeC2ye4BSzyxFJkGjTxF2BD9NuxA+LeBEJNL2
 /qZESfOa/1afenONoBYKWrWgiUbONjCekJbkGkBeyYSht0s2k6nh7nltiL+PH7eC
 3PZBOs1k4+qXkrTblPyXYjXNbeBeZGRInEy8+MKQHns8StptsmRHqOiF75Nk0P3O
 mTnYSGjBJLZBfB13ZIHzRnxd+nb86EMM7r4hHHCxfseUFgQMe8ntQ5MTA+NORgZ7
 Hi/IfklRG2ZNCj0Tq95GOEI4mX45He7I/awQ2ZitW0aq0rf+miYMi2uRS5t1G2R+
 eyYZpfiOXPYE45PPzLCOiH8v492EHxMWtf8nRzgM9uRm3Inigi4hwsz9Uy8Rq610
 8fE2tLUhNs942pl5GzKNoTuwhZMNEfR3MaInsj0XOYPWZwAz0z74bqxpboev1iHa
 f166+fPajPUOJfyh+0o2QUyMSbc6/M3mOukO3nDjHkIx+bft8d5EqU668C7h9mjG
 8d0WaZdK4AUNL5oJmGJgfRfXcdjckZcJOO6umgLv5RWUhkOku/57UKLb6DrYdAIo
 m5DWpj7sEKPFH90ddoRZRKeF6lU9Mx1Zsa+tjdw8lZ2XyCkzyIU=
 =XUhh
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-for-net-2021-07-23' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

Couple of fixes:
 * fix aggregation on mesh
 * fix late enabling of 4-addr mode
 * leave monitor SKBs with some headroom
 * limit band information for old applications
 * fix virt-wifi WARN_ON
 * fix memory leak in cfg80211 BSS list maintenance
2021-07-23 17:57:09 +01:00
Paul Jakma 15bbf8bb4d NIU: fix incorrect error return, missed in previous revert
Commit 7930742d6, reverting 26fd962, missed out on reverting an incorrect
change to a return value.  The niu_pci_vpd_scan_props(..) == 1 case appears
to be a normal path - treating it as an error and return -EINVAL was
breaking VPD_SCAN and causing the driver to fail to load.

Fix, so my Neptune card works again.

Cc: Kangjie Lu <kjlu@umn.edu>
Cc: Shannon Nelson <shannon.lee.nelson@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: stable <stable@vger.kernel.org>
Fixes: 7930742d ('Revert "niu: fix missing checks of niu_pci_eeprom_read"')
Signed-off-by: Paul Jakma <paul@jakma.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23 17:48:52 +01:00
Pavel Skripkin 52f3456a96 net: qrtr: fix memory leaks
Syzbot reported memory leak in qrtr. The problem was in unputted
struct sock. qrtr_local_enqueue() function calls qrtr_port_lookup()
which takes sock reference if port was found. Then there is the following
check:

if (!ipc || &ipc->sk == skb->sk) {
	...
	return -ENODEV;
}

Since we should drop the reference before returning from this function and
ipc can be non-NULL inside this if, we should add qrtr_port_put() inside
this if.

The similar corner case is in qrtr_endpoint_post() as Manivannan
reported. In case of sock_queue_rcv_skb() failure we need to put
port reference to avoid leaking struct sock pointer.

Fixes: e04df98adf ("net: qrtr: Remove receive worker")
Fixes: bdabad3e36 ("net: Add Qualcomm IPC router")
Reported-and-tested-by: syzbot+35a511c72ea7356cdcf3@syzkaller.appspotmail.com
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23 17:48:06 +01:00
David S. Miller 200bd5668c Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayusosays:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

1) Memleak in commit audit error path, from Dongliang Mu.

2) Avoid possible false sharing for flowtable timeout updates
   and nft_last use.

3) Adjust conntrack timestamp due to garbage collection delay,
   from Florian Westphal.

4) Fix nft_nat without layer 3 address for the inet family.

5) Fix compilation warning in nfnl_hook when ingress support
   is disabled, from Arnd Bergmann.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23 17:46:05 +01:00
Subbaraya Sundeep 9986066d94 octeontx2-af: Fix uninitialized variables in rvu_switch
Get the number of VFs of a PF correctly by calling
rvu_get_pf_numvfs in rvu_switch_disable function.
Also hwvf is not required hence remove it.

Fixes: 23109f8dd0 ("octeontx2-af: Introduce internal packet switching")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23 17:43:31 +01:00
Tetsuo Handa 3ce6e1f662 loop: reintroduce global lock for safe loop_validate_file() traversal
Commit 6cc8e74308 ("loop: scale loop device by introducing per
device lock") re-opened a race window for NULL pointer dereference at
loop_validate_file() where commit 310ca162d7 ("block/loop: Use
global lock for ioctl() operation.") has closed.

Although we need to guarantee that other loop devices will not change
during traversal, we can't take remote "struct loop_device"->lo_mutex
inside loop_validate_file() in order to avoid AB-BA deadlock. Therefore,
introduce a global lock dedicated for loop_validate_file() which is
conditionally taken before local "struct loop_device"->lo_mutex is taken.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Fixes: 6cc8e74308 ("loop: scale loop device by introducing per device lock")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-07-23 10:18:25 -06:00
Loic Poulain 68d1f1d4af wwan: core: Fix missing RTM_NEWLINK event for default link
A wwan link created via the wwan_create_default_link procedure is
never notified to the user (RTM_NEWLINK), causing issues with user
tools relying on such event to track network links (NetworkManager).

This is because the procedure misses a call to rtnl_configure_link(),
which sets the link as initialized and notifies the new link (cf
proper usage in __rtnl_newlink()).

Cc: stable@vger.kernel.org
Fixes: ca374290aa ("wwan: core: support default netdev creation")
Suggested-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Acked-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23 17:14:00 +01:00
Vladimir Oltean c92c74131a net: dsa: mv88e6xxx: silently accept the deletion of VID 0 too
The blamed commit modified the driver to accept the addition of VID 0
without doing anything, but deleting that VID still fails:

[   32.080780] mv88e6085 d0032004.mdio-mii:10 lan8: failed to kill vid 0081/0

Modify mv88e6xxx_port_vlan_leave() to do the same thing as the addition.

Fixes: b8b79c414e ("net: dsa: mv88e6xxx: Fix adding vlan 0")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23 17:13:02 +01:00
Kangmin Park 46c7655f0b ipv6: decrease hop limit counter in ip6_forward()
Decrease hop limit counter when deliver skb to ndp proxy.

Signed-off-by: Kangmin Park <l4stpr0gr4m@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23 16:40:41 +01:00
Gilad Naaman 227adfb2b1 net: Set true network header for ECN decapsulation
In cases where the header straight after the tunnel header was
another ethernet header (TEB), instead of the network header,
the ECN decapsulation code would treat the ethernet header as if
it was an IP header, resulting in mishandling and possible
wrong drops or corruption of the IP header.

In this case, ECT(1) is sent, so IP_ECN_decapsulate tries to copy it to the
inner IPv4 header, and correct its checksum.

The offset of the ECT bits in an IPv4 header corresponds to the
lower 2 bits of the second octet of the destination MAC address
in the ethernet header.
The IPv4 checksum corresponds to end of the source address.

In order to reproduce:

    $ ip netns add A
    $ ip netns add B
    $ ip -n A link add _v0 type veth peer name _v1 netns B
    $ ip -n A link set _v0 up
    $ ip -n A addr add dev _v0 10.254.3.1/24
    $ ip -n A route add default dev _v0 scope global
    $ ip -n B link set _v1 up
    $ ip -n B addr add dev _v1 10.254.1.6/24
    $ ip -n B route add default dev _v1 scope global
    $ ip -n B link add gre1 type gretap local 10.254.1.6 remote 10.254.3.1 key 0x49000000
    $ ip -n B link set gre1 up

    # Now send an IPv4/GRE/Eth/IPv4 frame where the outer header has ECT(1),
    # and the inner header has no ECT bits set:

    $ cat send_pkt.py
        #!/usr/bin/env python3
        from scapy.all import *

        pkt = IP(b'E\x01\x00\xa7\x00\x00\x00\x00@/`%\n\xfe\x03\x01\n\xfe\x01\x06 \x00eXI\x00'
                 b'\x00\x00\x18\xbe\x92\xa0\xee&\x18\xb0\x92\xa0l&\x08\x00E\x00\x00}\x8b\x85'
                 b'@\x00\x01\x01\xe4\xf2\x82\x82\x82\x01\x82\x82\x82\x02\x08\x00d\x11\xa6\xeb'
                 b'3\x1e\x1e\\xf3\\xf7`\x00\x00\x00\x00ZN\x00\x00\x00\x00\x00\x00\x10\x11\x12'
                 b'\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./01234'
                 b'56789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ')

        send(pkt)
    $ sudo ip netns exec B tcpdump -neqlllvi gre1 icmp & ; sleep 1
    $ sudo ip netns exec A python3 send_pkt.py

In the original packet, the source/destinatio MAC addresses are
dst=18:be:92:a0:ee:26 src=18:b0:92:a0:6c:26

In the received packet, they are
dst=18:bd:92:a0:ee:26 src=18:b0:92:a0:6c:27

Thanks to Lahav Schlesinger <lschlesinger@drivenets.com> and Isaac Garzon <isaac@speed.io>
for helping me pinpoint the origin.

Fixes: b723748750 ("tunnel: Propagate ECT(1) when decapsulating as recommended by RFC6040")
Cc: David S. Miller <davem@davemloft.net>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: David Ahern <dsahern@kernel.org>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Gilad Naaman <gnaaman@drivenets.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23 16:38:57 +01:00
Hoang Le d237a7f117 tipc: fix sleeping in tipc accept routine
The release_sock() is blocking function, it would change the state
after sleeping. In order to evaluate the stated condition outside
the socket lock context, switch to use wait_woken() instead.

Fixes: 6398e23cdb ("tipc: standardize accept routine")
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23 16:36:37 +01:00
Xin Long f8dd60de19 tipc: fix implicit-connect for SYN+
For implicit-connect, when it's either SYN- or SYN+, an ACK should
be sent back to the client immediately. It's not appropriate for
the client to enter established state only after receiving data
from the server.

On client side, after the SYN is sent out, tipc_wait_for_connect()
should be called to wait for the ACK if timeout is set.

This patch also restricts __tipc_sendstream() to call __sendmsg()
only when it's in TIPC_OPEN state, so that the client can program
in a single loop doing both connecting and data sending like:

  for (...)
      sendmsg(dest, buf);

This makes the implicit-connect more implicit.

Fixes: b97bf3fd8f ("[TIPC] Initial merge")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23 16:33:54 +01:00
Rafael J. Wysocki 0b8a53a844 Merge branch 'acpi-utils'
* acpi-utils:
  ACPI: utils: Fix reference counting in for_each_acpi_dev_match()
2021-07-23 17:06:15 +02:00
Sunil Goutham d72e91efca octeontx2-af: Remove unnecessary devm_kfree
Remove devm_kfree of memory where VLAN entry to RVU PF mapping
info is saved. This will be freed anyway at driver exit.
Having this could result in warning from devm_kfree() if
the memory is not allocated due to errors in rvu_nix_block_init()
before nix_setup_txvlan().

Fixes: 9a946def26 ("octeontx2-af: Modify nix_vtag_cfg mailbox to support TX VTAG entries")
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23 16:01:53 +01:00
Jedrzej Jagielski ea52faae1d i40e: Fix log TC creation failure when max num of queues is exceeded
Fix missing failed message if driver does not have enough queues to
complete TC command. Without this fix no message is displayed in dmesg.

Fixes: a9ce82f744 ("i40e: Enable 'channel' mode in mqprio for TC configs")
Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: Imam Hassan Reza Biswas <imam.hassan.reza.biswas@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-07-23 07:44:48 -07:00
Jedrzej Jagielski 89ec1f0886 i40e: Fix queue-to-TC mapping on Tx
In SW DCB mode the packets sent receive incorrect UP tags. They are
constructed correctly and put into tx_ring, but UP is later remapped by
HW on the basis of TCTUPR register contents according to Tx queue
selected, and BW used is consistent with the new UP values. This is
caused by Tx queue selection in kernel not taking into account DCB
configuration. This patch fixes the issue by implementing the
ndo_select_queue NDO callback.

Fixes: fd0a05ce74 ("i40e: transmit, receive, and NAPI")
Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: Imam Hassan Reza Biswas <imam.hassan.reza.biswas@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-07-23 07:44:48 -07:00
Lukasz Cieplicki dc614c4617 i40e: Add additional info to PHY type error
In case of PHY type error occurs, the message was too generic.
Add additional info to PHY type error indicating that it can be
wrong cable connected.

Fixes: 124ed15bf1 ("i40e: Add dual speed module support")
Signed-off-by: Lukasz Cieplicki <lukaszx.cieplicki@intel.com>
Signed-off-by: Michal Maloszewski <michal.maloszewski@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-07-23 07:44:48 -07:00
Arkadiusz Kubalewski 71d6fdba4b i40e: Fix firmware LLDP agent related warning
Make warning meaningful for the user.

Previously the trace:
"Starting FW LLDP agent failed: error: I40E_ERR_ADMIN_QUEUE_ERROR, I40E_AQ_RC_EAGAIN"
was produced when user tried to start Firmware LLDP agent,
just after it was stopped with sequence:
ethtool --set-priv-flags <dev> disable-fw-lldp on
ethtool --set-priv-flags <dev> disable-fw-lldp off
(without any delay between the commands)
At that point the firmware is still processing stop command, the behavior
is expected.

Fixes: c1041d0704 ("i40e: Missing response checks in driver when starting/stopping FW LLDP")
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Tested-by: Imam Hassan Reza Biswas <imam.hassan.reza.biswas@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-07-23 07:44:48 -07:00
Arkadiusz Kubalewski 65662a8dcd i40e: Fix logic of disabling queues
Correct the message flow between driver and firmware when disabling
queues.

Previously in case of PF reset (due to required reinit after reconfig),
the error like: "VSI seid 397 Tx ring 60 disable timeout" could show up
occasionally. The error was not a real issue of hardware or firmware,
it was caused by wrong sequence of messages invoked by the driver.

Fixes: 41c445ff0f ("i40e: main driver core")
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-07-23 07:44:48 -07:00