linux

mirror of https://github.com/torvalds/linux synced 2024-09-20 19:17:24 +00:00

Author	SHA1	Message	Date
Anton Eidelman	af8fd04247	nvme-multipath: fix possible io hang after ctrl reconnect The following scenario results in an IO hang: 1) ctrl completes a request with NVME_SC_ANA_TRANSITION. NVME_NS_ANA_PENDING bit in ns->flags is set and ana_work is triggered. 2) ana_work: nvme_read_ana_log() tries to get the ANA log page from the ctrl. This fails because ctrl disconnects. Therefore nvme_update_ns_ana_state() is not called and NVME_NS_ANA_PENDING bit in ns->flags is not cleared. 3) ctrl reconnects: nvme_mpath_init(ctrl,...) calls nvme_read_ana_log(ctrl, groups_only=true). However, nvme_update_ana_state() does not update namespaces because nr_nsids = 0 (due to groups_only mode). 4) scan_work calls nvme_validate_ns() finds the ns and re-validates OK. Result: The ctrl is now live but NVME_NS_ANA_PENDING bit in ns->flags is still set. Consequently ctrl will never be considered a viable path by __nvme_find_path(). IO will hang if ctrl is the only or the last path to the namespace. More generally, while ctrl is reconnecting, its ANA state may change. And because nvme_mpath_init() requests ANA log in groups_only mode, these changes are not propagated to the existing ctrl namespaces. This may result in a mal-function or an IO hang. Solution: nvme_mpath_init() will nvme_read_ana_log() with groups_only set to false. This will not harm the new ctrl case (no namespaces present), and will make sure the ANA state of namespaces gets updated after reconnect. Note: Another option would be for nvme_mpath_init() to invoke nvme_parse_ana_log(..., nvme_set_ns_ana_state) for each existing namespace. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Anton Eidelman <anton@lightbitslabs.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-10-29 08:55:00 -06:00
Nicholas Piggin	7d6475051f	powerpc/powernv: Fix CPU idle to be called with IRQs disabled Commit `e78a7614f3` ("idle: Prevent late-arriving interrupts from disrupting offline") changes arch_cpu_idle_dead to be called with interrupts disabled, which triggers the WARN in pnv_smp_cpu_kill_self. Fix this by fixing up irq_happened after hard disabling, rather than requiring there are no pending interrupts, similarly to what was done done until commit `2525db04d1` ("powerpc/powernv: Simplify lazy IRQ handling in CPU offline"). Fixes: `e78a7614f3` ("idle: Prevent late-arriving interrupts from disrupting offline") Reported-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Add unexpected_mask rather than checking for known bad values, change the WARN_ON() to a WARN_ON_ONCE()] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191022115814.22456-1-npiggin@gmail.com	2019-10-29 21:47:01 +11:00
Valentin Schneider	e284df705c	sched/topology: Allow sched_asym_cpucapacity to be disabled While the static key is correctly initialized as being disabled, it will remain forever enabled once turned on. This means that if we start with an asymmetric system and hotplug out enough CPUs to end up with an SMP system, the static key will remain set - which is obviously wrong. We should detect this and turn off things like misfit migration and capacity aware wakeups. As Quentin pointed out, having separate root domains makes this slightly trickier. We could have exclusive cpusets that create an SMP island - IOW, the domains within this root domain will not see any asymmetry. This means we can't just disable the key on domain destruction, we need to count how many asymmetric root domains we have. Consider the following example using Juno r0 which is 2+4 big.LITTLE, where two identical cpusets are created: they both span both big and LITTLE CPUs: asym0 asym1 [ ][ ] L L B L L B $ cgcreate -g cpuset:asym0 $ cgset -r cpuset.cpus=0,1,3 asym0 $ cgset -r cpuset.mems=0 asym0 $ cgset -r cpuset.cpu_exclusive=1 asym0 $ cgcreate -g cpuset:asym1 $ cgset -r cpuset.cpus=2,4,5 asym1 $ cgset -r cpuset.mems=0 asym1 $ cgset -r cpuset.cpu_exclusive=1 asym1 $ cgset -r cpuset.sched_load_balance=0 . (the CPU numbering may look odd because on the Juno LITTLEs are CPUs 0,3-5 and bigs are CPUs 1-2) If we make one of those SMP (IOW remove asymmetry) by e.g. hotplugging its big core, we would end up with an SMP cpuset and an asymmetric cpuset - the static key must remain set, because we still have one asymmetric root domain. With the above example, this could be done with: $ echo 0 > /sys/devices/system/cpu/cpu2/online Which would result in: asym0 asym1 [ ][ ] L L B L L When both SMP and asymmetric cpusets are present, all CPUs will observe sched_asym_cpucapacity being set (it is system-wide), but not all CPUs observe asymmetry in their sched domain hierarchy: per_cpu(sd_asym_cpucapacity, <any CPU in asym0>) == <some SD at DIE level> per_cpu(sd_asym_cpucapacity, <any CPU in asym1>) == NULL Change the simple key enablement to an increment, and decrement the key counter when destroying domains that cover asymmetric CPUs. Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Dietmar.Eggemann@arm.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: hannes@cmpxchg.org Cc: lizefan@huawei.com Cc: morten.rasmussen@arm.com Cc: qperret@google.com Cc: tj@kernel.org Cc: vincent.guittot@linaro.org Fixes: `df054e8445` ("sched/topology: Add static_key for asymmetric CPU capacity optimizations") Link: https://lkml.kernel.org/r/20191023153745.19515-3-valentin.schneider@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-10-29 09:58:46 +01:00
Valentin Schneider	cd1cb33505	sched/topology: Don't try to build empty sched domains Turns out hotplugging CPUs that are in exclusive cpusets can lead to the cpuset code feeding empty cpumasks to the sched domain rebuild machinery. This leads to the following splat: Internal error: Oops: 96000004 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 235 Comm: kworker/5:2 Not tainted 5.4.0-rc1-00005-g8d495477d62e #23 Hardware name: ARM Juno development board (r0) (DT) Workqueue: events cpuset_hotplug_workfn pstate: 60000005 (nZCv daif -PAN -UAO) pc : build_sched_domains (./include/linux/arch_topology.h:23 kernel/sched/topology.c:1898 kernel/sched/topology.c:1969) lr : build_sched_domains (kernel/sched/topology.c:1966) Call trace: build_sched_domains (./include/linux/arch_topology.h:23 kernel/sched/topology.c:1898 kernel/sched/topology.c:1969) partition_sched_domains_locked (kernel/sched/topology.c:2250) rebuild_sched_domains_locked (./include/linux/bitmap.h:370 ./include/linux/cpumask.h:538 kernel/cgroup/cpuset.c:955 kernel/cgroup/cpuset.c:978 kernel/cgroup/cpuset.c:1019) rebuild_sched_domains (kernel/cgroup/cpuset.c:1032) cpuset_hotplug_workfn (kernel/cgroup/cpuset.c:3205 (discriminator 2)) process_one_work (./arch/arm64/include/asm/jump_label.h:21 ./include/linux/jump_label.h:200 ./include/trace/events/workqueue.h:114 kernel/workqueue.c:2274) worker_thread (./include/linux/compiler.h:199 ./include/linux/list.h:268 kernel/workqueue.c:2416) kthread (kernel/kthread.c:255) ret_from_fork (arch/arm64/kernel/entry.S:1167) Code: f860dae2 912802d6 aa1603e1 12800000 (f8616853) The faulty line in question is: cap = arch_scale_cpu_capacity(cpumask_first(cpu_map)); and we're not checking the return value against nr_cpu_ids (we shouldn't have to!), which leads to the above. Prevent generate_sched_domains() from returning empty cpumasks, and add some assertion in build_sched_domains() to scream bloody murder if it happens again. The above splat was obtained on my Juno r0 with the following reproducer: $ cgcreate -g cpuset:asym $ cgset -r cpuset.cpus=0-3 asym $ cgset -r cpuset.mems=0 asym $ cgset -r cpuset.cpu_exclusive=1 asym $ cgcreate -g cpuset:smp $ cgset -r cpuset.cpus=4-5 smp $ cgset -r cpuset.mems=0 smp $ cgset -r cpuset.cpu_exclusive=1 smp $ cgset -r cpuset.sched_load_balance=0 . $ echo 0 > /sys/devices/system/cpu/cpu4/online $ echo 0 > /sys/devices/system/cpu/cpu5/online Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Dietmar.Eggemann@arm.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: hannes@cmpxchg.org Cc: lizefan@huawei.com Cc: morten.rasmussen@arm.com Cc: qperret@google.com Cc: tj@kernel.org Cc: vincent.guittot@linaro.org Fixes: `05484e0984` ("sched/topology: Add SD_ASYM_CPUCAPACITY flag detection") Link: https://lkml.kernel.org/r/20191023153745.19515-2-valentin.schneider@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-10-29 09:58:45 +01:00
Alan Stern	54f83b8c8e	USB: gadget: Reject endpoints with 0 maxpacket value Endpoints with a maxpacket length of 0 are probably useless. They can't transfer any data, and it's not at all unlikely that a UDC will crash or hang when trying to handle a non-zero-length usb_request for such an endpoint. Indeed, dummy-hcd gets a divide error when trying to calculate the remainder of a transfer length by the maxpacket value, as discovered by the syzbot fuzzer. Currently the gadget core does not check for endpoints having a maxpacket value of 0. This patch adds a check to usb_ep_enable(), preventing such endpoints from being used. As far as I know, none of the gadget drivers in the kernel tries to create an endpoint with maxpacket = 0, but until now there has been nothing to prevent userspace programs under gadgetfs or configfs from doing it. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-and-tested-by: syzbot+8ab8bf161038a8768553@syzkaller.appspotmail.com CC: <stable@vger.kernel.org> Acked-by: Felipe Balbi <balbi@kernel.org> Link: https://lore.kernel.org/r/Pine.LNX.4.44L0.1910281052370.1485-100000@iolanthe.rowland.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-29 09:56:18 +01:00
Thiago Jung Bauermann	05d9a95283	powerpc/prom_init: Undo relocation before entering secure mode The ultravisor will do an integrity check of the kernel image but we relocated it so the check will fail. Restore the original image by relocating it back to the kernel virtual base address. This works because during build vmlinux is linked with an expected virtual runtime address of KERNELBASE. Fixes: `6a9c930bd7` ("powerpc/prom_init: Add the ESM call to prom_init") Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Tested-by: Michael Anderson <andmike@linux.ibm.com> [mpe: Add IS_ENABLED() to fix the CONFIG_RELOCATABLE=n build] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190911163433.12822-1-bauerman@linux.ibm.com	2019-10-29 15:12:17 +11:00
Nicholas Piggin	d3566abb1a	scsi: qla2xxx: stop timer in shutdown path In shutdown/reboot paths, the timer is not stopped: qla2x00_shutdown pci_device_shutdown device_shutdown kernel_restart_prepare kernel_restart sys_reboot This causes lockups (on powerpc) when firmware config space access calls are interrupted by smp_send_stop later in reboot. Fixes: `e30d175648` ("[SCSI] qla2xxx: Addition of shutdown callback handler.") Link: https://lore.kernel.org/r/20191024063804.14538-1-npiggin@gmail.com Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Acked-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-10-28 21:58:01 -04:00
Nicolin Chen	2ccb4f16d0	hwmon: (ina3221) Fix read timeout issue After introducing "samples" to the calculation of wait time, the driver might timeout at the regmap_field_read_poll_timeout call, because the wait time could be longer than the 100000 usec limit due to a large "samples" number. So this patch sets the timeout limit to 2 times of the wait time in order to fix this issue. Fixes: `5c090abf94` ("hwmon: (ina3221) Add averaging mode support") Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com> Link: https://lore.kernel.org/r/20191022005922.30239-1-nicoleotsuka@gmail.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>	2019-10-28 18:46:55 -07:00
David S. Miller	55793d2a43	Here are two batman-adv bugfixes: * Fix free/alloc race for OGM and OGMv2, by Sven Eckelmann (2 patches) -----BEGIN PGP SIGNATURE----- iQJKBAABCgA0FiEE1ilQI7G+y+fdhnrfoSvjmEKSnqEFAl2yvoAWHHN3QHNpbW9u d3VuZGVybGljaC5kZQAKCRChK+OYQpKeoVosEAC7Ei09QJKa2Wh6by4c8OQKvloK ecei39e0FjMUx4/XETSR/kdHrRFwzMPvIhqe+DUCski7p5x9sHWrkJwIWWirGrDf mAsF8mhA1E0pfsQmQRqu4h/UbDyLLhsbfEf95B75Uf6Nq2cIG0Ko7rb2ao3DR7c2 YwvSjzT+zwlS8Q6Hj1YsJKSS3BEk5NQ9r9soD9WhwWmYynn9TxTWy8o9z3lBCXtV j+mOm0qNebxsDKhr7jbfujjIaepmTOQJHwqPzm1Vo/v6CMEdJzd6j8vbsxCUyXvI /Cg3V+8INEGmRmCDT96paO5kYkHptr8i467Raj3DAdwlh3BuoMo382i3D3GOXgEK VtieWhuOplAJrbc+GFifvFSAA75+B8rZ/IV8CDFIbWHhEjZ/fdtyl1panqQkKSqw n5m/nAyKmfTZ9CH0kZ/zKU8EKPaMNh0BUTbjhU/8Zn0lDI+loBs024K5a00TWtD8 GnjCGKi87HhtCn/3hxasbaVEyJIrMvPlzyYRylzFfSml55GCBuA8i8l/7ql0QKCe xq78J08x/n3RjAj+aguLQ1v4Vz/EgOvzj4RdfHS2CGSI9m37z7wz5oFTKcqi4tX+ 4Phpl7onRnMDx6AiZHMGxwrti/AdFO3YZmOiIh/IEBjUBhmABuj6Eq40thl/8Hkv DfoOTKf98HhmaW6d/g== =usWf -----END PGP SIGNATURE----- Merge tag 'batadv-net-for-davem-20191025' of git://git.open-mesh.org/linux-merge Simon Wunderlich says: ==================== Here are two batman-adv bugfixes: * Fix free/alloc race for OGM and OGMv2, by Sven Eckelmann (2 patches) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-10-28 16:39:07 -07:00
Daniel Wagner	0a29ac5bd3	net: usb: lan78xx: Disable interrupts before calling generic_handle_irq() lan78xx_status() will run with interrupts enabled due to the change in `ed194d1367` ("usb: core: remove local_irq_save() around ->complete() handler"). generic_handle_irq() expects to be run with IRQs disabled. [ 4.886203] 000: irq 79 handler irq_default_primary_handler+0x0/0x8 enabled interrupts [ 4.886243] 000: WARNING: CPU: 0 PID: 0 at kernel/irq/handle.c:152 __handle_irq_event_percpu+0x154/0x168 [ 4.896294] 000: Modules linked in: [ 4.896301] 000: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.3.6 #39 [ 4.896310] 000: Hardware name: Raspberry Pi 3 Model B+ (DT) [ 4.896315] 000: pstate: 60000005 (nZCv daif -PAN -UAO) [ 4.896321] 000: pc : __handle_irq_event_percpu+0x154/0x168 [ 4.896331] 000: lr : __handle_irq_event_percpu+0x154/0x168 [ 4.896339] 000: sp : ffff000010003cc0 [ 4.896346] 000: x29: ffff000010003cc0 x28: 0000000000000060 [ 4.896355] 000: x27: ffff000011021980 x26: ffff00001189c72b [ 4.896364] 000: x25: ffff000011702bc0 x24: ffff800036d6e400 [ 4.896373] 000: x23: 000000000000004f x22: ffff000010003d64 [ 4.896381] 000: x21: 0000000000000000 x20: 0000000000000002 [ 4.896390] 000: x19: ffff8000371c8480 x18: 0000000000000060 [ 4.896398] 000: x17: 0000000000000000 x16: 00000000000000eb [ 4.896406] 000: x15: ffff000011712d18 x14: 7265746e69206465 [ 4.896414] 000: x13: ffff000010003ba0 x12: ffff000011712df0 [ 4.896422] 000: x11: 0000000000000001 x10: ffff000011712e08 [ 4.896430] 000: x9 : 0000000000000001 x8 : 000000000003c920 [ 4.896437] 000: x7 : ffff0000118cc410 x6 : ffff0000118c7f00 [ 4.896445] 000: x5 : 000000000003c920 x4 : 0000000000004510 [ 4.896453] 000: x3 : ffff000011712dc8 x2 : 0000000000000000 [ 4.896461] 000: x1 : 73a3f67df94c1500 x0 : 0000000000000000 [ 4.896466] 000: Call trace: [ 4.896471] 000: __handle_irq_event_percpu+0x154/0x168 [ 4.896481] 000: handle_irq_event_percpu+0x50/0xb0 [ 4.896489] 000: handle_irq_event+0x40/0x98 [ 4.896497] 000: handle_simple_irq+0xa4/0xf0 [ 4.896505] 000: generic_handle_irq+0x24/0x38 [ 4.896513] 000: intr_complete+0xb0/0xe0 [ 4.896525] 000: __usb_hcd_giveback_urb+0x58/0xd8 [ 4.896533] 000: usb_giveback_urb_bh+0xd0/0x170 [ 4.896539] 000: tasklet_action_common.isra.0+0x9c/0x128 [ 4.896549] 000: tasklet_hi_action+0x24/0x30 [ 4.896556] 000: __do_softirq+0x120/0x23c [ 4.896564] 000: irq_exit+0xb8/0xd8 [ 4.896571] 000: __handle_domain_irq+0x64/0xb8 [ 4.896579] 000: bcm2836_arm_irqchip_handle_irq+0x60/0xc0 [ 4.896586] 000: el1_irq+0xb8/0x140 [ 4.896592] 000: arch_cpu_idle+0x10/0x18 [ 4.896601] 000: do_idle+0x200/0x280 [ 4.896608] 000: cpu_startup_entry+0x20/0x28 [ 4.896615] 000: rest_init+0xb4/0xc0 [ 4.896623] 000: arch_call_rest_init+0xc/0x14 [ 4.896632] 000: start_kernel+0x454/0x480 Fixes: `ed194d1367` ("usb: core: remove local_irq_save() around ->complete() handler") Cc: Woojung Huh <woojung.huh@microchip.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Stefan Wahren <wahrenst@gmx.net> Cc: Jisheng Zhang <Jisheng.Zhang@synaptics.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: David Miller <davem@davemloft.net> Signed-off-by: Daniel Wagner <dwagner@suse.de> Tested-by: Stefan Wahren <wahrenst@gmx.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-10-28 16:35:04 -07:00
Arnd Bergmann	5d294fc483	net: dsa: sja1105: improve NET_DSA_SJA1105_TAS dependency An earlier bugfix introduced a dependency on CONFIG_NET_SCH_TAPRIO, but this missed the case of NET_SCH_TAPRIO=m and NET_DSA_SJA1105=y, which still causes a link error: drivers/net/dsa/sja1105/sja1105_tas.o: In function `sja1105_setup_tc_taprio': sja1105_tas.c:(.text+0x5c): undefined reference to `taprio_offload_free' sja1105_tas.c:(.text+0x3b4): undefined reference to `taprio_offload_get' drivers/net/dsa/sja1105/sja1105_tas.o: In function `sja1105_tas_teardown': sja1105_tas.c:(.text+0x6ec): undefined reference to `taprio_offload_free' Change the dependency to only allow selecting the TAS code when it can link against the taprio code. Fixes: `a8d570de0c` ("net: dsa: sja1105: Add dependency for NET_DSA_SJA1105_TAS") Fixes: `317ab5b86c` ("net: dsa: sja1105: Configure the Time-Aware Scheduler via tc-taprio offload") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-10-28 16:33:42 -07:00
Benjamin Herrenschmidt	88824e3bf2	net: ethernet: ftgmac100: Fix DMA coherency issue with SW checksum We are calling the checksum helper after the dma_map_single() call to map the packet. This is incorrect as the checksumming code will touch the packet from the CPU. This means the cache won't be properly flushes (or the bounce buffering will leave us with the unmodified packet to DMA). This moves the calculation of the checksum & vlan tags to before the DMA mapping. This also has the side effect of fixing another bug: If the checksum helper fails, we goto "drop" to drop the packet, which will not unmap the DMA mapping. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Fixes: `05690d633f` ("ftgmac100: Upgrade to NETIF_F_HW_CSUM") Reviewed-by: Vijay Khemka <vijaykhemka@fb.com> Tested-by: Vijay Khemka <vijaykhemka@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-10-28 16:22:50 -07:00
Tejun Heo	20eb4f29b6	net: fix sk_page_frag() recursion from memory reclaim sk_page_frag() optimizes skb_frag allocations by using per-task skb_frag cache when it knows it's the only user. The condition is determined by seeing whether the socket allocation mask allows blocking - if the allocation may block, it obviously owns the task's context and ergo exclusively owns current->task_frag. Unfortunately, this misses recursion through memory reclaim path. Please take a look at the following backtrace. [2] RIP: 0010:tcp_sendmsg_locked+0xccf/0xe10 ... tcp_sendmsg+0x27/0x40 sock_sendmsg+0x30/0x40 sock_xmit.isra.24+0xa1/0x170 [nbd] nbd_send_cmd+0x1d2/0x690 [nbd] nbd_queue_rq+0x1b5/0x3b0 [nbd] __blk_mq_try_issue_directly+0x108/0x1b0 blk_mq_request_issue_directly+0xbd/0xe0 blk_mq_try_issue_list_directly+0x41/0xb0 blk_mq_sched_insert_requests+0xa2/0xe0 blk_mq_flush_plug_list+0x205/0x2a0 blk_flush_plug_list+0xc3/0xf0 [1] blk_finish_plug+0x21/0x2e _xfs_buf_ioapply+0x313/0x460 __xfs_buf_submit+0x67/0x220 xfs_buf_read_map+0x113/0x1a0 xfs_trans_read_buf_map+0xbf/0x330 xfs_btree_read_buf_block.constprop.42+0x95/0xd0 xfs_btree_lookup_get_block+0x95/0x170 xfs_btree_lookup+0xcc/0x470 xfs_bmap_del_extent_real+0x254/0x9a0 __xfs_bunmapi+0x45c/0xab0 xfs_bunmapi+0x15/0x30 xfs_itruncate_extents_flags+0xca/0x250 xfs_free_eofblocks+0x181/0x1e0 xfs_fs_destroy_inode+0xa8/0x1b0 destroy_inode+0x38/0x70 dispose_list+0x35/0x50 prune_icache_sb+0x52/0x70 super_cache_scan+0x120/0x1a0 do_shrink_slab+0x120/0x290 shrink_slab+0x216/0x2b0 shrink_node+0x1b6/0x4a0 do_try_to_free_pages+0xc6/0x370 try_to_free_mem_cgroup_pages+0xe3/0x1e0 try_charge+0x29e/0x790 mem_cgroup_charge_skmem+0x6a/0x100 __sk_mem_raise_allocated+0x18e/0x390 __sk_mem_schedule+0x2a/0x40 [0] tcp_sendmsg_locked+0x8eb/0xe10 tcp_sendmsg+0x27/0x40 sock_sendmsg+0x30/0x40 ___sys_sendmsg+0x26d/0x2b0 __sys_sendmsg+0x57/0xa0 do_syscall_64+0x42/0x100 entry_SYSCALL_64_after_hwframe+0x44/0xa9 In [0], tcp_send_msg_locked() was using current->page_frag when it called sk_wmem_schedule(). It already calculated how many bytes can be fit into current->page_frag. Due to memory pressure, sk_wmem_schedule() called into memory reclaim path which called into xfs and then IO issue path. Because the filesystem in question is backed by nbd, the control goes back into the tcp layer - back into tcp_sendmsg_locked(). nbd sets sk_allocation to (GFP_NOIO \| __GFP_MEMALLOC) which makes sense - it's in the process of freeing memory and wants to be able to, e.g., drop clean pages to make forward progress. However, this confused sk_page_frag() called from [2]. Because it only tests whether the allocation allows blocking which it does, it now thinks current->page_frag can be used again although it already was being used in [0]. After [2] used current->page_frag, the offset would be increased by the used amount. When the control returns to [0], current->page_frag's offset is increased and the previously calculated number of bytes now may overrun the end of allocated memory leading to silent memory corruptions. Fix it by adding gfpflags_normal_context() which tests sleepable && !reclaim and use it to determine whether to use current->task_frag. v2: Eric didn't like gfp flags being tested twice. Introduce a new helper gfpflags_normal_context() and combine the two tests. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2019-10-28 16:17:31 -07:00
Eric Dumazet	a793183caa	udp: fix data-race in udp_set_dev_scratch() KCSAN reported a data-race in udp_set_dev_scratch() [1] The issue here is that we must not write over skb fields if skb is shared. A similar issue has been fixed in commit `89c22d8c3b` ("net: Fix skb csum races when peeking") While we are at it, use a helper only dealing with udp_skb_scratch(skb)->csum_unnecessary, as this allows udp_set_dev_scratch() to be called once and thus inlined. [1] BUG: KCSAN: data-race in udp_set_dev_scratch / udpv6_recvmsg write to 0xffff888120278317 of 1 bytes by task 10411 on cpu 1: udp_set_dev_scratch+0xea/0x200 net/ipv4/udp.c:1308 __first_packet_length+0x147/0x420 net/ipv4/udp.c:1556 first_packet_length+0x68/0x2a0 net/ipv4/udp.c:1579 udp_poll+0xea/0x110 net/ipv4/udp.c:2720 sock_poll+0xed/0x250 net/socket.c:1256 vfs_poll include/linux/poll.h:90 [inline] do_select+0x7d0/0x1020 fs/select.c:534 core_sys_select+0x381/0x550 fs/select.c:677 do_pselect.constprop.0+0x11d/0x160 fs/select.c:759 __do_sys_pselect6 fs/select.c:784 [inline] __se_sys_pselect6 fs/select.c:769 [inline] __x64_sys_pselect6+0x12e/0x170 fs/select.c:769 do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x44/0xa9 read to 0xffff888120278317 of 1 bytes by task 10413 on cpu 0: udp_skb_csum_unnecessary include/net/udp.h:358 [inline] udpv6_recvmsg+0x43e/0xe90 net/ipv6/udp.c:310 inet6_recvmsg+0xbb/0x240 net/ipv6/af_inet6.c:592 sock_recvmsg_nosec+0x5c/0x70 net/socket.c:871 ___sys_recvmsg+0x1a0/0x3e0 net/socket.c:2480 do_recvmmsg+0x19a/0x5c0 net/socket.c:2601 __sys_recvmmsg+0x1ef/0x200 net/socket.c:2680 __do_sys_recvmmsg net/socket.c:2703 [inline] __se_sys_recvmmsg net/socket.c:2696 [inline] __x64_sys_recvmmsg+0x89/0xb0 net/socket.c:2696 do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 10413 Comm: syz-executor.0 Not tainted 5.4.0-rc3+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Fixes: `2276f58ac5` ("udp: use a separate rx queue for packet reception") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-10-28 13:53:40 -07:00
Nishad Kamdar	7de4344f2a	net: dpaa2: Use the correct style for SPDX License Identifier This patch corrects the SPDX License Identifier style in header files related to DPAA2 Ethernet driver supporting Freescale SoCs with DPAA2. For C header files Documentation/process/license-rules.rst mandates C-like comments (opposed to C source files where C++ style should be used) Changes made by using a script provided by Joe Perches here: https://lkml.org/lkml/2019/2/7/46. Suggested-by: Joe Perches <joe@perches.com> Signed-off-by: Nishad Kamdar <nishadkamdar@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-10-28 13:40:17 -07:00
David S. Miller	2024305863	Merge branch 'net-avoid-KCSAN-splats' Eric Dumazet says: ==================== net: avoid KCSAN splats Often times we use skb_queue_empty() without holding a lock, meaning that other cpus (or interrupt) can change the queue under us. This is fine, but we need to properly annotate the lockless intent to make sure the compiler wont over optimize things. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-10-28 13:33:41 -07:00
Eric Dumazet	7c422d0ce9	net: add READ_ONCE() annotation in __skb_wait_for_more_packets() __skb_wait_for_more_packets() can be called while other cpus can feed packets to the socket receive queue. KCSAN reported : BUG: KCSAN: data-race in __skb_wait_for_more_packets / __udp_enqueue_schedule_skb write to 0xffff888102e40b58 of 8 bytes by interrupt on cpu 0: __skb_insert include/linux/skbuff.h:1852 [inline] __skb_queue_before include/linux/skbuff.h:1958 [inline] __skb_queue_tail include/linux/skbuff.h:1991 [inline] __udp_enqueue_schedule_skb+0x2d7/0x410 net/ipv4/udp.c:1470 __udp_queue_rcv_skb net/ipv4/udp.c:1940 [inline] udp_queue_rcv_one_skb+0x7bd/0xc70 net/ipv4/udp.c:2057 udp_queue_rcv_skb+0xb5/0x400 net/ipv4/udp.c:2074 udp_unicast_rcv_skb.isra.0+0x7e/0x1c0 net/ipv4/udp.c:2233 __udp4_lib_rcv+0xa44/0x17c0 net/ipv4/udp.c:2300 udp_rcv+0x2b/0x40 net/ipv4/udp.c:2470 ip_protocol_deliver_rcu+0x4d/0x420 net/ipv4/ip_input.c:204 ip_local_deliver_finish+0x110/0x140 net/ipv4/ip_input.c:231 NF_HOOK include/linux/netfilter.h:305 [inline] NF_HOOK include/linux/netfilter.h:299 [inline] ip_local_deliver+0x133/0x210 net/ipv4/ip_input.c:252 dst_input include/net/dst.h:442 [inline] ip_rcv_finish+0x121/0x160 net/ipv4/ip_input.c:413 NF_HOOK include/linux/netfilter.h:305 [inline] NF_HOOK include/linux/netfilter.h:299 [inline] ip_rcv+0x18f/0x1a0 net/ipv4/ip_input.c:523 __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5010 __netif_receive_skb+0x37/0xf0 net/core/dev.c:5124 process_backlog+0x1d3/0x420 net/core/dev.c:5955 read to 0xffff888102e40b58 of 8 bytes by task 13035 on cpu 1: __skb_wait_for_more_packets+0xfa/0x320 net/core/datagram.c:100 __skb_recv_udp+0x374/0x500 net/ipv4/udp.c:1683 udp_recvmsg+0xe1/0xb10 net/ipv4/udp.c:1712 inet_recvmsg+0xbb/0x250 net/ipv4/af_inet.c:838 sock_recvmsg_nosec+0x5c/0x70 net/socket.c:871 ___sys_recvmsg+0x1a0/0x3e0 net/socket.c:2480 do_recvmmsg+0x19a/0x5c0 net/socket.c:2601 __sys_recvmmsg+0x1ef/0x200 net/socket.c:2680 __do_sys_recvmmsg net/socket.c:2703 [inline] __se_sys_recvmmsg net/socket.c:2696 [inline] __x64_sys_recvmmsg+0x89/0xb0 net/socket.c:2696 do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 13035 Comm: syz-executor.3 Not tainted 5.4.0-rc3+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-10-28 13:33:41 -07:00
Eric Dumazet	3f926af3f4	net: use skb_queue_empty_lockless() in busy poll contexts Busy polling usually runs without locks. Let's use skb_queue_empty_lockless() instead of skb_queue_empty() Also uses READ_ONCE() in __skb_try_recv_datagram() to address a similar potential problem. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-10-28 13:33:41 -07:00
Eric Dumazet	3ef7cf57c7	net: use skb_queue_empty_lockless() in poll() handlers Many poll() handlers are lockless. Using skb_queue_empty_lockless() instead of skb_queue_empty() is more appropriate. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-10-28 13:33:41 -07:00
Eric Dumazet	137a0dbe34	udp: use skb_queue_empty_lockless() syzbot reported a data-race [1]. We should use skb_queue_empty_lockless() to document that we are not ensuring a mutual exclusion and silence KCSAN. [1] BUG: KCSAN: data-race in __skb_recv_udp / __udp_enqueue_schedule_skb write to 0xffff888122474b50 of 8 bytes by interrupt on cpu 0: __skb_insert include/linux/skbuff.h:1852 [inline] __skb_queue_before include/linux/skbuff.h:1958 [inline] __skb_queue_tail include/linux/skbuff.h:1991 [inline] __udp_enqueue_schedule_skb+0x2c1/0x410 net/ipv4/udp.c:1470 __udp_queue_rcv_skb net/ipv4/udp.c:1940 [inline] udp_queue_rcv_one_skb+0x7bd/0xc70 net/ipv4/udp.c:2057 udp_queue_rcv_skb+0xb5/0x400 net/ipv4/udp.c:2074 udp_unicast_rcv_skb.isra.0+0x7e/0x1c0 net/ipv4/udp.c:2233 __udp4_lib_rcv+0xa44/0x17c0 net/ipv4/udp.c:2300 udp_rcv+0x2b/0x40 net/ipv4/udp.c:2470 ip_protocol_deliver_rcu+0x4d/0x420 net/ipv4/ip_input.c:204 ip_local_deliver_finish+0x110/0x140 net/ipv4/ip_input.c:231 NF_HOOK include/linux/netfilter.h:305 [inline] NF_HOOK include/linux/netfilter.h:299 [inline] ip_local_deliver+0x133/0x210 net/ipv4/ip_input.c:252 dst_input include/net/dst.h:442 [inline] ip_rcv_finish+0x121/0x160 net/ipv4/ip_input.c:413 NF_HOOK include/linux/netfilter.h:305 [inline] NF_HOOK include/linux/netfilter.h:299 [inline] ip_rcv+0x18f/0x1a0 net/ipv4/ip_input.c:523 __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5010 __netif_receive_skb+0x37/0xf0 net/core/dev.c:5124 process_backlog+0x1d3/0x420 net/core/dev.c:5955 read to 0xffff888122474b50 of 8 bytes by task 8921 on cpu 1: skb_queue_empty include/linux/skbuff.h:1494 [inline] __skb_recv_udp+0x18d/0x500 net/ipv4/udp.c:1653 udp_recvmsg+0xe1/0xb10 net/ipv4/udp.c:1712 inet_recvmsg+0xbb/0x250 net/ipv4/af_inet.c:838 sock_recvmsg_nosec+0x5c/0x70 net/socket.c:871 ___sys_recvmsg+0x1a0/0x3e0 net/socket.c:2480 do_recvmmsg+0x19a/0x5c0 net/socket.c:2601 __sys_recvmmsg+0x1ef/0x200 net/socket.c:2680 __do_sys_recvmmsg net/socket.c:2703 [inline] __se_sys_recvmmsg net/socket.c:2696 [inline] __x64_sys_recvmmsg+0x89/0xb0 net/socket.c:2696 do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 8921 Comm: syz-executor.4 Not tainted 5.4.0-rc3+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-10-28 13:33:41 -07:00
Eric Dumazet	d7d16a8935	net: add skb_queue_empty_lockless() Some paths call skb_queue_empty() without holding the queue lock. We must use a barrier in order to not let the compiler do strange things, and avoid KCSAN splats. Adding a barrier in skb_queue_empty() might be overkill, I prefer adding a new helper to clearly identify points where the callers might be lockless. This might help us finding real bugs. The corresponding WRITE_ONCE() should add zero cost for current compilers. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-10-28 13:33:41 -07:00
Linus Torvalds	8005803a2c	ARC fixes for 5.4-rc6 - perf fix for Big Endian build [Alexey] - hadk platform enable soem peripherals [Eugeniy] -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEOXpuCuR6hedrdLCJadfx3eKKwl4FAl23ISQACgkQadfx3eKK wl6xCA/+NRJcy7Iwu3uBOkSrQd55BEEpi5QPnf2VfHFFhYUjqT9KgVDK0F5+Xj+l MQ9Jrbybr7JD9g19FfKq9TzajfLf3R2QU1anZ9AEpBxsQnz+xp/KPgsVajUhJvMp nw/E2NeEbji9MeAEhKHQKWzxyfB9bHwGaorgIAm0lvq5NZ9pucAkHZlLlZ7rgo3J OhFc24XbOuxBdnb9Pnt9YBJduWxVYJV91MzYrQ/uUzFxAx+HFC6JU/+fsyA9NLeo bg5VkiCgnMm7DgsDoKdwb9v7vhMPBUvIUaM3lXqbsRahu2Xn84dOuXrFdfZ/Je25 uTDwXBZDC9YIbkHOOqbLW5brpoqWOoyd8qbjRiKid43jQK7zzlGHE6FbVX6vGy43 Oym6rBaUdQxghyZxau3T3yTQcm4QZiv4jI9592bbCRdk/gYoDY0d4V0cDm/1i4Rb USJkjslp5jQ3NqCn1siN4hfB/mnj8KJ9yHDIgY0H6YPqtKx/ZwTnWUWf9lCK6NaE rZ3vd2wiePs6mZmCGG3sWdYWJfkK1gBv/8pwTQroKhfW7mLNwXNnr3Wp2fZQQHWT WMFE51iCsQxkB71rFqNjy3oVIcTovElfEJYQv+XTz7/OuvAgWXrGVbjSXQQNfaRc ZZ2ImmSD02/g9wZYFBViVU4gJ6CkSyQgzIGGJ+8htN8WdXg2UnQ= =LNQX -----END PGP SIGNATURE----- Merge tag 'arc-5.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull ARC fixes from Vineet Gupta: "Small fixes for ARC: - perf fix for Big Endian build [Alexey] - hadk platform enable soem peripherals [Eugeniy]" * tag 'arc-5.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: ARC: perf: Accommodate big-endian CPU ARC: [plat-hsdk]: Enable on-boardi SPI ADC IC ARC: [plat-hsdk]: Enable on-board SPI NOR flash IC	2019-10-28 21:05:03 +01:00
Lijun Ou	b681a05299	RDMA/hns: Prevent memory leaks of eq->buf_list eq->buf_list->buf and eq->buf_list should also be freed when eqe_hop_num is set to 0, or there will be memory leaks. Fixes: `a5073d6054` ("RDMA/hns: Add eq support of hip08") Link: https://lore.kernel.org/r/1572072995-11277-3-git-send-email-liweihang@hisilicon.com Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Weihang Li <liweihang@hisilicon.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2019-10-28 15:06:38 -03:00
Yash Shah	00a5bf3a8c	RISC-V: Add PCIe I/O BAR memory mapping For legacy I/O BARs (non-MMIO BARs) to work correctly on RISC-V Linux, we need to establish a reserved memory region for them, so that drivers that wish to use the legacy I/O BARs can issue reads and writes against a memory region that is mapped to the host PCIe controller's I/O BAR mapping. Signed-off-by: Yash Shah <yash.shah@sifive.com> Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>	2019-10-28 10:43:32 -07:00
Potnuri Bharat Teja	d4934f4569	RDMA/iw_cxgb4: Avoid freeing skb twice in arp failure case _put_ep_safe() and _put_pass_ep_safe() free the skb before it is freed by process_work(). fix double free by freeing the skb only in process_work(). Fixes: `1dad0ebeea` ("iw_cxgb4: Avoid touch after free error in ARP failure handlers") Link: https://lore.kernel.org/r/1572006880-5800-1-git-send-email-bharat@chelsio.com Signed-off-by: Dakshaja Uppalapati <dakshaja@chelsio.com> Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2019-10-28 14:13:33 -03:00
Jason Gunthorpe	1524b12a6e	RDMA/mlx5: Use irq xarray locking for mkey_table The mkey_table xarray is touched by the reg_mr_callback() function which is called from a hard irq. Thus all other uses of xa_lock must use the _irq variants. WARNING: inconsistent lock state 5.4.0-rc1 #12 Not tainted -------------------------------- inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage. python3/343 [HC0[0]:SC0[0]:HE1:SE1] takes: ffff888182be1d40 (&(&xa->xa_lock)->rlock#3){?.-.}, at: xa_erase+0x12/0x30 {IN-HARDIRQ-W} state was registered at: lock_acquire+0xe1/0x200 _raw_spin_lock_irqsave+0x35/0x50 reg_mr_callback+0x2dd/0x450 [mlx5_ib] mlx5_cmd_exec_cb_handler+0x2c/0x70 [mlx5_core] mlx5_cmd_comp_handler+0x355/0x840 [mlx5_core] [..] Possible unsafe locking scenario: CPU0 ---- lock(&(&xa->xa_lock)->rlock#3); <Interrupt> lock(&(&xa->xa_lock)->rlock#3); * DEADLOCK * 2 locks held by python3/343: #0: ffff88818eb4bd38 (&uverbs_dev->disassociate_srcu){....}, at: ib_uverbs_ioctl+0xe5/0x1e0 [ib_uverbs] #1: ffff888176c76d38 (&file->hw_destroy_rwsem){++++}, at: uobj_destroy+0x2d/0x90 [ib_uverbs] stack backtrace: CPU: 3 PID: 343 Comm: python3 Not tainted 5.4.0-rc1 #12 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack+0x86/0xca print_usage_bug.cold.50+0x2e5/0x355 mark_lock+0x871/0xb50 ? match_held_lock+0x20/0x250 ? check_usage_forwards+0x240/0x240 __lock_acquire+0x7de/0x23a0 ? __kasan_check_read+0x11/0x20 ? mark_lock+0xae/0xb50 ? mark_held_locks+0xb0/0xb0 ? find_held_lock+0xca/0xf0 lock_acquire+0xe1/0x200 ? xa_erase+0x12/0x30 _raw_spin_lock+0x2a/0x40 ? xa_erase+0x12/0x30 xa_erase+0x12/0x30 mlx5_ib_dealloc_mw+0x55/0xa0 [mlx5_ib] uverbs_dealloc_mw+0x3c/0x70 [ib_uverbs] uverbs_free_mw+0x1a/0x20 [ib_uverbs] destroy_hw_idr_uobject+0x49/0xa0 [ib_uverbs] [..] Fixes: `0417791536` ("RDMA/mlx5: Add missing synchronize_srcu() for MW cases") Link: https://lore.kernel.org/r/20191024234910.GA9038@ziepe.ca Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2019-10-28 14:08:48 -03:00
Alan Stern	1186f86a71	UAS: Revert commit `3ae62a4209` ("UAS: fix alignment of scatter/gather segments") Commit `3ae62a4209` ("UAS: fix alignment of scatter/gather segments"), copying a similar commit for usb-storage, attempted to solve a problem involving scatter-gather I/O and USB/IP by setting the virt_boundary_mask for mass-storage devices. However, it now turns out that the analogous change in usb-storage interacted badly with commit `09324d32d2` ("block: force an unlimited segment size on queues with a virt boundary"), which was added later. A typical error message is: ehci-pci 0000:00:13.2: swiotlb buffer is full (sz: 327680 bytes), total 32768 (slots), used 97 (slots) There is no longer any reason to keep the virt_boundary_mask setting in the uas driver. It was needed in the first place only for handling devices with a block size smaller than the maxpacket size and where the host controller was not capable of fully general scatter-gather operation (that is, able to merge two SG segments into a single USB packet). But: High-speed or slower connections never use a bulk maxpacket value larger than 512; The SCSI layer does not handle block devices with a block size smaller than 512 bytes; All the host controllers capable of SuperSpeed operation can handle fully general SG; Since commit `ea44d19076` ("usbip: Implement SG support to vhci-hcd and stub driver") was merged, the USB/IP driver can also handle SG. Therefore all supported device/controller combinations should be okay with no need for any special virt_boundary_mask. So in order to head off potential problems similar to those affecting usb-storage, this patch reverts commit `3ae62a4209`. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> CC: Oliver Neukum <oneukum@suse.com> CC: <stable@vger.kernel.org> Acked-by: Christoph Hellwig <hch@lst.de> Fixes: `3ae62a4209` ("UAS: fix alignment of scatter/gather segments") Link: https://lore.kernel.org/r/Pine.LNX.4.44L0.1910231132470.1878-100000@iolanthe.rowland.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-28 17:53:39 +01:00
Alan Stern	9a97694961	usb-storage: Revert commit `747668dbc0` ("usb-storage: Set virt_boundary_mask to avoid SG overflows") Commit `747668dbc0` ("usb-storage: Set virt_boundary_mask to avoid SG overflows") attempted to solve a problem involving scatter-gather I/O and USB/IP by setting the virt_boundary_mask for mass-storage devices. However, it now turns out that this interacts badly with commit `09324d32d2` ("block: force an unlimited segment size on queues with a virt boundary"), which was added later. A typical error message is: ehci-pci 0000:00:13.2: swiotlb buffer is full (sz: 327680 bytes), total 32768 (slots), used 97 (slots) There is no longer any reason to keep the virt_boundary_mask setting for usb-storage. It was needed in the first place only for handling devices with a block size smaller than the maxpacket size and where the host controller was not capable of fully general scatter-gather operation (that is, able to merge two SG segments into a single USB packet). But: High-speed or slower connections never use a bulk maxpacket value larger than 512; The SCSI layer does not handle block devices with a block size smaller than 512 bytes; All the host controllers capable of SuperSpeed operation can handle fully general SG; Since commit `ea44d19076` ("usbip: Implement SG support to vhci-hcd and stub driver") was merged, the USB/IP driver can also handle SG. Therefore all supported device/controller combinations should be okay with no need for any special virt_boundary_mask. So in order to fix the swiotlb problem, this patch reverts commit `747668dbc0`. Reported-and-tested-by: Piergiorgio Sartor <piergiorgio.sartor@nexgo.de> Link: https://marc.info/?l=linux-usb&m=157134199501202&w=2 Signed-off-by: Alan Stern <stern@rowland.harvard.edu> CC: Seth Bollinger <Seth.Bollinger@digi.com> CC: <stable@vger.kernel.org> Fixes: `747668dbc0` ("usb-storage: Set virt_boundary_mask to avoid SG overflows") Acked-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/Pine.LNX.4.44L0.1910211145520.1673-100000@iolanthe.rowland.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-28 17:52:44 +01:00
Suwan Kim	d4d8257754	usbip: Fix free of unallocated memory in vhci tx iso_buffer should be set to NULL after use and free in the while loop. In the case of isochronous URB in the while loop, iso_buffer is allocated and after sending it to server, buffer is deallocated. And then, if the next URB in the while loop is not a isochronous pipe, iso_buffer still holds the previously deallocated buffer address and kfree tries to free wrong buffer address. Fixes: `ea44d19076` ("usbip: Implement SG support to vhci-hcd and stub driver") Reported-by: kbuild test robot <lkp@intel.com> Reported-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: Suwan Kim <suwan.kim027@gmail.com> Reviewed-by: Julia Lawall <julia.lawall@lip6.fr> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/20191022093017.8027-1-suwan.kim027@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-28 17:51:06 +01:00
GwanYeong Kim	28df0642ab	usbip: tools: Fix read_usb_vudc_device() error path handling This isn't really accurate right. fread() doesn't always return 0 in error. It could return < number of elements and set errno. Signed-off-by: GwanYeong Kim <gy741.kim@gmail.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/20191018032223.4644-1-gy741.kim@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-28 17:51:06 +01:00
Ben Dooks (Codethink)	d5501d5c29	usb: xhci: fix __le32/__le64 accessors in debugfs code It looks like some of the xhci debug code is passing u32 to functions directly from __le32/__le64 fields. Fix this by using le{32,64}_to_cpu() on these to fix the following sparse warnings; xhci-debugfs.c:205:62: warning: incorrect type in argument 1 (different base types) xhci-debugfs.c:205:62: expected unsigned int [usertype] field0 xhci-debugfs.c:205:62: got restricted __le32 xhci-debugfs.c:206:62: warning: incorrect type in argument 2 (different base types) xhci-debugfs.c:206:62: expected unsigned int [usertype] field1 xhci-debugfs.c:206:62: got restricted __le32 ... [Trim down commit message, sparse warnings were similar -Mathias] Cc: <stable@vger.kernel.org> # 4.15+ Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/1572013829-14044-4-git-send-email-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-28 17:47:08 +01:00
Samuel Holland	bfa3dbb343	usb: xhci: fix Immediate Data Transfer endianness The arguments to queue_trb are always byteswapped to LE for placement in the ring, but this should not happen in the case of immediate data; the bytes copied out of transfer_buffer are already in the correct order. Add a complementary byteswap so the bytes end up in the ring correctly. This was observed on BE ppc64 with a "Texas Instruments TUSB73x0 SuperSpeed USB 3.0 xHCI Host Controller [104c:8241]" as a ch341 usb-serial adapter ("1a86:7523 QinHeng Electronics HL-340 USB-Serial adapter") always transmitting the same character (generally NUL) over the serial link regardless of the key pressed. Cc: <stable@vger.kernel.org> # 5.2+ Fixes: `33e39350eb` ("usb: xhci: add Immediate Data Transfer support") Signed-off-by: Samuel Holland <samuel@sholland.org> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/1572013829-14044-3-git-send-email-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-28 17:47:07 +01:00
Mathias Nyman	18b74067ac	xhci: Fix use-after-free regression in xhci clear hub TT implementation commit `ef513be0a9` ("usb: xhci: Add Clear_TT_Buffer") schedules work to clear TT buffer, but causes a use-after-free regression at the same time Make sure hub_tt_work finishes before endpoint is disabled, otherwise the work will dereference already freed endpoint and device related pointers. This was triggered when usb core failed to read the configuration descriptor of a FS/LS device during enumeration. xhci driver queued clear_tt_work while usb core freed and reallocated a new device for the next enumeration attempt. EHCI driver implents ehci_endpoint_disable() that makes sure clear_tt_work has finished before it returns, but xhci lacks this support. usb core will call hcd->driver->endpoint_disable() callback before disabling endpoints, so we want this in xhci as well. The added xhci_endpoint_disable() is based on ehci_endpoint_disable() Fixes: `ef513be0a9` ("usb: xhci: Add Clear_TT_Buffer") Cc: <stable@vger.kernel.org> # v5.3 Reported-by: Johan Hovold <johan@kernel.org> Suggested-by: Johan Hovold <johan@kernel.org> Reviewed-by: Johan Hovold <johan@kernel.org> Tested-by: Johan Hovold <johan@kernel.org> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/1572013829-14044-2-git-send-email-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-28 17:47:07 +01:00
Johan Hovold	52403cfbc6	USB: ldusb: fix control-message timeout USB control-message timeouts are specified in milliseconds, not jiffies. Waiting 83 minutes for a transfer to complete is a bit excessive. Fixes: `2824bd250f` ("[PATCH] USB: add ldusb driver") Cc: stable <stable@vger.kernel.org> # 2.6.13 Reported-by: syzbot+a4fbb3bb76cda0ea4e58@syzkaller.appspotmail.com Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20191022153127.22295-1-johan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-28 17:46:24 +01:00
Johan Hovold	88f6bf3846	USB: ldusb: use unsigned size format specifiers A recent info-leak bug manifested itself along with warning about a negative buffer overflow: ldusb 1-1:0.28: Read buffer overflow, -131383859965943 bytes dropped when it was really a rather large positive one. A sanity check that prevents this has now been put in place, but let's fix up the size format specifiers, which should all be unsigned. Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20191022143203.5260-3-johan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-28 17:46:23 +01:00
Johan Hovold	d98ee2a19c	USB: ldusb: fix ring-buffer locking The custom ring-buffer implementation was merged without any locking or explicit memory barriers, but a spinlock was later added by commit `9d33efd9a7` ("USB: ldusb bugfix"). The lock did not cover the update of the tail index once the entry had been processed, something which could lead to memory corruption on weakly ordered architectures or due to compiler optimisations. Specifically, a completion handler running on another CPU might observe the incremented tail index and update the entry before ld_usb_read() is done with it. Fixes: `2824bd250f` ("[PATCH] USB: add ldusb driver") Fixes: `9d33efd9a7` ("USB: ldusb bugfix") Cc: stable <stable@vger.kernel.org> # 2.6.13 Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20191022143203.5260-2-johan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-28 17:46:23 +01:00
Alan Stern	d482c7bb05	USB: Skip endpoints with 0 maxpacket length Endpoints with a maxpacket length of 0 are probably useless. They can't transfer any data, and it's not at all unlikely that an HCD will crash or hang when trying to handle an URB for such an endpoint. Currently the USB core does not check for endpoints having a maxpacket value of 0. This patch adds a check, printing a warning and skipping over any endpoints it catches. Now, the USB spec does not rule out endpoints having maxpacket = 0. But since they wouldn't have any practical use, there doesn't seem to be any good reason for us to accept them. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Link: https://lore.kernel.org/r/Pine.LNX.4.44L0.1910281050420.1485-100000@iolanthe.rowland.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-28 17:46:22 +01:00
Greg Kroah-Hartman	4ae8beac0a	USB: fixes for v5.4-rc5 Not much here, only 14 commits in different drivers. As for the specifics, Roger Quadros fixed an important bug in cdns3 where the driver was making decisions about data pull-up management behind the UDC framework's back. The Atmel UDC got a fix for interrupt storm in FIFO mode, this was done by Cristian Brisan. Apart from these, we have the usual set of non-critical fixes. Signed-off-by: Felipe Balbi <balbi@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCAAvFiEElLzh7wn96CXwjh2IzL64meEamQYFAl21QfQRHGJhbGJpQGtl cm5lbC5vcmcACgkQzL64meEamQY5KA/+NDC0DFqODUQiiQoyfu64S/6cfrNGtsfE TtkwYyZ9ps/IAy0CQmi0df4gTSrmUHSeqy1GtCA+l/JmaDdYferRr8rFJ8UAvi8R Ozf5cpCFeym+zKYtQcwF/KgTYmh7H34nX6vJb+WUfyTE/Z8r2Con3zv0xtj7eXtJ pBdgRIEdeyhI/6LddT9XdGI+6CYws/WEBKoBleC+LKpyOCPLlCJ64qmqOyxiQiW4 pcTjdcDzF6AKKsx+DefhpNMqAFrmkfRDsutFs0SupR0U5yOQ09/eOVht/Xg8buKi RaYsxPzm1lf8LcCdjtRMBoTYc8nFZ7UL9Wp9Lkin5tWOLqF3KPJ2fUMYfXDB3K5Z SGkmbaFrfyIHalaPuQ4ZFuZ3s6AjC+TDuoxYdXDK/slgRa0EWksym4Y36vy9EKhq 41lDGZ361Ult4etXTpnBf+TJTlpcrEqxsZDGCU64hvWwXRILD0swob0D0aI2fBE8 aKpsgeu3a7dY1kmzC8MqIjhUmY4E03MZr/qY34sVirXbOynUBZGpD3vuEGj2Inie 9zuAiek+huyKOoKLTk3mRRACApOyxcpzzeXUZcfXu1S/Ue2c6uzV8UGHwci8qnCk IlYTVqs4YPEMondTM2fpJmw1HJS59JbpwE0YUxmLnouMAQFCMr7TmKeAS/U8PKSI 3v6/GpEcDkg= =ci7a -----END PGP SIGNATURE----- Merge tag 'fixes-for-v5.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb into usb-linus Felipe writes: USB: fixes for v5.4-rc5 Not much here, only 14 commits in different drivers. As for the specifics, Roger Quadros fixed an important bug in cdns3 where the driver was making decisions about data pull-up management behind the UDC framework's back. The Atmel UDC got a fix for interrupt storm in FIFO mode, this was done by Cristian Brisan. Apart from these, we have the usual set of non-critical fixes. Signed-off-by: Felipe Balbi <balbi@kernel.org> * tag 'fixes-for-v5.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb: usb: cdns3: gadget: Don't manage pullups usb: dwc3: remove the call trace of USBx_GFLADJ usb: gadget: configfs: fix concurrent issue between composite APIs usb: dwc3: pci: prevent memory leak in dwc3_pci_probe usb: gadget: composite: Fix possible double free memory bug usb: gadget: udc: atmel: Fix interrupt storm in FIFO mode. usb: renesas_usbhs: fix type of buf usb: renesas_usbhs: Fix warnings in usbhsg_recip_handler_std_set_device() usb: gadget: udc: renesas_usb3: Fix __le16 warnings usb: renesas_usbhs: fix __le16 warnings usb: cdns3: include host-export,h for cdns3_host_init usb: mtu3: fix missing include of mtu3_dr.h usb: fsl: Check memory resource before releasing it usb: dwc3: select CONFIG_REGMAP_MMIO	2019-10-28 17:28:59 +01:00
Jens Axboe	044c1ab399	io_uring: don't touch ctx in setup after ring fd install syzkaller reported an issue where it looks like a malicious app can trigger a use-after-free of reading the ctx ->sq_array and ->rings value right after having installed the ring fd in the process file table. Defer ring fd installation until after we're done reading those values. Fixes: `75b28affdd` ("io_uring: allocate the two rings together") Reported-by: syzbot+6f03d895a6cd0d06187f@syzkaller.appspotmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-10-28 09:15:33 -06:00
Linus Torvalds	0365fb6bae	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fixes from Jiri Kosina: - HID++ device support regression fixes (race condition during cleanup, device detection fix, opps fix) from Andrey Smirnov - disable PM on i2c-hid, as it's causing problems with a lot of devices; other OSes apparently don't implement/enable it either; from Kai-Heng Feng - error handling fix in intel-ish driver, from Zhang Lixu - syzbot fuzzer fix for HID core code from Alan Stern - a few other tiny fixups (printk message cleanup, new device ID) * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: i2c-hid: add Trekstor Primebook C11B to descriptor override HID: logitech-hidpp: do all FF cleanup in hidpp_ff_destroy() HID: logitech-hidpp: rework device validation HID: logitech-hidpp: split g920_get_config() HID: i2c-hid: Remove runtime power management HID: intel-ish-hid: fix wrong error handling in ishtp_cl_alloc_tx_ring() HID: google: add magnemite/masterball USB ids HID: Fix assumption that devices have inputs HID: prodikeys: make array keys static const, makes object smaller HID: fix error message in hid_open_report()	2019-10-28 14:26:33 +01:00
Linus Torvalds	9e5eefba3d	virtio: fixes Some minor fixes Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJdtqXyAAoJECgfDbjSjVRpJ3kH/008+Y5aDaWcWP4//49epuu/ dldFu073fppcNeYjRLOLEGPLHok3w2uh4zsBRZ+hrK8qiH7yjZDxxzLrbBWKINRE HpZRp5Xdtmdu7QRWUMBiGHTdU/CIY/tUpGRy2d+k8LxeTBcrZIqj/9ixrPRW6sh0 0iXPQJtHtDAbbMjbs86s9ScwfcY/mwxLFf23168DnwE44pk86faFGASstWx+XSKl DU14t3eQnYX33R089lDMP+ohKLsqMXyD160WWZDPg+Ml2GhzyGGWixcDmZn3k42q NBCOsfLL8uxdATZlWMlv4xX1ykiNcOfijBnp8eutsLx4JyIWFxqVF0bXO3GsztM= =xHx4 -----END PGP SIGNATURE----- Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost Pull virtio fixes from Michael Tsirkin: "Some minor fixes" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vringh: fix copy direction of vringh_iov_push_kern() vsock/virtio: remove unused 'work' field from 'struct virtio_vsock_pkt' virtio_ring: fix stalls for packed rings	2019-10-28 12:47:22 +01:00
Takashi Iwai	1a7f60b9df	Revert "ALSA: hda: Flush interrupts on disabling" This reverts commit `caa8422d01`. It turned out that this commit caused a regression at shutdown / reboot, as the synchronize_irq() calls seems blocking the whole shutdown. Also another part of the change about shuffling the call order looks suspicious; the azx_stop_chip() call disables the CORB / RIRB while the others may still need the CORB/RIRB update. Since the original commit itself was a cargo-fix, let's revert the whole patch. Fixes: `caa8422d01` ("ALSA: hda: Flush interrupts on disabling") BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=205333 BugLinK: https://bugs.freedesktop.org/show_bug.cgi?id=111174 Signed-off-by: Takashi Iwai <tiwai@suse.de> Cc: Chris Wilson <chris@chris-wilson.co.uk> Link: https://lore.kernel.org/r/20191028081056.22010-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>	2019-10-28 11:47:59 +01:00
Geert Uytterhoeven	652521d460	perf/headers: Fix spelling s/EACCESS/EACCES/, s/privilidge/privilege/ As per POSIX, the correct spelling of the error code is EACCES: include/uapi/asm-generic/errno-base.h:#define EACCES 13 /* Permission denied */ Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Kosina <trivial@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: https://lkml.kernel.org/r/20191024122904.12463-1-geert+renesas@glider.be Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-10-28 11:02:01 +01:00
Kan Liang	75be6f703a	perf/x86/uncore: Fix event group support The events in the same group don't start or stop simultaneously. Here is the ftrace when enabling event group for uncore_iio_0: # perf stat -e "{uncore_iio_0/event=0x1/,uncore_iio_0/event=0xe/}" <idle>-0 [000] d.h. 8959.064832: read_msr: a41, value b2b0b030 //Read counter reg of IIO unit0 counter0 <idle>-0 [000] d.h. 8959.064835: write_msr: a48, value 400001 //Write Ctrl reg of IIO unit0 counter0 to enable counter0. <------ Although counter0 is enabled, Unit Ctrl is still freezed. Nothing will count. We are still good here. <idle>-0 [000] d.h. 8959.064836: read_msr: a40, value 30100 //Read Unit Ctrl reg of IIO unit0 <idle>-0 [000] d.h. 8959.064838: write_msr: a40, value 30000 //Write Unit Ctrl reg of IIO unit0 to enable all counters in the unit by clear Freeze bit <------Unit0 is un-freezed. Counter0 has been enabled. Now it starts counting. But counter1 has not been enabled yet. The issue starts here. <idle>-0 [000] d.h. 8959.064846: read_msr: a42, value 0 //Read counter reg of IIO unit0 counter1 <idle>-0 [000] d.h. 8959.064847: write_msr: a49, value 40000e //Write Ctrl reg of IIO unit0 counter1 to enable counter1. <------ Now, counter1 just starts to count. Counter0 has been running for a while. Current code un-freezes the Unit Ctrl right after the first counter is enabled. The subsequent group events always loses some counter values. Implement pmu_enable and pmu_disable support for uncore, which can help to batch hardware accesses. No one uses uncore_enable_box and uncore_disable_box. Remove them. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: linux-drivers-review@eclists.intel.com Cc: linux-perf@eclists.intel.com Fixes: `087bfbb032` ("perf/x86: Add generic Intel uncore PMU support") Link: https://lkml.kernel.org/r/1572014593-31591-1-git-send-email-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-10-28 11:02:01 +01:00
Kim Phillips	e431e79b60	perf/x86/amd/ibs: Handle erratum #420 only on the affected CPU family (10h) This saves us writing the IBS control MSR twice when disabling the event. I searched revision guides for all families since 10h, and did not find occurrence of erratum #420, nor anything remotely similar: so we isolate the secondary MSR write to family 10h only. Also unconditionally update the count mask for IBS Op implementations that have read & writeable current count (CurCnt) fields in addition to the MaxCnt field. These bits were reserved on prior implementations, and therefore shouldn't have negative impact. Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: `c9574fe0bd` ("perf/x86-ibs: Implement workaround for IBS erratum #420") Link: https://lkml.kernel.org/r/20191023150955.30292-2-kim.phillips@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-10-28 11:02:00 +01:00
Kim Phillips	317b96bb14	perf/x86/amd/ibs: Fix reading of the IBS OpData register and thus precise RIP validity The loop that reads all the IBS MSRs into *buf stopped one MSR short of reading the IbsOpData register, which contains the RipInvalid status bit. Fix the offset_max assignment so the MSR gets read, so the RIP invalid evaluation is based on what the IBS h/w output, instead of what was left in memory. Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: `d47e8238cd` ("perf/x86-ibs: Take instruction pointer from ibs sample") Link: https://lkml.kernel.org/r/20191023150955.30292-1-kim.phillips@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-10-28 11:01:59 +01:00
Alexander Shishkin	8c7e975667	perf/core: Start rejecting the syscall with attr.__reserved_2 set Commit: `1a59413124` ("perf: Add wakeup watermark control to the AUX area") added attr.__reserved_2 padding, but forgot to add an ABI check to reject attributes with this field set. Fix that. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: adrian.hunter@intel.com Cc: mathieu.poirier@linaro.org Link: https://lkml.kernel.org/r/20191025121636.75182-1-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-10-28 11:01:59 +01:00
Jason Wang	b3683dee84	vringh: fix copy direction of vringh_iov_push_kern() We want to copy from iov to buf, so the direction was wrong. Note: no real user for the helper, but it will be used by future features. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2019-10-28 04:25:04 -04:00
Stefano Garzarella	6771596169	vsock/virtio: remove unused 'work' field from 'struct virtio_vsock_pkt' The 'work' field was introduced with commit `06a8fc7836` ("VSOCK: Introduce virtio_vsock_common.ko") but it is never used in the code, so we can remove it to save memory allocated in the per-packet 'struct virtio_vsock_pkt' Suggested-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2019-10-28 04:25:04 -04:00
Marvin Liu	40ce7919d8	virtio_ring: fix stalls for packed rings When VIRTIO_F_RING_EVENT_IDX is negotiated, virtio devices can use virtqueue_enable_cb_delayed_packed to reduce the number of device interrupts. At the moment, this is the case for virtio-net when the napi_tx module parameter is set to false. In this case, the virtio driver selects an event offset and expects that the device will send a notification when rolling over the event offset in the ring. However, if this roll-over happens before the event suppression structure update, the notification won't be sent. To address this race condition the driver needs to check wether the device rolled over the offset after updating the event suppression structure. With VIRTIO_F_RING_PACKED, the virtio driver did this by reading the flags field of the descriptor at the specified offset. Unfortunately, checking at the event offset isn't reliable: if descriptors are chained (e.g. when INDIRECT is off) not all descriptors are overwritten by the device, so it's possible that the device skipped the specific descriptor driver is checking when writing out used descriptors. If this happens, the driver won't detect the race condition and will incorrectly expect the device to send a notification. For virtio-net, the result will be a TX queue stall, with the transmission getting blocked forever. With the packed ring, it isn't easy to find a location which is guaranteed to change upon the roll-over, except the next device descriptor, as described in the spec: Writes of device and driver descriptors can generally be reordered, but each side (driver and device) are only required to poll (or test) a single location in memory: the next device descriptor after the one they processed previously, in circular order. while this might be sub-optimal, let's do exactly this for now. Cc: stable@vger.kernel.org Cc: Jason Wang <jasowang@redhat.com> Fixes: `f51f982682` ("virtio_ring: leverage event idx in packed ring") Signed-off-by: Marvin Liu <yong.liu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2019-10-28 04:24:46 -04:00

... 3 4 5 6 7 ...

872770 commits