linux

mirror of https://github.com/torvalds/linux synced 2024-10-17 08:46:50 +00:00

Author	SHA1	Message	Date
Taehee Yoo	bd5cd35b78	gtp: use __GFP_NOWARN to avoid memalloc warning gtp hashtable size is received by user-space. So, this hashtable size could be too large. If so, kmalloc will internally print a warning message. This warning message is actually not necessary for the gtp module. So, this patch adds __GFP_NOWARN to avoid this message. Splat looks like: [ 2171.200049][ T1860] WARNING: CPU: 1 PID: 1860 at mm/page_alloc.c:4713 __alloc_pages_nodemask+0x2f3/0x740 [ 2171.238885][ T1860] Modules linked in: gtp veth openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv] [ 2171.262680][ T1860] CPU: 1 PID: 1860 Comm: gtp-link Not tainted 5.5.0+ #321 [ 2171.263567][ T1860] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 2171.264681][ T1860] RIP: 0010:__alloc_pages_nodemask+0x2f3/0x740 [ 2171.265332][ T1860] Code: 64 fe ff ff 65 48 8b 04 25 c0 0f 02 00 48 05 f0 12 00 00 41 be 01 00 00 00 49 89 47 0 [ 2171.267301][ T1860] RSP: 0018:ffff8880b51af1f0 EFLAGS: 00010246 [ 2171.268320][ T1860] RAX: ffffed1016a35e43 RBX: 0000000000000000 RCX: 0000000000000000 [ 2171.269517][ T1860] RDX: 0000000000000000 RSI: 000000000000000b RDI: 0000000000000000 [ 2171.270305][ T1860] RBP: 0000000000040cc0 R08: ffffed1018893109 R09: dffffc0000000000 [ 2171.275973][ T1860] R10: 0000000000000001 R11: ffffed1018893108 R12: 1ffff11016a35e43 [ 2171.291039][ T1860] R13: 000000000000000b R14: 000000000000000b R15: 00000000000f4240 [ 2171.292328][ T1860] FS: 00007f53cbc83740(0000) GS:ffff8880da000000(0000) knlGS:0000000000000000 [ 2171.293409][ T1860] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2171.294586][ T1860] CR2: 000055f540014508 CR3: 00000000b49f2004 CR4: 00000000000606e0 [ 2171.295424][ T1860] Call Trace: [ 2171.295756][ T1860] ? mark_held_locks+0xa5/0xe0 [ 2171.296659][ T1860] ? __alloc_pages_slowpath+0x21b0/0x21b0 [ 2171.298283][ T1860] ? gtp_encap_enable_socket+0x13e/0x400 [gtp] [ 2171.298962][ T1860] ? alloc_pages_current+0xc1/0x1a0 [ 2171.299475][ T1860] kmalloc_order+0x22/0x80 [ 2171.299936][ T1860] kmalloc_order_trace+0x1d/0x140 [ 2171.300437][ T1860] __kmalloc+0x302/0x3a0 [ 2171.300896][ T1860] gtp_newlink+0x293/0xba0 [gtp] [ ... ] Fixes: `459aa660eb` ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-02-04 12:38:50 +01:00
Ridge Kennedy	0d0d9a388a	l2tp: Allow duplicate session creation with UDP In the past it was possible to create multiple L2TPv3 sessions with the same session id as long as the sessions belonged to different tunnels. The resulting sessions had issues when used with IP encapsulated tunnels, but worked fine with UDP encapsulated ones. Some applications began to rely on this behaviour to avoid having to negotiate unique session ids. Some time ago a change was made to require session ids to be unique across all tunnels, breaking the applications making use of this "feature". This change relaxes the duplicate session id check to allow duplicates if both of the colliding sessions belong to UDP encapsulated tunnels. Fixes: `dbdbc73b44` ("l2tp: fix duplicate session creation") Signed-off-by: Ridge Kennedy <ridge.kennedy@alliedtelesis.co.nz> Acked-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-02-04 12:35:49 +01:00
Kai-Heng Feng	b4b771fd51	r8152: Add MAC passthrough support to new device Device 0xa387 also supports MAC passthrough, therefore add it to the whitelst. BugLink: https://bugs.launchpad.net/bugs/1827961/comments/30 Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-02-04 11:58:10 +01:00
Cong Wang	599be01ee5	net_sched: fix an OOB access in cls_tcindex As Eric noticed, tcindex_alloc_perfect_hash() uses cp->hash to compute the size of memory allocation, but cp->hash is set again after the allocation, this caused an out-of-bound access. So we have to move all cp->hash initialization and computation before the memory allocation. Move cp->mask and cp->shift together as cp->hash may need them for computation too. Reported-and-tested-by: syzbot+35d4dea36c387813ed31@syzkaller.appspotmail.com Fixes: `331b72922c` ("net: sched: RCU cls_tcindex") Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Cc: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-02-04 11:41:36 +01:00
YueHaibing	83b4304530	qed: Remove set but not used variable 'p_link' Fixes gcc '-Wunused-but-set-variable' warning: drivers/net/ethernet/qlogic/qed/qed_cxt.c: In function 'qed_qm_init_pf': drivers/net/ethernet/qlogic/qed/qed_cxt.c:1401:29: warning: variable 'p_link' set but not used [-Wunused-but-set-variable] commit `92fae6fb23` ("qed: FW 8.42.2.0 Queue Manager changes") leave behind this unused variable. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-02-04 09:33:36 +01:00
David S. Miller	9afe2322cb	Merge branch 'unbreak-basic-and-bpf-tdc-testcases' Davide Caratti says: ==================== unbreak 'basic' and 'bpf' tdc testcases - patch 1/2 fixes tdc failures with 'bpf' action on fresch clones of the kernel tree - patch 2/2 allow running tdc for the 'basic' classifier without tweaking tdc_config.py ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-02-04 09:30:00 +01:00
Davide Caratti	e9ed4fa7b4	tc-testing: add missing 'nsPlugin' to basic.json since tdc tests for cls_basic need $DEV1, use 'nsPlugin' so that the following command can be run without errors: [root@f31 tc-testing]# ./tdc.py -c basic Fixes: `4717b05328` ("tc-testing: Introduced tdc tests for basic filter") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-02-04 09:30:00 +01:00
Davide Caratti	7145fcfffe	tc-testing: fix eBPF tests failure on linux fresh clones when the following command is done on a fresh clone of the kernel tree, [root@f31 tc-testing]# ./tdc.py -c bpf test cases that need to build the eBPF sample program fail systematically, because 'buildebpfPlugin' is unable to install the kernel headers (i.e, the 'khdr' target fails). Pass the correct environment to 'make', in place of ENVIR, to allow running these tests. Fixes: `4c2d39bd40` ("tc-testing: use a plugin to build eBPF program") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-02-04 09:30:00 +01:00
Eric Dumazet	2b5b8251bc	net: hsr: fix possible NULL deref in hsr_handle_frame() hsr_port_get_rcu() can return NULL, so we need to be careful. general protection fault, probably for non-canonical address 0xdffffc0000000006: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000030-0x0000000000000037] CPU: 1 PID: 10249 Comm: syz-executor.5 Not tainted 5.5.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__read_once_size include/linux/compiler.h:199 [inline] RIP: 0010:hsr_addr_is_self+0x86/0x330 net/hsr/hsr_framereg.c:44 Code: 04 00 f3 f3 f3 65 48 8b 04 25 28 00 00 00 48 89 45 d0 31 c0 e8 6b ff 94 f9 4c 89 f2 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <80> 3c 02 00 0f 85 75 02 00 00 48 8b 43 30 49 39 c6 49 89 47 c0 0f RSP: 0018:ffffc90000da8a90 EFLAGS: 00010206 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff87e0cc33 RDX: 0000000000000006 RSI: ffffffff87e035d5 RDI: 0000000000000000 RBP: ffffc90000da8b20 R08: ffff88808e7de040 R09: ffffed1015d2707c R10: ffffed1015d2707b R11: ffff8880ae9383db R12: ffff8880a689bc5e R13: 1ffff920001b5153 R14: 0000000000000030 R15: ffffc90000da8af8 FS: 00007fd7a42be700(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b32338000 CR3: 00000000a928c000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <IRQ> hsr_handle_frame+0x1c5/0x630 net/hsr/hsr_slave.c:31 __netif_receive_skb_core+0xfbc/0x30b0 net/core/dev.c:5099 __netif_receive_skb_one_core+0xa8/0x1a0 net/core/dev.c:5196 __netif_receive_skb+0x2c/0x1d0 net/core/dev.c:5312 process_backlog+0x206/0x750 net/core/dev.c:6144 napi_poll net/core/dev.c:6582 [inline] net_rx_action+0x508/0x1120 net/core/dev.c:6650 __do_softirq+0x262/0x98c kernel/softirq.c:292 do_softirq_own_stack+0x2a/0x40 arch/x86/entry/entry_64.S:1082 </IRQ> Fixes: `c5a7591172` ("net/hsr: Use list_head (and rcu) instead of array for slave devices.") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-02-04 09:27:07 +01:00
Jakub Kicinski	a444ad1432	Merge branch 'netdevsim-fix-several-bugs-in-netdevsim-module' Taehee Yoo says: ===================== netdevsim: fix several bugs in netdevsim module This patchset fixes several bugs in netdevsim module. 1. The first patch fixes using uninitialized resources This patch fixes two similar problems, which is to use uninitialized resources. a) In the current code, {new/del}_device_store() use resource, they are initialized by __init(). But, these functions could be called before __init() is finished. So, accessing uninitialized data could occur and it eventually makes panic. b) In the current code, {new/del}_port_store() uses resource, they are initialized by new_device_store(). But thes functions could be called before new_device_store() is finished. 2. The second patch fixes another race condition. The main problem is a race condition in {new/del}_port() and devlink reload function. These functions would allocate and remove resources. So these functions should not be executed concurrently. 3. The third patch fixes a panic in nsim_dev_take_snapshot_write(). nsim_dev_take_snapshot_write() uses nsim_dev and nsim_dev->dummy_region. But these data could be removed by both reload routine and del_device_store(). And these functions could be executed concurrently. 4. The fourth patch fixes stack-out-of-bound in nsim_dev_debugfs_init(). nsim_dev_debugfs_init() provides only 16bytes for name pointer. But, there are some case the name length is over 16bytes. So, stack-out-of-bound occurs. 5. The fifth patch uses IS_ERR instead of IS_ERR_OR_NULL. debugfs_create_{dir/file} doesn't return NULL. So, IS_ERR() is more correct. 6. The sixth patch avoids kmalloc warning. When too large memory allocation is requested by user-space, kmalloc internally prints warning message. That warning message is not necessary. In order to avoid that, it adds __GFP_NOWARN. 7. The last patch removes an unused sdev.c file Change log: v2 -> v3: - Use smp_load_acquire() and smp_store_release() for flag variables. - Change variable names. - Fix deadlock in second patch. - Update lock variable comment. - Add new patch for fixing panic in snapshot_write(). - Include Reviewed-by tags. - Update some log messages and comment. v1 -> v2: - Splits a fixing race condition patch into two patches. - Fix incorrect Fixes tags. - Update comments - Fix use-after-free - Add a new patch, which removes an unused sdev.c file. - Remove a patch, which tries to avoid debugfs warning. ===================== Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-02-03 15:38:50 -08:00
Taehee Yoo	245311637f	netdevsim: remove unused sdev code sdev.c code is merged into dev.c and is not used anymore. it would be removed. Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-02-03 15:32:20 -08:00
Taehee Yoo	83cf4213ba	netdevsim: use __GFP_NOWARN to avoid memalloc warning vfnum buffer size and binary_len buffer size is received by user-space. So, this buffer size could be too large. If so, kmalloc will internally print a warning message. This warning message is actually not necessary for the netdevsim module. So, this patch adds __GFP_NOWARN. Test commands: modprobe netdevsim echo 1 > /sys/bus/netdevsim/new_device echo 1000000000 > /sys/devices/netdevsim1/sriov_numvfs Splat looks like: [ 357.847266][ T1000] WARNING: CPU: 0 PID: 1000 at mm/page_alloc.c:4738 __alloc_pages_nodemask+0x2f3/0x740 [ 357.850273][ T1000] Modules linked in: netdevsim veth openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrx [ 357.852989][ T1000] CPU: 0 PID: 1000 Comm: bash Tainted: G B 5.5.0-rc5+ #270 [ 357.854334][ T1000] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 357.855703][ T1000] RIP: 0010:__alloc_pages_nodemask+0x2f3/0x740 [ 357.856669][ T1000] Code: 64 fe ff ff 65 48 8b 04 25 c0 0f 02 00 48 05 f0 12 00 00 41 be 01 00 00 00 49 89 47 0 [ 357.860272][ T1000] RSP: 0018:ffff8880b7f47bd8 EFLAGS: 00010246 [ 357.861009][ T1000] RAX: ffffed1016fe8f80 RBX: 1ffff11016fe8fae RCX: 0000000000000000 [ 357.861843][ T1000] RDX: 0000000000000000 RSI: 0000000000000017 RDI: 0000000000000000 [ 357.862661][ T1000] RBP: 0000000000040dc0 R08: 1ffff11016fe8f67 R09: dffffc0000000000 [ 357.863509][ T1000] R10: ffff8880b7f47d68 R11: fffffbfff2798180 R12: 1ffff11016fe8f80 [ 357.864355][ T1000] R13: 0000000000000017 R14: 0000000000000017 R15: ffff8880c2038d68 [ 357.865178][ T1000] FS: 00007fd9a5b8c740(0000) GS:ffff8880d9c00000(0000) knlGS:0000000000000000 [ 357.866248][ T1000] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 357.867531][ T1000] CR2: 000055ce01ba8100 CR3: 00000000b7dbe005 CR4: 00000000000606f0 [ 357.868972][ T1000] Call Trace: [ 357.869423][ T1000] ? lock_contended+0xcd0/0xcd0 [ 357.870001][ T1000] ? __alloc_pages_slowpath+0x21d0/0x21d0 [ 357.870673][ T1000] ? _kstrtoull+0x76/0x160 [ 357.871148][ T1000] ? alloc_pages_current+0xc1/0x1a0 [ 357.871704][ T1000] kmalloc_order+0x22/0x80 [ 357.872184][ T1000] kmalloc_order_trace+0x1d/0x140 [ 357.872733][ T1000] __kmalloc+0x302/0x3a0 [ 357.873204][ T1000] nsim_bus_dev_numvfs_store+0x1ab/0x260 [netdevsim] [ 357.873919][ T1000] ? kernfs_get_active+0x12c/0x180 [ 357.874459][ T1000] ? new_device_store+0x450/0x450 [netdevsim] [ 357.875111][ T1000] ? kernfs_get_parent+0x70/0x70 [ 357.875632][ T1000] ? sysfs_file_ops+0x160/0x160 [ 357.876152][ T1000] kernfs_fop_write+0x276/0x410 [ 357.876680][ T1000] ? __sb_start_write+0x1ba/0x2e0 [ 357.877225][ T1000] vfs_write+0x197/0x4a0 [ 357.877671][ T1000] ksys_write+0x141/0x1d0 [ ... ] Reviewed-by: Jakub Kicinski <kuba@kernel.org> Fixes: `7957922056` ("netdevsim: add SR-IOV functionality") Fixes: `82c93a87bf` ("netdevsim: implement couple of testing devlink health reporters") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-02-03 15:32:20 -08:00
Taehee Yoo	6556ff32f1	netdevsim: use IS_ERR instead of IS_ERR_OR_NULL for debugfs Debugfs APIs return valid pointer or error pointer. it doesn't return NULL. So, using IS_ERR is enough, not using IS_ERR_OR_NULL. Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reported-by: kbuild test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-02-03 15:32:20 -08:00
Taehee Yoo	6fb8852b12	netdevsim: fix stack-out-of-bounds in nsim_dev_debugfs_init() When netdevsim dev is being created, a debugfs directory is created. The variable "dev_ddir_name" is 16bytes device name pointer and device name is "netdevsim<dev id>". The maximum dev id length is 10. So, 16bytes for device name isn't enough. Test commands: modprobe netdevsim echo "1000000000 0" > /sys/bus/netdevsim/new_device Splat looks like: [ 249.622710][ T900] BUG: KASAN: stack-out-of-bounds in number+0x824/0x880 [ 249.623658][ T900] Write of size 1 at addr ffff88804c527988 by task bash/900 [ 249.624521][ T900] [ 249.624830][ T900] CPU: 1 PID: 900 Comm: bash Not tainted 5.5.0+ #322 [ 249.625691][ T900] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 249.626712][ T900] Call Trace: [ 249.627103][ T900] dump_stack+0x96/0xdb [ 249.627639][ T900] ? number+0x824/0x880 [ 249.628173][ T900] print_address_description.constprop.5+0x1be/0x360 [ 249.629022][ T900] ? number+0x824/0x880 [ 249.629569][ T900] ? number+0x824/0x880 [ 249.630105][ T900] __kasan_report+0x12a/0x170 [ 249.630717][ T900] ? number+0x824/0x880 [ 249.631201][ T900] kasan_report+0xe/0x20 [ 249.631723][ T900] number+0x824/0x880 [ 249.632235][ T900] ? put_dec+0xa0/0xa0 [ 249.632716][ T900] ? rcu_read_lock_sched_held+0x90/0xc0 [ 249.633392][ T900] vsnprintf+0x63c/0x10b0 [ 249.633983][ T900] ? pointer+0x5b0/0x5b0 [ 249.634543][ T900] ? mark_lock+0x11d/0xc40 [ 249.635200][ T900] sprintf+0x9b/0xd0 [ 249.635750][ T900] ? scnprintf+0xe0/0xe0 [ 249.636370][ T900] nsim_dev_probe+0x63c/0xbf0 [netdevsim] [ ... ] Reviewed-by: Jakub Kicinski <kuba@kernel.org> Fixes: `ab1d0cc004` ("netdevsim: change debugfs tree topology") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-02-03 15:32:20 -08:00
Taehee Yoo	8526ad9646	netdevsim: fix panic in nsim_dev_take_snapshot_write() nsim_dev_take_snapshot_write() uses nsim_dev and nsim_dev->dummy_region. So, during this function, these data shouldn't be removed. But there is no protecting stuff in this function. There are two similar cases. 1. reload case reload could be called during nsim_dev_take_snapshot_write(). When reload is being executed, nsim_dev_reload_down() is called and it calls nsim_dev_reload_destroy(). nsim_dev_reload_destroy() calls devlink_region_destroy() to destroy nsim_dev->dummy_region. So, during nsim_dev_take_snapshot_write(), nsim_dev->dummy_region() would be removed. At this point, snapshot_write() would access freed pointer. In order to fix this case, take_snapshot file will be removed before devlink_region_destroy(). The take_snapshot file will be re-created by ->reload_up(). 2. del_device_store case del_device_store() also could call nsim_dev_reload_destroy() during nsim_dev_take_snapshot_write(). If so, panic would occur. This problem is actually the same problem with the first case. So, this problem will be fixed by the first case's solution. Test commands: modprobe netdevsim while : do echo 1 > /sys/bus/netdevsim/new_device & echo 1 > /sys/bus/netdevsim/del_device & devlink dev reload netdevsim/netdevsim1 & echo 1 > /sys/kernel/debug/netdevsim/netdevsim1/take_snapshot & done Splat looks like: [ 45.564513][ T975] general protection fault, probably for non-canonical address 0xdffffc000000003a: 0000 [#1] SMP DEI [ 45.566131][ T975] KASAN: null-ptr-deref in range [0x00000000000001d0-0x00000000000001d7] [ 45.566135][ T975] CPU: 1 PID: 975 Comm: bash Not tainted 5.5.0+ #322 [ 45.569020][ T975] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 45.569026][ T975] RIP: 0010:__mutex_lock+0x10a/0x14b0 [ 45.570518][ T975] Code: 08 84 d2 0f 85 7f 12 00 00 44 8b 0d 10 23 65 02 45 85 c9 75 29 49 8d 7f 68 48 b8 00 00 00 0f [ 45.570522][ T975] RSP: 0018:ffff888046ccfbf0 EFLAGS: 00010206 [ 45.572305][ T975] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 45.572308][ T975] RDX: 000000000000003a RSI: ffffffffac926440 RDI: 00000000000001d0 [ 45.576843][ T975] RBP: ffff888046ccfd70 R08: ffffffffab610645 R09: 0000000000000000 [ 45.576847][ T975] R10: ffff888046ccfd90 R11: ffffed100d6360ad R12: 0000000000000000 [ 45.578471][ T975] R13: dffffc0000000000 R14: ffffffffae1976c0 R15: 0000000000000168 [ 45.578475][ T975] FS: 00007f614d6e7740(0000) GS:ffff88806c400000(0000) knlGS:0000000000000000 [ 45.581492][ T975] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 45.582942][ T975] CR2: 00005618677d1cf0 CR3: 000000005fb9c002 CR4: 00000000000606e0 [ 45.584543][ T975] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 45.586633][ T975] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 45.589889][ T975] Call Trace: [ 45.591445][ T975] ? devlink_region_snapshot_create+0x55/0x4a0 [ 45.601250][ T975] ? mutex_lock_io_nested+0x1380/0x1380 [ 45.602817][ T975] ? mutex_lock_io_nested+0x1380/0x1380 [ 45.603875][ T975] ? mark_held_locks+0xa5/0xe0 [ 45.604769][ T975] ? _raw_spin_unlock_irqrestore+0x2d/0x50 [ 45.606147][ T975] ? __mutex_unlock_slowpath+0xd0/0x670 [ 45.607723][ T975] ? crng_backtrack_protect+0x80/0x80 [ 45.613530][ T975] ? wait_for_completion+0x390/0x390 [ 45.615152][ T975] ? devlink_region_snapshot_create+0x55/0x4a0 [ 45.616834][ T975] devlink_region_snapshot_create+0x55/0x4a0 [ ... ] Fixes: `4418f862d6` ("netdevsim: implement support for devlink region and snapshots") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-02-03 15:32:20 -08:00
Taehee Yoo	6ab63366e1	netdevsim: disable devlink reload when resources are being used devlink reload destroys resources and allocates resources again. So, when devices and ports resources are being used, devlink reload function should not be executed. In order to avoid this race, a new lock is added and new_port() and del_port() call devlink_reload_disable() and devlink_reload_enable(). Thread0 Thread1 {new/del}_port() {new/del}_port() devlink_reload_disable() devlink_reload_disable() devlink_reload_enable() //here devlink_reload_enable() Before Thread1's devlink_reload_enable(), the devlink is already allowed to execute reload because Thread0 allows it. devlink reload disable/enable variable type is bool. So the above case would exist. So, disable/enable should be executed atomically. In order to do that, a new lock is used. Test commands: modprobe netdevsim echo 1 > /sys/bus/netdevsim/new_device while : do echo 1 > /sys/devices/netdevsim1/new_port & echo 1 > /sys/devices/netdevsim1/del_port & devlink dev reload netdevsim/netdevsim1 & done Splat looks like: [ 23.342145][ T932] DEBUG_LOCKS_WARN_ON(mutex_is_locked(lock)) [ 23.342159][ T932] WARNING: CPU: 0 PID: 932 at kernel/locking/mutex-debug.c:103 mutex_destroy+0xc7/0xf0 [ 23.344182][ T932] Modules linked in: netdevsim openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_dx [ 23.346485][ T932] CPU: 0 PID: 932 Comm: devlink Not tainted 5.5.0+ #322 [ 23.347696][ T932] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 23.348893][ T932] RIP: 0010:mutex_destroy+0xc7/0xf0 [ 23.349505][ T932] Code: e0 07 83 c0 03 38 d0 7c 04 84 d2 75 2e 8b 05 00 ac b0 02 85 c0 75 8b 48 c7 c6 00 5e 07 96 40 [ 23.351887][ T932] RSP: 0018:ffff88806208f810 EFLAGS: 00010286 [ 23.353963][ T932] RAX: dffffc0000000008 RBX: ffff888067f6f2c0 RCX: ffffffff942c4bd4 [ 23.355222][ T932] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff96dac5b4 [ 23.356169][ T932] RBP: ffff888067f6f000 R08: fffffbfff2d235a5 R09: fffffbfff2d235a5 [ 23.357160][ T932] R10: 0000000000000001 R11: fffffbfff2d235a4 R12: ffff888067f6f208 [ 23.358288][ T932] R13: ffff88806208fa70 R14: ffff888067f6f000 R15: ffff888069ce3800 [ 23.359307][ T932] FS: 00007fe2a3876740(0000) GS:ffff88806c000000(0000) knlGS:0000000000000000 [ 23.360473][ T932] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 23.361319][ T932] CR2: 00005561357aa000 CR3: 000000005227a006 CR4: 00000000000606f0 [ 23.362323][ T932] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 23.363417][ T932] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 23.364414][ T932] Call Trace: [ 23.364828][ T932] nsim_dev_reload_destroy+0x77/0xb0 [netdevsim] [ 23.365655][ T932] nsim_dev_reload_down+0x84/0xb0 [netdevsim] [ 23.366433][ T932] devlink_reload+0xb1/0x350 [ 23.367010][ T932] genl_rcv_msg+0x580/0xe90 [ ...] [ 23.531729][ T1305] kernel BUG at lib/list_debug.c:53! [ 23.532523][ T1305] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI [ 23.533467][ T1305] CPU: 2 PID: 1305 Comm: bash Tainted: G W 5.5.0+ #322 [ 23.534962][ T1305] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 23.536503][ T1305] RIP: 0010:__list_del_entry_valid+0xe6/0x150 [ 23.538346][ T1305] Code: 89 ea 48 c7 c7 00 73 1e 96 e8 df f7 4c ff 0f 0b 48 c7 c7 60 73 1e 96 e8 d1 f7 4c ff 0f 0b 44 [ 23.541068][ T1305] RSP: 0018:ffff888047c27b58 EFLAGS: 00010282 [ 23.542001][ T1305] RAX: 0000000000000054 RBX: ffff888067f6f318 RCX: 0000000000000000 [ 23.543051][ T1305] RDX: 0000000000000054 RSI: 0000000000000008 RDI: ffffed1008f84f61 [ 23.544072][ T1305] RBP: ffff88804aa0fca0 R08: ffffed100d940539 R09: ffffed100d940539 [ 23.545085][ T1305] R10: 0000000000000001 R11: ffffed100d940538 R12: ffff888047c27cb0 [ 23.546422][ T1305] R13: ffff88806208b840 R14: ffffffff981976c0 R15: ffff888067f6f2c0 [ 23.547406][ T1305] FS: 00007f76c0431740(0000) GS:ffff88806c800000(0000) knlGS:0000000000000000 [ 23.548527][ T1305] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 23.549389][ T1305] CR2: 00007f5048f1a2f8 CR3: 000000004b310006 CR4: 00000000000606e0 [ 23.550636][ T1305] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 23.551578][ T1305] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 23.552597][ T1305] Call Trace: [ 23.553004][ T1305] mutex_remove_waiter+0x101/0x520 [ 23.553646][ T1305] __mutex_lock+0xac7/0x14b0 [ 23.554218][ T1305] ? nsim_dev_port_del+0x4e/0x140 [netdevsim] [ 23.554908][ T1305] ? mutex_lock_io_nested+0x1380/0x1380 [ 23.555570][ T1305] ? _parse_integer+0xf0/0xf0 [ 23.556043][ T1305] ? kstrtouint+0x86/0x110 [ 23.556504][ T1305] ? nsim_dev_port_del+0x4e/0x140 [netdevsim] [ 23.557133][ T1305] nsim_dev_port_del+0x4e/0x140 [netdevsim] [ 23.558024][ T1305] del_port_store+0xcc/0xf0 [netdevsim] [ ... ] Fixes: `75ba029f3c` ("netdevsim: implement proper devlink reload") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-02-03 15:32:20 -08:00
Taehee Yoo	f5cd21605e	netdevsim: fix using uninitialized resources When module is being initialized, __init() calls bus_register() and driver_register(). These functions internally create various resources and sysfs files. The sysfs files are used for basic operations(add/del device). /sys/bus/netdevsim/new_device /sys/bus/netdevsim/del_device These sysfs files use netdevsim resources, they are mostly allocated and initialized in ->probe() function, which is nsim_dev_probe(). But, sysfs files could be executed before ->probe() is finished. So, accessing uninitialized data would occur. Another problem is very similar. /sys/bus/netdevsim/new_device internally creates sysfs files. /sys/devices/netdevsim<id>/new_port /sys/devices/netdevsim<id>/del_port These sysfs files also use netdevsim resources, they are mostly allocated and initialized in creating device routine, which is nsim_bus_dev_new(). But they also could be executed before nsim_bus_dev_new() is finished. So, accessing uninitialized data would occur. To fix these problems, this patch adds flags, which means whether the operation is finished or not. The flag variable 'nsim_bus_enable' means whether netdevsim bus was initialized or not. This is protected by nsim_bus_dev_list_lock. The flag variable 'nsim_bus_dev->init' means whether nsim_bus_dev was initialized or not. This could be used in {new/del}_port_store() with no lock. Test commands: #SHELL1 modprobe netdevsim while : do echo "1 1" > /sys/bus/netdevsim/new_device echo "1 1" > /sys/bus/netdevsim/del_device done #SHELL2 while : do echo 1 > /sys/devices/netdevsim1/new_port echo 1 > /sys/devices/netdevsim1/del_port done Splat looks like: [ 47.508954][ T1008] general protection fault, probably for non-canonical address 0xdffffc0000000021: 0000 I [ 47.510793][ T1008] KASAN: null-ptr-deref in range [0x0000000000000108-0x000000000000010f] [ 47.511963][ T1008] CPU: 2 PID: 1008 Comm: bash Not tainted 5.5.0+ #322 [ 47.512823][ T1008] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 47.514041][ T1008] RIP: 0010:__mutex_lock+0x10a/0x14b0 [ 47.514699][ T1008] Code: 08 84 d2 0f 85 7f 12 00 00 44 8b 0d 10 23 65 02 45 85 c9 75 29 49 8d 7f 68 48 b8 00 00 00 0f [ 47.517163][ T1008] RSP: 0018:ffff888059b4fbb0 EFLAGS: 00010206 [ 47.517802][ T1008] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 47.518941][ T1008] RDX: 0000000000000021 RSI: ffffffff85926440 RDI: 0000000000000108 [ 47.519732][ T1008] RBP: ffff888059b4fd30 R08: ffffffffc073fad0 R09: 0000000000000000 [ 47.520729][ T1008] R10: ffff888059b4fd50 R11: ffff88804bb38040 R12: 0000000000000000 [ 47.521702][ T1008] R13: dffffc0000000000 R14: ffffffff871976c0 R15: 00000000000000a0 [ 47.522760][ T1008] FS: 00007fd4be05a740(0000) GS:ffff88806c800000(0000) knlGS:0000000000000000 [ 47.523877][ T1008] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 47.524627][ T1008] CR2: 0000561c82b69cf0 CR3: 0000000065dd6004 CR4: 00000000000606e0 [ 47.527662][ T1008] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 47.528604][ T1008] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 47.529531][ T1008] Call Trace: [ 47.529874][ T1008] ? nsim_dev_port_add+0x50/0x150 [netdevsim] [ 47.530470][ T1008] ? mutex_lock_io_nested+0x1380/0x1380 [ 47.531018][ T1008] ? _kstrtoull+0x76/0x160 [ 47.531449][ T1008] ? _parse_integer+0xf0/0xf0 [ 47.531874][ T1008] ? kernfs_fop_write+0x1cf/0x410 [ 47.532330][ T1008] ? sysfs_file_ops+0x160/0x160 [ 47.532773][ T1008] ? kstrtouint+0x86/0x110 [ 47.533168][ T1008] ? nsim_dev_port_add+0x50/0x150 [netdevsim] [ 47.533721][ T1008] nsim_dev_port_add+0x50/0x150 [netdevsim] [ 47.534336][ T1008] ? sysfs_file_ops+0x160/0x160 [ 47.534858][ T1008] new_port_store+0x99/0xb0 [netdevsim] [ 47.535439][ T1008] ? del_port_store+0xb0/0xb0 [netdevsim] [ 47.536035][ T1008] ? sysfs_file_ops+0x112/0x160 [ 47.536544][ T1008] ? sysfs_kf_write+0x3b/0x180 [ 47.537029][ T1008] kernfs_fop_write+0x276/0x410 [ 47.537548][ T1008] ? __sb_start_write+0x215/0x2e0 [ 47.538110][ T1008] vfs_write+0x197/0x4a0 [ ... ] Fixes: `f9d9db47d3` ("netdevsim: add bus attributes to add new and delete devices") Fixes: `794b2c05ca` ("netdevsim: extend device attrs to support port addition and deletion") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-02-03 15:32:20 -08:00
Jakub Kicinski	2b5ea2947f	Merge branch 'bnxt_en-Bug-fixes' Michael Chan says: ===================== bnxt_en: Bug fixes 3 patches that fix some issues in the firmware reset logic, starting with a small patch to refactor the code that re-enables SRIOV. The last patch fixes a TC queue mapping issue. ==================== Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-02-03 15:07:26 -08:00
Michael Chan	18e4960c18	bnxt_en: Fix TC queue mapping. The driver currently only calls netdev_set_tc_queue when the number of TCs is greater than 1. Instead, the comparison should be greater than or equal to 1. Even with 1 TC, we need to set the queue mapping. This bug can cause warnings when the number of TCs is changed back to 1. Fixes: `7809592d3e` ("bnxt_en: Enable MSIX early in bnxt_init_one().") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-02-03 15:06:45 -08:00
Vasundhara Volam	d407302895	bnxt_en: Fix logic that disables Bus Master during firmware reset. The current logic that calls pci_disable_device() in __bnxt_close_nic() during firmware reset is flawed. If firmware is still alive, we're disabling the device too early, causing some firmware commands to not reach the firmware. Fix it by moving the logic to bnxt_reset_close(). If firmware is in fatal condition, we call pci_disable_device() before we free any of the rings to prevent DMA corruption of the freed rings. If firmware is still alive, we call pci_disable_device() after the last firmware message has been sent. Fixes: `3bc7d4a352` ("bnxt_en: Add BNXT_STATE_IN_FW_RESET state.") Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-02-03 15:06:45 -08:00
Michael Chan	12de2eadf8	bnxt_en: Fix RDMA driver failure with SRIOV after firmware reset. bnxt_ulp_start() needs to be called before SRIOV is re-enabled after firmware reset. Re-enabling SRIOV may consume all the resources and may cause the RDMA driver to fail to get MSIX and other resources. Fix it by calling bnxt_ulp_start() first before calling bnxt_reenable_sriov(). We re-arrange the logic so that we call bnxt_ulp_start() and bnxt_reenable_sriov() in proper sequence in bnxt_fw_reset_task() and bnxt_open(). The former is the normal coordinated firmware reset sequence and the latter is firmware reset while the function is down. This new logic is now more straight forward and will now fix both scenarios. Fixes: `f3a6d206c2` ("bnxt_en: Call bnxt_ulp_stop()/bnxt_ulp_start() during error recovery.") Reported-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-02-03 15:06:45 -08:00
Michael Chan	c16d4ee0e3	bnxt_en: Refactor logic to re-enable SRIOV after firmware reset detected. Put the current logic in bnxt_open() to re-enable SRIOV after detecting firmware reset into a new function bnxt_reenable_sriov(). This call needs to be invoked in the firmware reset path also in the next patch. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-02-03 15:06:45 -08:00
Nicolin Chen	14b41a2959	net: stmmac: Delete txtimer in suspend() When running v5.5 with a rootfs on NFS, memory abort may happen in the system resume stage: Unable to handle kernel paging request at virtual address dead00000000012a [dead00000000012a] address between user and kernel address ranges pc : run_timer_softirq+0x334/0x3d8 lr : run_timer_softirq+0x244/0x3d8 x1 : ffff800011cafe80 x0 : dead000000000122 Call trace: run_timer_softirq+0x334/0x3d8 efi_header_end+0x114/0x234 irq_exit+0xd0/0xd8 __handle_domain_irq+0x60/0xb0 gic_handle_irq+0x58/0xa8 el1_irq+0xb8/0x180 arch_cpu_idle+0x10/0x18 do_idle+0x1d8/0x2b0 cpu_startup_entry+0x24/0x40 secondary_start_kernel+0x1b4/0x208 Code: f9000693 a9400660 f9000020 b4000040 (f9000401) ---[ end trace bb83ceeb4c482071 ]--- Kernel panic - not syncing: Fatal exception in interrupt SMP: stopping secondary CPUs SMP: failed to stop secondary CPUs 2-3 Kernel Offset: disabled CPU features: 0x00002,2300aa30 Memory Limit: none ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]--- It's found that stmmac_xmit() and stmmac_resume() sometimes might run concurrently, possibly resulting in a race condition between mod_timer() and setup_timer(), being called by stmmac_xmit() and stmmac_resume() respectively. Since the resume() runs setup_timer() every time, it'd be safer to have del_timer_sync() in the suspend() as the counterpart. Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-02-03 15:01:22 -08:00
Jakub Kicinski	3d80c653f9	RxRPC fixes -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEqG5UsNXhtOCrfGQP+7dXa6fLC2sFAl439W0ACgkQ+7dXa6fL C2uT5RAAl+rytJS768Aa6c/S9TPBaNOKzFQmgNaWzAtgC5BAtfZf1Bl0okFOba2V QtcFmTogRl7BEnf9S3w3z25qj7Nv5udjIaAbH7cO6j1vuIwRpvtEZ7WXlE3L4hxA DscSysqI9L5ISsnzfloUvA/biA9azsH2ckgMxFo++YmLHvagXW0lmkE4yZw+aZ/T DstFAMAnFQwEAyu+Et3bo32/382aJh+gBGgV/2wuHBPgzQMhiWdosKdNnEZtZ3V2 HvgVuAU2V4hOf8OuaPuBnZrJPondTq5e1W+5mQiYLYzTCTOs6rZ1gUGfEXth3d1H ABJ0FoyPgN783msb0yL6OOLMWiTc9USiLB8a3U2vJOD+hQTkPSYSGsCObeIGVW/Y dbBd5fwudF78LIP9TjYWbau3vHtV73N4Mc1UT219jew0+Hi1ik5VIj8q0wdBIkzm vKcVznXnPReOUBVKmqA2scnXs8EA4w6bTCjyd3JYUBK2qLchz366s+0oYyIK8Nq0 dy8TVISaCSfohI4nAYAv90AUIDa+zdHKuttBsAD90yVM/wrnokqc/RonfWjHs32k SerM0RclPidool3Hkc/w0uvPywDQg/S5sKAW6E6lpfTppOwIY0wq/yZc5N2d9FZV UrPZVGhQszC7AM/CBKXDT0ZMrPnAfGO8PMPLyFrAcv2p9O7XuWA= =kHIG -----END PGP SIGNATURE----- Merge tag 'rxrpc-fixes-20200203' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs David Howells says: ==================== RxRPC fixes Here are a number of fixes for AF_RXRPC: (1) Fix a potential use after free in rxrpc_put_local() where it was accessing the object just put to get tracing information. (2) Fix insufficient notifications being generated by the function that queues data packets on a call. This occasionally causes recvmsg() to stall indefinitely. (3) Fix a number of packet-transmitting work functions to hold an active count on the local endpoint so that the UDP socket doesn't get destroyed whilst they're calling kernel_sendmsg() on it. (4) Fix a NULL pointer deref that stemmed from a call's connection pointer being cleared when the call was disconnected. Changes: v2: Removed a couple of BUG() statements that got added. ==================== Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-02-03 10:26:23 -08:00
David Howells	5273a191dc	rxrpc: Fix NULL pointer deref due to call->conn being cleared on disconnect When a call is disconnected, the connection pointer from the call is cleared to make sure it isn't used again and to prevent further attempted transmission for the call. Unfortunately, there might be a daemon trying to use it at the same time to transmit a packet. Fix this by keeping call->conn set, but setting a flag on the call to indicate disconnection instead. Remove also the bits in the transmission functions where the conn pointer is checked and a ref taken under spinlock as this is now redundant. Fixes: `8d94aa381d` ("rxrpc: Calls shouldn't hold socket refs") Signed-off-by: David Howells <dhowells@redhat.com>	2020-02-03 10:25:30 +00:00
Jakub Kicinski	83d0585f91	Merge branch 'Fix-reconnection-latency-caused-by-FIN-ACK-handling-race' SeongJae Park says: ==================== Fix reconnection latency caused by FIN/ACK handling race The first patch fixes the problem by adjusting the first resend delay of the SYN in the case. The second one adds a user space test to reproduce this problem. From v2 (https://lore.kernel.org/linux-kselftest/20200201071859.4231-1-sj38.park@gmail.com/) - Use TCP_TIMEOUT_MIN as reduced delay (Neal Cardwall) - Add Reviewed-by and Signed-off-by from Eric Dumazet From v1 (https://lore.kernel.org/linux-kselftest/20200131122421.23286-1-sjpark@amazon.com/) - Drop the trivial comment fix patch (Eric Dumazet) - Limit the delay adjustment to only the first SYN resend (Eric Dumazet) - selftest: Avoid use of hard-coded port number (Eric Dumazet) - Explain RST/ACK and FIN/ACK has no big difference (Neal Cardwell) ==================== Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-02-02 13:45:05 -08:00
SeongJae Park	af8c8a450b	selftests: net: Add FIN_ACK processing order related latency spike test This commit adds a test for FIN_ACK process races related reconnection latency spike issues. The issue has described and solved by the previous commit ("tcp: Reduce SYN resend delay if a suspicous ACK is received"). The test program is configured with a server and a client process. The server creates and binds a socket to a port that dynamically allocated, listen on it, and start a infinite loop. Inside the loop, it accepts connection, reads 4 bytes from the socket, and closes the connection. The client is constructed as an infinite loop. Inside the loop, it creates a socket with LINGER and NODELAY option, connect to the server, send 4 bytes data, try read some data from server. After the read() returns, it measure the latency from the beginning of this loop to this point and if the latency is larger than 1 second (spike), print a message. Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: SeongJae Park <sjpark@amazon.de> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-02-02 13:33:21 -08:00
SeongJae Park	9603d47bad	tcp: Reduce SYN resend delay if a suspicous ACK is received When closing a connection, the two acks that required to change closing socket's status to FIN_WAIT_2 and then TIME_WAIT could be processed in reverse order. This is possible in RSS disabled environments such as a connection inside a host. For example, expected state transitions and required packets for the disconnection will be similar to below flow. 00 (Process A) (Process B) 01 ESTABLISHED ESTABLISHED 02 close() 03 FIN_WAIT_1 04 ---FIN--> 05 CLOSE_WAIT 06 <--ACK--- 07 FIN_WAIT_2 08 <--FIN/ACK--- 09 TIME_WAIT 10 ---ACK--> 11 LAST_ACK 12 CLOSED CLOSED In some cases such as LINGER option applied socket, the FIN and FIN/ACK will be substituted to RST and RST/ACK, but there is no difference in the main logic. The acks in lines 6 and 8 are the acks. If the line 8 packet is processed before the line 6 packet, it will be just ignored as it is not a expected packet, and the later process of the line 6 packet will change the status of Process A to FIN_WAIT_2, but as it has already handled line 8 packet, it will not go to TIME_WAIT and thus will not send the line 10 packet to Process B. Thus, Process B will left in CLOSE_WAIT status, as below. 00 (Process A) (Process B) 01 ESTABLISHED ESTABLISHED 02 close() 03 FIN_WAIT_1 04 ---FIN--> 05 CLOSE_WAIT 06 (<--ACK---) 07 (<--FIN/ACK---) 08 (fired in right order) 09 <--FIN/ACK--- 10 <--ACK--- 11 (processed in reverse order) 12 FIN_WAIT_2 Later, if the Process B sends SYN to Process A for reconnection using the same port, Process A will responds with an ACK for the last flow, which has no increased sequence number. Thus, Process A will send RST, wait for TIMEOUT_INIT (one second in default), and then try reconnection. If reconnections are frequent, the one second latency spikes can be a big problem. Below is a tcpdump results of the problem: 14.436259 IP 127.0.0.1.45150 > 127.0.0.1.4242: Flags [S], seq 2560603644 14.436266 IP 127.0.0.1.4242 > 127.0.0.1.45150: Flags [.], ack 5, win 512 14.436271 IP 127.0.0.1.45150 > 127.0.0.1.4242: Flags [R], seq 2541101298 /* ONE SECOND DELAY */ 15.464613 IP 127.0.0.1.45150 > 127.0.0.1.4242: Flags [S], seq 2560603644 This commit mitigates the problem by reducing the delay for the next SYN if the suspicous ACK is received while in SYN_SENT state. Following commit will add a selftest, which can be also helpful for understanding of this issue. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: SeongJae Park <sjpark@amazon.de> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-02-02 13:33:21 -08:00
Lukas Bulwahn	dff6bc1bfd	MAINTAINERS: correct entries for ISDN/mISDN section Commit `6d97985072` ("isdn: move capi drivers to staging") cleaned up the isdn drivers and split the MAINTAINERS section for ISDN, but missed to add the terminal slash for the two directories mISDN and hardware. Hence, all files in those directories were not part of the new ISDN/mISDN SUBSYSTEM, but were considered to be part of "THE REST". Rectify the situation, and while at it, also complete the section with two further build files that belong to that subsystem. This was identified with a small script that finds all files belonging to "THE REST" according to the current MAINTAINERS file, and I investigated upon its output. Fixes: `6d97985072` ("isdn: move capi drivers to staging") Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-02-02 12:40:08 -08:00
Jakub Kicinski	b7c3a17c60	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Fix suspicious RCU usage in ipset, from Jozsef Kadlecsik. 2) Use kvcalloc, from Joe Perches. 3) Flush flowtable hardware workqueue after garbage collection run, from Paul Blakey. 4) Missing flowtable hardware workqueue flush from nf_flow_table_free(), also from Paul. 5) Restore NF_FLOW_HW_DEAD in flow_offload_work_del(), from Paul. 6) Flowtable documentation fixes, from Matteo Croce. ==================== Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-02-01 12:38:20 -08:00
Eric Dumazet	cb3c0e6bdf	cls_rsvp: fix rsvp_policy NLA_BINARY can be confusing, since .len value represents the max size of the blob. cls_rsvp really wants user space to provide long enough data for TCA_RSVP_DST and TCA_RSVP_SRC attributes. BUG: KMSAN: uninit-value in rsvp_get net/sched/cls_rsvp.h:258 [inline] BUG: KMSAN: uninit-value in gen_handle net/sched/cls_rsvp.h:402 [inline] BUG: KMSAN: uninit-value in rsvp_change+0x1ae9/0x4220 net/sched/cls_rsvp.h:572 CPU: 1 PID: 13228 Comm: syz-executor.1 Not tainted 5.5.0-rc5-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x220 lib/dump_stack.c:118 kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:118 __msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:215 rsvp_get net/sched/cls_rsvp.h:258 [inline] gen_handle net/sched/cls_rsvp.h:402 [inline] rsvp_change+0x1ae9/0x4220 net/sched/cls_rsvp.h:572 tc_new_tfilter+0x31fe/0x5010 net/sched/cls_api.c:2104 rtnetlink_rcv_msg+0xcb7/0x1570 net/core/rtnetlink.c:5415 netlink_rcv_skb+0x451/0x650 net/netlink/af_netlink.c:2477 rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:5442 netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline] netlink_unicast+0xf9e/0x1100 net/netlink/af_netlink.c:1328 netlink_sendmsg+0x1248/0x14d0 net/netlink/af_netlink.c:1917 sock_sendmsg_nosec net/socket.c:639 [inline] sock_sendmsg net/socket.c:659 [inline] ____sys_sendmsg+0x12b6/0x1350 net/socket.c:2330 ___sys_sendmsg net/socket.c:2384 [inline] __sys_sendmsg+0x451/0x5f0 net/socket.c:2417 __do_sys_sendmsg net/socket.c:2426 [inline] __se_sys_sendmsg+0x97/0xb0 net/socket.c:2424 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2424 do_syscall_64+0xb8/0x160 arch/x86/entry/common.c:296 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x45b349 Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f269d43dc78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007f269d43e6d4 RCX: 000000000045b349 RDX: 0000000000000000 RSI: 00000000200001c0 RDI: 0000000000000003 RBP: 000000000075bfc8 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff R13: 00000000000009c2 R14: 00000000004cb338 R15: 000000000075bfd4 Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:144 [inline] kmsan_internal_poison_shadow+0x66/0xd0 mm/kmsan/kmsan.c:127 kmsan_slab_alloc+0x8a/0xe0 mm/kmsan/kmsan_hooks.c:82 slab_alloc_node mm/slub.c:2774 [inline] __kmalloc_node_track_caller+0xb40/0x1200 mm/slub.c:4382 __kmalloc_reserve net/core/skbuff.c:141 [inline] __alloc_skb+0x2fd/0xac0 net/core/skbuff.c:209 alloc_skb include/linux/skbuff.h:1049 [inline] netlink_alloc_large_skb net/netlink/af_netlink.c:1174 [inline] netlink_sendmsg+0x7d3/0x14d0 net/netlink/af_netlink.c:1892 sock_sendmsg_nosec net/socket.c:639 [inline] sock_sendmsg net/socket.c:659 [inline] ____sys_sendmsg+0x12b6/0x1350 net/socket.c:2330 ___sys_sendmsg net/socket.c:2384 [inline] __sys_sendmsg+0x451/0x5f0 net/socket.c:2417 __do_sys_sendmsg net/socket.c:2426 [inline] __se_sys_sendmsg+0x97/0xb0 net/socket.c:2424 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2424 do_syscall_64+0xb8/0x160 arch/x86/entry/common.c:296 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: `6fa8c0144b` ("[NET_SCHED]: Use nla_policy for attribute validation in classifiers") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-02-01 12:25:06 -08:00
Sven Eckelmann	e8d5bb4dfa	MAINTAINERS: Orphan HSR network protocol The current maintainer Arvid Brodin <arvid.brodin@alten.se> hasn't contributed to the kernel since 2015-02-27. His company mail address is also bouncing and the company confirmed (2020-01-31) that no Arvid Brodin is working for them: > Vi har dessvärre ingen Arvid Brodin som arbetar på ALTEN. A MIA person cannot be the maintainer. It is better to mark is as orphaned until some other person can jump in and take over the responsibility for HSR. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-02-01 11:45:57 -08:00
Dan Carpenter	d32a06f543	qed: Fix a error code in qed_hw_init() If the qed_fw_overlay_mem_alloc() then we should return -ENOMEM instead of success. Fixes: `30d5f85895` ("qed: FW 8.42.2.0 Add fw overlay feature") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-02-01 11:35:17 -08:00
Dan Carpenter	08ff78182f	octeontx2-pf: Fix an IS_ERR() vs NULL bug The otx2_mbox_get_rsp() function never returns NULL, it returns error pointers on error. Fixes: `34bfe0ebed` ("octeontx2-pf: MTU, MAC and RX mode config support") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-02-01 11:32:43 -08:00
Eric Dumazet	784f8344de	tcp: clear tp->segs_{in\|out} in tcp_disconnect() tp->segs_in and tp->segs_out need to be cleared in tcp_disconnect(). tcp_disconnect() is rarely used, but it is worth fixing it. Fixes: `2efd055c53` ("tcp: add tcpi_segs_in and tcpi_segs_out to tcp_info") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Marcelo Ricardo Leitner <mleitner@redhat.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-01-31 22:12:37 -08:00
Eric Dumazet	db7ffee6f3	tcp: clear tp->data_segs{in\|out} in tcp_disconnect() tp->data_segs_in and tp->data_segs_out need to be cleared in tcp_disconnect(). tcp_disconnect() is rarely used, but it is worth fixing it. Fixes: `a44d6eacda` ("tcp: Add RFC4898 tcpEStatsPerfDataSegsOut/In") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-01-31 22:12:22 -08:00
Eric Dumazet	2fbdd56251	tcp: clear tp->delivered in tcp_disconnect() tp->delivered needs to be cleared in tcp_disconnect(). tcp_disconnect() is rarely used, but it is worth fixing it. Fixes: `ddf1af6fa0` ("tcp: new delivery accounting") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-01-31 22:12:18 -08:00
Eric Dumazet	c13c48c00a	tcp: clear tp->total_retrans in tcp_disconnect() total_retrans needs to be cleared in tcp_disconnect(). tcp_disconnect() is rarely used, but it is worth fixing it. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: SeongJae Park <sjpark@amazon.de> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-01-31 22:11:35 -08:00
Matteo Croce	78e06cf430	netfilter: nf_flowtable: fix documentation In the flowtable documentation there is a missing semicolon, the command as is would give this error: nftables.conf:5:27-33: Error: syntax error, unexpected devices, expecting newline or semicolon hook ingress priority 0 devices = { br0, pppoe-data }; ^^^^^^^ nftables.conf:4:12-13: Error: invalid hook (null) flowtable ft { ^^ Fixes: `19b351f16f` ("netfilter: add flowtable documentation") Signed-off-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2020-01-31 19:31:42 +01:00
Paul Blakey	c22208b7ce	netfilter: flowtable: Fix setting forgotten NF_FLOW_HW_DEAD flag During the refactor this was accidently removed. Fixes: `ae29045018` ("netfilter: flowtable: add nf_flow_offload_tuple() helper") Signed-off-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2020-01-31 19:31:42 +01:00
Paul Blakey	0f34f30a1b	netfilter: flowtable: Fix missing flush hardware on table free If entries exist when freeing a hardware offload enabled table, we queue work for hardware while running the gc iteration. Execute it (flush) after queueing. Fixes: `c29f74e0df` ("netfilter: nf_flow_table: hardware offload support") Signed-off-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2020-01-31 19:31:41 +01:00
Paul Blakey	91bfaa15a3	netfilter: flowtable: Fix hardware flush order on nf_flow_table_cleanup On netdev down event, nf_flow_table_cleanup() is called for the relevant device and it cleans all the tables that are on that device. If one of those tables has hardware offload flag, nf_flow_table_iterate_cleanup flushes hardware and then runs the gc. But the gc can queue more hardware work, which will take time to execute. Instead first add the work, then flush it, to execute it now. Fixes: `c29f74e0df` ("netfilter: nf_flow_table: hardware offload support") Signed-off-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2020-01-31 19:31:40 +01:00
Joe Perches	b9e0102a57	netfilter: Use kvcalloc Convert the uses of kvmalloc_array with __GFP_ZERO to the equivalent kvcalloc. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2020-01-31 19:30:54 +01:00
Nathan Chancellor	91a7d4bf3e	mlxsw: spectrum_qdisc: Fix 64-bit division error in mlxsw_sp_qdisc_tbf_rate_kbps When building arm32 allmodconfig: ERROR: "__aeabi_uldivmod" [drivers/net/ethernet/mellanox/mlxsw/mlxsw_spectrum.ko] undefined! rate_bytes_ps has type u64, we need to use a 64-bit division helper to avoid a build error. Fixes: `a44f58c41b` ("mlxsw: spectrum_qdisc: Support offloading of TBF Qdisc") Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Tested-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-01-31 08:52:41 -08:00
Shannon Nelson	b5ce31b5e1	ionic: fix rxq comp packet type mask Be sure to include all the packet type bits in the mask. Fixes: `fbfb803153` ("ionic: Add hardware init and device commands") Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-01-31 08:43:05 -08:00
Michael Walle	2318ca8aef	net: phy: at803x: disable vddio regulator The probe() might enable a VDDIO regulator, which needs to be disabled again before calling regulator_put(). Add a remove() function. Fixes: `2f664823a4` ("net: phy: at803x: add device tree binding") Signed-off-by: Michael Walle <michael@walle.cc> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-01-31 07:54:48 -08:00
Michael Walle	2e1bf3a765	net: mii_timestamper: fix static allocation by PHY driver If phydev->mii_ts is set by the PHY driver, it will always be overwritten in of_mdiobus_register_phy(). Fix it. Also make sure, that the unregister() doesn't do anything if the mii_timestamper was provided by the PHY driver. Fixes: `1dca22b184` ("net: mdio: of: Register discovered MII time stampers.") Signed-off-by: Michael Walle <michael@walle.cc> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-01-31 07:46:11 -08:00
Michael Walle	0e0daf6ac3	net: mdio: of: fix potential NULL pointer derefernce of_find_mii_timestamper() returns NULL if no timestamper is found. Therefore, guard the unregister_mii_timestamper() calls. Fixes: `1dca22b184` ("net: mdio: of: Register discovered MII time stampers.") Signed-off-by: Michael Walle <michael@walle.cc> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-01-31 07:46:11 -08:00
David Howells	04d36d748f	rxrpc: Fix missing active use pinning of rxrpc_local object The introduction of a split between the reference count on rxrpc_local objects and the usage count didn't quite go far enough. A number of kernel work items need to make use of the socket to perform transmission. These also need to get an active count on the local object to prevent the socket from being closed. Fix this by getting the active count in those places. Also split out the raw active count get/put functions as these places tend to hold refs on the rxrpc_local object already, so getting and putting an extra object ref is just a waste of time. The problem can lead to symptoms like: BUG: kernel NULL pointer dereference, address: 0000000000000018 .. CPU: 2 PID: 818 Comm: kworker/u9:0 Not tainted 5.5.0-fscache+ #51 ... RIP: 0010:selinux_socket_sendmsg+0x5/0x13 ... Call Trace: security_socket_sendmsg+0x2c/0x3e sock_sendmsg+0x1a/0x46 rxrpc_send_keepalive+0x131/0x1ae rxrpc_peer_keepalive_worker+0x219/0x34b process_one_work+0x18e/0x271 worker_thread+0x1a3/0x247 kthread+0xe6/0xeb ret_from_fork+0x1f/0x30 Fixes: `730c5fd42c` ("rxrpc: Fix local endpoint refcounting") Signed-off-by: David Howells <dhowells@redhat.com>	2020-01-30 21:50:41 +00:00
David Howells	f71dbf2fb2	rxrpc: Fix insufficient receive notification generation In rxrpc_input_data(), rxrpc_notify_socket() is called if the base sequence number of the packet is immediately following the hard-ack point at the end of the function. However, this isn't sufficient, since the recvmsg side may have been advancing the window and then overrun the position in which we're adding - at which point rx_hard_ack >= seq0 and no notification is generated. Fix this by always generating a notification at the end of the input function. Without this, a long call may stall, possibly indefinitely. Fixes: `248f219cb8` ("rxrpc: Rewrite the data and ack handling code") Signed-off-by: David Howells <dhowells@redhat.com>	2020-01-30 21:50:41 +00:00

1 2 3 4 5 ...

896528 commits