Commit graph

245426 commits

Author SHA1 Message Date
Simon J. Gerraty 18e2fbc0d8 Initialize verbosity and debug level from env
For EFI at least, we can seed the environment
with VE_VERBOSE etc.

Reviewed by:	stevek imp
Sponsored by:	Juniper Networks
MFC after:	1 week
Differential Revision:  https://reviews.freebsd.org/D22135
2019-10-24 19:50:18 +00:00
Bjoern A. Zeeb e5fffe9a69 frag6.c: do not leak packet queue entry in error case
When we are checking for the maximum reassembled packet size of the
fragmentable part and run into the error case (packet too big),
we are leaking the packet queue enntry if this was a first fragment
to arrive.
Properly cleanup, removing the queue entry from the bucket, decrementing
counters, and freeing the memory.

MFC after:	3 weeks
Sponsored by:	Netflix
2019-10-24 19:47:32 +00:00
Kirk McKusick 7792f70137 Soft updates needs to keep an on-disk linked list of inodes that
have been unlinked, but are still referenced by open file descriptors.
These inodes cannot be freed until the final file descriptor reference
has been closed. If the system crashes while they are still being
referenced, these inodes and their referenced blocks need to be
freed by fsck. By having them on a linked list with the head pointer
in the superblock, fsck can quickly find and process them rather
than having to check every inode in the filesystem to see if it is
unreferenced.

When updating the head pointer of this list of unlinked inodes in
the superblock, the superblock check-hash was not getting updated.
If the system crashed with the incorrect superblock check-hash, the
superblock would appear to be corrupted. This patch ensures that
the superblock check-hash is updated when updating the head pointer
of the unlinked inodes list.

There is no need to MFC as superblock check hashes first appeared in
13.0.

Tested by:    Peter Holm
Sponsored by: Netflix
2019-10-24 19:47:18 +00:00
Andrew Gallatin 0dc59d7632 Add a tunable to set the pgcache zone's maxcache
When it is set to 0 (the default), a heavy Netflix-style web workload
suffers from heavy lock contention on the vm page free queue called from
vm_page_zone_{import,release}() as the buckets are frequently drained.
When setting the maxcache, this contention goes away.

We should eventually try to autotune this, as well as make this
zone eligable for uma_reclaim().

Reviewed by:	alc, markj
Not Objected to by: jeff
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D22112
2019-10-24 18:39:05 +00:00
John Baldwin 7d29eb9a91 Use a counter with a random base for explicit IVs in GCM.
This permits constructing the entire TLS header in ktls_frame() rather
than ktls_seq().  This also matches the approach used by OpenSSL which
uses an incrementing nonce as the explicit IV rather than the sequence
number.

Reviewed by:	gallatin
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D22117
2019-10-24 18:13:26 +00:00
Bjoern A. Zeeb 30809ba9e3 frag6: leave a note about upper layer header checks TBD
Per sepcification the upper layer header needs to be within the first
fragment.  The check was not done so far and there is an open review for
related work, so just leave a note as to where to put it.
Move the extraction of frag offset up to this as it is needed to determine
whether this is a first fragment or not.

MFC after:	3 weeks
Sponsored by:	Netflix
2019-10-24 12:16:15 +00:00
Bjoern A. Zeeb 7715d794ef frag6: check global limits before hash and lock
Check whether we are accepting more fragments (based on global limits)
before doing expensive operations of calculating the hash and taking the
bucket lock.   This slightly increases a "race" between check time and
incrementing counters (which is already there) possibly allowing a few
more fragments than the maximum limits.  However, when under attack,
we rather save this CPU time for other packets/work.

MFC after:		3 weeks
Sponsored by:		Netflix
2019-10-24 11:58:24 +00:00
Michael Tuexen 9f36ec8bba Store a handle for the event handler. This will be used when unloading the
SCTP as a module.

Obtained from:		markj@
2019-10-24 09:22:23 +00:00
Bjoern A. Zeeb efdfee93c0 frag6: small improvements
Rather than walking the mbuf chain manually use m_last() which doing
exactly that for us.
Defer initializing srcifp for longer as there are multiple exit paths
out of the function which do not need it set.  Initialize before taking
the lock though.
Rename the mtx lock to match the type better.

MFC after:	3 weeks
Sponsored by:	Netflix
2019-10-24 08:15:40 +00:00
Bjoern A. Zeeb da89a0fe94 frag6: remove IP6_REASS_MBUF macro
The IP6_REASS_MBUF() macro did some pointer gynmastics to end up with the
same type as it gets in [*(cast **)&].  Spelling it out instead saves all
this and makes the code more readable and less obfuscated directly using
the structure field.

MFC after:	3 weeks
Sponsored by:	Netflix
2019-10-24 07:53:10 +00:00
Toomas Soome 96b2f9c996 userboot/test should use PRIx64 as one would expect from prefix 0x
Test is printing decimal value after prefix 0x.
2019-10-24 07:49:33 +00:00
Randall Stewart 9992c365b6 Fix a small bug in bbr when running under a VM. Basically what
happens is we are more delayed in the pacer calling in so
we remove the stack from the pacer and recalculate how
much time is left after all data has been acknowledged. However
the comparision was backwards so we end up with a negative
value in the last_pacing_delay time which causes us to
add in a huge value to the next pacing time thus stalling
the connection.

Reported by:	vm2.finance@gmail.com
2019-10-24 05:54:30 +00:00
Justin Hibbits 6087140822 powerpc/booke: Simplify the MPC85XX PCIe root complex driver
Summary:
Due to bugs in the enumeration code, fsl_pcib_init() was not configuring
sub-bridges properly, so devices hanging off a separate bridge would not
be found.  Since the generic PCI code already supports probing child
buses, just delete this code and initialize only the device itself,
letting the generic code handle all the additional probing and
initializing.

This also deletes setup for some PCI peripherals found on some MPC85XX
evaluation boards.  The code can be resurrected if needed, but overly
complicated this code in the first place.

Reviewed by:	bdragon
Differential Revision:	https://reviews.freebsd.org/D22050
2019-10-24 03:51:33 +00:00
Eric Joyner 1558015e3e iflib: call ether_ifdetach and netmap_detach before stop
From Jake:
Calling ether_ifdetach after iflib_stop leads to a potential race where
a stale ifp pointer can remain in the route entry list for IPv6 traffic.
This will potentially cause a page fault or other system instability if
the ifp pointer is accessed.

Move both iflib_netmap_detach and ether_ifdetach to be called prior to
iflib_stop. This avoids the race above, and helps ensure that other ifp
references are removed before stopping the interface.

Submitted by:	Jacob Keller <jacob.e.keller@intel.com>
Reviewed by:	erj@, gallatin@, jhb@
MFC after:	3 days
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D22071
2019-10-23 23:20:49 +00:00
Bjoern A. Zeeb f1664f3258 frag6: add "big picture"
Add some ASCII relation of how the bits plug together.  The terminology
difference of "fragmented packets" and "fragment packets" is subtle.
While here clear up more whitespace and comments.

No functional change.

MFC after:	3 weeks
Sponsored by:	Netflix
2019-10-23 23:10:12 +00:00
Bjoern A. Zeeb 21f08a074d frag6: replace KAME hand-rolled queues with queue(9) TAILQs
Remove the KAME custom circular queue for fragments and fragmented packets
and replace them with a standard TAILQ.
This make the code a lot more understandable and maintainable and removes
further hand-rolled code from the the tree using a standard interface instead.

Hide the still public structures under #ifdef _KERNEL as there is no
use for them in user space.
The naming is a bit confusing now as struct ip6q and the ip6q[] buckets
array are not the same anymore;  sadly struct ip6q is also used by the
MAC framework and we cannot rename it.

Submitted by:	jtl (initally)
MFC after:	3 weeks
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D16847 (jtl's original)
2019-10-23 23:01:18 +00:00
Mark Johnston be2c561003 Modify release_page() to handle a missing fault page.
r353890 introduced a case where we may call release_page() with
fs.m == NULL, since the fault handler may now lock the vnode prior
to allocating a page for a page-in.

Reported by:	jhb
Reviewed by:	kib
MFC with:	r353890
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22120
2019-10-23 20:39:21 +00:00
Bjoern A. Zeeb 3c7165b35e frag6: whitespace changes
Remove trailing white space, add a blank line, and compress a comment.
No functional changes.

MFC after:	10 days
Sponsored by:	Netflix
2019-10-23 20:37:15 +00:00
Ed Maste 4ad0475f03 arm64: enable options NUMA in GENERIC
As with amd64 NUMA is required for reasonable operation on big-iron
arm64 systems and is expected to have no significant impact on small
systems.  Enable it now for wider testing in advance of FreeBSD 13.0.

You can use the 'vm.ndomains' sysctl to see if multiple domains are in
use - for example (from Cavium/Marvell ThunderX2):

# sysctl vm.ndomains
vm.ndomains: 2

No objection:	manu
Sponsored by:	The FreeBSD Foundation
2019-10-23 19:35:26 +00:00
Warner Losh dd376a963c exit requires stdlib.h to be included to use.
FreeBSD 10.3 requires this, and dtc is a bootstrap tool so it needs to compile
there.
2019-10-23 19:23:31 +00:00
Mateusz Guzik 08ded448cf amd64 pmap: per-domain pv chunk list
This significantly reduces contention since chunks get created and removed
all the time. See the review for sample results.

Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21976
2019-10-23 19:17:10 +00:00
Conrad Meyer 639ec13157 amd64: Add CFI directives for libc syscall stubs
No functional change (in program code).  Additional DWARF metadata is
generated in the .eh_frame section.  Also, it is now a compile-time
requirement that machine/asm.h ENTRY() and END() macros are paired.  (This
is subject to ongoing discussion and may change.)

This DWARF metadata allows llvm-libunwind to unwind program stacks when the
program is executing the function.  The goal is to collect accurate
userspace stacktraces when programs have entered syscalls.

(The motivation for "Call Frame Information," or CFI for short -- not to be
confused with Control Flow Integrity -- is to sufficiently annotate assembly
functions such that stack unwinders can unwind out of the local frame
without the requirement of a dedicated framepointer register; i.e.,
-fomit-frame-pointer.  This is necessary for C++ exception handling or
collecting backtraces.)

For the curious, a more thorough description of the metadata and some
examples may be found at [1] and documentation at [2].  You can also look at
'cc -S -o - foo.c | less' and search for '.cfi_' to see the CFI directives
generated by your C compiler.

[1]: https://www.imperialviolet.org/2017/01/18/cfi.html
[2]: https://sourceware.org/binutils/docs/as/CFI-directives.html

Reviewed by:	emaste, kib (with reservations)
Differential Revision:	https://reviews.freebsd.org/D22122
2019-10-23 19:03:03 +00:00
Conrad Meyer 5ffc069a3a libthr: Add missing END() directive for umtx_op_err (amd64)
Like r353929, related to D22122.  No functional change.

Reviewed by:	emaste, kib (earlier version both)
2019-10-23 18:27:30 +00:00
Mark Johnston 2f81c92e55 Check for bogus_page in vnode_pager_generic_getpages_done().
We now assert that a page is busy when updating its validity-tracking
state, but bogus_page is not busied during a getpages operation.

Reported by:	syzkaller
Reviewed by:	alc, kib
Discussed with:	jeff
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22124
2019-10-23 18:00:22 +00:00
Mark Johnston e6f1a58082 Verify identity after checking for WAITFAIL in vm_page_busy_acquire().
A caller that does not guarantee that a page's identity won't change
while sleeping for a busy lock must specify either NOWAIT or WAITFAIL.

Reported by:	syzkaller
Reviewed by:	alc, kib
Discussed with:	jeff
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22124
2019-10-23 17:58:19 +00:00
Ryan Stone 92a15f946b Add missing M_NOWAIT flag
The LinuxKPI linux_dma code calls PCTRIE_INSERT with a
mutex held, but does not set M_NOWAIT when allocating
nodes, leading to a potential panic.  All of this code
can handle an allocation failure here, so prefer an
allocation failure to sleeping on memory.

Also fix a related case where NOWAIT/WAITOK was not
specified.  In this case it's not clear whether sleeping
is allowed so be conservative and assume not.  There are
a lot of other paths in this code that can fail due to
a lack of memory anyway.

Differential Revision: https://reviews.freebsd.org/D22127
Reviewed by: imp
Sponsored by: Dell EMC Isilon
MFC After: 1 week
2019-10-23 17:20:20 +00:00
Dimitry Andric 6ab18ea64d Build toolchain components as dynamically linked executables by default
Summary:
Historically, we have built toolchain components such as cc, ld, etc as
statically linked executables.  One of the reasons being that you could
sometimes save yourself from botched upgrades, by e.g. recompiling a
"known good" libc and reinstalling it.

In this day and age, we have boot environments, virtual machine
snapshots, cloud backups, and other much more reliable methods to
restore systems to working order.  So I think the time is ripe to flip
this default, and link the toolchain components dynamically, just like
almost all other executables on FreeBSD.

Maybe at some point they can even become PIE executables by default! :)

Reviewed by:	kib
MFC after:	2 weeks
Differential Revision: https://reviews.freebsd.org/D22061
2019-10-23 17:02:45 +00:00
Dimitry Andric 653fac7d1a Bump clang's default target CPU for the i386 architecture (aka "x86") to
i686, as per the discussion on the freebsd-arch mailing list.  Earlier
in r352030, I had already bumped it to i586, to work around missing
atomic 64 bit functions for the i386 architecture.

Relnotes:	yes
2019-10-23 16:57:11 +00:00
Mark Johnston 87382b222f Set OBJ_NOSPLIT on the ksyms(4) VM object.
The object does not provide anonymous memory.

Reported by:	kib
Reviewed by:	kib
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22123
2019-10-23 16:53:37 +00:00
Conrad Meyer 65366903c3 Prevent a panic when a driver provides bogus debugnet parameters
This is just a bandaid; we should fix the driver(s) too.  Introduced in
r353685.

PR:		241403
X-MFC-With:	r353685
Reported by:	np and others
2019-10-23 16:48:22 +00:00
Dimitry Andric de19b521ee Slightly expand description of WITH_SHARED_TOOLCHAIN, add a
corresponding WITHOUT_SHARED_TOOLCHAIN description, and regenerate
src.conf(5).

MFC after:	 3 days
2019-10-23 16:48:17 +00:00
John Baldwin 4196949c01 Strip "sf" suffix when generating a target triple.
This fixes the target triple used when compiling riscv64sf with clang.

Discussed with:	mhorne
MFC after:	2 weeks
Sponsored by:	DARPA
2019-10-23 16:43:51 +00:00
John Baldwin b96562eb86 Fix atomic_*cmpset32 on riscv64 with clang.
The lr.w instruction used to read the value from memory sign-extends
the value read from memory.  GCC sign-extends the 32-bit comparison
value passed in whereas clang currently does not.  As a result, if the
value being compared has the MSB set, the comparison fails for
matching 32-bit values when compiled with clang.

Use a cast to explicitly sign-extend the unsigned comparison value.
This works with both GCC and clang.

There is commentary in the RISC-V spec that suggests that GCC's
approach is more correct, but it is not clear if the commentary in the
RISC-V spec is binding.

Reviewed by:	mhorne
Obtained from:	Axiado
MFC after:	2 weeks
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D22084
2019-10-23 16:41:31 +00:00
Konstantin Belousov c92f130498 Fix undefined behavior.
Create a sequence point by ending a full expression for call to
vspace() and use of the globals which are modified by vspace().

Reported and reviewed by:	imp
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D22126
2019-10-23 16:06:47 +00:00
Conrad Meyer 168f19a57b libm: Add missing END() directives for amd64 routines
No functional change.  Related to D22122.

Reviewed by:	emaste, kib (earlier version both)
2019-10-23 16:05:52 +00:00
Konstantin Belousov 8076c4e7d1 vn_printf(): Decode VI_TEXT_REF.
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2019-10-23 15:51:26 +00:00
Andrew Turner 9d0a6b83ca Stop enabling interrupts when reentering kdb on arm64
When we raise a data abort from the kernel we need to enable interrupts,
however we shouldn't be doing this when in the kernel debugger. In this
case interrupts can lead to a further panic as they don't expect to be
run from such a context.

MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
2019-10-23 13:21:15 +00:00
Emmanuel Vadot a601368040 regulator: Add a regnode_set_constraint function
This method check that boot_on or always_on is set to 1 and if it
is it will try to enable the regulator.
The binding docs aren't clear on what to do but Linux enable the regulator
if any of those properties is set so we want to do the same.
The function first check the status to see if the regulator is
already enabled it then get the voltage to check if it is in a acceptable
range and then enables it.
This will be either called from the regnode_init method (if it's needed by the platform)
or by a SYSINIT at SI_SUB_LAST

Reviewed by:	mmel
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D22106
2019-10-23 09:56:53 +00:00
Emmanuel Vadot f66cb502db axp81x: Use the default regnode_init method
MFC after:	1 week
2019-10-23 09:54:50 +00:00
Emmanuel Vadot a24735546b regulator: Add a regnode_method_init
This is a default init method for regulator that don't really
need one.

MFC after:	1 week
2019-10-23 09:54:12 +00:00
Konstantin Belousov 16b0c09225 Assert that vm_fault_lock_vnode() returns locked saved vnode.
Reviewed by:	alc, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D22113
2019-10-23 07:36:26 +00:00
Kyle Evans b3bec79d36 cap_sysctl: correct typo from r347534-ish
operation & ~limit where limit is a bool is clearly not what was intended,
given the line prior. Correct it to use the calculated mask for validation.

The cap_sysctl tests should now be functional again.
2019-10-23 03:23:14 +00:00
Kyle Evans f7810883d4 tuntap(4): Fix NOINET build after r353741
Shuffle headers around to more appropriate #ifdef OPTION blocks (INET vs.
INET6) -- double checked LINT-{NOINET,NOINET6,NOIP}, all seem good.

Reported by:	cem
2019-10-23 02:15:15 +00:00
Kyle Evans e735aa5a74 libcasper/services: include <src.opts.mk> to hook tests to build
Note that the cap_sysctl tests are currently failing and need some
attention.
2019-10-23 01:50:41 +00:00
Greg Lehey 67fd12c006 Correct spelling, apply appropriate respect. 2019-10-23 01:11:25 +00:00
Justin Hibbits dc2b5bb497 powerpc/booke: Fix Book-E boot post-minidump
r353489 added minidump support for powerpc64, but it added a dependency on
the dump_avail array.  Leaving it uninitialized caused breakage in late
boot.  Initialize dump_avail, even though the 64-bit booke pmap doesn't yet
support minidumps, but will in the future.
2019-10-23 00:31:19 +00:00
Jung-uk Kim a8630f59f9 Belatedly remove stale debug symbols after r339270.
Reported by:	danfe
MFC after:	3 days
2019-10-23 00:05:29 +00:00
Mateusz Guzik 61b8430f38 amd64 pmap: conditionalize per-superpage locks on NUMA
Instead of superpages use. The current code employs superpage-wide locking
regardless and the better locking granularity is welcome with NUMA enabled
even when superpage support is not used.

Requested by:	alc
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21982
2019-10-22 22:55:46 +00:00
Mateusz Guzik 15e33b5493 amd64 pmap: fixup invlgen lookup for fictitious mappings
Similarly to r353438, use dummy entry.

Reported and tested by:	Neel Chauhan
Sponsored by:	The FreeBSD Foundation
2019-10-22 22:54:41 +00:00
Mateusz Guzik 2e310f6f72 pseudofs: hashed vncache
Vast majority of uses the cache are just checking if there is an entry
present on process exit (and evicting it if so). Both checking and
eviction process are very expensive and put the lock protecting it high
up on the profile during poudriere -j 104.

Convert the linked list into a hash. This allows to almost always avoid
taking the lock in the first place (and consequently almost removes it
from the profile). Note only one lock is preserved as a split did not
meaningfully impact contention.

Should the cache be used for something it will still run into contention
issues. The code needs a rewrite, but should someone want to tidy it up
further the following can be done:

1) per-chain locks (or at least an array)
2) hashing by something else than just pid

Sponsored by:	The FreeBSD Foundation
2019-10-22 22:52:53 +00:00