Commit graph

122772 commits

Author SHA1 Message Date
Ed Maste e248c23702 arm64: add arm64 linux.h based on i386 linuxulator and Linux headers
Sponsored by:	Turing Robotic Industries
2018-06-15 19:09:17 +00:00
Conrad Meyer e6e49c7df4 Retain offset compatibility with pre-12.0 dumps
As a follow-up to r324965, which adds support for compressed kernel dumps,
readjust dump header members slightly to mostly preserve ABI with earlier
(11.x and older) dumps.

Reviewed by:	markj
X-MFC-With:	r324965
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D15829
2018-06-15 19:02:53 +00:00
Justin Hibbits 6f3fcd7693 Check for a 'pci' prefix rather than a full match in get_addr_props
Summary:
Newer OPAL device trees, such as those on POWER9 systems, use 'pciex' for
device_type, not 'pci'.  Rather than enumerating all possible variants, just
check for a 'pci' prefix.

Reviewed by:	nwhitehorn, breno.leitao_gmail.com
Differential Revision: https://reviews.freebsd.org/D15817
2018-06-15 18:55:02 +00:00
Navdeep Parhar 6ddad9de86 cxgbe(4): sysctls to display the local and intr CPUs for the adapter.
The driver assumes the list can change (even though it does't right now)
and queries it every time the sysctl runs.

sysctl dev.<nexus>.<inst>.local_cpus
sysctl dev.<nexus>.<inst>.intr_cpus

sysctl dev.t6nex.0.local_cpus
sysctl dev.t6nex.0.intr_cpus

Sponsored by:	Chelsio Communications
2018-06-15 18:04:44 +00:00
Kyle Evans e937d05124 extres/regulator: Switch boot_on/always_on sysctl to uint8
These are represented as booleans on the kernel-side, but were being exposed
as int. This was causing some funky things to happen when read later with
sysctl(8), e.g. randomly reading super-high when the value was actually
'0'/false.

Reviewed by:	manu
2018-06-15 17:29:32 +00:00
Ed Maste 2f75f4134c Correct kern.pre.mk comment: objcopy, not objdump, copies objects.
PR:		229046
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2018-06-15 16:32:18 +00:00
Ruslan Bukin 1147facec3 Make virtio queue re-initialization steps to be similar to
original initialization, so we don't miss few registers to
configure.

This fixes vtnet(4) operation with QEMU's virtio-net-device.

Tested in QEMU with FreeBSD/RISC-V.

Reviewed by:	bryanv
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D15821
2018-06-15 16:19:10 +00:00
Chuck Tuffli d1fa346fac Add linprocfs support for min_free_kbytes
This adds linprocfs support for proc/sys/vm/min_free_kbytes which the
free program requires for correct operation. The approach mirrors the
approach used in illumos.

Reviewed by:	imp (mentor), emaste
Approved by:	imp (mentor)
Differential Revision: https://reviews.freebsd.org/D15563
2018-06-15 15:22:27 +00:00
Ed Maste 931e2a1a6e linuxulator: do not include legacy syscalls on arm64
Existing linuxulator platforms (i386, amd64) support legacy syscalls,
such as non-*at ones like open, but arm64 and other new platforms do
not.

Wrap these in #ifdef LINUX_LEGACY_SYSCALLS, #defined in the MD linux.h
files.  We may need finer grained control in the future but this is
sufficient for now.

Reviewed by:	andrew
Sponsored by:	Turing Robotic Industries
Differential Revision:	https://reviews.freebsd.org/D15237
2018-06-15 14:41:51 +00:00
Ed Maste 4b842782da Correct debug control for linuxulator faccessat
The Linuxulator provides per-syscall debug control via the
compat.linux.debug sysctl.  There's generally a 1:1 mapping between
sysctl setting and syscall, but faccessat was controlled by the access
setting, perhaps due to copy-paste.

Sponsored by:	Turing Robotic Industries
2018-06-15 14:29:41 +00:00
Konstantin Belousov 1f3bc15763 linprocfs: add TracerPid to /proc/pid/status.
Also fix the value of parent pid if the process is traced.

Submitted by:	Yanko Yankulov <yanko.yankulov@gmail.com>
MFC after:	1 week
2018-06-15 13:56:58 +00:00
Ed Maste 6a1d1e0ca3 Add stubbed arm64 linuxulator /proc/cpuinfo handler
Sponsored by:	Turing Robotic Industries
2018-06-15 13:53:37 +00:00
Kyle Evans e5f97ff2cc Revert r335173 at request of mmel@
This was the wrong solution to the problem; regulator_shutdown invokes
regnode_stop. regulator_stop is not a refcounting method, but it invokes
regnode_enable, which is.

mmel@ has a proposed patch/solution to instead provide regnode_fixed_stop
behavior that properly takes shared GPIO pins into account.
2018-06-15 13:14:45 +00:00
Michael Tuexen 43b223f42e When retransmitting TCP SYN-ACK segments with the TCP timestamp option
enabled use an updated timestamp instead of reusing the one used in
the initial TCP SYN-ACK segment.

This patch ensures that an updated timestamp is used when sending the
SYN-ACK from the syncache code. It was already done if the
SYN-ACK was retransmitted from the generic code.

This makes the behaviour consistent and also conformant with
the TCP specification.

Reviewed by:		jtl@, Jason Eggleston
MFC after:		1 month
Sponsored by:		Neflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D15634
2018-06-15 12:28:43 +00:00
Emmanuel Vadot cfba8dec89 allwinner: ccung: Fully subclass the clock drivers
Each clock drivers if now fully subclassed, this have the advantage that
we can control the probe order.
Some clocks can have parents from other drivers, for example clocks in the
sun8i_r driver uses clocks from the main clock driver.
This worked before because the sun8i_r node is after the main ccu node in the
dtb and driver are probed in DTB order. This cannot work with the Display
Engine clocks as it is the first node in the DTB.

Tested on:    A83T, H5 A64
Tested on:    A20 (kevans)
2018-06-15 08:36:21 +00:00
Justin Hibbits ec0646391d ofw_reg_to_paddr(): Fix minor typo in KASSERT message 2018-06-15 03:28:05 +00:00
Matt Macy 2e3f84e5b7 Quiet coretemp probe
Only the first device will print
coretemp0: <CPU On-Die Thermal Sensors> numa-domain 0 on cpu0
instead of all hyper threads

Submitted by:	kbowling
Reviewed by:	imp, sbruno
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D15727
2018-06-15 02:28:36 +00:00
Bryan Drewery 03bd1b693e proc0_post: Fix some locking issues
- Filter out PRS_NEW procs as rufetch() tries taking the thread lock
  which may not yet be initialized.
- Hold PROC_LOCK to ensure stability of iterating the threads.
- p_rux fields are protected by the process statlock as well.

MFC after:	2 weeks
Reviewed by:	kib
Sponsored by:	Dell EMC
Differential Revision:	https://reviews.freebsd.org/D15809
2018-06-15 00:36:41 +00:00
Olivier Houchard 78bcf87e3e Use M_EXEC when calling malloc() to allocate the memory to store the module,
as it'll contain executable code.
2018-06-14 23:10:10 +00:00
Gleb Smirnoff 9293873e83 TCPOUTFLAGS no longer exists since r334843. 2018-06-14 22:25:10 +00:00
Michael Tuexen 33ef123090 Provide the ip6_plen in network byte order when calling ip6_output().
This is not strictly required by ip6_output(), since it overrides it,
but it is needed for upcoming dtrace support.
2018-06-14 21:30:52 +00:00
Brooks Davis 7d87c005da Regen after 335177 (rename sys_obreak to sys_break). 2018-06-14 21:29:31 +00:00
Brooks Davis 9da5364ed9 Name the implementation of brk and sbrk sys_break().
The break() system call was renamed (several times) starting in v3
AT&T UNIX when C was invented and break was a language keyword. The
last vestage of a need for it to be called something else (eg obreak)
was removed in r225617 which consistantly prefixed all syscall
implementations.

Reviewed by:	emaste, kib (older version)
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D15638
2018-06-14 21:27:25 +00:00
Michael Tuexen 8d86bd564f Whitespace changes. 2018-06-14 21:22:14 +00:00
Kyle Evans cba506f2c1 extres/regulator: Properly refcount gpio regulators
regnode::enable_cnt is generally used to refcount regulator nodes. For
GPIOs, the refcount was done on the gpio_entry since more than one regulator
can share a GPIO.

GPIO regulators were not taking part in the node refcount, since they had
their own mechanism. This caused some fallout after manu started disabling
everybody's unused regulators in r331989.

Refcount it.

Glanced over by:	manu
2018-06-14 20:37:25 +00:00
Konstantin Belousov b7b8a09658 Handle the race between fork/vm_object_split() and faults.
If fault started before vmspace_fork() locked the map, and then during
fork, vm_map_copy_entry()->vm_object_split() is executed, it is
possible that the fault instantiate the page into the original object
when the page was already copied into the new object (see
vm_map_split() for the orig/new objects terminology). This can happen
if split found a busy page (e.g. from the fault) and slept dropping
the objects lock, which allows the swap pager to instantiate
read-behind pages for the fault.  Then the restart of the scan can see
a page in the scanned range, where it was already copied to the upper
object.

Fix it by instantiating the read-ahead pages before
swap_pager_getpages() method drops the lock to allocate pbuf.  The
object scan would see the whole range prefilled with the busy pages
and not proceed the range.

Note that vm_fault rechecks the map generation count after the object
unlock, so that it restarts the handling if raced with split, and
re-lookups the right page from the upper object.

In collaboration with:	alc
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2018-06-14 19:41:02 +00:00
Emmanuel Vadot 7de871aa81 mx25l: compat_data is only defined when FDT is
Reported by:	O. Hartmann <ohartmann@walstatt.org>
2018-06-14 19:01:40 +00:00
Kyle Evans 2ac5ef02d4 a10_ahci: Correct clock indices for new bindings
r329104 imported 4.15 DTS which brought CCU to a10/a20. In the process, they
swapped the ordering of 'clocks' for allwinner,sun4i-a10-ahci on both
sun4i-a10 and sun7i-a20 from PLL, Gate to Gate, PLL.

Swap it in the driver.
2018-06-14 18:34:02 +00:00
Kyle Evans dcc1299f0b aw_ccung: Add a10/a20 support
Note: At this time, this has only been tested on a single board from one of
the supported SoCs. This is enough to boot the board from MMC and have
functional USB- which is still an improvement over where we were at just
before with no functional clocks.

Differential Revision:	https://reviews.freebsd.org/D15810
2018-06-14 17:50:29 +00:00
Kyle Evans 85f58288a7 aw_ccung: Support clock factors where factor=0, factor is effectively 1
This happens in two cases for a20 clocks:

pll_core for 'n' factor:
factor=0, val=1
factor=n, val=n

ahb divisor:
factor=0,val=/2
factor=n,val=/2^n

Reviewed by:	manu
Differential Revision:	https://reviews.freebsd.org/D15806
2018-06-14 17:36:02 +00:00
Emmanuel Vadot aab8510031 arm timer: Use the default get_cntxct by default
Reported by:	kevans
2018-06-14 17:32:23 +00:00
Justin Hibbits ebf95d96d9 Split the PowerISA 3.0 HPT implementation from historic
PowerISA 3.0 makes several changes to not only the format of the HPT but
also the behavior surrounding it.  For instance, TLBIE no longer requires
serialization.  Removing this lock cuts buildworld time in half on a
18-core/72-thread POWER9 system, demonstrating that this lock is highly
contended on such a system.

There was odd behavior observed trying to make this change in a
backwards-compatible manner in moea64_native.c, so the best option was to
fully split it, and largely revert the original changes adding POWER9
support to the original file.

Suggested by:	nwhitehorn
2018-06-14 17:23:51 +00:00
Emmanuel Vadot 50d3578f0b mx25l: Add pnp info 2018-06-14 17:21:09 +00:00
Emmanuel Vadot 64b507e5fe spi: Add SPIBUS_PNP_INFO macro
The PNP info string is the same as the SIMPLEBUS one but driver should
depend on spibus and not simplebus
2018-06-14 17:20:47 +00:00
Emmanuel Vadot ba03ef5e21 aw_spi: Add pnp info 2018-06-14 17:19:44 +00:00
Emmanuel Vadot 38d3befe9c arm timer: Add workaround for Allwinner A64 timer
The timer present in allwinner A64 SoC is unstable, value can jump backward
or forward.
It was found that when bit 11 and upper roll over the low bits can sometimes
being read as all as 1 or all as 0.
Simply ignore the values for those cases.
2018-06-14 17:18:15 +00:00
Kenneth D. Merry e4b58dfe33 Fix da(4) locking when probing SMR drives.
Probing host aware and host managed SMR drives got broken in revision
330796.

The added cam_periph_lock() calls were in areas in dadone() where
the peripheral lock was already held.

Since then, dadone() has been split into separate functions that are
dedicated to each probe state.

The result is that when probing a host aware drive, I ran into a recursive
lock acquisition in dadone_probeatalogdir(). I would have run into the
same problem in dadone_probeataiddir(), and in dadone_probeatasup() and
dadone_probeatazone() in the error paths had the probe continued.

The solution is to take out all of the extra cam_periph_lock() calls. I
also added cam_periph_assert(periph, MA_OWNED) near the top of each of
the dadone_* calls. These make it clear to anyone coming along in the
the future that the lock is held in the probe done functions.

Also add a locking assert in daprobedone(), to make it clear that it must
be called with the periph lock held.

Sponsored by:	Spectra Logic
Differential Revision:	https://reviews.freebsd.org/D15764
2018-06-14 17:08:44 +00:00
Justin Hibbits 402c7806cb Fix CTR formatting for moea64_native bootstrap
On very large memory systems 'size' can become 2GB or larger, resulting in a
negative value being formatted.  Also, moea64_pteg_count is already a long, so
format it as such.
2018-06-14 16:01:11 +00:00
Andrey V. Elsukov 9597ff83b8 Add missing BPF_MTAP2() for outbound packets. 2018-06-14 15:04:30 +00:00
Andrey V. Elsukov 2addcba7d5 Convert if_me(4) driver to use encap_lookup_t method and be lockless on
data path.
2018-06-14 14:53:24 +00:00
Konstantin Belousov 459ccd3c5f linuxolator/amd64: Don't mangle %r10 on return from syscall for EJUSTRETURN.
This fixes the %r10 content for rt_sigreturn.

Submitted by:	Yanko Yankulov <yanko.yankulov@gmail.com>
MFC after:	1 week
2018-06-14 12:35:57 +00:00
Andrey V. Elsukov eb548a1a5c In m_megapullup() use m_getjcl() to allocate 9k or 16k mbuf when requested.
It is better to try allocate a big mbuf, than just silently drop a big
packet. A better solution could be reworking of libalias modules to be
able use m_copydata()/m_copyback() instead of requiring the single
contiguous buffer.

PR:		229006
MFC after:	1 week
2018-06-14 11:15:39 +00:00
Konstantin Belousov 5803d744c7 Reorganize code flow in fpudna()/npxdna() to highlight the critical
section scope.  Sprinkle __predict_false() for conditions known to
never occur or occur only on rare platforms.

Sponsored by:	The FreeBSD Foundation
2018-06-14 11:09:51 +00:00
Konstantin Belousov fa7fad8ab9 Remove printf() in #NM handler.
Give up and remove the almost useless informational message reporting
that device not available exception occured while our state tracking
indicates the current CPU has FPU context loaded for the current
thread.

It seems that this is recurring bug with some VM monitors.

Sponsored by:	The FreeBSD Foundation
2018-06-14 10:33:26 +00:00
Rick Macklem c338c94d20 Move four functions in nfscl.ko to nfscommon.ko.
Four functions nfscl_reqstart(), nfscl_fillsattr(), nfsm_stateidtom()
and nfsmnt_mdssession() are now called from within the nfsd.
As such, they needed to be moved from nfscl.ko to nfscommon.ko so that
nfsd.ko would load when nfscl.ko wasn't loaded.

Reported by:	herbert@gojira.at
2018-06-14 10:00:19 +00:00
Andrey V. Elsukov a4f647571f Add NULL check like the rest of code has.
It is possible that ifma_protospec becomes NULL in this function for
some entry, but it is still referenced and thus it will not unlinked
from the list. Then "restart" condition triggers and this entry with
NULL ifma_protospec will lead to page fault.

PR:		228982
2018-06-14 09:36:25 +00:00
Andrey V. Elsukov e36c281fc9 Remove stale comment. in6_ifdetach() can be called from places
where addresses are not removed yet.
2018-06-14 09:29:39 +00:00
Hans Petter Selasky 3ea947d615 Revert r335094 and properly fix OFED build after r335053.
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-06-14 07:55:10 +00:00
Emmanuel Vadot 63e12418fd dts: Update our copy to Linux 4.17 2018-06-14 07:12:10 +00:00
Emmanuel Vadot c0fc404789 Add modules/rockchip
Build rockchip modules as part of buildkernel.
Add the i2c controller module.
2018-06-14 06:40:59 +00:00
Emmanuel Vadot 3de61a6883 rk_i2c: Add driver for the I2C controller present in RockChip SoC
This controller have a special mode for RX to help with smbus-like transfer
when the controller will automatically send the slave address, register address
and read the data. Use it when possible.
The same mode for TX is describe is the datasheet but is broken and have been
since ~10 years of presence of this controller in RockChip SoCs.

Attach this driver early at we need it to communicate with the PMIC early in the
boot.
Do not hook it to the kernel build for now.
2018-06-14 06:39:33 +00:00
Emmanuel Vadot e3fc845c91 rk3328: Add support for the i2c clocks 2018-06-14 06:34:27 +00:00
Emmanuel Vadot 3476304a69 if_dwc_rk: Add DesignWare driver for RockChip SoCs.
Add driver for the designware ethernet controller found in some RockChip SoCs.
The driver still rely on a lot of things setup by the bootloader like clocks
and phy mode.
But since netbooting is the only/easiest way to boot rockchip board at the
moment add the driver so other people can test/dev on thoses boards.
2018-06-14 06:28:09 +00:00
Emmanuel Vadot 282d1ef778 rk_armclk: Add the write mask to the register mux value
This was omitted in r334112 and r334996 which cause the PLL to not correctly
reparent, leaving the armclk to be derived from the APLL instead of the NPLL.
The arm core clock is now correctly set to 600Mhz via the assigned-clock present
in the DTB.
2018-06-14 05:46:57 +00:00
Emmanuel Vadot 1e7af4cc7a rk_pll: Add support for mode
RockChip PLL have two modes controlled by a register, a "slow mode" (the
default one) where the frequency is derived from the 24Mhz oscillator on the
board, and a "normal" one when the pll take it's input from the real PLL output.

Default the mode to normal for all the PLLs.
2018-06-14 05:43:45 +00:00
Emmanuel Vadot b1b521b1d5 rk_pinctrl: Only add gpio subnode
This is the only node we are interested in so do not waste time to test
creating device that will be either unused or fail as most of the nodes
don't have a compatible string.
2018-06-14 05:41:16 +00:00
Randall Stewart 4aec110f70 This fixes several bugs that Larry Rosenman helped me find in
Rack with respect to its handling of TCP Fast Open. Several
fixes all related to TFO are included in this commit:
1) Handling of non-TFO retransmissions
2) Building the proper send-map when we are doing TFO
3) Dealing with the ack that comes back that includes the
   SYN and data.

It appears that with this commit TFO now works :-)

Thanks Larry for all your help!!

Sponsored by:	Netflix Inc.
Differential Revision:	https://reviews.freebsd.org/D15758
2018-06-14 03:27:42 +00:00
Navdeep Parhar 2ea5b0f54a cxgbe(4): Catch up with recent changes in the kernel -- it no longer
holds non-sleepable locks around any of the driver ioctls.

Sponsored by:	Chelsio Communications
2018-06-14 01:27:35 +00:00
Matt Macy 67b3b4d245 fix OFED build after r335053 2018-06-13 23:30:54 +00:00
Matt Macy feeef8509b Fix PCBGROUPS build post CK conversion of pcbinfo 2018-06-13 23:19:54 +00:00
Konstantin Belousov fc3e80c322 Enable eager FPU context switch by default on i386 too, based on
amd64 r335072.

Security:	CVE-2018-3665
Sponsored by:	The FreeBSD Foundation
2018-06-13 21:10:23 +00:00
Warner Losh 75b6758daa Add PNP info to PCI attachment of ae driver
Reviewed by: imp, chuck
Submitted by: Lakhan Shiva Kamireddy <lakhanshiva@gmail.com>
Sponsored by: Google, Inc. (GSoC 2018)
2018-06-13 20:25:36 +00:00
Warner Losh ae8b178b0a Add PNP info to PCI attachments of age driver
Reviewed by: imp, chuck
Submitted by: Lakhan Shiva Kamireddy <lakhanshiva@gmail.com>
Sponsored by: Google, Inc. (GSoC 2018)
2018-06-13 20:25:32 +00:00
Warner Losh 542f4c5c92 Add PNP info to PCI attachment of amr driver
Reviewed by: imp, chuck
Submitted by: Lakhan Shiva Kamireddy <lakhanshiva@gmail.com>
Sponsored by: Google, Inc. (GSoC 2018)
2018-06-13 20:25:27 +00:00
Warner Losh ef50201a2e Add PNP info to PCI attachment of ale driver
Reviewed by: imp, chuck
Submitted by: Lakhan Shiva Kamireddy <lakhanshiva@gmail.com>
Sponsored by: Google, Inc. (GSoC 2018)
2018-06-13 20:25:23 +00:00
Warner Losh fc7449ce6a Add PNP info to PCI attachment of bwi driver
Reviewed by: imp, chuck
Submitted by: Lakhan Shiva Kamireddy <lakhanshiva@gmail.com>
Sponsored by: Google, Inc. (GSoC 2018)
2018-06-13 20:25:18 +00:00
Warner Losh 96b523613c Add PNP info to PCI attachment of bwn driver
Reviewed by: imp, chuck
Submitted by: Lakhan Shiva Kamireddy <lakhanshiva@gmail.com>
Sponsored by: Google, Inc. (GSoC 2018)
2018-06-13 20:25:13 +00:00
Warner Losh 769ac9e65b Add PNP info to PCI attachment of an driver
Reviewed by: imp, chuck
Submitted by: Lakhan Shiva Kamireddy <lakhanshiva@gmail.com>
Sponsored by: Google, Inc. (GSoC 2018)
2018-06-13 20:25:09 +00:00
Warner Losh 491589f3af Add PNP info to the PCI attachment of the ahci driver
Mark the PNP table, but still need to handle the CLASS / SUBCLASS /
REVID matching.

Reviewed by: imp, chuck
Submitted by: Lakhan Shiva Kamireddy <lakhanshiva@gmail.com>
Sponsored by: Google, Inc. (GSoC 2018)
2018-06-13 20:25:04 +00:00
Warner Losh c3f3f3e648 Add PNP info to the PCI attachment of the aacraid driver.
Reviewed by: imp, chuck
Submitted by: Lakhan Shiva Kamireddy <lakhanshiva@gmail.com>
Sponsored by: Google, Inc. (GSoC 2018)
2018-06-13 20:25:00 +00:00
Warner Losh 791a8cbbe6 Add PNP info to the PCI attachment of the ncr driver.
Reviewed by: imp, chuck
Submitted by: Lakhan Shiva Kamireddy <lakhanshiva@gmail.com>
Sponsored by: Google, Inc. (GSoC 2018)
2018-06-13 20:24:49 +00:00
Ryan Libby a7be368aec i386: copyin/copyout error is EFAULT
Discussed with:	kib
MFC with:	r332489
Sponsored by:	Dell EMC Isilon
2018-06-13 19:57:03 +00:00
Konstantin Belousov d1a07e31e5 Enable eager FPU context switch by default on amd64.
With compilers making increasing use of vector instructions the
performance benefit of lazily switching FPU state is no longer a
desirable tradeoff.  Linux switched to eager FPU context switch some
time ago, and the idea was floated on the FreeBSD-current mailing list
some years ago[1].

Enable eager FPU context switch by default on amd64, with a tunable/sysctl
available to turn it back off.

[1] https://lists.freebsd.org/pipermail/freebsd-current/2015-March/055198.html

Reviewed by:	jhb
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
2018-06-13 17:55:09 +00:00
Jonathan T. Looney 0766f278d8 Make UMA and malloc(9) return non-executable memory in most cases.
Most kernel memory that is allocated after boot does not need to be
executable.  There are a few exceptions.  For example, kernel modules
do need executable memory, but they don't use UMA or malloc(9).  The
BPF JIT compiler also needs executable memory and did use malloc(9)
until r317072.

(Note that a side effect of r316767 was that the "small allocation"
path in UMA on amd64 already returned non-executable memory.  This
meant that some calls to malloc(9) or the UMA zone(9) allocator could
return executable memory, while others could return non-executable
memory.  This change makes the behavior consistent.)

This change makes malloc(9) return non-executable memory unless the new
M_EXEC flag is specified.  After this change, the UMA zone(9) allocator
will always return non-executable memory, and a KASSERT will catch
attempts to use the M_EXEC flag to allocate executable memory using
uma_zalloc() or its variants.

Allocations that do need executable memory have various choices.  They
may use the M_EXEC flag to malloc(9), or they may use a different VM
interfact to obtain executable pages.

Now that malloc(9) again allows executable allocations, this change also
reverts most of r317072.

PR:		228927
Reviewed by:	alc, kib, markj, jhb (previous version)
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D15691
2018-06-13 17:04:41 +00:00
Warner Losh a971acbc25 Implement a 'car limit' for bioq.
Allow one to implement a 'car limit' for
bioq_disksort. debug.bioq_batchsize sets the size of car limit. Every
time we queue that many requests, we start over so that we limit the
latency for requests when the software queue depths are large. A value
of '0', the default, means to revert to the old behavior.

Sponsored by: Netflix
2018-06-13 16:48:07 +00:00
Andrew Turner 4e050d14e0 Add ThunderX2 to the list of CPUs we need to apply the branch predictor
hardening to.

Sponsored by:	DARPA, AFRL
2018-06-13 15:58:33 +00:00
Andrew Turner 3c4dad8812 Switch to the SMCCC function for branch predictor hardening. The previous
method may not have worked as the firmware checks for the ARCH_WORKAROUND_1
function ID.

Sponsored by:	DARPA, AFRL
2018-06-13 15:56:24 +00:00
Andrew Turner 09d1a08ddc Add the SMCCC return codes from ARM DEN 0070A.
While here add a comment with the document the function IDs come from.

Sponsored by:	DARPA, AFRL
2018-06-13 15:41:22 +00:00
Andrew Turner f651b52527 Add support for the ARM SMC Calling Convention (SMCCC). This is a method
to call into the firmware in a similar way to the existing PSCI, and used
PSCI to detect when SMCCC is enabled.

There is a function ID space we can use. Currently we only support 3
functions in the ARM Architecture Calls region, however it is expected we
will expend these in the future.

Sponsored by:	DARPA, AFRL
2018-06-13 15:32:00 +00:00
Andrew Turner 9e8cb3d226 Move psci_call to a header file so we can use it in other files to
communicate with the firmware.

Sponsored by:	DARPA, AFRL
2018-06-13 15:24:07 +00:00
Alan Somers ebc0d5599d audit(4): fix the definition of ARG_TERMID_ADDR
Due to a copy/paste error in r168688, ARG_TERMID_ADDR has the same
definition as ARG_SADDRUNIX.  Fix it.

The header change, while publicly visible, is guarded by #ifdef KERNEL, and
I can't find any kmod ports that use it.  So I'm not bumping
__FreeBSD_version.

PR:		228820
Submitted by:	aniketp
Sponsored by:	Google, Inc. (GSoC 2018)
Differential Revision:	https://reviews.freebsd.org/D15702
2018-06-13 14:55:31 +00:00
Bruce Evans 407a812657 Oops, r335053 had an old version of the comment about 16-bit linux dev_t
translation.
2018-06-13 12:44:45 +00:00
Andrew Turner 5add83935a Add a handler for the PSCI_FEATURES function. This needs PSCI 1.0, so
check for this, returning an error if the version is too old.

Sponsored by:	DARPA, AFRL
2018-06-13 12:33:47 +00:00
Andrew Turner 4493861b9a Find and cache the PSCI version on driver attach.
Sponsored by:	DARPA, AFRL
2018-06-13 12:32:04 +00:00
Andrew Turner 811880f2a4 Add the PSCI_FEATURES function ID. This is found in PSCI 1.0 and is used
to query if a given function is implemented and its features.

Sponsored by:	DARPA, AFRL
2018-06-13 12:26:37 +00:00
Bruce Evans ab35e1c71b Fix the encoding of major and minor numbers in 64-bit dev_t by restoring
the old encodings for the lower 16 and 32 bits and only using the
higher 32 bits for unusually large major and minor numbers.  This
change breaks compatibility with the previous encoding (which was only
used in -current).

Fix truncation to (essentially) 16-bit dev_t in newnfs v3.

Any encoding of device numbers gives an ABI, so it can't be changed
without translations for compatibility.  Extra bits give the much
larger complication that the translations need to compress into fewer
bits.  Fortunately, more than 32 bits are rarely needed, so
compression is rarely needed except for 16-bit linux dev_t where it
was always needed but never done.

The previous encoding moved the major number into the top 32 bits.
Almost no translation code handled this, so the major number was blindly
truncated away in most 32-bit encodings.  E.g., for ffs, mknod(8) with
major = 1 and minor = 2 gave dev_t = 0x10000002; ffs cannot represent
this and blindly truncated it to 2.  But if this mknod was run on any
released version of FreeBSD, it gives dev_t = 0x102.  ffs can represent
this, but in the previous encoding it was not decoded, giving major = 0,
minor = 0x102.

The presence of bugs was most obvious for exporting dev_t's from an
old system to -current, since bugs in newnfs augment them.  I fixed
oldnfs to support 32-bit dev_t in 1996 (r16634), but this regressed
to 16-bit dev_t in newnfs, first to the old 16-bit encoding and then
further in -current.  E.g., old ad0 with major = 234, minor = 0x10002
had the correct (major, minor) number on the wire, but newnfs truncated
this to (234, 2) and then the previous encoding shifted the major
number into oblivion as seen by ffs or old applications.

I first tried to fix this by translating on every ABI/API boundary, but
there are too many boundaries and too many sloppy translations by blind
truncation.  So use the old encoding for the low 32 bits so that sloppy
translations work no worse than before provided the high 32 bits are
not set.  Add some error checking for when bits are lost.  Keep not
doing any error checking for translations for almost everything in
compat/linux.

compat/freebsd32/freebsd32_misc.c:
Optionally check for losing bits after possibly-truncating assignments as
before.

compat/linux/linux_stats.c:
Depend on the representation being compatible with Linux's (or just with
itself for local use) and spell some of the translations as assignments in
a macro that hides the details.

fs/nfsclient/nfs_clcomsubs.c:
Essentially the same fix as in 1996, except there is now no possible
truncation in makedev() itself.  Also fix nearby style bugs.

kern/vfs_syscalls.c:
As for freebsd32.  Also update the sysctl description to include file
numbers, and change it to describe device ids as device numbers.

sys/types.h:
Use inline functions (wrapped by macros) since the expressions are now
a bit too complicated for plain macros.  Describe the encoding and
some of the reasons for it.  16-bit compatibility didn't leave many
reasonable choices for the 32-bit encoding, and 32-bit compatibility
doesn't leave many reasonable choices for the 64-bit encoding.  My
choice is to put the 8 new minor bits in the low 8 bits of the top 32
bits.  This minimizes discontiguities.

Reviewed by:	kib (except for rewrite of the comment in linux_stats.c)
2018-06-13 12:22:00 +00:00
Andrew Turner 8b47c1ae54 Rename the ThunderX CPU identification macros to include the X. This is the
name people know the product by, and is consistent with the later SoC ID
macros.

Sponsored by:	DARPA, AFRL
2018-06-13 12:17:11 +00:00
Andrew Turner 0014ef8a04 Add more Cavium CPU part numbers.
While here split the lists by vendor.

Sponsored by:	DARPA, AFRL
2018-06-13 11:58:41 +00:00
Andrey V. Elsukov a5185adeb6 Rework if_gre(4) to use encap_lookup_t method to speedup lookup
of needed interface when many gre interfaces are present.

Remove rmlock from gre_softc, use epoch(9) and CK_LIST instead.
Move more AF-related code into AF-related locations. Use hash table to
speedup lookup of needed softc.
2018-06-13 11:11:33 +00:00
Ruslan Bukin b626c976dc Don't jump to VA space until kernel is ready.
This fixes the race when first core sets up the pagetables, while
secondary cores do translating the address of __riscv_boot_ap.

This now allows us to smpboot in QEMU with 8 cores just fine.

Sponsored by:	DARPA, AFRL
2018-06-13 10:32:21 +00:00
Bruce Evans 372639f944 Fix some bugs found while fixing the representation and translation
of 64-bit dev_t's (but not ones involving dev_t's).

st_size was supposed to be clamped in cvtstat() and linux's copy_stat(),
but the clamping code wasn't aware that st_size is signed, and also had
an obfuscated off-by-1 value for the unsigned limit, so its effect was
to produce a bizarre negative size instead of clamping.

Change freebsd32's copy_ostat() to be no worse than cvtstat().  It was
missing clamping and bzero()ing of padding.

Reviewed by:	kib (except a final fix of the clamp to the signed maximum)
2018-06-13 08:50:43 +00:00
Dimitry Andric 2b6fe1b2da Fix build of liquidio with base gcc on i386
Some casts from pointers to uint64_t and back in lio_main.c cause base
gcc on i386 to warn "cast from pointer to integer of different size",
and vice versa.  Add additional casts to uintptr_t to suppress these.

Reviewed by:	sbruno
MFC after:	3 days
Differential Revision: https://reviews.freebsd.org/D15754
2018-06-13 07:55:57 +00:00
Marcelo Araujo ebc3c37c6f Add SPDX tags to vmm(4).
MFC after:	4 weeks.
Sponsored by:	iXsystems Inc.
2018-06-13 07:02:58 +00:00
Matt Macy 483305b99c Handle INP_FREED when looking up an inpcb
When hash table lookups are not serialized with in_pcbfree it will be
possible for callers to find an inpcb that has been marked free. We
need to check for this and return NULL.
2018-06-13 04:23:49 +00:00
Randall Stewart c9b4ac7587 This fixes missing VNET sets in the hpts system. Basically
without this and running vnets with a TCP stack that uses
some of the features is a recipe for panic (without this commit).

Reported by:	Larry Rosenman
Sponsored by:	Netflix Inc.
Differential Revision:	https://reviews.freebsd.org/D15757
2018-06-12 23:54:08 +00:00
Matt Macy 700e893c34 Defer inpcbport free in in_pcbremlists as well 2018-06-12 23:26:25 +00:00
Jung-uk Kim 6362b1a6b1 Fix number of auxargs entries to copy out for 32-bit Linuxulator.
PR:		228790
2018-06-12 22:54:48 +00:00
Rick Macklem 7e4595e82b Version bump since r334930 changed the interface between the NFS modules,
so they all need to be rebuilt.
2018-06-12 22:48:19 +00:00
Matt Macy f09ee4fc01 Defer inpcbport free until after a grace period has elapsed
This is a dependency for inpcbinfo rlock conversion to epoch
2018-06-12 22:18:27 +00:00
Matt Macy b872626dbe mechanical CK macro conversion of inpcbinfo lists
This is a dependency for converting the inpcbinfo hash and info rlocks
to epoch.
2018-06-12 22:18:20 +00:00