A few of these omitted a shift of 0, but this is more consistent.
Reviewed by: chuck
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D43602
The current macro always builds a full mask for a named field, so use
the M suffix for mask.
Reviewed by: chuck, imp
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D43601
This is a preparation for adding dtrace hooks in a follow-up commit,
which are missing in the code path, where packets are directly queued
to the tcpcb. The dtrace hooks expect the fields to be in host byte
order. This only applies when TCP HPTS is used.
No functional change intended.
Reviewed by: rscheff
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D43594
hash_add_rcu asserted that the node's prev pointer was NULL in an
attempt to detect addition of a node already on a list, but the caller
is not required to provide a zeroed node.
Reported in https://github.com/freebsd/drm-kmod/issues/282
Reviewed by: bz, manu
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D43645
When we route-to the state should be bound to the route-to interface,
not the default route interface. However, we should only do so for
outbound traffic, because inbound traffic should bind on the arriving
interface, not the one we eventually transmit on.
Explicitly check for this in BOUND_IFACE().
We must also extend pf_find_state(), because subsequent packets within
the established state will attempt to match the original interface, not
the route-to interface.
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D43589
pfil hooks (i.e. firewalls) may pass, modify or free the mbuf passed
to them. (E.g. when rejecting a packet, or when gathering up packets
for reassembly).
If the hook returns PFIL_PASS the mbuf must still be present. Assert
this in pfil_mem_common() and ensure that ipfilter follows this
convention. pf and ipfw already did.
Similarly, if the hook returns PFIL_DROPPED or PFIL_CONSUMED the mbuf
must have been freed (or now be owned by the firewall for further
processing, like packet scheduling or reassembly).
This allows us to remove a few extraneous NULL checks.
Suggested by: tuexen
Reviewed by: tuexen, zlei
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D43617
The RFC6675 pipe calculation (sack.revised, enabled
by default since D28702), uses outdated information,
while the previous default calculated it correctly
with up-to-date information from the incoming ACK.
This difference can become as large as the receive
window (not the congestion window previously),
potentially triggering a massive burst of new packets.
MFC after: 1 week
Reviewed By: tuexen, #transport
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D43520
If the header that we add to the packet's size is 0 % 4 and we're
strictly aligning, then we need to adjust where we store the header so
the packet that follows will have it's struct ip header properly
aligned. We do this on allocation (and when we check the length of the
mbufs in the lro_nomrg case). We can't just adjust the clustersz in the
softc, because it's also used to allocate the mbufs and it needs to be
the proper size for that. Since we otherwise use the size of the mbuf
(or sometimes the smaller size of the received packet) to compute how
much we can buffer, this ensures no overflows. The 2 byte adjustment
also does not affect how many packets we can receive in the lro_nomrg
case.
PR: 271288
Sponsored by: Netflix
Reviewed by: bryanv
Differential Revision: https://reviews.freebsd.org/D43224
Let the user choose a period (interrupt cadence in samples), in the
official RME drivers this setting is available as "Buffer Size".
Override the period propagated through blocksize by pcm channel latency
settings (see sound(4)), since these are unreliable and differ between
playback and recording channels.
Differential Revision: https://reviews.freebsd.org/D43527
The rest of the sound system supports 7+1 maximum and is not aware of 8+0.
I believe that these messages are caused by 8+0:
kernel: feeder_init(0xfffff801f935d680) on feeder_matrix returned 22
kernel: pcm0: feeder_build_matrix(): can't add feeder_matrix
This should disable allocation of the second per-CPU principal buffer
which is never used. This will also enable additional asserts
for buffers that are never switched.
For example, shutdown_panic wants to produce some output and maybe take
some input before a system is actually reset.
The change should only make difference for the case of system reset
(reboot), poweroff and halt should not be affected.
The change makes difference only if hw.acpi.handle_reboot is set. It
used to default to zero until r213755 / ac731af567.
First of all and unlike I2C, it's not the master that dictates how many
bytes to read in block read operation. It's the device that informs the
master how many bytes it's sending back.
Thus, for ichsmb_bread() the count parameter is purely an output
parameter. The code has been changed to reflect that.
The sanity checking of the response length is now done once it (the
first byte of the response) is received.
While here, handling of ICH_HST_STA_FAILED status bit has been added.
Plus some code style improvements and some new code comments in the
vicinity of the changed code.
That is, wd_shutdown_countdown value is ignored when halting.
A halted system should remain halted for as long as needed until
a power cycle, so the watchdog should not reset the system.
Power off or reset may be activated either by low or high signal or by an
edge. So, try everything.
Also, the driver now supports DTS properties for timings.
Finally, the driver does not change the pin configuration during attach.
It is assumed that the pin is already in a state that does not trigger
the power event (otherwise we wouldn't be running).
Fall-through to non-FDT probe code if no matching device node is found.
While here, fix indentation of the switch statement.
Also, make the device description for the hints-based attachment the
same as for FDT attachment.
Fixes: 2486b446db ds1307: add support for the EPSON RX-8035SA I2C RTC
On several systems we've noticed that when NTB link goes down, the
Physical Layer User Test Pattern registers we use as additional
scratchpad registers (that is explicitly allowed by the chip specs)
become read-only for about 100us. I see no explanation for this in
the chip specs, neither why it was not seen before, may be a race.
Since we do need these registers, workaround it by repeating writes
until we succeed or 1ms timeout expire.
MFC after: 1 week
A commit from 2012 (5d7380f8e3, r228424) introduced
'td_stopsched', on the ground that a global variable would cause all
CPUs to have a copy of it in their cache, and consequently of all other
variables sharing the same cache line.
This is really a problem only if that cache line sees relatively
frequent modifications. This was unlikely to be the case back then
because nearby variables are almost never modified as well. In any
case, today we have a new tool at our disposal to ensure that this
variable goes into a read-mostly section containing frequently-accessed
variables ('__read_frequently'). Most of the cache lines covering this
section are likely to always be in every CPU cache. This makes the
second reason stated in the commit message (ensuring the field is in the
same cache line as some lock-related fields, since these are accessed in
close proximity) moot, as well as the second order effect of requiring
an additional line to be present in the cache (the one containing the
new 'scheduler_stopped' boolean, see below).
From a pure logical point of view, whether the scheduler is stopped is
a global state and is certainly not a per-thread quality.
Consequently, remove 'td_stopsched', which immediately frees a byte in
'struct thread'. Currently, the latter's size (and layout) stays
unchanged, but some of the later re-orderings will probably benefit from
this removal. Available bytes at the original position for
'td_stopsched' have been made explicit with the addition of the
'_td_pad0' member.
Store the global state in the new 'scheduler_stopped' boolean, which is
annotated with '__read_frequently'.
Replace uses of SCHEDULER_STOPPED_TD() with SCHEDULER_STOPPER() and
remove the former as it is now unnecessary.
Reviewed by: markj, kib
Approved by: markj (mentor)
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D43572
It's not an assertion, so doesn't logically belong to 'kassert.h'.
Moreover, a subsequent commit will make it rely on a variable whose
declaration also belongs to 'systm.h'.
Approved by: markj (mentor)
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D43571
While here, put such annotation after the variable for 'dumping', since
it concerns the variable and not the type.
Reviewed by: markj
Approved by: markj (mentor)
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D43570
Currently, no performance-critical path tests for a panic. Moreover, we
today have KERNEL_PANICKED() which wraps the test into
__predict_false(), already catering to those (potential) use cases.
Also, in practice we don't support 64-bit architectures without caches,
so reading an 'int' instead of a pointer doesn't (directly) save any
memory access. Finally, 'panicked' is redundant with 'panicstr' (and
wastes a tiny amount of memory).
Consequently:
1. Use again 'panicstr' as a flag indicating that the system is
panicking. To this end:
- Modify panic() so that it ensures this pointer is set to some
non-NULL value even if the caller didn't pass any panic string.
- Modify KERNEL_PANICKED() to test for 'panicstr'.
- Remove 'panicked'.
2. Annotate 'panicstr' with '__read_mostly' (instead of using
'__read_frequently' as for 'panicked'). This may have to be changed if,
in the future, some performance-intensive path needs to test it.
3. Convert a few more direct tests of 'panicstr' to using
KERNEL_PANICKED().
Reviewed by: kib, markj, emaste
Approved by: markj (mentor)
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D43569
pmap_san_bootstrap() doesn't really do much, and it was hard-coding the
the bootstrap stack size defined in locore.S. Moreover, the name is a
bit confusing given the existence of pmap_bootstrap_san(). Just remove
it and call kasan_init_early() directly like we do on amd64. It will
not be used by KMSAN in a forthcoming patch series.
No functional change intended.
MFC after: 1 week
Sponsored by: Klara, Inc.
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D43403
Use SEQ_SUB instead of a plain subtraction, for an implict
type conversion and prevention of a possible overflow.
Use curly brackets in stacked if statements throughout.
Use of the ? operator to enhance readability when clearing
the FIN flag in tcp_output().
None of the above change the function.
Reviewed By: tuexen, cc, #transport
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D43539
Currently there are a few output sections left as implicitly using
:kernel :dynamic before :kernel on its own is used again, which means
they end up in both the PT_LOAD and the PT_DYNAMIC segments, an unusual
situation which the new libelf-based kldxref initially treated as
invalid. Thus, hoist the :kernel to the very next section to ensure only
.dynamic is in PT_DYNAMIC, as is more normal.
Whilst here, sync ldscript.powerpc64le with ldscript.powerpc64 to pick
up various fixes that were presumably made between the start of the
powerpc64le port and it being committed and got missed.
Reviewed by: jhibbits, jhb
Differential Revision: https://reviews.freebsd.org/D43066
Shifting bits is quicker than checking header flag bits
one by one. Also improve readability by the use of switch
statements.
No change in behaviour.
Reviewed By: glebius, tuexen, #transport
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D43560
Account for SACK retransmitted bytes once the actual length
is known. This prevents a call to tcp_maxseg() and prepares
for TSO support when transmitting from the SACK scoreboard.
Reviewed By: tuexen, #transport
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D43447
When booting a KMSAN kernel on an Ampere Altra, I've seen some boot time
hangs when the XHCI controller driver attempts to allocate memory for
32-bit DMA. The system boots fine with a GENERIC kernel; I believe that
the additional memory requirements of KMSAN push it over the edge. The
system has a bit less than 2GB of RAM below the 4GB boundary.
Allocate a new freelist to segregate memory below 4GB, as we do on
amd64, so that such memory allocation failures are less likely to occur.
Reviewed by: alc
MFC after: 1 month
Sponsored by: Klara, Inc.
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D43503
Introduce a verbose output mode to pflowctl, and expose the status of
the socket to userspace. This can be helpful in debugging configuration
errors.
Sponsored by: Rubicon Communications, LLC ("Netgate")
Currently the local interrupt controller implementation is based on
pre-INTRNG arm/arm64 code, using hand-rolled event code rather than
INTRNG. This then interacts weirdly with the PLIC, and other future
interrupt controllers like the APLIC and IMSICs in the upcoming AIA
specification, since they become the root PIC despite not being the
logical root. Instead, use a real newbus device for it and register
it as the root PIC.
This also adapts the IPI code to make use of the newly-added INTRNG
generic IPI handling framework, adding a new sbi_ipi as the PIC. In
future there will be alternative devices for sending IPIs that will
register with higher priorities, such as the proposed AIA IMSIC and
ACLINT SSWI.
Reviewed by: mhorne
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D35901
This approach is based on the Arm PSCI driver, though that makes more
extensive use of its softc than we do here. This will be used to extract
the SBI IPI code as a real PIC.
Reviewed by: mhorne, imp
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D35900
On RISC-V, the root PIC (whether the PLIC or, as will be the case in
future, the local interrupt controller) cannot send IPIs, relying on
another means to trigger the necessary software interrupts (firmware
calls), but there are upcoming standard devices that will be able to
inject them, so we can't just put the firmware calls in the root PIC
driver.
Thus, split out a new intr_ipi_dev from intr_irq_root_dev to use for
sending IPIs. New devices can be registered with a given priority up
until the first IPI is set up, when the best device seen so far gets
frozen as the IPI device to use.
Reviewed by: mhorne
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D35899
The arm and arm64 implementations of dispatching IPIs via PIC_IPI_SEND
are almost identical, and entirely MI with the lone exception of a
single store barrier on arm64 (that is likely either redundant or needed
on arm too). Thus, de-duplicate this code by moving it to INTRNG as a
generic IPI glue framework. The ipi_* functions remain declared in MD
smp.h headers and implemented in MD code, but are trivial wrappers
around intr_ipi_send that could be made MI, at least for INTRNG ports,
at a later date.
Note that, whilst both arm and arm64 had an ii_send member in intr_ipi
to abstract over how to send interrupts,, they were always ultimately
using PIC_IPI_SEND, and so this complexity has been removed. A follow-up
commit will re-introduce the same flexibility by instead allowing a
device other than the root PIC to be registered as the IPI sender.
As part of this, strengthen a MAXCPU assertion that was missed in commit
2f0b059eea ("intrng: switch from MAXCPU to mp_ncpus") (which itself is
mis-titled).
Reviewed by: mmel, mhorne
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D35898
The static irq_root_ipicount variable is only ever written to (with the
value passed to irq_root_ipicount), never read. Moreover, the bcm2836
driver, as used by the Raspberry Pi 2B and 3A/B (but not 4, which uses a
GIC-400, though does have the legacy interrupt controller present too)
passes 0 as ipicount, despite implementing IPIs. It's thus inaccurate
and serves no purpose, so should be removed.
Reviewed by: mmel, imp, mhorne
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D35897
`ti->ti_begin` is actually the offset within the first block that is
unread, so we must use that for our lower bound.
Moving to the previous block has to be done at the end of the loop in
order to correctly handle the case of ti_begin == TTYINQ_DATASIZE. At
that point, lastblock is still the last one with data written and the
next write into the queue would advance lastblock. If we move to the
previous block at the beginning, then we're essentially off by one block
for the entire scan and run the risk of running off the end of the block
queue.
The ti_begin == 0 case is still handled correctly, as we skip the loop
entirely and the linestart gets recorded as the first byte available for
writing. The bit after the loop about moving to the next block is also
still correct, even with both previous fixes in mind: we skipped moving
to the previous block if we hit ti_begin, and `off + 1` would in-fact be
a member of the next block from where we're reading if it falls on a
block boundary.
Reported by: dim
Fixes: 522083ffbd ("kern: tty: recanonicalize the buffer on [...]")