changes DELAY to use the TSC once it has been calibrated. This does NOT
use the TSC for long-term timekeeping. It only uses it to bound the
DELAY() spinloop. This should not be affected by the Athlon64 X2 TSC
quirks because the cpu is not halted while we use DELAY().
compilation of kernels without ns8250 support but using the uart framework.
These kernels will be for machines where size matters more, so including code
that can never be executed is undesriable...
this can happen under certain conditions when scanning. This logic
will eventually go away with the new scanning code.
While here de-inline the routine.
MFC after: 1 week
in the 802.11 layer: we send a directed probe request frame to the
current ap bmiss_max times (w/o answer) before scanning for a new ap.
MFC after: 2 weeks
working at all and only saw "nve0: device timeout (N)" messages.
- Setup PHY before handing control to NVidia API setting
speed, duplex, enabling interrupts, etc.
- Add restriction of MAXADDR_32BIT for high address to contigmalloc
to make the driver work on machines with 4+GB of memory.
PR: kern/85583, kern/88045
Tested by: scottl, others earlier version
MFC after: 10 days
- Implement cv_wait_unlock() method which has semantics compatible
with the sv_wait() method in IRIX. For cv_wait_unlock(), the lock
must be held before entering the function, but is not held when the
function is exited.
- Implement the existing cv_wait() function in terms of cv_wait_unlock().
Submitted by: kan
Feedback from: jhb, trhodes, Christoph Hellwig <hch at infradead dot org>
being hold by current thread or ignored by current process,
otherwise, it is very possible the thread will enter an infinite loop
and lead to an administrator's nightmare.
- number of read I/O requests,
- number of write I/O requests,
- number of read bytes,
- number of written bytes.
Add 'reset' subcommand for resetting statistics.
transmitted bits was between 8.6180us and 8.6200us when we used a RCLK
of 16.500MHz. This is a little low (should be 8.6805us). This error
is exactly the error one would expect if it actually had a 16.384MHz
watch oscillator (as suggested by garrett) instead of using the PCI
RCLK. Assume that the pci clock therefore wasn't really used, but
instead the cheap 16.384MH watch quartz oscillator. This gives bits
in the 8.6800us to 8.6810us ranage, which matches theoretical.
Submitted by: garrett
- Move PUSH_FRAME and POP_FRAME to asmacros.h and use PUSH_FRAME in
atpic entry points.
- Move PCPU_* asm macros out of the middle of the asm profiling macros.
- Pass IRQ vector argument as an int rather than void * to reduce diffs
with i386.
- EOI the lapic in C for the lapic timer handler.
- GC unused Xcpuast function.
- Split IPI_STOP handling code of ipi_nmi_handler() out into a
cpustop_handler() function and call it from Xcpustop rather than
duplicating all the logic in assembly.
- Fixup the list of symbols with interrupt frames in ddb traces.
Xatpic_fastintr* have never existed on amd64, and the lapic timer
handler and various IPI handlers were missing.
- Use trapframe instead of intrframe for interrupt entry points (on amd64
the interrupt vector was already a separate argument, so the two frames
were already identical) and GC intrframe.
Submitted by: peter (3)
cluster allocator, that wasn't MPSAFE. Instead, utilize our new generic
UMA jumbo cluster allocator. Since UMA gives us a 9k piece that is contigous
in virtual memory, but isn't contigous in physical memory we need to handle
a few segments. To deal with this we utilize Tigon chip feature - extended
RX descriptors, that can handle up to four DMA segments for one frame.
Details:
o Remove bge_alloc_jumbo_mem(), bge_free_jumbo_mem(),
bge_jalloc(), bge_jfree() functions.
o Remove SLIST heads, bge_jumbo_tag, bge_jumbo_map from softc.
o Use extended RX BDs for Jumbo receive producer ring, and
initialize it appropriately.
o New bge_newbuf_jumbo():
- Allocate an mbuf with Jumbo cluster with help of m_cljget().
- Load the cluster for DMA with help of bus_dmamap_load_mbuf_sg().
- Assert that we got 3 segments in the DMA mapping.
- Fill in these 3 segments into the extended RX descriptor.
2) rework link state detection code & use it in POLLING mode
3) fix 2 bugs in link state detection code:
a) driver unable to detect link loss on bcm5721
b) on bcm570x chips (tested on bcm5700 bcm5701 bcm5702) driver fails
to detect link loss with probability 1/6 (solved in brgphy.c)
Devices working in TBI mode should not be affected by this change.
Approved by: glebius (mentor)
MFC after: 1 month
4k clusters in addition to 9k and 16k ones.
struct mbuf *m_getjcl(int how, short type, int flags, int size)
void *m_cljget(struct mbuf *m, int how, int size)
m_getjcl() returns an mbuf with a cluster of the specified size attached
like m_getcl() does for 2k clusters.
m_cljget() is different from m_clget() as it can allocate clusters
without attaching them to an mbuf. In that case the return value
is the pointer to the cluster of the requested size. If an mbuf was
specified, it gets the cluster attached to it and the return value
can be safely ignored.
For size both take MCLBYTES, MJUM4BYTES, MJUM9BYTES, MJUM16BYTES.
Reviewed by: glebius
Tested by: glebius
Sponsored by: TCP/IP Optimization Fundraise 2005
"done" method so that for non-repeat operations we have completely
finished with the transfer by the time the callback is invoked.
This makes it possible to recycle a transfer from within the callback
routine for the same transfer. Previously this almost worked, but
with OHCI controllers calling the "done" method after the callback
would zero out some important fields needed by the recycled transfer.
Only some usb peripheral drivers such as ucom appear to rely on the
ability to reuse a transfer from its callback.
MFC after: 1 week
- Move vtophys() macros next to vtopte() where vtopte() exists to match
comments above vtopte().
- Remove references to the alternate address space in the comment above
vtopte(). amd64 never had the alternate address space, and i386 lost it
prior to PAE support being added.
- s/entires/entries/ in comments.
Reviewed by: alc
KTR_* class macros via genassym.c. Together with sys/sys/ktr.h
rev. 1.34 this has the desired side-effect of providing a default
value for KTR_COMPILE. Thus this fixes warnings from -Wundef
regarding KTR_COMPILE not being defined for .S files.
Requested by: ru
Reviewed by: ru
ktr_tracepoint() and the macros using it. This allows this header
to be included in .S files for obtaining the KTR_* class macros
directly and providing a default value for KTR_COMPILE in case it's
not specified in the kernel config file including defining it to 0
when not using 'options KTR' at all.
Requested by: ru
Reviewed by: ru
MACHINE_ARCH and MACHINE). Their purpose was to be able to test
in cpp(1), but cpp(1) only understands integer type expressions.
Using such unsupported expressions introduced a number of subtle
bugs, which were discovered by compiling with -Wundef.
_MACHINE == i386 test always succeeds, even on non-i386 (both
sides of expressions become 0). Remove the comment since
_MACHINE and _MACHINE_ARCH are going away.
of the radix lookup tables. Since several rnh_lookup() can run in
parallel on the same table, we can piggyback on the shared locking
provided by ipfw(4).
However, the single entry cache in the ip_fw_table can't be used lockless,
so it is removed. This pessimizes two cases: processing of bursts of similar
packets and matching one packet against the same table several times during
one ipfw_chk() lookup. To optimize the processing of similar packet bursts
administrator should use stateful firewall. To optimize the second problem
a solution will be provided soon.
Details:
o Since we piggyback on the ipfw(4) locking, and the latter is per-chain,
the tables are moved from the global declaration to the
struct ip_fw_chain.
o The struct ip_fw_table is shrunk to one entry and thus vanished.
o All table manipulating functions are extended to accept the struct
ip_fw_chain * argument.
o All table modifing functions use IPFW_WLOCK_ASSERT().
(suggested by alfred@)
o Reuse si_band field in struct __siginfo, add a mqd member which will
be used by mqueue.
o Add code SI_KERNEL to indicate a signal is queued by kernel.
Use the following kernel configuration option to enable:
options BPF_JITTER
If you want to use bpf_filter() instead (e. g., debugging), do:
sysctl net.bpf.jitter.enable=0
to turn it off.
Currently BIOCSETWF and bpf_mtap2() are unsupported, and bpf_mtap() is
partially supported because 1) no need, 2) avoid expensive m_copydata(9).
Obtained from: WinPcap 3.1 (for i386)
time ago appears to be based not on the typical 1.8432MHz clock, or
the other more typical multiple of 8 of this (14.7456MHz), but instead
it appears to be 1/2 the PCI clock rate or 16.50000MHz. I'm not 100%
sure that this is right, but since I did the original entry, I'm going
to go ahead and modify it. With the 14.7456MHz value, I was getting
bits that were ~7.3us instead of ~8.6us like they are supposed to be.
My measuring gear for today is a stupid handheld scope with two
signficant digits. So I don't know if it is 33.000000/2 MHz or some
other value close to 16.5MHz, but 16.5MHz works well enough for me to
use a couple of different devices at 115200 baud, and is a nice even
multiple of a well known clock frequency...
rather than embedding it in the intrframe as if_vec. This reduces diffs
with amd64 somewhat.
- Remove cf_vec from clockframe (it wasn't used anyway) and stop pushing
dummy vector arguments for ipi_bitmap_handler() and lapic_handle_timer()
since clockframe == trapframe now.
- Fix ddb to handle stack traces across interrupt entry points that just
have a trapframe on their stack and not a trapframe + vector.
- Change intr_execute_handlers() to take a trapframe rather than an
intrframe pointer.
- Change lapic_handle_intr() and atpic_handle_intr() to take a vector and
trapframe rather than an intrframe.
- GC struct intrframe now that nothing uses it anymore.
- GC CLOCK_TO_TRAPFRAME() and INTR_TO_TRAPFRAME().
Reviewed by: bde
Requested by: peter
ipi_nmi_handler() and into a new cpustop_handler() function. Change
the Xcpustop IPI_STOP handler to call this function instead of
duplicating all the same logic in assembly.
- EOI the local APIC for the lapic timer interrupt in C rather than
assembly.
- Bump the lazypmap IPI counter if COUNT_IPIS is defined in C rather than
assembly.
commit to atpic.c) there may not be an IRQ 13. Instead, just keep going.
If the INT16 interface doesn't work then we will eventually panic anyway.
FWIW: We could probably just axe the support for IRQ 13 altogether at this
point. The only thing we'd lose support for are 486sx systems with
external 487 FPUs.
MFC after: 1 week
working IRQ0 with APIC anymore. Previously, it was possible to have
some other ATPIC IRQS "leak" through in a few edge cases. For example, on
my x86 test machine, ACPI re-routes the SCI (IRQ 9) to intpin 13 on the
first I/O APIC. This leaves a hole for IRQ 13 (since the APIC doesn't
provide a source for IRQ 13 in that case) with the result that the ATPIC
IRQ13 source was registered instead. This changes the 8259A drivers to
only register their interrupt sources if none of the 16 ISA IRQs have an
interrupt source already installed.
MFC after: 1 week
- Add a new SET_KERNEL_SREGS macro that sets up %ds and %es to point to
kernel data and %fs to point to per-CPU data and use the new macro
in several kernel entry points including trap and interrupt handlers.
- Convert the IPI_STOP handler Xcpustop to push a standard trap frame
rather than an application frame.
- Make the TRAP() macro private to exception.s since it is only used
there.
- Move the PCPU_*() macros in asmacros.h out of the middle of the
profiling macros.
Reviewed by: bde
Requested by: bde (4, 5)
acquired anywhere in the driver now.
- Axe the spin mutex used for the nve_oslock*() routines. The driver lock
already provides sufficient synchronization.
- Don't mess around with IFF_UP when the link state changes. IFF_UP is
an administrative flag, not a link status indicator.
MFC after: 1 week
lock object (and thus off of each mutex and sx lock):
- Rename the all_locks list to pending_locks and only put locks initialized
before SI_SUB_WITNESS on the list so that the SI_SUB_WITNESS can add them
to witness once it starts up.
- Now that pending_locks is only used during early startup, change it from
a TAILQ to an STAILQ. This removes a pointer from the STAILQ_ENTRY in
struct lock_object.
- Since the pending_locks list is only used during the single-threaded
early boot it no longer needs to be protected by a mutex, so remove
all_mtx.
- Since the lo_list member of struct lock_object is now only used during
early boot before witness is running, collapse lo_list and lo_witness
into a union. This shaves the second pointer off of struct lock_object.
- Axe lock_cur_cnt and lock_max_cnt.
With these changes, struct mtx shrinks from 36 to 28 bytes on 32-bit
platforms and from 72 to 56 bytes on 64-bit platforms. Note that this
commit will completely and utterly destroy the kernel ABI, so no MFC.
Tested on: alpha, amd64, i386, sparc64
immediately from acpi_pci_link_route_interrupt() since we aren't going
to have a valid pci_link device to talk to try to route interrupts. This
fixes a page fault if you disable just pci_link. Note that trying to use
ACPI without pci_link is probably not advised however.
MFC after: 1 week
Tested by: Eugene Grosbein eugen at kuzbass dot ru
since mount_smbfs(8) assumed long name mounting by default unless "-n long"
was explicitly specified.
Rather than supplying a "long" option in mount_smbfs(8), this commit brings
back the original behaviour by associating SMBFS_MOUNT_NO_LONG with the
"nolong" option. This should fix the broken long file names on smbfs people
observed recently.
Reported by: Vladimir Grebenschikov <vova at fbsd dot ru>
Reviewed by: phk
Tested by: Slawa Olhovchenkov <slw at zxy dot spb dot ru>
the eaddr array (introduced in rev. 1.174) prior to writing to it. As
dc_read_eeprom() is told to write only 3 16-bit words to eaddr but eaddr
in fact is somewhat larger removal of the zeroing defeated the check
whether the MAC address is all zero as there can be some random garbage
in eaddr past the 3 words written to it and the check verifys all bits
in eaddr. Solve this by changing the check to verify only the 3 words
(happenning to be ETHER_ADDR_LEN bytes) written to eaddr.
- While here change the notation of "FCode" in a nearby comment to the
official way.
Ok'ed by: marcel, ru
o plug memory leak in adhoc mode: on rx the sender may be the
current master so simply checking against ic_bss is not enough
to identify if the packet comes from an unknown sender; must
also check the mac address
o split neighbor node creation into two routines and fillin state
of nodes faked up on xmit when a beacon or probe response frame
is later received; this ensures important state like the rate set
and advertised capabilities are correct
Obtained from: netbsd
MFC after: 1 week
polarity. Some machines route PCI IRQs to an ISA IRQ but fail to include
an interrupt override entry to set the polarity and trigger of the given
ISA IRQ in their MADT table.
PR: usb/74989
Reported by: Julien Gabel jpeg at thilelli dot net
MFC after: 1 week
from sys/sparc64/include/ofw_upa.h to sys/sparc64/pci/ofw_pci.h and
rename them to struct ofw_pci_ranges and OFW_PCI_RANGE_* respectively.
This ranges struct only applies to host-PCI bridges but no to other
bridges found on UPA. At the same time it applies to all host-PCI
bridges regardless of whether the interconnection bus is Fireplane/
Safari, JBus or UPA.
- While here rename the PCI_CS_* macros in sys/sparc64/pci/ofw_pci.h
to OFW_PCI_CS_* in order to be consistent and change this header to
use uintXX_t instead of u_intXX_t.
the bridge (PCI bus A or B) we are attaching to rather than registering
both handlers at once when attaching to the first half we encounter.
This is a bit cleaner as it corresponds to which PCI bus error interrupt
actually is assigned to the respective half by the OFW and allows to
collapse both PCI bus error interrupt handlers into one function easily.
- Use the actual RID of the respective interrupt resource as index into
sc_irq_res and also use it when allocating the resource. For now this
is a bit cleaner and will be mandatory later on.
- According to OpenSolaris the spare hardware interrupt is used as the
over-temperature interrupt in systems with Psycho bridges. Unlike as
with the SBus-based workstations I didn't manage to trigger it when
covering the fan outlets of an U60 but better be safe than sorry and
register a handler anyway.
MFC after: 1 month
bug by explaining what the problem is and how the workaround works.
- Fix some cosmetics nits, mainly properly terminate sentences in comments,
which I missed when backporting the style changes to psycho(4) in psycho.c
rev. 1.54 due to lack of corresponding code.
- The "USIIe version of the Sabre bridge" actually is termed "Hummingbird";
name it as such in comments and messages.
and some fixes from Motomichi Matsuzaki. Testing involved many people, but the
final, successful testing was from rwatson who endured several rounds of "it
crashes at XYZ stage" "oh, please correct this typo and try again." The Linux
driver, and to a small extent the limited specs, were both used as a reference
for how to program the chipset.
PR: kern/80396
Submitted by: Martin Mersberger
the base rcorder. This is accomplished by running rcorder twice,
first to get all the disks mounted (through mountcritremote),
then again to include the local_startup directories.
This dramatically changes the behavior of rc.d/localpkg, as
all "local" scripts that have the new rc.d semantics are now
run in the base rcorder, so only scripts that have not been
converted yet will run in rc.d/localpkg.
Make a similar change in rc.shutdown, and add some functions in
rc.subr to support these changes.
Bump __FreeBSD_version to reflect this change.
revision 1.179 to correctly set/clear execute permission on the mapping
it creates. Thus, mmap(2)ing a memory resident file will not result in
the file being mapped with execute permission when execute permission was
not requested.
Eliminate an unneeded Instruction Memory Barrier (IMB) in
pmap_enter_quick(). Since there was no previous (instruction) mapping
for the given virtual address prior to pmap_enter_quick(), there can be
no instructions from the given virtual address in the pipeline that need
flushing.
MFC after: 1 week
Update Intel MatrixRAID support to be able to pick up RAID0+1 (RAID10)
and RAID5 arrays without panic'ing.
This has the side effect of now also supporting multiple volumes on
MatrixRAID's now I have the metadata better understood..
HW sponsored by: Mullet Scandinavia AB
commit. Copy the ethernet address into a local buffer, which we know
is sufficiently aligned for the width of the memory accesses that we
do. This also eliminates all suspicious and potentionally harmful
casts.
In collaboration with: ru
plain file bsdlabel(8) always writes label at a fixed offset from
its beginning (512 bytes), regardless of the sector size. At the same
time, bsdlabel geom class expects label to be available at the very
beginning of the second sector.
As a result, images prepared in userland for media with sector size
different from 512 bytes (i.e. 2k for cdroms) are not recognized by
the tasting mechanism.
Solve the problem by always looking for the label at 512-byte offset
if we can't find it at the beginning of the second sector and sector
size is not 512 bytes.
o The only indication of error condition is NULL value returned by
the function;
o value pointed to by error argument is undefined in the case when
operation completes successfully.
Discussed with: phk
only now) symbolic links in the kernel compile directory, rather
than relying on config(8) to do this. (The changes to config(8)
will be committed separately.) This is aimed towards making the
config(8) as lightweight as possible.
Idea by: bde (all bugs are mine)
sosend(). Robert accidentally changed the snderr() macro to jump to the
out label which assumes the lock is already released rather than the
release label which drops the lock in his previous change to sosend().
This should fix the recent panics about returning from write(2) with the
socket lock held and the most recent LOR on current@.
The sys/sys/stddef.h is here for some time now to fulfil the
kernel needs. It also was not reliable due to the exists(@)
check: in an empty module directory, "make depend; mv .depend
.depend~; make depend" ran mkdep(1) with different arguments.
o Do not use ipfw_insn_pipe->pipe_ptr in locate_flowset(). The
_ipfw_insn_pipe isn't touched by this commit to preserve ABI
compatibility.
o To optimize the lookup of the pipe/flowset in locate_flowset()
introduce hashes for pipes and queues:
- To preserve ABI compatibility utilize the place of global list
pointer for SLIST_ENTRY.
- Introduce locate_flowset(queue nr) and locate_pipe(pipe nr).
o Rework all the dummynet code to deal with the hashes, not global
lists. Also did some style(9) changes in the code blocks that were
touched by this sweep:
- Be conservative about flowset and pipe variable names on stack,
use "fs" and "pipe" everywhere.
- Cleanup whitespaces.
- Sort variables.
- Give variables more meaningful names.
- Uppercase and dots in comments.
- ENOMEM when malloc(9) failed.
- S3 Savage driver ported.
- Added support for ATI_fragment_shader registers for r200.
- Improved r300 support, needed for latest r300 DRI driver.
- (possibly) r300 PCIE support, needs X.Org server from CVS.
- Added support for PCI Matrox cards.
- Software fallbacks fixed for Rage 128, which used to render badly or hang.
- Some issues reported by WITNESS are fixed.
- i915 module Makefile added, as the driver may now be working, but is untested.
- Added scripts for copying and preprocessing DRM CVS for inclusion in the
kernel. Thanks to Daniel Stone for getting me started on that.
rare case of a stray interrupt to an unregistered source (such as a stray
interrupt from the 8259As when using APIC), this could result in a page
fault when it tried to walk the list of interrupt handlers to execute
INTR_FAST handlers. This bug was introduced with the intr_event changes,
so it's not present in 5.x or 6.x.
Submitted by: Mark Tinguely tinguely at casselton dot net
process as over the limit when its time is >= to the limit rather than >
the limit. Technically, if p->p_rux.rux_runtime.sec == p->p_pcpulimit
and p->p_rux.rux_runtime.frac == 0, the process hasn't exceeded the limit
yet. However, having the fraction exactly equal to 0 is rather rare, and
it is not worth the overhead to handle that edge case. With just the >
comparison, the process would have to exceed its limit by almost a second
before it was killed.
PR: kern/83192
Submitted by: Maciej Zawadzinski mzawadzinski at gmail dot com
Reviewed by: bde
MFC after: 1 week
chains and copying in mbufs from the body of the send logic, creating
a new function sosend_copyin(). This changes makes sosend() almost
readable, and will allow the same logic to be used by tailored socket
send routines.
MFC after: 1 month
Reviewed by: andre, glebius
appeared to rely on all kinds of non-guaranteed behaviours: the
transfer abort code assumed that TDs with no interrupt timeout
configured would end up on the done queue within 20ms, the done
queue processing assumed that all TDs from a transfer would appear
at the same time, and there were access-after-free bugs triggered
on failed transfers.
Attempt to fix these problems by the following changes:
- Use a maximum (6-frame) interrupt delay instead of no interrupt
delay to ensure that the 20ms wait in ohci_abort_xfer() is enough
for the TDs to have been taken off the hardware done queue.
- Defer cancellation of timeouts and freeing of TDs until we either
hit an error or reach the final TD.
- Remove TDs from the done queue before freeing them so that it
is safe to continue traversing the done queue.
This appears to fix a hang that was reproducable with revision 1.67
or 1.68 of ulpt.c (earlier revisions had a different transfer
pattern). With certain HP printers, the command "true > /dev/ulpt0"
would cause ohci_add_done() to spin because the done queue had a
loop. The list corruption was caused by a 3-TD transfer where the
first TD completed but remained on the internal host controller
done queue because it had no interrupt timeout. When the transfer
timed out, the TD got freed and reused, so it caused a loop in the
done queue when it was inserted a second time from a different
transfer.
Reported by: Alex Pivovarov
MFC after: 1 week
application wishes to request high precision time stamps be returned:
Alias Existing
CLOCK_REALTIME_PRECISE CLOCK_REALTIME
CLOCK_MONOTONIC_PRECISE CLOCK_MONOTONIC
CLOCK_UPTIME_PRECISE CLOCK_UPTIME
Add experimental low-precision clockid_t names corresponding to these
clocks, but implemented using cached timestamps in kernel rather than
a full time counter query. This offers a minimum update rate of 1/HZ,
but in practice will often be more frequent due to the frequency of
time stamping in the kernel:
New clockid_t name Approximates existing clockid_t
CLOCK_REALTIME_FAST CLOCK_REALTIME
CLOCK_MONOTONIC_FAST CLOCK_MONOTONIC
CLOCK_UPTIME_FAST CLOCK_UPTIME
Add one additional new clockid_t, CLOCK_SECOND, which returns the
current second without performing a full time counter query or cache
lookup overhead to make sure the cached timestamp is stable. This is
intended to support very low granularity consumers, such as time(3).
The names, visibility, and implementation of the above are subject
to change, and will not be MFC'd any time soon. The goal is to
expose lower quality time measurement to applications willing to
sacrifice accuracy in performance critical paths, such as when taking
time stamps for the purpose of rescheduling select() and poll()
timeouts. Future changes might include retrofitting the time counter
infrastructure to allow the "fast" time query mechanisms to use a
different time counter, rather than a cached time counter (i.e.,
TSC).
NOTE: With different underlying time mechanisms exposed, using
different time query mechanisms in the same application may result in
relative non-monoticity or the appearance of clock stalling for a
single clockid_t, as a cached time stamp queried after a precision
time stamp lookup may be "before" the time returned by the earlier
live time counter query.
This shouldn't happen as far as the self-id buffer is vaild but
some people have this problem.
PR: kern/83999
Submitted by: Markus Wild <fbsd-lists@dudes.ch>
MFC after: 3 days
- In ifc_name2unit(), disallow leading zeroes in a unit.
Exploit: ifconfig lo01 create
- In ifc_name2unit(), properly handle overflows. Otherwise,
either of two local panic()'s can occur, either because
no interface with such a name could be found after it was
successfully created, or because the code will bogusly
assume that it's a wildcard (unit < 0 due to overflow).
Exploit: ifconfig lo<overflowed_integer> create
- Previous revision made the following sequence trigger
a KASSERT() failure in queue(3):
Exploit: ifconfig lo0 destroy; ifconfig lo0 destroy
This is because IFC_IFLIST_REMOVE() is always called
before ifc->ifc_destroy() has been run, not accounting
for the fact that the latter can fail and leave the
interface operating (like is the case for "lo0").
So we ended up calling LIST_REMOVE() twice. We cannot
defer IFC_IFLIST_REMOVE() until after a call to
ifc->ifc_destroy() because the ifnet may have been
removed and its memory has been freed, so recover from
this by re-inserting the ifnet in the cloned interfaces
list if ifc->ifc_destroy() indicates a failure.