An mbuf packet chain with the M_PROMISC flag set contains a unicast packet
received by the link layer, which does not correspond to any configured
link layer address in the local system.
It is copied when copying m_pkthdr. It is not cleared when crossing layers.
As such, it is defined to have a flag value which is outside of the
M_PROTO* range, like M_VLANTAG has.
Reviewed by: andre
Obtained from: NetBSD
been set at the socket layer, in our somewhat convoluted IPv4 source
selection logic in ip_output().
IP_ONESBCAST is actually a special case of SO_DONTROUTE, as 255.255.255.255
must always be delivered on a local link with a TTL of 1.
If IP_ONESBCAST has been set at the socket layer, also perform destination
interface lookup for point-to-point interfaces based on the destination
address of the link; previously it was not possible to use the option with
such interfaces; also, the destination/broadcast address fields map to the
same field within struct ifnet, which doesn't help matters.
One more valid fix going forward for these issues is to treat 255.255.255.255
as a destination in its own right in the forwarding trie. Other
implementations do this. It fits with the use of multiple paths, though
it then becomes necessary to specify interface preference.
This hack will eventually go away when that comes to pass.
Reviewed by: andre
MFC after: 1 week
and optimize away unused stack values. The 48 bytes that the lock_profile_object
adds to the stack evidently has a measurable performance impact on certain workloads.
uipc_send in cases where only a global read lock is held by breaking
them out and avoiding the unpcb lock acquire in the common case. This
avoids deadlocks which manifested with X11, and should also marginally
further improve performance.
Reported by: sepotvin, brooks
Add macro EVL_APPLY_VLID() which may be used to apply an 802.1q VLAN ID
to the M_VLANTAG field in an mbuf packet header non-destructively.
This will be used by net80211 to begin with.
Add macro EVL_APPLY_PRI() which may be used to apply an 802.1p priority
class to the M_VLANTAG field in an mbuf packet header non-destructively.
Add other macros for manipulating tags and the CFI bit.
Submitted by: Boris Kovalenko (EVL_CFIOFTAG(), EVL_MAKETAG())
to a READ_CAPACITY request rather than the maximum sector (off by one
problem). This causes a huge cascade of errors as the geom tasting
code tries to read the last sector (which isn't really there in the
face of this error). automated tools that manipulate disk labels and
such also have issues.
Create a new quirk READ_CAPACITY_OFFBY1 and add a quirk for the
SanDISK ImageMate that I have that suffers from this problem (the
SDDR-31). It intercepts the READ_CAPACITY response and adjusts it
from number of sectors to max sector for devices with this quirk.
Reading the Linux source suggests that there are a host of
other devices with this issue, including iPods and some popular
cameras. I've not added quirks for them, since I don't have the
devices in front of me to test.
it is initialized; use path instead.
This change fixes a panic when using atapicam in conjunction with CAMDEBUG,
which has been described under kern/103602.
Thanks to Josh Carroll <josh.carroll@gmail.com> for providing the traces
that allowed identifying this problem.
PR: kern/103602
MFC after: 1 week
- Fix missing initialization in kern_rwlock.c causing bogus times to be collected
- Move updates to the lock hash to after the lock is released for spin mutexes,
sleep mutexes, and sx locks
- Add new kernel build option LOCK_PROFILE_FAST - only update lock profiling
statistics when an acquisition is contended. This reduces the overhead of
LOCK_PROFILING to increasing system time by 20%-25% which on
"make -j8 kernel-toolchain" on a dual woodcrest is unmeasurable in terms
of wall-clock time. Contrast this to enabling lock profiling without
LOCK_PROFILE_FAST and I see a 5x-6x slowdown in wall-clock time.
arrangement that has no intrinsic internal knowledge of whether devices
it is given are truly multipath devices. As such, this is a simplistic
approach, but still a useful one.
The basic approach is to (at present- this will change soon) use camcontrol
to find likely identical devices and and label the trailing sector of the
first one. This label contains both a full UUID and a name. The name is
what is presented in /dev/multipath, but the UUID is used as a true
distinguishor at g_taste time, thus making sure we don't have chaos
on a shared SAN where everyone names their data multipath as "Fred".
The first of N identical devices (and N *may* be 1!) becomes the active
path until a BIO request is failed with EIO or ENXIO. When this occurs,
the active disk is ripped away and the next in a list is picked to
(retry and) continue with.
During g_taste events new disks that meet the match criteria for existing
multipath geoms get added to the tail end of the list.
Thus, this active/passive setup actually does work for devices which
go away and come back, as do (now) mpt(4) and isp(4) SAN based disks.
There is still a lot to do to improve this- like about 5 of the 12
recommendations I've received about it, but it's been functional enough
for a while that it deserves a broader test base.
Reviewed by: pjd
Sponsored by: IronPort Systems
MFC: 2 months
Linux does not check file descriptor when MAP_ANONYMOUS is set.
This should fix recent LTP test regressions.
Reported by: Scot Hetzel (swhetzel at gmail dot com)
netchild
case where it asynchronously exits burst mode on its own. Handle different
values of hz in sleep loop. Provide more debugging options to tune EC
behavior. These tunables/sysctls may be temporary and are not for user
access if the EC is working properly. Burst mode is now on by default for
testing and the poll interval has been increased from 100 to 500 us and
total timeout from 100 to 500 ms.
Hopefully this should be the first step of addressing reports of timeout
errors during battery or thermal access, especially on HP/Compaq laptops.
It is reasonably stable and should not cause a loss of functionality or
performance on systems that were previously working. Testing shows an
increase of responsiveness by ~75% on one system.
PR: kern/98171
potential issues where the peer does not close, potentially leaving
thousands of connections in FIN_WAIT_2. This is controlled by a new sysctl
fast_finwait2_recycle, which is disabled by default.
Reviewed by: gnn, silby.
- BIOCGDIRECTION and BIOCSDIRECTION get or set the setting determining
whether incoming, outgoing, or all packets on the interface should be
returned by BPF. Set to BPF_D_IN to see only incoming packets on the
interface. Set to BPF_D_INOUT to see packets originating locally and
remotely on the interface. Set to BPF_D_OUT to see only outgoing
packets on the interface. This setting is initialized to BPF_D_INOUT
by default. BIOCGSEESENT and BIOCSSEESENT are obsoleted by these but
kept for backward compatibility.
- BIOCFEEDBACK sets packet feedback mode. This allows injected packets
to be fed back as input to the interface when output via the interface is
successful. When BPF_D_INOUT direction is set, injected outgoing packet
is not returned by BPF to avoid duplication. This flag is initialized to
zero by default.
Note that libpcap has been modified to support BPF_D_OUT direction for
pcap_setdirection(3) and PCAP_D_OUT direction is functional now.
Reviewed by: rwatson
concurrency:
- Add per-unpcb mutexes protecting unpcb connection state, fields, etc.
- Replace global UNP mutex with a global UNP rwlock, which will protect the
UNIX domain socket connection topology, v_socket, and be acquired
exclusively before acquiring more than per-unpcb at a time in order to
avoid lock order issues.
In performance measurements involving MySQL, this change has little or no
overhead on UP (+/- 1%), but leads to a significant (5%-30%) improvement in
multi-processor measurements using the sysbench and supersmack benchmarks.
Much testing by: kris
Approved by: re (kensmith)
determine if it holds an exclusive rwlock reference or not. This is
non-ideal, but recursion scenarios in the network stack currently
require it.
Approved by: jhb
call which can easily lock up a system otherwise; instead,
return ENOBUFS as documented in a manpage, thus reverting
us to the FreeBSD 4.x behavior.
Reviewed by: rwatson
MFC after: 2 weeks
- only collect timestamps when a lock is contested - this reduces the overhead
of collecting profiles from 20x to 5x
- remove unused function from subr_lock.c
- generalize cnt_hold and cnt_lock statistics to be kept for all locks
- NOTE: rwlock profiling generates invalid statistics (and most likely always has)
someone familiar with that should review
attribute. Also define some macros to manipulate one of these
structures. Explain their use in the extattr.9 manual page.
The next step will be to make a sweep through the kernel replacing
the old pointer manipulation code. To get an idea of how they would
be used, the ffs_findextattr() function in ufs/ffs/ffs_vnops.c is
currently written as follows:
/*
* Vnode operating to retrieve a named extended attribute.
*
* Locate a particular EA (nspace:name) in the area (ptr:length), and return
* the length of the EA, and possibly the pointer to the entry and to the data.
*/
static int
ffs_findextattr(u_char *ptr, u_int length, int nspace, const char *name,
u_char **eap, u_char **eac)
{
u_char *p, *pe, *pn, *p0;
int eapad1, eapad2, ealength, ealen, nlen;
uint32_t ul;
pe = ptr + length;
nlen = strlen(name);
for (p = ptr; p < pe; p = pn) {
p0 = p;
bcopy(p, &ul, sizeof(ul));
pn = p + ul;
/* make sure this entry is complete */
if (pn > pe)
break;
p += sizeof(uint32_t);
if (*p != nspace)
continue;
p++;
eapad2 = *p++;
if (*p != nlen)
continue;
p++;
if (bcmp(p, name, nlen))
continue;
ealength = sizeof(uint32_t) + 3 + nlen;
eapad1 = 8 - (ealength % 8);
if (eapad1 == 8)
eapad1 = 0;
ealength += eapad1;
ealen = ul - ealength - eapad2;
p += nlen + eapad1;
if (eap != NULL)
*eap = p0;
if (eac != NULL)
*eac = p;
return (ealen);
}
return(-1);
}
After applying the structure and macros, it would look like this:
/*
* Vnode operating to retrieve a named extended attribute.
*
* Locate a particular EA (nspace:name) in the area (ptr:length), and return
* the length of the EA, and possibly the pointer to the entry and to the data.
*/
static int
ffs_findextattr(u_char *ptr, u_int length, int nspace, const char *name,
u_char **eapp, u_char **eac)
{
struct extattr *eap, *eaend;
eaend = (struct extattr *)(ptr + length);
for (eap = (struct extattr *)ptr; eap < eaend; eap = EXTATTR_NEXT(eap)){
/* make sure this entry is complete */
if (EXTATTR_NEXT(eap) > eaend)
break;
if (eap->ea_namespace != nspace ||
eap->ea_namelength != length ||
bcmp(eap->ea_name, name, length))
continue;
if (eapp != NULL)
*eapp = eap;
if (eac != NULL)
*eac = EXTATTR_CONTENT(eap);
return (EXTATTR_CONTENT_SIZE(eap));
}
return(-1);
}
Not only is it considerably shorter, but it hopefully more readable :-)
PRIO_USER case, possibly also other places that deferences
p_ucred.
In the past, we insert a new process into the allproc list right
after PID allocation, and release the allproc_lock sx. Because
most content in new proc's structure is not yet initialized,
this could lead to undefined result if we do not handle PRS_NEW
with care.
The problem with PRS_NEW state is that it does not provide fine
grained information about how much initialization is done for a
new process. By defination, after PRIO_USER setpriority(), all
processes that belongs to given user should have their nice value
set to the specified value. Therefore, if p_{start,end}copy
section was done for a PRS_NEW process, we can not safely ignore
it because p_nice is in this area. On the other hand, we should
be careful on PRS_NEW processes because we do not allow non-root
users to lower their nice values, and without a successful copy
of the copy section, we can get stale values that is inherted
from the uninitialized area of the process structure.
This commit tries to close the race condition by grabbing proc
mutex *before* we release allproc_lock xlock, and do copy as
well as zero immediately after the allproc_lock xunlock. This
guarantees that the new process would have its p_copy and p_zero
sections, as well as user credential informaion initialized. In
getpriority() case, instead of grabbing PROC_LOCK for a PRS_NEW
process, we just skip the process in question, because it does
not affect the final result of the call, as the p_nice value
would be copied from its parent, and we will see it during
allproc traverse.
Other potential solutions are still under evaluation.
Discussed with: davidxu, jhb, rwatson
PR: kern/108071
MFC after: 2 weeks
semi-automatic style(9)
The futex stuff already differs a lot (only a small part does not differ)
from NetBSD, so we are already way off and can't apply changes from NetBSD
automatically. As we need to merge everything by hand already, we can even
make the files comply to our world order.
(external) microphone pin tend to screw it. Internal microphone (found
on several laptops) still need high VRef.
Tested by: Pietro Cerutti <pietro.cerutti@gmail.com>
lenix <irc.freenode.net>
immediately flag any page that is allocated to a OBJT_PHYS object as
unmanaged in vm_page_alloc() rather than waiting for a later call to
vm_page_unmanage(). This allows for the elimination of some uses of
the page queues lock.
Change the type of the kernel and kmem objects from OBJT_DEFAULT to
OBJT_PHYS. This allows us to take advantage of the above change to
simplify the allocation of unmanaged pages in kmem_alloc() and
kmem_malloc().
Remove vm_page_unmanage(). It is no longer used.
It is built in the same module as IPv4 multicast forwarding, i.e. ip_mroute.ko,
if and only if IPv6 support is enabled for loadable modules.
Export IPv6 forwarding structs to userland netstat(1) via sysctl(9).
- Dont "return" in linux_clone() after we forked the new process in a case
of problems.
- Move the copyout of p2->p_pid outside the emul_lock coverage in
linux_clone().
- Cache the em->pdeath_signal in a local variable and move the copyout
out of the emul_lock coverage.
- Move the free() out of the emul_shared_lock coverage in a preparation
to switch emul_lock to non-sleepable lock (mutex).
Submitted by: rdivacky
an ICB. This shows up on card restarts, and usually for
2200-2300 cards. What happens is that we start up,
attempting to acquire a hard address. We end up instead
being an F-port topology, which reports out a loop id
of 0xff (or 0xffff for 2K Login f/w). Then, if we restart,
we end up telling the card to go off an acquire this loop
address, which the card then rejects. Bah.
Compilation fixes from Solaris port.
I created and tested this with a custom FreeSBIE cd-image.
PR: i386/96452
Submitted by: Yuichiro Goto <y7goto at gmail dot com>
MFC after: 3 days
Approved by: imp (mentor)
inode's i_flag.
It's possible that after ufs_infactive() calls softdep_releasefile(),
i_nlink stays >0 for a considerable amount of time (> 60 seconds here).
During this period, any ffs allocation routines that alter di_blocks
must also account for the blocks in the filesystem's fs_pendingblocks
value.
This change fixes an eventual df/du discrepency that will happen as
the result of fs_pendingblocks being reduced to <0.
The only manifestation of this that people may recognise is the
following message on boot:
/somefs: update error: blocks -N files M
at which point the negative pending block count is adjusted to zero.
Reviewed by: tegge
MFC after: 3 weeks
freshly-loaded kernel module. To avoid various unload races, hide linker
files whose sysinit's are being run from userland so that they can't be
kldunloaded until after all the sysinit's have finished.
Tested by: gallatin
changes. This should ease the job of maintaining codebase since much
of the regression tests are done across os versions.
- bus_setup_intr() -> snd_setup_intr().
triggers a KASSERT) or local variables. In the case of kern_ndis, the
tsleep() actually used a common sleep address (curproc) making it
susceptible to a premature wakeup.
want an equivalent of DELAY(9) that sleeps instead of spins. It accepts
a wmesg and a timeout and is not interrupted by signals. It uses a private
wait channel that should never be woken up by wakeup(9) or wakeup_one(9).
Glanced at by: phk
Use bus_get_dma_tag() to obtain the parent DMA tag to make the drivers
a little bit more non-ia32/amd64 friendly.
There is no man page for bus_get_dma_tag, so this is modelled after
rev. 1.62 of src/sys/dev/sound/pci/es137x.c by marius.
Inspired by: commit by marius
attachment of new devices that arrive (and we notice them
via async Fibre Channel events). We've always had the
right thing (of sorts) happen when devices go away- this
is the corollary function that makes multipath failover
actually work.
MFC after: 2 weeks
rescan requests. The purpose of this is to allow a SIM
(or other entities) to request a bus rescan and have it
then fielded in a different (process) context from the
caller.
There are probably better ways to accomplish this, but
it's a very small change that helps solve a number of
problems.
Reviewed by: Justin, Ken and Scott.
MFC after: 2 weeks
early, we haven't set board type, so we can't correctly check for
some options. Fix this by splitting option setting/getting into
generic, pci and then later board specific, option setting/getting.
This was noticed when setting 'iid' (or 'hard loop id') didn't work
all of a sudden.
Noticed by: Mike Drangula (thanks!) via Jung-uk Kim (thanks!)
incoming packets have had their 802.1Q tags processed by the
hardware, resulting in them being stripped from the packets, and
placed on the mbuf. This fixes the processing of 802.1Q tags when
hardware offload of 802.1Q tags is enabled.
check that the subject has read/write access to the vnode using the
vnode MAC check.
MFC after: 3 weeks
Submitted by: Spencer Minear <spencer_minear at securecomputing dot com>
Obtained from: TrustedBSD Project
a link-layer multicast group membership.
Such memberships are needed in order to support protocols such as
IS-IS without putting the interface into PROMISC or ALLMULTI modes.
sa_equal() is not OK for comparing sockaddr_dl as it has deeper structure
than a simple byte array, so add sa_dl_equal() and use that instead.
Reviewed by: rwatson
Verified with: /usr/sbin/mtest
Bug found by: Jouke Witteveen
MFC after: 2 weeks
addition of SerDes support. According to the docs, the 5706C and 5708C
phys are supposed to use the same MII model that is separate from the
SerDes parts, but the 5706C actually uses the MII model of the SerDes
parts. To fix this, readd the old 5706C entry to miidevs and add a
special check in brgphy_probe() for phys that match the 5706C ID. If
the phy is supported by the gentbi(4) driver, then it's a SerDes phy, so
we fail the probe and let gentbi(4) grab it. Otherwise, it's a 5706C phy,
so we let brgphy(4) grab it.
In coordination with: dwhite
treated as multicast frames and filtered, but when only when "adopting"
running firmware. By "adopting", I mean using pre-existing firmware
loaded from eeprom at PCI reset, rather than firmware loaded by the
driver.
non consecutively numbered ports.
This should fix current SATA problems.
Support AHCI chips where the ports are not consecutively numbered as in
some incarnations of the ICH8 chip.
flash card reader.
Also remove an 'Opened da0 -> <random number>' which is not needed on a daily
basis (available through bootverbose).
Reviewed by: phk, ken
MFC after: 1 week
and make it print under debug.iwi control same as other debugging stuff.
Remove the device_printf() in iwi_ioctl() and replace with this:
/*
* wait until pending iwi_cmd() are completed, to avoid races
* that could cause problems.
*/
while (sc->flags & IWI_FLAG_BUSY)
msleep(sc, &sc->sc_mtx, 0, "iwiioctl", hz);
This at least prevents what has become an almost systematic failure for my
system, presumably due to a previous iwi_cmd() not complete yet by the
time iwi_ioctl() is called.
It has been pointed to my attention that the real problem could be
calling ieee80211_ioctl() with the lock held. If that is true,
there might still be a possibility for a race condition e.g. an
interrupt coming while the ioctl is sleeping.
Need to investigate further on what changes are required to release
the lock before calling ieee80211_ioctl
+ do not release the dma-ble region used for downloading firmware.
This should fix the problems that some people were seeing, due to
memory becoming too fragmented which prevented subsequent allocations
of a suitable contiguous region of memory;
+ document the firmware format and usage in if_iwivar.h
+ use a loop to allocate the four tx rings, instead of replicating
the body of the loop.
+ add debugging code IWI_LOCK_ASSERT() to detect missing locks.
These only do a printf, and should go away once we figure out why
the driver sometimes freezes the system due to a (yet unidentified)
race condition.
+ add a device_printf() in iwi_ioctl() in certain conditions
(see comment in the code). This helps preventing the race condition
mentioned above, and makes the system survive. This printf will
also go away once fixing this bug is completed.
+ change iwi_getfw() to return 0 on success, 1 on error, consistently
with other functions.
+ fix the argument of a sizeof() in iwi_get_firmware()
+ use le32toh() to access little-endian fields
+ simplify error handling in iwi_load_firmware() and iwi_init_locked()
The bugs fixed by this commit (the freezing one especially) are serious
enough to call for a quick MFC
MFC after: 3 days
garbage collection complications from general discussion of UNIX domain
sockets.
Staticize unp_addsockcred().
Remove XXX comment regarding Giant and v_socket -- v_socket is protected
by the global UNIX domain socket lock.
System V shared memory, now believed fixed in sysv_shm.c:1.109:
date: 2006/11/06 13:42:01; author: rwatson; state: Exp; lines: +65 -37
Sweep kernel replacing suser(9) calls with priv(9) calls, assigning
specific privilege names to a broad range of privileges. These may
require some future tweaking.
Sponsored by: nCircle Network Security, Inc.
Obtained from: TrustedBSD Project
Discussed on: arch@
Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri,
Alex Lyashkov <umka at sevcity dot net>,
Skip Ford <skip dot ford at verizon dot net>,
Antoine Brodin <antoine dot brodin at laposte dot net>
This restores fine-grained privilege support to System V IPC.
PR: 106078
ignored on other systems I investigated when accessing an existing
memory segment rather than creating a new one. This call to ipcperm()
is the only one to pass in a complete mode flag to the permission
checks rather than a simple access request mask, and caused problems
for the revised ipcperm() based on the priv(9) interface, which can
now be restored.
PR: 106078
VFS privilege namespace: exceedquota, getquota, and setquota. Leave
UFS-specific quota configuration privileges in the UFS name space.
This renumbers VFS and UFS privileges, so requires rebuilding modules
if you are using security policies aware of privilege identifiers.
This is likely no one at this point since none of the committed MAC
policies use the privilege checks.
set real-time priority on a thread. It looks like this suser(9)
call was introduced after my first pass through replacing superuser
checks with named privilege checks.
As consequence, getdirentries() no longer needs to drop/reacquire
directory vnode lock, that would allow it to be reclaimed in between.
Reported and tested by: Peter Holm
Approved by: rodrigc (unionfs)
MFC after: 1 week
Rounding addr upwards to next 2M boundary in pmap_growkernel() could
cause addr to become 0, resulting in an early return without populating
the last PDE.
Reported and tested by: kris
Suggested by: alc
MFC after: 1 week
mapped at, and LOADERRAMADDR, the address at which the loader maps the ram at
at the time the kernel is booted.
They are used to detect if the kernel is booted from the onboard flash.
Define those for the IQ31244
(1) change debounce period from 1s to 250ms. This appears to be fine and
speeds things up a little.
(2) In the middle of cbb_pcic_power_disable_socket we write 0 to the EXCA_INTR
register to put the card into reset. However, this turns off CSC
interrupts for TI bridges (and maybe others). So no further card
insertion events would be noticed. To compensate, after we've gone
through the entire power down sequence, turn on EXCA_INTR_ENABLE so
that CSC events happen.
#2 should fix the 'dead slot' problem that has been reported after
card ejection (but only 16-bit cards).
This way we may support multiple structures in v_data vnode field within
one file system without using black magic.
Vnode-to-file-handle should be VOP in the first place, but was made VFS
operation to keep interface as compatible as possible with SUN's VFS.
BTW. Now Solaris also implements vnode-to-file-handle as VOP operation.
VFS_VPTOFH() was left for API backward compatibility, but is marked for
removal before 8.0-RELEASE.
Approved by: mckusick
Discussed with: many (on IRC)
Tested with: ufs, msdosfs, cd9660, nullfs and zfs
a version that i posted earlier on the -current mailing list,
and subsequent feedback received.
The core of the change is just in sys/firmware.h and kern/subr_firmware.c,
while other files are just adaptation of the clients to the ABI change
(const-ification of some parameters and hiding of internal info,
so this is fully compatible at the binary level).
In detail:
- reduce the amount of information exported to clients in struct firmware,
and constify the pointer;
- internally, document and simplify the implementation of the various
functions, and make sure error conditions are dealt with properly.
The diffs are large, but the code is really straightforward now (i hope).
Note also that there is a subtle issue with the implementation of
firmware_register(): currently, as in the previous version, we just
store a reference to the 'imagename' argument, but we should rather
copy it because there is no guarantee that this is a static string.
I realised this while testing this code, but i prefer to fix it in
a later commit -- there is no regression with respect to the past.
Note, too, that the version in RELENG_6 has various bugs including
missing locks around the module release calls, mishandling of modules
loaded by /boot/loader, and so on, so an MFC is absolutely necessary
there. I was just postponing it until this cleanup to avoid doing
things twice.
MFC after: 1 week
of the special handling for ".." and perform an ISDOTDOT VOP_LOOKUP()
for a filesystem root vnode. Handle this case inside lookup().
Submitted by: tegge
PR: 92785
MFC after: 1 week
device pointers. They don't change as the children device drivers
come and go. Rather, check to see if the device is attached where we
would have checked ! NULL. This solves many asymmetries in the code
that likely could lead to crashes when loading/unloading cbb without
one or more of the expected children's driver not present.
o When detaching all children, try really hard to get all the children
list before giving up. This is based on an observation by hans petter
selasky in his usb p4 branch.
o When rescanning devices after a driver is added, abort if we can't get
the child list with a message.
o when rescanning devices, if the reprobe/attach is successful, save the
device for cardbus/pccard.
Unlike other GigEs Yukon II always set VLAN bit when it detects VLAN
tagged packet regardless of H/W VLAN processing configuration state.
So it need to check IFCAP_VLAN_HWTAGGING bit to know whether driver
is configured to take advantage of H/W VLAN processing. If H/W VLAN
processing was disabled don't adjust received packet length such that
subsequent validation logic works for software VLAN processing.
Reported by: bms
Tested by: bms
vm_page_free_toq() to account for recent changes that allow
vm_page_free_toq() to be called on some pages without the page queues lock
being held, specifically, pages that are not contained in a vm object and
not a member of a page queue. (Examples of such pages include page table
pages, pv entry pages, and uma small alloc pages.)
- PROT_READ, PROT_WRITE, or PROT_EXEC implies PROT_READ and PROT_EXEC.
Linux/ia64's i386 emulation layer does this and it complies with Linux
header files. This fixes mmap05 LTP test case on amd64.
- Do not adjust stack size when failure has occurred.
- Synchronize i386 mmap/mprotect with amd64.
blacklist a bunch of old chipsets. If a system contains a PCI-PCI bridge
that supports PCI-X, assume the chipset supports PCI-X. If a system
contains a PCI-express root port, assume the chipset supports PCI-express.
If the chipset doesn't support either PCI-X or PCI-express, then blacklist
it by default. We should now only need to explicitly blacklist PCI-X or
PCI-express chipsets that don't properly handle MSI.
broke the method as all the MSI-X table indices were off by one in
the backend MD code.
- Fix a cosmetic nit in the bootverbose printf in pci_alloc_msix_method().
sonewconn() in unp_connect(). This avoids a race that occurs due to
v_socket being an uncounted reference, as the lock was being released in
order to call sonewconn(), which otherwise recurses into the UNIX domain
socket code via pru_attach, as well as holding the lock over a sleeping
memory allocation in uipc_attach(). Switch to a non-sleeping memory
allocation during UNIX domain socket attach.
This fix non-ideal in that it requires enabling recursion, but is a much
smaller change than moving to using true references for v_socket. The
reported panic occurs in unp_connect() following the return of
sonewconn().
Update copyright year.
Panic reported by: jhb
is actually being added to the hold queue, not the free queue. At the same
time, avoid unnecessary tests to wake up threads waiting for free memory
and the idle thread that zeroes free pages. (These tests will be performed
later when the page finally moves from the hold queue to the free queue.)
observation here is that it doesn't matter what garbage accumulates in
bits which we're going to end up masking away anyway, as long as the
garbage doesn't overflow into bits which we care about.
This improved version may not be the fastest possible on all systems,
but it's certainly going to be better than what was here before.
- Add sigacts locking.
- Add a mutex to struct sigacts that protects all the members of the struct.
- Create and log events via the CTRx macros.
Reviewed by: cognet
doing a CLEARFILE option. Do a vrele instead. This prevents
a panic later due to v_writecount being negative when the vnode
is taken off the freelist.
Submitted by: jhb
- ZONE get now also take a type cast so it does the
cast like mtod does.
- New macro SCTP_LIST_EMPTY, which in bsd is just
LIST_EMPTY
- Removal of const in some of the static hmac functions
(not needed)
- Store length changes to allow for new fields in auth
- Auth code updated to current draft (this should be the
RFC version we think).
- use uint8_t instead of u_char in LOOPBACK address comparison
- Some u_int32_t converted to uint32_t (in crc code)
- A bug was found in the mib counts for ordered/unordered
count, this was fixed (was referencing a freed mbuf).
- SCTP_ASOCLOG_OF_TSNS added (code will probably disappear
after my testing completes. It allows us to keep a
small log on each assoc of the last 40 TSN's in/out and
stream assignment. It is NOT in options and so is only
good for private builds.
- Some CMT changes in prep for Jana fixing his problem
with reneging when CMT is enabled (Concurrent Multipath
Transfer = CMT).
- Some missing mib stats added.
- Correction to number of open assoc's count in mib
- Correction to os_bsd.h to get right sha2 macros
- Add of special AUTH_04 flags so you can compile the code
with the old format (in case the peer does not yet support
the latest auth code).
- Nonce sum was incorrectly being set in when ecn_nonce was
NOT on.
- LOR in listen with implicit bind found and fixed.
- Moved away from using mbuf's for socket options to using
just data pointers. The mbufs were used to harmonize
NetBSD code since both Net and Open used this method. We
have decided to move away from that and more conform to
FreeBSD style (which makes more sense).
- Very very nasty bug found in some of my "debug" code. The
cookie_how collision case tracking had an endless loop in
it if you got a second retransmission of a cookie collision
case. This would lock up a CPU .. ugly..
- auth function goes to using size_t instead of int which
conforms to socketapi better
- Found the nasty bug that happens after 9 days of testing.. you
get the data chunk, deliver it and due to the reference to a ch->
that every now and then has been deleted (depending on the postion
in the mbuf) you have an invalid ch->ch.flags.. and thus you don't
advance the stream sequence number.. so you block the stream
permanently. The fix is to make local variables of these guys
and set them up before you have any chance of trimming the
mbuf.
- style fix in sctp_util.h, not sure how this got bad maybe in
the last patch? (aka it may not be in the real source).
- Found interesting bug when using the extended snd/rcv info where
we would get an error on receiving with this. Thats because
it was NOT padded to the same size as the snd_rcv info. We
increase (add the pad) so the two structs are the same size
in sctp_uio.h
- In sctp_usrreq.c one of the most common things we did for
socket options was to cast the pointer and validate the size.
This as been macro-ized to help make the code more readable.
- in sctputil.c two things, the socketapi class found a missing
flag type (the next msg is a notification) and a missing
scope recovery was also fixed.
Reviewed by: gnn
to become negative. This will detect the underflow when it
happens, instead of having it discovered when the vnode is
taken off the freelist, long after the offending process is long
gone.
boot by MD code to indicated detected alignment preference. Rather than
cache alignment being encoded in UMA consumers by defining a global
alignment value of (16 - 1) in UMA_ALIGN_CACHE, UMA_ALIGN_CACHE is now
a special value (-1) that causes UMA to look at registered alignment. If
no preferred alignment has been selected by MD code, a default alignment
of (16 - 1) will be used.
Currently, no hardware platforms specify alignment; architecture
maintainers will need to modify MD startup code to specify an alignment
if desired. This must occur before initialization of UMA so that all UMA
zones pick up the requested alignment.
Reviewed by: jeff, alc
Submitted by: attilio
vm_page_alloc() from within a critical section in pmap_growkernel().
Since the need for a critical section may never have existed in the
first place, simply get rid of it.
Discussed with: alc@
IGMPMSG_WHOLEPKT notifications to the userland PIM routing daemon,
as an optimization to mitigate the effects of high multicast
forwarding load.
This is an experimental change, therefore it must be explicitly enabled by
setting the sysctl/tunable net.inet.pim.squelch_wholepkt to a non-zero value.
The tunable may be set from the loader or from within the kernel environment
when loading ip_mroute.ko as a module.
Submitted by: edrt <edrt at citiz.net>
See also: http://mailman.icsi.berkeley.edu/pipermail/xorp-users/2005-June/000639.html
Make PIM dynamically loadable by using encap_attach_func().
PIM may now be loaded into a GENERIC kernel.
Tested with: ports/net/pimdd && tcpreplay && wireshark
Reviewed by: Pavlin Radoslavov
it isn't used in the access control decision. This became visible to
Coverity with the change to a function call retrieving label values.
Coverity CID: 1723
tries to drop the reference count after our close routine returns.
A more correct fix is to defer the destroy_dev() to a taskqueue(either
in devfs or locally).
Reminded by: jhb
addressing if a packet is later re-encapsulated and sent to a
non-broadcast, non-multicast destination after being received on the
ng_ksocket input hook.
PR: 106999
Submitted by: Kevin Lahey
MFC after: 4 weeks
device specific d_close(), which makes subsequent destroy_dev() being
blocked in the "devdrn" loop.
This bandaid should fix the smbfs hang/crashing observed on -CURRENT since
the introduction of sys/kern/kern_conf.c:1.199:
# mount_smbfs -I server //server/share /mnt
Password:
[hang]
Reviewed by: bp
See also: http://lists.freebsd.org/pipermail/cvs-src/2006-November/071379.html
by the token bucket filter will result in EINVAL being returned.
If you want to rate-limit traffic in future, use ALTQ or dummynet; this
isn't a general purpose QoS engine.
Preserve the now unused fields in struct vif so as to avoid having to
recompile netstat(1) and other tools.
Reviewed by: Pavlin Radslavov, Bill Fenner
Bugfix for the Realtek PHY driver... an RTL8201L standalone PHY
needs different handling than the integrated ones in terms of
speed detection. There was a bogus test based on the parent
device driver name string controlling which speed register to
query. That test began failing when the rl driver was split into
separate rl and re drivers some time ago. Apparently nobody ever
noticed because the buggy code only executes if NWAY negotiation
failed. Since we happen to be testing with an ancient dumb hub
rather than a modern switch, we found it.
To fix it all, have the attach() routine notice whether we're
dealing with an integrated PHY or an RTL8201L and store that info
in a struct accessible to the status() routine that needs to know
which register to query.
I touched up the fixes because they were relative to RELENG_6 and to
bring a few nits into line with style(9).
MFC After: 2 weeks
Submitted by: Ian Lepore
tunable allowing automatic parsing of VPD data to be disabled. The
default is left as-is; if you are having problems with hard hangs at boot
due to VPD, try setting hw.pci.enable_vpd=0. A proper architectural
solution has been under discussion for some time, but this allows me to
boot my test machines in the mean time.
Submitted by: bz
Head nod: jmg
- Fix these types in ULE as well. This fixes bugs in priority index
calculations in certain edge cases. (int)-1 % 64 != (uint)-1 % 64.
Reported by: kkenn using pho's stress2.
partitioning class that supports multiple schemes. Current
schemes supported are APM (Apple Partition Map) and GPT.
Change all GEOM_APPLE anf GEOM_GPT options into GEOM_PART_APM
and GEOM_PART_GPT (resp).
The ctlreq interface supports verbs to create and destroy
partitioning schemes on a disk; to add, delete and modify
partitions; and to commit or undo changes made.
- Restore support for fetching swap information from crash dumps via
kvm_get_swapinfo(3) to fix pstat -T/-s on crash dumps.
Reviewed by: arch@, phk
MFC after: 1 week
never used them; with mrouted, their functionality may be replaced by
explicitly configuring gif(4) instances and specifying them with the
'phyint' keyword.
Bump __FreeBSD_version to 700030, and update UPDATING.
A doc update is forthcoming.
Discussed on: net
Reviewed by: fenner
MFC after: 3 months
the value of p_textvp. This way, we always unlock the locked vnode.
While there, vhold() the vnode around the vn_lock().
Reported and tested by: Guy Helmer (ghelmer palisadesys com)
Approved by: des (procfs maintainer)
MFC after: 1 week
variable to avoid invalid constraints in dead code. Use an array of
u_char's (inside a struct) instead of a char/short/int/long variable so
that the variable and its accesses can be spelled in the same way in all
cases and code doesn't need to be cloned just to hold the spelling
differences.
Fixed strict-aliasing errors in PCPU_SET() and in the amd64 PCPU_GET().
Cast to (void *) as in rev.1.37 of the i386 version where the errors
were fixed for the i386 PCPU_GET() only. It would be more correct to
copy to and from the temp. variable using memcpy(), but then an
ifdef tangle would be required to ensure using the builtin memcpy().
We depend on fairly aggressive optimization to put the temp. variable
only in a register despite it being copied using
*(type *)(void *)&anothertype and could depend on this when using
memcpy() too. This seems to work right even for -O0, but the -O0 case
has not been completely tested.
This change gives identical object code for all object files in LINT
on amd64 (except for one file with a __TIME__ stamp). For LINT on
i386 it gives unimportant differences in instruction order and padding
in a few object files. This was only tested for -O.
This change (actually a previous version of it) gives the following
reductions in the number of object files in LINT that fail to compile
with -O2 but without the -fno-strict-aliasing kludge:
- amd64: 29 (down from 211)
- i386: 36 (down from 47)
gcc-3.4.6 actually allows the invalid constraints that result from not
using the temp. variable, at least with -O[1-2], but gcc-3.3.3 crashes
on them and I don't want to depend on compiler bugs.
avoid holding the UNIX domain socket subsystem lock over soooptcopyin()
and sooptcopyout(). This problem was introduced when LOCAL_CREDS, and
LOCAL_CONNWAIT support were added.
Reviewed by: mdodd
LABEL_TO_SLOT() macro used by policy modules to query and set label data
in struct label. Instead of using a union, store an intptr_t, simplifying
the API.
Update policies: in most cases this required only small tweaks to current
wrapper macros. In two cases, a single wrapper macros had to be split into
separate get and set macros.
Move struct label definition from _label.h to mac_internal.h and remove
_label.h. With this change, policies may now treat struct label * as
opaque, allowing us to change the layout of struct label without breaking
the policy module ABI. For example, we could make the maximum number of
policies with labels modifiable at boot-time rather than just at
compile-time.
Obtained from: TrustedBSD Project
Don't perform a nested include of _label.h in mac.h, as mac.h now
describes only the user API to MAC, and _label.h defines the in-kernel
representation of MAC labels.
Remove mac.h includes from policies and MAC framework components that do
not use userspace MAC API definitions.
Add _KERNEL inclusion checks to mac_internal.h and mac_policy.h, as these
are kernel-only include files
Obtained from: TrustedBSD Project
sleep lock missed the witness code, and the system will panic
immediately on boot if WITNESS is enabled.
Changed the witness definition to the new type.
register takes 16 characters (64-bit register in hex). In practice this
is a slight bit of overkill as 7 of the 56 registers are only 32-bit, but
having the buffer too small results in remote kgdb trashing kernel memory
when it connects.
PR: amd64/108673
Submitted by: Ravi Murty, Nikhil Rao @ Intel
MFC after: 3 days
that of the tun instance even for the !AF_INET case, and properly
remove configured addresses by calling if_purgeaddrs().
Maintain the TUN_DSTADDR behaviour for compatibility with the OS/390
emulator.
MFC after: 3 weeks
PR: 100080
Reviewed by: bz
The patch from the PR was a little outdated w/regards to the
Vodafone vendor string.
PR: kern/106033
Submitted by: Volker Werth <volker_AT_vwsoft.com>
MFC in: 3 days
Make devfs cloning a sysctl/tunable which defaults to on.
If devfs cloning is enabled, only the super-user may create
tun(4)/tap(4)/vmnet(4) instances. Devfs cloning is still enabled by
default; it may be disabled from the loader or via sysctl with
"net.link.tap.devfs_cloning" and "net.link.tun.devfs_cloning".
Disabling its use affects potentially all tun(4)/tap(4) consumers
including OpenSSH, OpenVPN and VMware.
PR: 105228 (potentially also 90413, 105570)
Submitted by: Landon Fuller
Tested by: Andrej Tobola
Approved by: core (rwatson)
MFC after: 4 weeks
to set_controller_command_byte() call; by issueing a Read Mode Byte
command, the touchpad is in Absolute Mode again.
This problem occursed at least on Asus V6V laptops.
approval, change the copyright statement to point at him instead of
"FreeBSD, Inc".
Encouraged by: rwatson
Reviewed by: imp
Discussed with and approved by: orion