Always use sys/conf/kern.mk when building kernel/modules.
<bsd.kern.mk> is only preserved for sys/boot/pc98/boot2
for now, but this will be fixed. If there are other
users of <bsd.kern.mk>, please let me know.
Reminded by: bde
Clear the LQICRC_NLQ status should it pop up after we have
already handled the SCSIPERR. During some streaming operations
this status can be delayed until the stream ends. Without this
change, the driver would complain about a "Missing case in
ahd_handle_scsiint".
In the LQOBUSFREE handler...
Don't return the LQOMGR back to the idle state until after
we have cleaned up ENSELO and any status related to this
selection. The last thing we need is the LQO manager starting
another select-out before we have updated the execution queue.
It is not clear whether the LQOMGR would, or would not
start a new selection early.
Make sure ENSELO is off prior to clearing SELDO by flushing
device writes.
Move assignment of the next target SCB pointer inside of
an if to make the code clearer. The effect is the same.
Dump card state in both "Unexpected PKT busfree" paths.
In ahd_reset(), set the chip to SCSI mode before reading SXFRCTL1.
That register only exists in the SCSI mode. Also set the mode
explicitly to the SCSI mode after chip reset due to paranoia.
Re-arrange code so that SXFRCTL1 is restored as quickly after the
chip reset as possible.
S/G structurs must be 8byte aligned. Make this official by saying
so in our DMA tag.
Disable CIO bus stretch on MDFFSTAT if SHVALID is about to come
true. This can cause a CIO bus lockup if a PCI or PCI-X error
occurs while the stretch is occurring - the host cannot service
the PCI-X error since the CIO bus is locked out and SHVALID will
never resolve. The stretch was added in the Rev B to simplify the
wait for SHVALID to resolve, but the code to do this in the open
source sequencer is so simple it was never removed.
Consistently use MAX_OFFSET for the user max syncrate set from
non-volatile storage. This ensures that the offset does not
conflict with AH?_OFFSET_UNKNOWN.
Have ahd_pause_and_flushwork set the mode to ensure that it has
access to the registers it checks. Also modify the checking of
intstat so that the check against 0xFF can actually succeed if
the INT_PEND mask is something other than 0xFF. Although there
are no cardbus U320 controllers, this check may be needed to
recover from a hot-plug PCI removal that occurs without informing
the driver.
Fix a typo. sg_prefetch_cnt -> sg_prefetch_align. This fixes
an infinite loop at card initialization if the cacheline size is 0.
aic79xx.h:
Add AHD_EARLY_REQ_BUG bug flag.
Fix spelling errors.
Include the CDB's length just after the CDB pointer in the DMA'ed
CDB case.
Change AH?_OFFSET_UNKNOWN to 0xFF. This is a value that the
curr->offset can never be, unlike '0' which we previously used.
This fixes code that only checks for a non-zero offset to
determine if a sync negotiation is required since it will fire
in the unknown case even if the goal is async.
aic79xx.reg:
Add comments for LQISTAT bits indicating their names in the 7902
data book. We use slightly different and more descriptive names
in the firmware.
Fix spelling errors.
Include the CDB's length just after the CDB pointer in the DMA'ed
CDB case.
aic79xx.seq:
Update comments regarding rundown of the GSFIFO to reflect reality.
Fix spelling errors.
Since we use an 8byte address and 1 byte length, shorten the size
of a block move for the legacy DMA'ed CDB case from 11 to 9 bytes.
Remove code that, assuming the abort pending feature worked, would
set MK_MESSAGE in the SCB's control byte on completion to catch
invalid reselections. Since we don't see interrupts for completed
selections, this status update could occur prior to us noticing the
SELDO. The "select-out" queue logic will get confused by the
MK_MESSAGE bit being set as this is used to catch packatized
connections where we select-out with ATN. Since the abort pending
feature doesn't work on any released controllers yet, this code was
never executed.
Add support for the AHD_EARLY_REQ_BUG. Don't ignore persistent REQ
assertions just because they were asserted within the bus settle delay
window. This allows us to tolerate devices like the GEM318 that
violate the SCSI spec.
Remove unintentional settnig of SG_CACHE_AVAIL. Writing this bit
should have no effect, but who knows...
On the Rev A, we must wait for HDMAENACK before loading additional
segments to avoid clobbering the address of the first segment in
the S/G FIFO. This resolves data-corruption issues with certain
IBM (now Hitachi) and Fujitsu U320 drives.
Rearrange calc_residual to avoid an extra jmp instruction.
On RevA Silicon, if the target returns us to data-out after we
have already trained for data-out, it is possible for us to
transition the free running clock to data-valid before the required
100ns P1 setup time (8 P1 assertions in fast-160 mode). This will
only happen if this L-Q is a continuation of a data transfer for
which we have already prefetched data into our FIFO (LQ/Data
followed by LQ/Data for the same write transaction). This can
cause some target implementations to miss the first few data
transfers on the bus. We detect this situation by noticing that
this is the first data transfer after an LQ (LQIWORKONLQ true),
that the data transfer is a continuation of a transfer already
setup in our FIFO (SAVEPTRS interrupt), and that the transaction
is a write (DIRECTION set in DFCNTRL). The delay is performed by
disabling SCSIEN until we see the first REQ from the target.
Only compile in snapshot savepointers handler for RevA silicon
where it is enabled.
Handle the cfg4icmd packetized interrupt. We just need to load
the address and count, start the DMA, and CLRCHN once the transfer
is complete.
Fix an oversight in the overrun handler for packetized status
operations. We need to wait for either CTXTDONE or an overrun
when checking for an overrun. The previous code did not wait
and thus could decide that no overrun had occurred even though
an overrun will occur on the next data-valid req. Add some
comment to this section for clarity.
Use LAST_SEG_DONE instead of LASTSDONE for testing transfer
completion in the packetized status case. LASTSDONE may come up
more quickly since it only records completion on the SCSI side,
but since LAST_SEG_DONE is used everywhere else (and needs to be),
this is less confusing.
Add a missing invalidation of the longjmp address in the non-pack
handler. This code needs additional review.
aic79xx_inline.h:
Fix spelling error.
aic79xx_osm.c:
Set the cdb length for CDBs dma'ed from host memory.
Add a comment indicating that, should CAM start supporting cdbs
larger than 16bytes, the driver could store the CDB in the status
buffer.
aic79xx_pci.c:
Add a table entry for the 39320A.
Added a missing comma to an error string table.
Fix spelling errors.
Move <sys/conf.h> before <sys/disk.h>.
No need for raidread()/raidwrite(), we have generic code for that.
Remove non-functional dump code.
Make raidinit() return the softc, not the dev_t.
Move to "struct disk*" centric API.
Fix printfs' to get name from struct disk instead of dev_t.
OK'ed by: scottl
don't end up freezing the box. This makes VTY locking useless
in the DDB case but a box which is supposed to be physically
secure shouldn't compile DDB anyway.
Reviewed by: silence on -audit
To do this, initialize the d_maj member of the cdevsw to MAJOR_AUTO.
When the cdevsw is first passed to make_dev() a free major number
will be assigned.
Until we have a bit more experience with this a printf will announce
this fact.
Major numbers are not reclaimed, so loading/unloading the same
device driver which uses MAJOR_AUTO will eventually deplete the
pool of free major numbers and the system will panic when it can
not allocate one. Still undecided who to invonvenience with the
solution to this.
Improve SBP device probeing:
- Wait 2 sec before issuing LOGIN ORB expecting the reconnection
hold timer expires.
- Serialize management ORB and scanning LUN by CAM on each target.
This should fix the problem for devices which have multiple LUNs.
Test device is donated by: Jaye Mathisen <mrcpu@internetcds.com>
- Freeze SIM queue for 2 sec after BUS RESET.
- Retry with LOGIN rather than RECONNECT after LOGIN is not completed for
BUS RESET.
- Use appropriate CAM status for BUS RESET and DEVICE RESET.
- Let CAM to scan targets after BUS REST.
- Implement CAM scan target function.
- Keep our own devq freeze count.
- Let CAM to know that SBP does tagged queuing.
These should be merged to RELENG_4 before 4.8-RELEASE.
a correctly aligned address in this block we really want to check, that the
part of the chunk that starts at the aligned address is large enough with
regard to the original request. Comparing it to 0 makes no sense, because this
is always true except in the rare case, that the aligned address is just at
the end of the chunk.
Approved by: jake (mentor)
td_wmesg field in the thread structure points to the description string of
the condition variable or mutex. If the condvar or the mutex had been
initialized from a loadable module that was unloaded in the meantime,
td_wmesg may now point to invalid memory. Retrieving the process table now
may panic the kernel (or access junk). Setting the td_wmesg field to NULL
after unblocking on the condvar/mutex prevents this panic.
PR: kern/47408
Approved by: jake (mentor)
Fixed memory leak in the "nodevice" option implementation.
Use these instead of sed(1) in MD NOTES.
Use a single makefile (sys/conf/makeLINT.mk) to generate
LINT for all architectures. (Previous versions missed
the LINT dependency on Makefile, and i386 version also
missed the dependency on ${NOTES}.)
Fixed bugs in the previous NOTES conversion using the
"nodevice" token and sed(1):
- i386 LINT lost "device pst".
- pc98 LINT lost SC_*, MAXCONS and KBD_DISABLE_KEYMAP_LOAD
options, and got needless DPT_* options.
- Added nooptions PPC_DEBUG, PPC_PROBE_CHIPSET, KBD_INSTALL_CDEV
to sparc64 LINT so that it has a chance to config(8).
This basically returns us to where we were before.
the fxp driver. This is enabled only for the 82550/82551 chips
(PCI revision code 12 or 13). RX and TX checksum offload are
both supported. Transmit offload is limited to TCP and UDP only
right now: there seems to be a problem with IP header checksumming
on transmit in some cases.
This chip has hardware VLAN support as well. I hope to enable
support for this eventually.
- Make transmission of packets work again. This stopped working because
ether_ifattach() was forcing ifp->if_output to be ether_output() and
clobbering our attempt to override this vector with a pointer to
ng_fec_output(). Move the overriding of ifp->if_output to after
ether_ifattach().
- Abandon the use of the netgraph ng_ether_input_p hook for snagging
incoming frames, and instead override the ifp->if_input vector for
interfaces that have been aggregated into our bundle. (I would have
loved to have written things this way in the first place, but I
didn't want to have to be the one to implement the if_input hook
and change all the drivers.) This avoids collisions with the ng_ether
module, which uses the same hook. Each aggregated device now calls
ng_fec_input() directly, which then fakes up the rcvif pointer
before invoking ifp->if_input itself.
This module should actually work now.
that matches snd_max, then do not respond with an ack, just drop the
segment. This fixes a problem where a simultaneous close results in
an ack loop between two time-wait states.
Test case supplied by: Tim Robbins <tjr@FreeBSD.ORG>
Sponsored by: DARPA, NAI Labs
introduce a preprocessor define for it. The larger block size
significantly speeds up the loading of the kernel.
Submitted by: Arun Sharma <arun.sharma@intel.com>
changed since this code was written:
- The ng_ether_input_p hook only accepts two arguments now: the pointer
to the ether header structure is gone.
- It's no longer necessary to cons up a fake ether header before passing
incoming packets to BPF_MTAP().
ng_fec_input() has been modified to account for these two changes.
Running tcpdump on fec0 should work now.
PR: kern/46720
- the mutex aac_io_lock protects the main codepaths which handle queues and
hardware registers. Only one acquire/release is done in the top-half and
the taskqueue. This mutex also applies to the userland command path and
CAM data path.
- Move the taskqueue to the new Giant-free version.
- Register the disk device with DISKFLAG_NOGIANT so the top-half processing
runs without Giant.
- Move the dynamic command allocator to the worker thread to avoid locking
issues with bus_dmamem_alloc().
This gives about 20% improvement in most of my benchmarks.
turns runs its tasks free of Giant too. It is intended that as drivers
become locked down, they will move out of the old, Giant-bound taskqueue
and into this new one. The old taskqueue has been renamed to
taskqueue_swi_giant, and the new one keeps the name taskqueue_swi.
an if clause was true. Break the two clauses out into seperate statements
since they require different actions.
Reported/Tested by: jake
Spotted by: jhb
delta 1.371) we must ensure that we do not get ourselves into a
recursive trap endlessly trying to clean up after ourselves.
Reported by: Attila Nagy <bra@fsn.hu>
Sponsored by: DARPA & NAI Labs.
from the filesystem size field to the filesystem maximum blocksize
field. The problem is that older versions of growfs updated only the
new size field and not the old size field. This resulted in the old
(smaller) size field being copied up to the new size field which
caused the filesystem to appear to fsck to be badly trashed.
This also adds a sanity check to ensure that the superblock is not
being updated when the filesystem is mounted read-only. Obviously
such an update should never happen.
Reported by: Nate Lawson <nate@root.org>
Sponsored by: DARPA & NAI Labs.
o Always check for null when dereferencing the filename component.
o Implement a try-and-backoff method for allocating memory to
dump stats to avoid a spin-lock -> sleep-lock mutex lock order
panic with WITNESS.
Approved by: des, markm (mentor)
Not objected: jhb
for testing and setting the current and alternate address spaces.
- Changed PTDpde and APTDpde to arrays to support multiple page directory
pages.
ponsored by: DARPA, Network Associates Laboratories
tcpcb is NULL, but also its connected inpcb, since we now allow
elements of a TCP connection to hang around after other state, such
as the socket, has been recycled.
Tested by: dcs
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
track of the number of dirty buffers held by a vnode. When a
bdwrite is done on a buffer, check the existing number of dirty
buffers associated with its vnode. If the number rises above
vfs.dirtybufthresh (currently 90% of vfs.hidirtybuffers), one
of the other (hopefully older) dirty buffers associated with
the vnode is written (using bawrite). In the event that this
approach fails to curb the growth in it the vnode's number of
dirty buffers (due to soft updates rollback dependencies),
the more drastic approach of doing a VOP_FSYNC on the vnode
is used. This code primarily affects very large and actively
written files such as snapshots. This change should eliminate
hanging when taking snapshots or doing background fsck on
very large filesystems.
Hopefully, one day it will be possible to cache filesystem
metadata in the VM cache as is done with file data. As it
stands, only the buffer cache can be used which limits total
metadata storage to about 20Mb no matter how much memory is
available on the system. This rather small memory gets badly
thrashed causing a lot of extra I/O. For example, taking a
snapshot of a 1Tb filesystem minimally requires about 35,000
write operations, but because of the cache thrashing (we only
have about 350 buffers at our disposal) ends up doing about
237,540 I/O's thus taking twenty-five minutes instead of four
if it could run entirely in the cache.
Reported by: Attila Nagy <bra@fsn.hu>
Sponsored by: DARPA & NAI Labs.
- Remove the buftimelock mutex and acquire the buf's interlock to protect
these fields instead.
- Hold the vnode interlock while locking bufs on the clean/dirty queues.
This reduces some cases from one BUF_LOCK with a LK_NOWAIT and another
BUF_LOCK with a LK_TIMEFAIL to a single lock.
Reviewed by: arch, mckusick
- Get rid of the useless atop() / pmap_phys_address() detour. The
device mmap handlers must now give back the physical address
without atop()'ing it.
- Don't borrow the physical address of the mapping in the returned
int. Now we properly pass a vm_offset_t * and expect it to be
filled by the mmap handler when the mapping was successful. The
mmap handler must now return 0 when successful, any other value
is considered as an error. Previously, returning -1 was the only
way to fail. This change thus accidentally fixes some devices
which were bogusly returning errno constants which would have been
considered as addresses by the device pager.
- Garbage collect the poorly named pmap_phys_address() now that it's
no longer used.
- Convert all the d_mmap_t consumers to the new API.
I'm still not sure wheter we need a __FreeBSD_version bump for this,
since and we didn't guarantee API/ABI stability until 5.1-RELEASE.
Discussed with: alc, phk, jake
Reviewed by: peter
Compile-tested on: LINT (i386), GENERIC (alpha and sparc64)
Runtime-tested on: i386
time and there's no indication that it will improve anytime soon.
By removing support for SimOS it is possible to build LINT on
Alpha, which is considered more important at the moment.
Not objected to on: alpha@
- Changed VM_MAXUSER_ADDRESS to be defined in terms of PTDPTDI. In order for
assumptions about the recursive page table map to work it must be the base
of the recursive map. Any pte offset that's not NPTEPG will break these
assumptions.
Sponsored by: DARPA, Network Associates Laboratories
flag that can be marked on each symmetric op
o eliminate hw.ubsec.maxbatch and hw.ubsec.maxaggr since they are not
needed anymore
o change ubsec_feed to return void instead of int since zero is always
returned and noone ever looked at the return value
instead of allocating another page of kva and mapping it in again. This was
likely an oversight in revision 1.174 (cut and paste from pmap_pinit).
Discussed with: peter, tegge
Sponsored by: DARPA, Network Associates Laboratories
Reading the PCI config space with the wrong (larger) size is not
a problem in this case, but writing can be as it clobbers unrelated
registers. In this case the clobbering is for reserved fields, which
too is mostly harmless... for now. Hence, this change is mostly
preventive in nature.
page directory.
- Use these instead of the magic constants 1 or PAGE_SIZE where appropriate.
There are still numerous assumptions that the page directory is exactly
1 page.
Sponsored by: DARPA, Network Associates Laboratories
without waiting, since they are called from a system-call context only.
This appears to fix all sorts of problems with open("/dev/dsp", O_WRONLY)
randomly returning ENXIO.
Found by: cognet
Security improvements:
- Increase the size of each syncookie secret from 32 to 128 bits
in order to make brute force attacks on the secrets much more
difficult.
- Always return the lowest order dword from the MD5 hash; this
allows us to expose 2 more bits of the cookie and makes ACK
floods which seek to guess the cookie value more difficult.
Performance improvements:
- Increase the lifetime of each syncookie from 4 seconds to 16
seconds. This increases the usefulness of syncookies during
an attack.
- From Yahoo!: Reduce the number of calls to MD5Update; this
results in a ~17% increase in cookie generation time here.
Reviewed by: hsu, jayanth, jlemon, nectar
MFC After: 15 seconds
in massive locking issues on diskless systems.
It is also not clear that this sysctl is non-dangerous in its
requirements for locked down memory on large RAM systems.
type M_STRING, now defined in malloc.h. Useful when string parsing
must occur using the kernel strsep() and we want to avoid toasting
the source string.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
should be done in crypto_done rather than in the callback thread
o use this flag to mark operations from /dev/crypto since the callback
routine just does a wakeup; this eliminates the last unneeded ctx switch
o change CRYPTO_F_NODELAY to CRYPTO_F_BATCH with an inverted meaning
so "0" becomes the default/desired setting (needed for user-mode
compatibility with openbsd)
o change crypto_dispatch to honor CRYPTO_F_BATCH instead of always
dispatching immediately
o remove uses of CRYPTO_F_NODELAY
o define COP_F_BATCH for ops submitted through /dev/crypto and pass
this on to the op that is submitted
Similar changes and more eventually coming for asymmetric ops.
MFC if re gives approval.
branch targets that are too far apart for the BRADDR relocation.
This is caused by the branch prediction optimizationi in the atomic
inlines here, because they jump across sections.
The workaround is to suppress jumping to a different section when
compiling LINT. To generate correct code in that case, the section
directives are replaced by a branch and a label to deal with the
fall-through case. Reasonably good C compilers will optimize this
away anyway, so the end result isn't really that bad.
packets coming out of a GIF tunnel are re-processed by ipfw, et. al.
By default they are not reprocessed. With the option they are.
This reverts 1.214. Prior to that change packets were not re-processed.
After they were which caused problems because packets do not have
distinguishing characteristics (like a special network if) that allows
them to be filtered specially.
This is really a stopgap measure designed for immediate MFC so that
4.8 has consistent handling to what was in 4.7.
PR: 48159
Reviewed by: Guido van Rooij <guido@gvr.org>
MFC after: 1 day
o Make DXS3 the primary playback channel. It may be the only
universally supported channel with the assorted revisions of this
chipset.
o Add sysctl and handler for enabling s/pdif output from DXS3.
as opposed to one after the other. This is faster in both -CURRENT
and -STABLE. Additionally, there is less code duplication for
error-checking.
One thing to note is that this code seems to return(1) when no buffers
are available; perhaps ENOBUFS should be the correct return value?
Partially submitted & tested by: Hiten Pandya <hiten@unixdaemons.com>
MFC after: 1 week
and enable it by default, with a limit of 16.
At the same time, tweak maxfragpackets downward so that in the worst
possible case, IP reassembly can use only 1/2 of all mbuf clusters.
MFC after: 3 days
Reviewed by: hsu
Liked by: bmah
a snapshot. As part of taking a snapshot of a filesystem, the kernel
builds up a list of the filesystem metadata (such as the cylinder
group bitmaps) that are contained in the snapshot. When doing a
copy-on-write check, the list is first consulted. If the block being
written is found on the list, then the full snapshot lookup can be
avoided. Besides providing an important performance speedup this
check also avoids a potential deadlock between the code creating
the snapshot and the bufdaemon trying to cleanup snapshot related
buffers. This fix creates a temporary list containing the key
metadata blocks that can cause the deadlock. This temporary list
is used between the time that the snapshot is first enabled and the
time that the fully complete list is built.
Reported by: Attila Nagy <bra@fsn.hu>
Sponsored by: DARPA & NAI Labs.
is being taken from panicing with either "freeing free block" or
"freeing free inode". The problem arises when the snapshot code
is scanning the filesystem looking for inodes with a reference
count of zero (e.g., unlinked but still open) so that it can
expunge them from its view. If it encounters a reclaimed vnode
and has to restart its scan, then it will panic if it encounters
and tries to free an inode that it has already processed. The fix
is to check each candidate inode to see if it has already been
processed before trying to delete it from the snapshot image.
Sponsored by: DARPA & NAI Labs.
that they convert to 64-bit values before shifting rather than
afterwards. Once fixed, they can be used rather than inline expanded.
Sponsored by: DARPA & NAI Labs.
Retire the "d_dump_t" and use the "dumper_t" type instead.
Dumper_t takes a void * as first arg which is more general than the
dev_t taken by d_dump_t. (Remember: we could have net-dumpers if
somebody wrote us one!)
Define the convention for GEOM controlled disk devices to be that the
first argument to the dumper function is the struct disk pointer.
Change device drivers accordingly.
Change the argument to disk_destroy() to be the same struct disk * as
disk_create() takes.
This enables drivers to ignore the (now) bogus dev_t which disk_create()
returns.
a number of related problems along the way.
- Automatically detect CDROM drives that can't handle 6 byte mode
sense and mode select, and adjust our command size accordingly.
We have to handle this in the cd(4) driver (where the buffers are
allocated), since the parameter list length is different for the
6 and 10 byte mode sense commands.
- Remove MODE_SENSE and MODE_SELECT translation removed in ATAPICAM
and in the umass(4) driver, since there's no way for that to work
properly.
- Add a quirk entry for CDROM drives that just hang when they get a 6
byte mode sense or mode select. The reason for the quirk must be
documented in a PR, and all quirks must be approved by
ken@FreeBSD.org. This is to make sure that we fully understand why
each quirk is needed. Once the CAM_NEW_TRAN_CODE is finished, we
should be able to remove any such quirks, since we'll know what
protocol the drive speaks (SCSI, ATAPI, etc.) and therefore whether
we should use 6 or 10 byte mode sense/select commands.
- Change the way the da(4) handles the no_6_byte sysctl. There is
now a per-drive sysctl to set the minimum command size for that
particular disk. (Since you could have multiple disks with
multiple requirements in one system.)
- Loader tunable support for all the sysctls in the da(4) and cd(4)
drivers.
- Add a CDIOCCLOSE ioctl for cd(4) (bde pointed this out a long
time ago).
- Add a media validation routine (cdcheckmedia()) to the cd(4)
driver, to fix some problems bde pointed out a long time ago. We
now allow open() to succeed no matter what, but if we don't detect
valid media, the user can only issue CDIOCCLOSE or CDIOCEJECT
ioctls.
- The media validation routine also reads the table of contents off
the drive. We use the table of contents to implement the
CDIOCPLAYTRACKS ioctl using the PLAY AUDIO MSF command. The
PLAY AUDIO TRACK INDEX command that we previously used was
deprecated after SCSI-2. It works in every SCSI CDROM I've tried,
but doesn't seem to work on ATAPI CDROM drives. We still use the
play audio track index command if we don't have a valid TOC, but
I suppose it'll fail anyway in that case.
- Add _len() versions of scsi_mode_sense() and scsi_mode_select() so
that we can specify the minimum command length.
- Fix a couple of formatting problems in the sense printing code.
MFC after: 4 weeks
OSes has probably caused more problems than it ever solved. Allow the
user to retire the old behavior by specifying their own privileged
range with,
net.inet.ip.portrange.reservedhigh default = IPPORT_RESERVED - 1
net.inet.ip.portrange.reservedlo default = 0
Now you can run that webserver without ever needing root at all. Or
just imagine, an ftpd that can really drop privileges, rather than
just set the euid, and still do PORT data transfers from 20/tcp.
Two edge cases to note,
# sysctl net.inet.ip.portrange.reservedhigh=0
Opens all ports to everyone, and,
# sysctl net.inet.ip.portrange.reservedhigh=65535
Locks all network activity to root only (which could actually have
been achieved before with ipfw(8), but is somewhat more
complicated).
For those who stick to the old religion that 0-1023 belong to root and
root alone, don't touch the knobs (or even lock them by raising
securelevel(8)), and nothing changes.
Some drives seem to be confused by simultaneous probes.
Tested by: marcel
As a side effect, logical units whose lun is greater than 0 might not be
probed correctly if the lun of 0 doesn't exist in the target because
CAM doesn't scan such luns.
I have a SCSI-FireWire bridge which maps SCSI-ID to LUN and it is an
example of such targets.
dev_t to the method functions.
The dev_t can still be found at struct consdev *->cn_dev.
Add a void *cn_arg element to struct consdev which the drivers can use
for retrieving their softc.
This moves all chipset specific code to a new file 'ata-chipset.c'.
Extensive use of tables and pointers to avoid having the same switch
on chipset type in several places, and to allow substituting various
functions for different HW arch needs.
Added PIO mode setup and all DMA modes.
Support for all known SiS chipsets. Thanks to Christoph Kukulies for
sponsoring a nice ASUS P4S8X SiS648 based board for this work!
Tested on: i386, PC98, alpha and sparc64
This moves all chipset specific code to a new file 'ata-chipset.c'.
Extensive use of tables and pointers to avoid having the same switch
on chipset type in several places, and to allow substituting various
functions for different HW arch needs.
Added PIO mode setup and all DMA modes.
Support for all known SiS chipsets. Thanks to Christoph Kukulies for
sponsoring a nice ASUS P4S8X SiS648 based board for this work!
Tested on: i386, PC98, alpha and sparc64
In devsw() return dead_cdevsw instead of NULL in case the dev_t does not
have a si_devsw.
This may improve our survival chances with devices which go away unexpectedly.
compile-time constants). That is, a "bucket" now is not necessarily
a page-worth of mbufs or clusters, but it is MBUF_BUCK_SZ, CLUS_BUCK_SZ
worth of mbufs, clusters.
o Rename {mbuf,clust}_limit to {mbuf,clust}_hiwm and introduce
{mbuf,clust}_lowm, which currently has no effect but will be used
to set the low watermarks.
o Fix netstat so that it can deal with the differently-sized buckets
and teach it about the low watermarks too.
o Make sure the per-cpu stats for an absent CPU has mb_active set to 0,
explicitly.
o Get rid of the allocate refcounts from mbuf map mess. Instead,
just malloc() the refcounts in one shot from mbuf_init()
o Clean up / update comments in subr_mbuf.c
used to share resource limits between rfork threads, but never was.
Removing it makes resource limit locking much simpler -- only the current
process can change the contents of the structure that p_limit points to.
support for Tekram DC395U2W cards.
Add a fix submitted by joerg@ to correctly report some errors to CAM.
Use bus_dma instead of the remaining vtophys().
reference counter array for mbuf clusters. I don't know
how this got past early testing nor how it survived so long
without getting caught. If anyone was seeing really really
bizarre memory corruption in a few mbufs this would be why.
field for holding driver-dependant data. Instead of putting the pointer
to the driver command struct in there, take advantage of these structs
being a (virtually) contiguous array and just put the array index in the
field.
control block. Allow the socket and tcpcb structures to be freed
earlier than inpcb. Update code to understand an inp w/o a socket.
Reviewed by: hsu, silby, jayanth
Sponsored by: DARPA, NAI Labs
routine does not require a tcpcb to operate. Since we no longer keep
template mbufs around, move pseudo checksum out of this routine, and
merge it with the length update.
Sponsored by: DARPA, NAI Labs
that a command completion happened, all further processing is deferred to
a taskqueue. The taskqueue itself runs implicetely under Giant, but we
already used a taskqueue for the biodone() processing, so this at least
saves the contesting of Giant in the interrupt handler.
retain symetry with aac_alloc_commans(). Since aac_alloc_commands()
allocates fib maps and places them onto the fib lists, aac_free_commands()
should reverse those operations.
o Combine two ifs with the same body with an ||.
o Switch from uintptr_t to uint32_t for fib map load operations.
The target is a uint32_t so using this type for the map load call
avoids an extra cast. uintptr_t should only be used when you need
an "int sized the same as the machine's poitner size" which is not
the case here.
o Removed the commented out M_WAITOK flag in the allocation in
aac_alloc_commands(). The kernel will only block in the allocator
if it can grow the size of the kernel. This usually results in a
page-out which could involve this aac device. Thus, sleeping here
could deadlock the machine. Assuming this operation is occurring outside
of attach time, we have enough fibs to operate anyway, so waiting for
fibs to free up is okay if not optimal.
o In aac_alloc_commands(), if we cannot dmamem_alloc additional fib
space, free the fib map.
o In aac_alloc_commands(), if we cannot create per-command dmamaps, don't
lose track of the fib map that is mapping all of the commands that we
have already released into the free pool. Instead, just cut out of
the loop and modify aac_free_commands to not attempt to free maps that
have not been allocated.
o Don't use a magic number when pre-allocating fibs.
o Use PAGE_SIZE to allocate in page sized chunks instead of an
architecture specific constant.
Submitted by: gibbs
- delay acks for T/TCP regardless of delack setting
- fix bug where a single pass through tcp_input might not delay acks
- use callout_active() instead of callout_pending()
Sponsored by: DARPA, NAI Labs
wish the busdma APIs were more consistent accross architectures.
We should probably move all the other DMA map creations in
xl_attach() where we can really handle them failing, since
xl_init() is void and shouldn't fail.
Pointy hat to: mux
Tested by: Anders Andersson <anders@hack.org>
One of the vnodes is on different mount and is possibly on a different
kind of filesystem; treating it as an smbfs vnode then writing to it
will probably corrupt it.
PR: 48381
MFC after: 1 month
group number properly based on the board id. Perform dummy reads of
registers after writing to flush the hardware write buffers.
This gets the soon to be committed zs attachment working.
Minor CIS resource allocation code cleanup
Remove some fairly useless debug writes.
This finishes the work to move as much cardbus code as possible into
pci. We wind up removing 800-odd lines from cardbus.c: we go from
1285 to 400 lines.
Reviewed by: mdodd
bus_dmamap_load() it.
- Make it so reusing mbufs when we can't allocate (or map) new ones
actually works. We were previously trying to reuse a mbuf which
was already bus_dmamap_unload()'ed.
Reviewed by: silby
- Fix memory leak in detaching.
- Initialize fc->status to other than FWBUSREST.
* fwohci.c
- Ignore BUS reset events while BUS reset phase. We can't clear that flag
during bus reset phase.
UltraSPARCs, and an eeprom attachment for fhc, which allows the date
to be set properly on these machines. Central is a wierd bus which
seems to only ever have 1 fhc attached to it. FHC (FireHose Controller)
is another wierd bus with various things on it depending where its attached.
The fhc attached to central has eeprom and zs, and the fhcs which attach
directly to nexus have simm-status, environment and other nodes, none of
which I'll probably ever have documentation for.
Thanks to Ade Lovett for providing access to an 8 cpu e4500.
#if'ed out for a while. Complete the deed and tidy up some other bits.
We need to be able to call this stuff from outer edges of interrupt
handlers for devices that have the ISR bits in pci config space. Making
the bios code mpsafe was just too hairy. We had also stubbed it out some
time ago due to there simply being too much brokenness in too many systems.
This adds a leaf lock so that it is safe to use pci_read_config() and
pci_write_config() from interrupt handlers. We still will use pcibios
to do interrupt routing if there is no acpi.. [yes, I tested this]
Briefly glanced at by: imp
is encoded in the PCI BAR. The latter is more reliable.
This allows the sio/modem function of the Xircom RealPort ethernet+modem
card to work. Note that there still seem to be issues with sio_pci not
releasing resources on detach.
pci busses implement this.
Also minor comment smithing in cardbus. Fix copyright to this year
with my name on it since I've been doing a lot to this file.
Reviewed by: jhb
on the fly and read into userland one at a time, this costs very
little total memory. The pnpinfo sizes of pccard is more than 64
bytes due to the length of the strings that man cards have in their
CIS.
- Don't initiate bus reset even if probe failed for some nodes to prevent
infinite bus reset loop.
Problem Reported by: Pierre Beyssac <pb@fasterix.frmug.org>
- Protect timeout routine with splfw() for 4-stable.
* sbp.c
- Make sure to release devq when start request.
cr_uid.
Note: we do not have socheckuid() in RELENG_4, ip_fw2.c uses its
own macro for a similar purpose that is why ipfw2 in RELENG_4 processes
uid rules correctly. I will MFC the diff for code consistency.
Reported by: Oleg Baranov <ol@csa.ru>
Reviewed by: luigi
MFC after: 1 month
sched_lock around accesses to p_stats->p_timer[] to avoid a potential
race with hardclock. getitimer(), setitimer() and the realitexpire()
callout are now Giant-free.
add a signal to a mailbox's pending set.
- Add a new function, thread_signal_upcall(), this causes the current thread
to upcall so that we can deliver pending signals.
Reviewed by: mini
I was in two minds as to where to put them in the first case..
I should have listenned to the other mind.
Submitted by: parts by davidxu@
Reviewed by: jeff@ mini@
Wearing said pointy hat, correct the oversight and hope nobody
notices.
# this should make xircom modems happier to detach once other bugs with
# the cardbus layer are fixed.
Noticed by: scottl
Conical Hat to: imp
queue lock already held.
- In getblk() and flushbufqueues() use bremfreel() while we still have the
buf queue lock held to keep the lists consistent.
- Add LK_NOWAIT to two cases where we're essentially asserting that the bufs
are not locked while acquiring the locks. This will make sure that we get
the appropriate panic() and not another one for sleeping with a lock held.
o Use the common pci_* routines in preference to the copied and hacked
routines from an ancient pci.c.
This saves 509 lines in cardbus.c. More savings to follow when I
convert the resource code over. In the past when I've done this the
resource code conversion breaks cardbus in subtle ways so I'm doing a
1/2 way checkpoint this time. cardbus still works for me the same as
it did before.
It also looks like cardbus devices now show up as pci bus devices to
pciconf -l, but maybe that was happening before.
Inspired by a patch from Justin Gibbs many moons ago. When he
finishes his kobj multiple inheritance work, we can transition the
finished version of this work to that fairly easily.
- Mark the process leader as having an advisory lock
- Check if process leader is marked as having advisory lock when
closing file
- Check that file is still open after lock has been obtained
- Don't allow file descriptor table sharing between processes
with different leaders
PR: 10265
Reviewed by: alfred
is already in pages, so we should not convert from bytes to pages.
The result of this bug was bad scaling of the VHPT relative to the
available memory.
Submitted by: Arun Sharma <arun@sharma-home.net>
It's unnecessary for two reasons: (1) Giant is at present already held in
such cases and (2) our various implementations of pmap_growkernel() look to
be MP safe. (For example, for sparc64 the proof of (2) is trivial.)
freebsd4_sigaction() and osigaction() instead of around the whole
body of those functions. They now no longer hold Giant around calls
to copyin() and copyout(), and it is slightly more obvious what
Giant is protecting.
barrier between free'ing filedesc structures. Basically if you want to
access another process's filedesc, you want to hold this mutex over the
entire operation.
values for the initial inode generation numbers in newfs and for
newly allocated inode generation numbers in the kernel.
Submitted by: Theo de Raadt <deraadt@cvs.openbsd.org>
Sponsored by: DARPA & NAI Labs.
opposed to returning the top of the old chain when there was one and
the top of the newly allocated chain if there was no old chain.
Actually, it should be noted that prior to this fix, although the
comment above m_getm() advertised that m_getm() would return the
top of the old chain (if an old chain was being passed in) it
actually [wrongly] was returning the tail mbuf in the old chain
instead. This is a bug but since the one use of m_getm() in
the tree luckily did not depend on the behavior, it happened
to work out without notice.
Harti Brandt pointed out that the advertised behavior was actually
not the real behavior and so this change makes m_getm() ALWAYS
return the newly allocated chain (and fixes the comment). This
is less confusing and is the best course of action as then the
caller is always able to have both a reference to the top of
the original chain (because it's passing it in in the call) and
a reference to the newly attached chain. Although the API is
slightly modified, I don't think that any third-party code uses
m_getm() and if it does, it surely can't be working properly
because the old behavior was bogus.
API bug pointed out by: Harti Brandt <brandt@fokus.fraunhofer.de>
To fix scsi, don't wait for ithreads if we're dumping, it makes the
debugger sad.
To fix ata, use what appears to be a polling method if we're dumping,
I stole this from tmm but added code to ensure that this change is
only in effect while dumping.
Tested by: des
for the agp module, and add agp to the list of modules to compile for alpha.
Add an alpha_mb() to agp_flush_cache for alpha -- it's not correct but may
improve the situation, and it's what linux and NetBSD do.
we can have additional different types of bridges.
o remove now bogus comment.
o Don't clear CARD_OK when we can't attach a card.
o minor style nits
# this make kldload of cardbus drivers work for me when the card is
# present on boot.
o chip_name arrays ifdef'd out.
o use the OLDCARD-like get/put functions so we can support differnt types
of mappings.
o Write the beggings of is this a valid exca device and introduce more
chipset support.
# this is partially a wip, but also needed because some other cahnges I've
# made require some of these changes.
previous revision fixed the panic, I found the problem exits in
another part of the function by investigating the crom dump sent by him.
The search was started in the middle of bus info block and the
routine misunderstood the EUI64 as a crom entry. This problem is fixed.
PR: kern/48129
Fix incorrect type mask included in a logical unit number and check
the validity of the lun.