Commit graph

23027 commits

Author SHA1 Message Date
Robert Watson cee313c431 o Export dmmax ("Maximum size of a swap block") using SYSCTL_INT.
This removes a reason that systat requires setgid kmem.  More to
  come.
2000-11-20 00:39:04 +00:00
Scott Long affec73ebd Disable calling timeout() when doing bio. It was causing more prolems than
solving.  This will be fixed for real soon.
2000-11-19 23:46:21 +00:00
David Malone 32af0d74f0 Make sbcompress use the new M_WRITABLE macro. Previously sbcompress
could not compress into clusters. This could result in lots of
wasted clusters while recieving small packets from an interface
that uses clusters for all it's packets.

Patch is partially from BSDi (limiting the size of the copy) and
based on a patch for 4.1 by Ian Dowse <iedowse@maths.tcd.ie> and
myself.

Reviewed by:	bmilekic
Obtained From:	BSDi
Submitted by:	iedowse
2000-11-19 22:22:47 +00:00
Doug Rabson 35cd89e7e5 Convert various calls to splhigh() to disable_intr() since splhigh() is
now a no-op.
2000-11-19 12:28:42 +00:00
Doug Rabson bcc542cb4e We don't need <stddef.h> for offsetof() any more. 2000-11-19 12:26:14 +00:00
Jake Burkholder fa2fbc3dac - Protect the callout wheel with a separate spin mutex, callout_lock.
- Use the mutex in hardclock to ensure no races between it and
  softclock.
- Make softclock be INTR_MPSAFE and provide a flag,
  CALLOUT_MPSAFE, which specifies that a callout handler does not
  need giant.  There is still no way to set this flag when
  regstering a callout.

Reviewed by:	-smp@, jlemon
2000-11-19 06:02:32 +00:00
Matthew Dillon 936524aa02 Implement a low-memory deadlock solution.
Removed most of the hacks that were trying to deal with low-memory
    situations prior to now.

    The new code is based on the concept that I/O must be able to function in
    a low memory situation.  All major modules related to I/O (except
    networking) have been adjusted to allow allocation out of the system
    reserve memory pool.  These modules now detect a low memory situation but
    rather then block they instead continue to operate, then return resources
    to the memory pool instead of cache them or leave them wired.

    Code has been added to stall in a low-memory situation prior to a vnode
    being locked.

    Thus situations where a process blocks in a low-memory condition while
    holding a locked vnode have been reduced to near nothing.  Not only will
    I/O continue to operate, but many prior deadlock conditions simply no
    longer exist.

Implement a number of VFS/BIO fixes

	(found by Ian): in biodone(), bogus-page replacement code, the loop
        was not properly incrementing loop variables prior to a continue
        statement.  We do not believe this code can be hit anyway but we
        aren't taking any chances.  We'll turn the whole section into a
        panic (as it already is in brelse()) after the release is rolled.

	In biodone(), the foff calculation was incorrectly
        clamped to the iosize, causing the wrong foff to be calculated
        for pages in the case of an I/O error or biodone() called without
        initiating I/O.  The problem always caused a panic before.  Now it
        doesn't.  The problem is mainly an issue with NFS.

	Fixed casts for ~PAGE_MASK.  This code worked properly before only
        because the calculations use signed arithmatic.  Better to properly
        extend PAGE_MASK first before inverting it for the 64 bit masking
        op.

	In brelse(), the bogus_page fixup code was improperly throwing
        away the original contents of 'm' when it did the j-loop to
        fix the bogus pages.  The result was that it would potentially
        invalidate parts of the *WRONG* page(!), leading to corruption.

	There may still be cases where a background bitmap write is
        being duplicated, causing potential corruption.  We have identified
        a potentially serious bug related to this but the fix is still TBD.
        So instead this patch contains a KASSERT to detect the problem
  	and panic the machine rather then continue to corrupt the filesystem.
	The problem does not occur very often..  it is very hard to
	reproduce, and it may or may not be the cause of the corruption
	people have reported.

Review by: (VFS/BIO: mckusick, Ian Dowse <iedowse@maths.tcd.ie>)
Testing by: (VM/Deadlock) Paul Saab <ps@yahoo-inc.com>
2000-11-18 23:06:26 +00:00
Matthew Dillon ef0646f9d8 Add the splvm()'s suggested in PR 20609 to protect vm_pager_page_unswapped().
The remainder of the PR is still open.

PR: kern/20609 (partial fix)
2000-11-18 21:11:23 +00:00
Matthew Dillon 279d722604 This patchset fixes a large number of file descriptor race conditions.
Pre-rfork code assumed inherent locking of a process's file descriptor
    array.  However, with the advent of rfork() the file descriptor table
    could be shared between processes.  This patch closes over a dozen
    serious race conditions related to one thread manipulating the table
    (e.g. closing or dup()ing a descriptor) while another is blocked in
    an open(), close(), fcntl(), read(), write(), etc...

PR: kern/11629
Discussed with: Alexander Viro <viro@math.psu.edu>
2000-11-18 21:01:04 +00:00
David Malone ca89ee278e Further use of M_ZERO.
Submitted by:	josh@zipperup.org
Submitted by:	Robert Drehmel <robd@gmx.net>
Approved by:	msmith
2000-11-18 15:21:22 +00:00
David Malone 99cdf4ccb2 Add the use of M_ZERO to netgraph.
Submitted by:	josh@zipperup.org
Submitted by:	Robert Drehmel <robd@gmx.net>
Submitted by:	archie
Approved by:	archie
2000-11-18 15:17:43 +00:00
Søren Schmidt 374940a184 Fix a braino .. 2000-11-18 12:14:35 +00:00
Cameron Grant 612276f48c do not blindly assume 8khz is supported on open(). try for 8khz but respect
minspeed/maxspeed specified by the hw driver.

Submitted by:	Andrew Gordon <arg@arg1.demon.co.uk>
2000-11-18 03:43:04 +00:00
Boris Popov f2b1e0d206 Use vop_defaultop() instead of ntfs_bypass().
PR:		kern/22756
2000-11-18 02:47:12 +00:00
John Baldwin b6b55e27a4 Release sched_lock very briefly to give interrupts a chance to fire if we
are in softclock() for a long time.  The old code already did an
splx()/slphigh() pair here, I just missed adding in the equivalent mutex
operations on sched_lock earlier.
2000-11-18 00:21:00 +00:00
Tor Egge e5c5b82950 Don't attempt to cluster write buffers where the VMIO flag isn't set. 2000-11-17 23:40:08 +00:00
Dag-Erling Smørgrav cd085280b5 Make sure we don't cross stripe boundaries when reviving striped plexes.
This makes crash recovery work for stripe sizes that are not multiples of
DEFAULT_REVIVE_BLOCKSIZE (currently 64 kB).
While we're here, fix a few cosmetic nits.

Reviewed by:	grog
Sponsored by:	Enitel ASA (http://www.enitel.no/)
2000-11-17 23:40:01 +00:00
David E. O'Brien 363ea5eadf Fix the `make -jX' (X>1) breakage.
Based on patch submitted by:	Makoto MATSUSHITA <matusita@jp.freebsd.org>
Reviewed by:	marcel, bde
2000-11-17 21:25:15 +00:00
Jake Burkholder 7da6f97772 - Split the run queue and sleep queue linkage, so that a process
may block on a mutex while on the sleep queue without corrupting
it.
- Move dropping of Giant to after the acquire of sched_lock.

Tested by:	John Hay <jhay@icomtek.csir.co.za>
		jhb
2000-11-17 18:09:18 +00:00
John Baldwin 835a748f30 - Change extra sanity checks in cpu_switch() to be conditional on INVARIANTS
instead of DIAGNOSTIC.
- Remove the p_wchan check as it no longer applies since a process may be
  switched out during CURSIG() within msleep() or mawait().
- Remove an extra sanity check only needed during the early SMPng work.
2000-11-17 17:37:43 +00:00
Ruslan Ermilov 251c176f41 mdoc(7) police: use certified section headers wherever possible. 2000-11-17 11:44:16 +00:00
Mike Smith 1d6dc22916 The default kernel filename is "kernel" again, not "kernel.ko".
Submitted by:	mckusick
2000-11-17 04:43:56 +00:00
Mike Smith 0fd18bb21b Add the 'gdt' and 'gdtd' devices for the ICP Vortex RAID controller family. 2000-11-17 01:36:34 +00:00
Brian Somers 27121ab1a4 Go back to using data_len in struct ngpppoe_init_data after discussions
with Julian and Archie.

Implement a new ``sizedstring'' parse type for dealing with field pairs
consisting of a uint16_t followed by a data field of that size, and use
this to deal with the data_len and data fields.

Written by:		Archie with some input by me
Agreed in principle by:	julian
2000-11-16 23:14:53 +00:00
John Baldwin cb799bfef9 The recent changes to msleep() and mawait() resulted in timeout() and
untimeout() not being called with Giant in those functions.  For now,
use the sched_lock to protect the callout wheel in softclock() and in
the various timeout and callout functions.

Noticed by:	tegge
2000-11-16 21:20:52 +00:00
Bill Paul da6e4b77bb When checking the device code in the probe routine, leave the chip in
16-bit mode. Technically, pcn_probe() is destructive because once the
chip goes into 32-bit mode, the only way to get it out again is a
hardware reset. And once the device is in 32-bit mode, the lnc driver
won't be able to talk to it. So if pcn_probe() is called before the
lnc probe routine, and pcn_probe() rejects the chip as one it doesn't
support, the lnc driver will be SOL.

I don't like this. I think it's a design flaw that you can't switch
the chip out of 32-bit mode once it's selected. The only 'right'
solution is for the pcn driver to support all of the PCI devices
in 32-bit mode, however I don't have samples of all the PCnet series
cards for testing.
2000-11-16 19:56:09 +00:00
Archie Cobbs 7d7a5b89a7 Add kernel option NETGRAPH_ONE2MANY. 2000-11-16 16:59:26 +00:00
Warner Losh c9527183c9 vx is now optional rather than taking a count. Reflect that in the
files.  Also a minor white space nit.

Submitted by: bde
2000-11-16 15:16:41 +00:00
Søren Schmidt 18434d5214 Put the probe verboseness behind bootverbose 2000-11-16 10:52:00 +00:00
Archie Cobbs 6e8d625628 New netgraph node type ng_one2many(4). 2000-11-16 05:58:33 +00:00
John Baldwin 20cdcc5b73 Don't release and acquire Giant in mi_switch(). Instead, release and
acquire Giant as needed in functions that call mi_switch().  The releases
need to be done outside of the sched_lock to avoid potential deadlocks
from trying to acquire Giant while interrupts are disabled.

Submitted by:	witness
2000-11-16 02:16:44 +00:00
Andrew Gallatin 088638dae4 remove redundant declaration of bsd_to_linux_sigset()
reviewed by: marcel
2000-11-16 02:08:40 +00:00
Andrew Gallatin b595ab370b fix glaring bugs in rt signals -- copyout the right signal mask in
linux_rt_sendsig() and restore the same signal mask linux does
in rt_sigreturn().  This gets us saving/restoring all 64-bits of the
linux sigset_t in rt signals.

Reviewed by: marcel
2000-11-16 02:07:05 +00:00
John Baldwin 92c79c7e3e Argh, add in a missing release of the sched_lock. 2000-11-16 01:16:54 +00:00
John Baldwin 95de685572 CURSIG() calls functions that acquire sleep mutexes, so it is not a good
idea to be holding the sched_lock while we are calling it.  As such,
release sched_lock before calling CURSIG() in msleep() and mawait() and
reacquire it after CURSIG() returns.

Submitted by:	witness
2000-11-16 01:07:19 +00:00
Andrew Gallatin 930a65fe47 Use the linux_connect() on alpha rather than passing directly through
to our native connect().  This is required to deal with the differences
in the way linux handles connects on non-blocking sockets.

This gets the private beta of the Compaq Linux/alpha JDK working
on FreeBSD/alpha

Approved by: marcel
2000-11-16 01:05:53 +00:00
Andrew Gallatin e652fd8417 make the fcntl() flags match what the linux/alpha port uses, not
what linux/i386 uses
2000-11-16 00:58:07 +00:00
John Baldwin b84988521c - Rename await() to mawait(). mawait() is to await() as msleep() is to
tsleep().  Namely, mawait() takes an extra argument which is a mutex
  to drop when going to sleep.  Just as with msleep(), if the priority
  argument includes the PDROP flag, then the mutex will be dropped and will
  not be reacquired when the process wakes up.
- Add in a backwards compatible macro await() that passes in NULL as the
  mutex argument to mawait().
2000-11-15 22:39:35 +00:00
John Baldwin 3ae4dd935b - Replace a KASSERT() that knew too much about mutex internals with a
mtx_assert() that ensures the mutex we release during msleep() is both
  not recursed and owned by the current process.
2000-11-15 22:30:48 +00:00
John Baldwin f33a072eb9 - Convert references from tsleep() -> msleep()
- Fix a buglet in a comment above await()
2000-11-15 22:27:38 +00:00
John Baldwin 9cce2a0c0d - Add a new macro DROP_GIANT_NOSWITCH() that is similar to DROP_GIANT()
except that it uses the MTX_NOSWITCH flag while it releases Giant via
  mtx_exit().
- Add a mtx_recursed() primitive.  This primitive should only be used on
  a mutex owned by the current process.  It will return non-zero if the
  mutex is recursively owned, or zero otherwise.
- Add two new flags MA_RECURSED and MA_NOTRECURSED that can be used in
  conjuction with MA_OWNED to control the assertion checked by mtx_assert().
- Fix some of the KTR tracepoint strings to use %p when displaying the lock
  field of a mutex, which is a uintptr_t.
2000-11-15 22:12:33 +00:00
John Baldwin 9c36c934a1 Include the right headers to get the DDB #define and the db_active variable. 2000-11-15 22:08:16 +00:00
John Baldwin 896c2303d4 - Replace some instances of sched_ithd with sched_swi in KTR tracepoints.
- Assert that Giant is not owned during the main loop of sithd_loop().
2000-11-15 22:05:23 +00:00
John Baldwin 7c06c69188 Assert that Giant is not owned during the main loop of ithd_loop(). 2000-11-15 22:03:26 +00:00
John Baldwin 59f857e4ea Declare the 'witness_spin_check' properly as a per-CPU variable in the
non-SMP case.
2000-11-15 22:02:05 +00:00
John Baldwin ecbd8e3710 Don't perform witness checks in witness_enter() during a panic. 2000-11-15 22:00:31 +00:00
John Baldwin 4b2c46fab1 Add the 'witness_spin_check' per-CPU variable. 2000-11-15 21:58:02 +00:00
John Baldwin 651c378316 - Don't acquire/release Giant during an interrupt context for machine
checks, clock interrupts, and device interrupts.
- Assert that Giant is not owned during the main loop of ithd_loop().
2000-11-15 21:56:50 +00:00
John Baldwin 22f1b34223 Make ktr_verbose a bit more useful:
- On SMP systems display the cpu number with each message
- If ktr_verbose > 1, then include the filename and line number with each
  trace message
2000-11-15 21:51:53 +00:00
Kirk McKusick 324d6bacc3 Bug fix for revision 1.14 on the replacement of CIRCLEQ with TAILQ.
Submitted by:	Warner Losh <imp@village.org>
2000-11-15 20:07:16 +00:00