Commit graph

4932 commits

Author SHA1 Message Date
Kelly Yancey 3316a80bd1 Convert hit and miss counters to unsigned values. Surely negative values
for either does not make sense.

PR:		(one small part of) 19720
2002-06-10 22:40:26 +00:00
John Baldwin d0c149fce8 We no longer need to acqure Giant in ast() for ktrpsig() in postsig() now
that ktrace no longer needs Giant.
2002-06-07 05:43:40 +00:00
John Baldwin 374a15aa55 - trapsignal() no longer needs to acquire Giant for ktrpsig().
- Catch up to new ktrace API.
2002-06-07 05:43:02 +00:00
John Baldwin af300f2367 - Proper locking for p_tracep and p_traceflag.
- Catch up to new ktrace API.
2002-06-07 05:42:25 +00:00
John Baldwin 6c84de02e0 Properly lock accesses to p_tracep and p_traceflag. Also make a few
ktrace-only things #ifdef KTRACE that were not before.
2002-06-07 05:41:27 +00:00
John Baldwin 9ba7fe1b76 - Catch up to new ktrace API.
- ktrace trace points in msleep() and cv_wait() no longer need Giant.
2002-06-07 05:39:16 +00:00
John Baldwin 60a9bb197d Catch up to changes in ktrace API. 2002-06-07 05:37:18 +00:00
John Baldwin ea3fc8e4cd Overhaul the ktrace subsystem a bit. For the most part, the actual vnode
operations to dump a ktrace event out to an output file are now handled
asychronously by a ktrace worker thread.  This enables most ktrace events
to not need Giant once p_tracep and p_traceflag are suitably protected by
the new ktrace_lock.

There is a single todo list of pending ktrace requests.  The various
ktrace tracepoints allocate a ktrace request object and tack it onto the
end of the queue.  The ktrace kernel thread grabs requests off the head of
the queue and processes them using the trace vnode and credentials of the
thread triggering the event.

Since we cannot assume that the user memory referenced when doing a
ktrgenio() will be valid and since we can't access it from the ktrace
worker thread without a bit of hassle anyways, ktrgenio() requests are
still handled synchronously.  However, in order to ensure that the requests
from a given thread still maintain relative order to one another, when a
synchronous ktrace event (such as a genio event) is triggered, we still put
the request object on the todo list to synchronize with the worker thread.
The original thread blocks atomically with putting the item on the queue.
When the worker thread comes across an asynchronous request, it wakes up
the original thread and then blocks to ensure it doesn't manage to write a
later event before the original thread has a chance to write out the
synchronous event.  When the original thread wakes up, it writes out the
synchronous using its own context and then finally wakes the worker thread
back up.  Yuck.  The sychronous events aren't pretty but they do work.

Since ktrace events can be triggered in fairly low-level areas (msleep()
and cv_wait() for example) the ktrace code is designed to use very few
locks when posting an event (currently just the ktrace_mtx lock and the
vnode interlock to bump the refcoun on the trace vnode).  This also means
that we can't allocate a ktrace request object when an event is triggered.
Instead, ktrace request objects are allocated from a pre-allocated pool
and returned to the pool after a request is serviced.

The size of this pool defaults to 100 objects, which is about 13k on an
i386 kernel.  The size of the pool can be adjusted at compile time via the
KTRACE_REQUEST_POOL kernel option, at boot time via the
kern.ktrace_request_pool loader tunable, or at runtime via the
kern.ktrace_request_pool sysctl.

If the pool of request objects is exhausted, then a warning message is
printed to the console.  The message is rate-limited in that it is only
printed once until the size of the pool is adjusted via the sysctl.

I have tested all kernel traces but have not tested user traces submitted
by utrace(2), though they should work fine in theory.

Since a ktrace request has several properties (content of event, trace
vnode, details of originating process, credentials for I/O, etc.), I chose
to drop the first argument to the various ktrfoo() functions.  Currently
the functions just assume the event is posted from curthread.  If there is
a great desire to do so, I suppose I could instead put back the first
argument but this time make it a thread pointer instead of a vnode pointer.

Also, KTRPOINT() now takes a thread as its first argument instead of a
process.  This is because the check for a recursive ktrace event is now
per-thread instead of process-wide.

Tested on:	i386
Compiles on:	sparc64, alpha
2002-06-07 05:32:59 +00:00
John Baldwin 48849938e8 Change the all locks list from a STAILQ to a TAILQ. This bloats struct
lock_object by another pointer (though all of lock_object should be
conditional on LOCK_DEBUG anyways) in exchange for an O(1) TAILQ_REMOVE()
in witness_destroy() (called for every mtx_destroy() and sx_destroy())
instead of an O(n) STAILQ_REMOVE.  Since WITNESS is so dog slow as it is,
the speed-up is worth the space cost.

Suggested by:	iedowse
2002-06-06 20:51:04 +00:00
Chad David ca18d53eae s/!SIGNOTEMPY/SIGISEMPTY/
Reviewed by: marcel, jhb, alfred
2002-06-06 19:12:41 +00:00
John Baldwin 8dcb900b62 Handle "dead" witnesses better in the situation of several short term locks
being created and destroyed without a single long-term one around to ensure
the witness associated with that group of locks stays alive.  The pipe
mutexes are an example of this group.  For a dead witness we no longer
clear the witness name.  Instead, when looking up the witness for a lock,
if a dead witness' (a witness with a refcount of 0) w_name pointer is
identical to the witness name of the lock then we revive that witness
instead of using a new witness for the lock.  This results in far fewer
dead witness objects and also better preserves locking orders over the long
term resulting in more correct lock order checking.  Note that we can't
ever derefence w_name of a dead witness since we don't know if the string
it is pointing to has been free()'d or kldunload()'d out from under us.
2002-06-06 19:04:38 +00:00
Dag-Erling Smørgrav edad3af28d Move some sysctls from the debug tree to the vfs tree. 2002-06-06 15:50:22 +00:00
Dag-Erling Smørgrav 4a357a32e0 Gratuitous whitespace cleanup. 2002-06-06 15:46:38 +00:00
Poul-Henning Kamp 53e645d90e Use "bwrbg" as description when we sleep for background writing,
"biord" was misleading in every possible way.
2002-06-06 08:56:10 +00:00
Bruce Evans 6438c894da Fixed overflow in the bounds checking in dscheck(). It assumed that
daadr_t is no larger than a long, and some other relatively harmless
things (*blush*).  Overflow for subtracting a daddr_t from a u_long
caused "truncation" of the i/o for attempts to access blocks beyond
the end of the actually cause expansion of the i/o to a preposterous
size.
2002-06-06 00:35:07 +00:00
John Baldwin 6a95e08f2f Replace thread_runnable() with thread_running() as the latter is more
accurate.

Suggested by:	julian
2002-06-04 22:36:24 +00:00
John Baldwin 7fcca6096f Optimize the adaptive mutex spin a bit. Use a simple while loop with
simple reads (and on IA32, a "pause" instruction for each interation of the
loop) to spin until either the mutex owner field changes, or the lock owner
stops executing.

Suggested by:	tanimura
Tested on:	i386
2002-06-04 21:53:48 +00:00
John Baldwin 5853d37d3b Add a private thread_runnable() macro to make the code more readable and
make the KSE diff easier to maintain.
2002-06-04 21:50:02 +00:00
Dag-Erling Smørgrav f16a5176ad ANSIfy the one remaining K&R function. 2002-06-02 21:57:28 +00:00
Dag-Erling Smørgrav e89efc02e0 Whitespace nits. 2002-06-02 21:55:58 +00:00
Dag-Erling Smørgrav 3b1f7e7de0 Add support for 'j' flag. Simplify the size modifier code and reduce code
duplication.  Also add support for 'n' specifier.

Reviewed by:	bde
2002-06-02 21:54:55 +00:00
Jens Schweikhardt 21dc7d4f57 Fix typo in the BSD copyright: s/withough/without/
Spotted and suggested by:	des
MFC after:	3 weeks
2002-06-02 20:05:59 +00:00
Mike Barcroft 6ee093fb8f Add POSIX.1-2001 WCONTINUED option for waitpid(2). A proc flag
(P_CONTINUED) is set when a stopped process receives a SIGCONT and
cleared after it has notified a parent process that has requested
notification via waitpid(2) with WCONTINUED specified in its options
operand.  The status value can be checked with the new WIFCONTINUED()
macro.

Reviewed by:	jake
2002-06-01 18:37:46 +00:00
Archie Cobbs 48d183faca Fix a bug in m_split(): the "m->m_ext.ext_size" field of an mbuf was being
set to zero. This field indicates the total space in the external buffer
and therefore should not be modified after the external buffer is added.

Add a comment warning that the mbufs returned by m_split() might be read-only.

Fix M_TRAILINGSPACE() to return zero if !M_WRITABLE(m).

Reviewed by:	freebsd-net
Obtained from:	Vernier Networks, Inc.
MFC after:	1 week
2002-05-31 22:09:57 +00:00
Dag-Erling Smørgrav 7aa57dca57 Nit: kern.ttys is of type S,xtty, not S,tty. 2002-05-31 16:11:49 +00:00
Seigo Tanimura 4cc20ab1f0 Back out my lats commit of locking down a socket, it conflicts with hsu's work.
Requested by:	hsu
2002-05-31 11:52:35 +00:00
Robert Drehmel 280759e75e - Replace the bandaid introduced in revision 1.110 with
a better solution.
 - Add braces for a ``for'' statement containing a single
   multi-line statement.
2002-05-31 09:41:09 +00:00
Poul-Henning Kamp eef633a71f Mistyped and lost a '&' in previous commit. 2002-05-30 16:26:39 +00:00
Poul-Henning Kamp fe71224650 Don't forget to factor in the boottime when we calculate PPS timestamps.
Submitted by:	Akira Watanabe <akira@myaw.ei.meisei-u.ac.jp>
2002-05-30 10:34:01 +00:00
Jeff Roberson 7181624aaa Record the file, line, and pid of the last successful shared lock holder. This
is useful as a last effort in debugging file system deadlocks.  This is enabled
via 'options DEBUG_LOCKS'
2002-05-30 05:55:22 +00:00
Julian Elischer 628855e758 CURSIG() is not a macro so rename it cursig().
Obtained from:	KSE tree
2002-05-29 23:44:32 +00:00
Julian Elischer 2d0231f5da diff reduction from KSE to keep WW-III from happenning on -current 2002-05-29 20:40:50 +00:00
Dag-Erling Smørgrav 6b658142fd Add some checks to prevent NULL dereferences.
Submitted by:	jhay
2002-05-28 14:29:56 +00:00
Maxime Henrion 8eb0098f4c Remove a duplicated vfs_freeopts() that I introduced in last
revision.
2002-05-28 13:27:55 +00:00
Dag-Erling Smørgrav 6c533ac713 Add NAI copyright. 2002-05-28 06:53:41 +00:00
Marcel Moolenaar 52183d0145 Add uuidgen(2) and uuidgen(1).
The uuidgen command, by means of the uuidgen syscall, generates one
or more Universally Unique Identifiers compatible with OSF/DCE 1.1
version 1 UUIDs.

From the Perforce logs (change 11995):

Round of cleanups:
o  Give uuidgen() the correct prototype in syscalls.master
o  Define struct uuid according to DCE 1.1 in sys/uuid.h
o  Use struct uuid instead of uuid_t. The latter is defined
   in sys/uuid.h but should not be used in kernel land.
o  Add snprintf_uuid(), printf_uuid() and sbuf_printf_uuid()
   to kern_uuid.c for use in the kernel (currently geom_gpt.c).
o  Rename the non-standard struct uuid in kern/kern_uuid.c
   to struct uuid_private and give it a slightly better definition
   for better byte-order handling. See below.
o  In sys/gpt.h, fix the broken uuid definitions to match the now
   compliant struct uuid definition. See below.
o  In usr.bin/uuidgen/uuidgen.c catch up with struct uuid change.

A note about byte-order:
        The standard failed to provide a non-conflicting and
unambiguous definition for the binary representation. My initial
implementation always wrote the timestamp as a 64-bit little-endian
(2s-complement) integral. The clock sequence was always written
as a 16-bit big-endian (2s-complement) integral. After a good
nights sleep and couple of Pan Galactic Gargle Blasters (not
necessarily in that order :-) I reread the spec and came to the
conclusion that the time fields are always written in the native
by order, provided the the low, mid and hi chopping still occurs.
The spec mentions that you "might need to swap bytes if you talk
to a machine that has a different byte-order". The clock sequence
is always written in big-endian order (as is the IEEE 802 address)
because its division is resulting in bytes, making the ordering
unambiguous.
2002-05-28 06:16:08 +00:00
Marcel Moolenaar 494eefd86b Add syscall uuidgen() for generating Univerally Unique Identifiers
(UUIDs). On ia64 UUIDs, aka GUIDs, are used by EFI and the firmware
among others. To create GUID Partition Tables (GPTs), we need to
be able to generate UUIDs.
2002-05-28 05:58:06 +00:00
Dag-Erling Smørgrav 1a149fcd67 Introduce struct xtty, used when exporting tty information to userland.
Make kern.ttys export a struct xtty rather than struct tty.  Since struct
tty is no longer exposed to userland, remove the dev_t / udev_t hack.

Sponsored by:	DARPA, NAI Labs
2002-05-28 05:40:53 +00:00
Alan Cox a739e09c6e o Remove some unnecessary casting from and add some necessary casting to
aio_suspend() and lio_listio().

Submitted by:	bde
2002-05-25 18:39:42 +00:00
Dag-Erling Smørgrav 4b4c18f861 ANSIfy (significant portions were already partly ANSIfied) 2002-05-25 15:52:53 +00:00
Dag-Erling Smørgrav b7457aabf6 Remove register. 2002-05-25 15:44:38 +00:00
Dag-Erling Smørgrav dedf14f521 Automated whitespace cleanup. 2002-05-25 15:43:06 +00:00
Jake Burkholder d2ac231616 Make the run queue parameters machine dependent. Optimize 64 bit
architectures by using a 64 bit word for the bit array which keeps
track of non-empty queues.

Reviewed by:	peter
2002-05-25 01:12:23 +00:00
Peter Wemm 34e3110c70 Fix warnings. Also, removed an unused variable that I found that was just
initialized and never used afterwards.
2002-05-24 06:06:18 +00:00
Maxime Henrion 2274ec995c Style nit, no functional changes. 2002-05-23 23:22:22 +00:00
Maxime Henrion 9ee6bf717f Slightly change the way we pass mount options to the filesystem
VFS_NMOUNT operations.

Reviewed by:	phk
2002-05-23 23:02:19 +00:00
Hajimu UMEMOTO 4b562eede1 In m_aux_delete, no need to chase beyond victim.
Submitted by:	archie
Obtained from:	KAME
2002-05-23 15:59:48 +00:00
John Baldwin cc5d39f81e Minor nit: get p pointer in msleep() from td->td_proc (where
td == curthread) rather than from curproc.
2002-05-23 04:14:18 +00:00
John Baldwin a79c98fa98 Whitespace: trim a trailing tab. 2002-05-23 04:12:28 +00:00
Dag-Erling Smørgrav db586c8b7c Make the counters uintmax_ts, and use %ju rather than %llu. 2002-05-23 03:08:42 +00:00