Commit graph

6491 commits

Author SHA1 Message Date
Robert Watson 6b42f0a2eb Prefer the vop_rmextattr() vnode operation for removing extended
attributes from objects over vop_setextattr() with a NULL uio; if
the file system doesn't support the vop_rmextattr() method, fall
back to the vop_setextattr() method.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-06-22 23:03:07 +00:00
Robert Watson 77533ed2aa Expose vop_rmextattr as an explicit operation at the vnode operation
interface, rather than relying on a NULL uio for the deletion
operation.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-06-22 22:45:24 +00:00
Robert Watson 4b090e41ff Add an explicit credential argument to alq_open() to allow the caller to
specify what credential to use when authorizing vn_open() and later
write operations, rather than curthread->td_ucred.

When writing KTR traces to an ALQ, specify the credential of the thread
generating the sysctl request.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-06-22 22:28:56 +00:00
Poul-Henning Kamp 3b6d965263 Add a f_vnode field to struct file.
Several of the subtypes have an associated vnode which is used for
stuff like the f*() functions.

By giving the vnode a speparate field, a number of checks for the specific
subtype can be replaced simply with a check for f_vnode != NULL, and
we can later free f_data up to subtype specific use.

At this point in time, f_data still points to the vnode, so any code I
might have overlooked will still work.
2003-06-22 08:41:43 +00:00
Ian Dowse adef9265ef When DDB is active, always send printf() output directly to the
console, even if there is a TIOCCONS console tty. We were already
doing this after a panic, but it's also useful when entering DDB
for some other reason too.
2003-06-22 03:20:24 +00:00
Ian Dowse d29bf12ff8 Use a new message buffer `consmsgbuf' to forward messages to a
TIOCCONS console (e.g. xconsole) via a timeout routine instead of
calling into the tty code directly from printf(). This fixes a
number of cases where calling printf() at the wrong time (such as
with locks held) would cause a panic if xconsole is running.

The TIOCCONS message buffer is 8k in size by default, but this can
be changed with the kern.consmsgbuf_size sysctl. By default, messages
are checked for 5 times per second. The timer runs and the buffer
memory remains allocated only at times when a TIOCCONS console is
active.

Discussed on:	freebsd-arch
2003-06-22 02:54:33 +00:00
Ian Dowse 4784a46912 Replace the code for reading and writing the kernel message buffer
with a new implementation that has a mostly reentrant "addchar"
routine, supports multiple message buffers in the kernel, and hides
the implementation details from callers.

The new code uses a kind of sequence number to represend the current
read and write positions in the buffer. This approach (suggested
mainly by bde) permits the read and write pointers to be maintained
separately, which reduces the number of atomic operations that are
required. The "mostly reentrant" above refers to the way that while
it is now always safe to have any number of concurrent writers,
readers could see the message buffer after a writer has advanced
the pointers but before it has witten the new character.

Discussed on:	freebsd-arch
2003-06-22 02:18:31 +00:00
Jeff Roberson 1a7a9d0ec2 - lticks was erroneously being updated in sched_pctcpu(). This was causing
us to skip the pctcpu_update() call which lead to inaccurate cpu usage
   statistics for processes that didn't run often.
2003-06-21 02:31:49 +00:00
Jeff Roberson 665cb285a8 - Don't allow nice to have such a large effect on priority. This was
causing poor interactive performance while unnice processes were running.
   The new scheme still allows nice to have an effect on priority but it is
   not as dramatic as the effect of the interactivity score.
2003-06-21 02:22:47 +00:00
Bosko Milekic b2b417bb41 Fix a divide-by-zero on kern.log_wakeups_per_second tunable.
Submitted by: Christian S.J. Peron <maneo@bsdpro.com>
PR: kern/53557
2003-06-20 22:18:38 +00:00
Stefan Eßer c2ef4dd48a Add comment about **vpp being special-cased in vnode_if.awk (1.38) 2003-06-20 12:24:06 +00:00
David Xu ab78d4d641 cpu_set_upcall_kse needs to access userspace, release schedule lock
before calling it for bound thread. To avoid this problem, change
thread_schedule_upcall to not put new thread on run queue, let caller
do it, so we can tweak the new thread before setting it to run.

Reported by: pho
2003-06-20 09:12:12 +00:00
Poul-Henning Kamp 166400b7e6 Don't put callout_lock under #ifdef DIAGNOSTIC despite the fact that it
works anyway.
2003-06-20 08:39:04 +00:00
Poul-Henning Kamp 568733688b Initialize b_saveaddr when we hand out buffers 2003-06-20 08:26:38 +00:00
Poul-Henning Kamp ce6912c420 Crude but efficient:
#ifdef DIAGNOSTIC hold a mutex while calling callout's so that we hear
about it if they sleep.
2003-06-20 08:07:15 +00:00
Poul-Henning Kamp eaaca5deee Don't (re)initialize f_gcflag to zero.
Move initialization of DTYPE_VNODE specific field f_seqcount into
the DTYPE_VNODE specific code.
2003-06-20 08:02:30 +00:00
David Xu 062cf543fc When a STOP signal is being sent to a process, it is possible all
threads in the process have already masked the signal, so job control
is delayed. But later a thread unmasking the STOP signal should enable
job control, so in issignal(), scanning all threads in process to see
if we can direct suspend some of them, not just suspend current thread.
2003-06-20 03:36:45 +00:00
David Xu 8b56079e2b Fix typo. td should be td0. 2003-06-20 01:56:28 +00:00
Alfred Perlstein bab88630ba Unlock the struct file lock before aquiring Giant, otherwise
we can deadlock because of lock order reversals.  This was not
caught because Witness ignores pool mutexes right now.

Diagnosis and help: truckman
Noticed by: pho
2003-06-19 18:13:07 +00:00
Mike Silbersack b083ea5114 Add a ratelimited message of the form
"maxproc limit exceeded by uid %i, please see tuning(7) and login.conf(5)."

Which will be triggered whenever a user hits his/her maxproc limit or
the systemwide maxproc limit is reached.

MFC after:	1 week
2003-06-19 05:57:25 +00:00
Don Lewis 6084b6c9d5 FILE_LOCK() uses a pool mutex, as does the vnode v_vnlock. Since pool
mutexes are supposed to only be used as leaf mutexes, and what appear
to be separate pool mutexes could be aliased together, it is bad idea
for a thread to attempt to hold two pool mutexes at the same time.

Slightly rearrange the code in kern_open() so that FILE_UNLOCK() is
called before calling VOP_GETVOBJECT(), which will grab the v_vnlock
mutex.
2003-06-19 04:10:56 +00:00
Mike Silbersack 4d7dfc31b8 Add a rate limited message reporting when kern.maxfiles is exceeded,
reporting who did it.

Also, fix a style bug introduced in the previous change.

MFC after:	1 week
2003-06-19 04:07:12 +00:00
Don Lewis 8d5f9131fc VOP_GETVOBJECT() wants to be called with the vnode lock held. 2003-06-19 03:55:01 +00:00
Poul-Henning Kamp 2db4b023bb Introduce a new flag on a file descriptor: DFLAG_SEEKABLE and use that
rather than assume that only DTYPE_VNODE is seekable.
2003-06-18 19:53:59 +00:00
Mike Silbersack 438f085b2f Reserve the last 5% of file descriptors for root use. This should allow
systems to fail more gracefully when a file descriptor exhaustion situation
occurs.

Original patch by:	David G. Andersen <dga@lcs.mit.edu>
PR:			45353
MFC after:		1 week
2003-06-18 18:57:58 +00:00
Poul-Henning Kamp 7c2d2efd58 Initialize struct fileops with C99 sparse initialization. 2003-06-18 18:16:40 +00:00
Jeff Roberson d07ac847ef - Use a more robust mechanism for determining whether or not a kse is on a
kseq.
2003-06-17 19:49:18 +00:00
Scott Long 04d2f20f6b Drop the proc lock around SYSCTL_OUT in the no-threads case.
Submitted by:	truckman
2003-06-17 19:14:00 +00:00
Jeff Roberson 7cd0f83355 - Temporarily patch a problem where the interact score could be negative
because the run time exceeds the largest value a signed int can hold.
   The real solution involves calculating how far we are over the limit.
   To quickly solve this problem we loop removing 1/5th of the current value
   until it falls below the limit.  The common case requires no passes.
2003-06-17 10:21:34 +00:00
Jeff Roberson 4b60e3242e - Add a new function "sched_interact_update()" that scales back the sleep
and run time.
 - Scale the sleep and run time back via sched_interact_update() in more
   places.  This is to keep the statistic more accurate.
 - Charge a parent one tick for forking a child.
 - Add only the run time and not the sleep time to the parents kg when a
   thread exits.  This allows us to give a penalty for having an expensive
   thread exit but does not give a bonus for having an interactive thread
   exit.
 - Change the SLP_RUN_THROTTLE to limit us to 4/5th and not 1/2.
 - Change the SLP_RUN_MAX to two seconds.  This keeps bursty interactive
   applications like mozilla and openoffice in the interactive range even
   through expensive tasks.
 - Recalculate the slice after every sleep.  This ensures that once a task
   has been marked interactive it only has a slice of 1 at the risk of
   giving tasks that sleep for a very brief period a longer time slice.
2003-06-17 06:39:51 +00:00
Mike Silbersack 51710a4597 Hide the m_defrag* statistics under MBUF_STRESS_TEST, there seems
to be no need to see them in the general case (and they aren't
smp-safe anyway.)

Suggested by:	hmp
MFC after:	1 week
2003-06-17 02:34:40 +00:00
David Xu 4184d79115 Forgot to commit code to disable creating a bound thread in same
group again except first kse_create syscall.

Noticed by: julian
2003-06-16 23:46:41 +00:00
David Xu 075102cc4e Reset ncpus to 1 for bound thread group since there is only one
thread in such group.
Change message text from kse_rel to kserel, it is better displayed
in top.
2003-06-16 13:14:52 +00:00
Poul-Henning Kamp e725c18c3a Get rid of the b_spc specialty field in struct buf by using an already
available caller private field.
2003-06-16 07:18:39 +00:00
Poul-Henning Kamp 2a0f8aeb52 I have not had any reports of trouble for a long time, so remove the
gentle versions of the vop_strategy()/vop_specstrategy() mismatch methods
and use vop_panic() instead.
2003-06-15 19:49:14 +00:00
Robert Watson 2bceb0f2b2 Various cr*() calls believed to be MPSAFE, since the uidinfo
code is locked down.
2003-06-15 15:57:42 +00:00
David Xu cd4f6ebb13 1. Add code to support bound thread. when blocked, a bound thread never
schedules an upcall. Signal delivering to a bound thread is same as
   non-threaded process. This is intended to be used by libpthread to
   implement PTHREAD_SCOPE_SYSTEM thread.
2. Simplify kse_release() a bit, remove sleep loop.
2003-06-15 12:51:26 +00:00
Ian Dowse 4f1b457770 Don't overwrite the static panicstr buffer for secondary and further
panics. Before revision 1.38, we used to just point panicstr at the
format string if panicstr was NULL, but since we now use a static
buffer for the formatted panic message, we have to be careful to
only write to it during the first panic.

Pointed out by:	bde
2003-06-15 11:43:00 +00:00
Jeff Roberson 3c12473229 - Increase the ksegrp's cpu time history buffer to 250ms.
- Decrease the history buffer divisor to 2 so that we remember more of the
   old behavior.
2003-06-15 04:14:25 +00:00
David Xu 1d5a24bec6 1. Migrate TDF_UPCALLING from td_flags to td_pflags.
2. Add a flag TDF_SA, it will be used to distinguish SA
   based thread from bound thread.
2003-06-15 03:18:58 +00:00
Jeff Roberson b41f3d22cc - Cap the growth of sleep and run time in sched_exit_kse(). 2003-06-15 02:52:29 +00:00
Jeff Roberson 210491d3d9 - Fix the maximum slice value. I accidentally checked in a value of '2'
which meant no process would run for longer than 20ms.
 - Slightly redo the interactivity scorer.  It follows the same algorithm but
   in a slightly more correct way.  Previously values above half were
   incorrect.
 - Lower the interactivity threshold to 20.  It seems that in testing non-
   interactive tasks are hardly ever near there and expensive interactive
   tasks can sometimes surpass it.  This area needs more testing.
 - Remove an unnecessary KTR.
 - Fix a case where an idle thread that had an elevated priority due to
   priority prop. would be placed back on the idle queue.
 - Delay setting NEEDRESCHED until userret() for threads that haad their
   priority elevated while in kernel.  This gives us the same context switch
   optimization as SCHED_4BSD.
 - Limit the child's slice to 1 in sched_fork_kse() so we detect its behavior
   more quickly.
 - Inhert some of the run/slp time from the child in sched_exit_ksegrp().
 - Redo some of the priority comparisons so they are more clear.
 - Throttle the frequency of sched_pctcpu_update() so that rounding errors
   do not make it invalid.
2003-06-15 02:18:29 +00:00
David Xu 0e2a4d3aeb Rename P_THREADED to P_SA. P_SA means a process is using scheduler
activations.
2003-06-15 00:31:24 +00:00
Alan Cox 49a2507bd1 Migrate the thread stack management functions from the machine-dependent
to the machine-independent parts of the VM.  At the same time, this
introduces vm object locking for the non-i386 platforms.

Two details:

1. KSTACK_GUARD has been removed in favor of KSTACK_GUARD_PAGES.  The
different machine-dependent implementations used various combinations
of KSTACK_GUARD and KSTACK_GUARD_PAGES.  To disable guard page, set
KSTACK_GUARD_PAGES to 0.

2. Remove the (unnecessary) clearing of PG_ZERO in vm_thread_new.  In
5.x, (but not 4.x,) PG_ZERO can only be set if VM_ALLOC_ZERO is passed
to vm_page_alloc() or vm_page_grab().
2003-06-14 23:23:55 +00:00
Alan Cox 89f4fca265 Move the *_new_altkstack() and *_dispose_altkstack() functions out of the
various pmap implementations into the machine-independent vm.  They were
all identical.
2003-06-14 06:20:25 +00:00
Maxime Henrion fca737117a Style(9). 2003-06-13 19:39:21 +00:00
Dag-Erling Smørgrav c2935410f6 Make the VFS cache use zones instead of malloc(9). This results in a
small but noticeable increase in performance for name lookup operations.

The code uses two zones, one for short names (less than 32 characters)
and one for long names (up to NAME_MAX).  Since most file names are
fairly short, this saves a considerable amount of space that would
otherwise be wasted if we always allocated NAME_MAX bytes.  The cutoff
value of 32 characters was picked arbitrarily and may benefit from some
tweaking; it could also be made into a tunable.

Submitted by:	hmp
2003-06-13 08:46:13 +00:00
Alan Cox 8630c1173e Add vm object locking to various pagers' "get pages" methods, i386 stack
management functions, and a u area management function.
2003-06-13 03:02:28 +00:00
Poul-Henning Kamp 7652131bee Initialize struct vfsops C99-sparsely.
Submitted by:   hmp
Reviewed by:	phk
2003-06-12 20:48:38 +00:00
Dag-Erling Smørgrav 633f506489 Document some sysctl variables.
Submitted by:	hmp
2003-06-12 19:46:51 +00:00
Scott Long 30c6f34e00 Add support to sysctl_kern_proc to return all threads in a proc, not just the
first one.  The old behaviour can be switched by specifying KERN_PROC_PROC.

Submitted by: julian, tweaks and added functionality by myself
2003-06-12 16:41:50 +00:00
Alan Cox c10c537816 Finish the vm object locking in sendfile(2). More generally,
the vm locking in sendfile(2) is complete.
2003-06-12 05:52:09 +00:00
Alan Cox 2ab3670aad Lock the vm object when removing a page. 2003-06-11 21:23:04 +00:00
Alan Cox f717a9d063 Lock the vm object when removing a page. 2003-06-11 16:37:33 +00:00
Dag-Erling Smørgrav ffe92432e3 Whitespace cleanup. 2003-06-11 07:35:56 +00:00
Alan Cox c40f7377a4 Add vm object locking. 2003-06-11 06:43:48 +00:00
David E. O'Brien f4636c5959 Use __FBSDID(). 2003-06-11 06:34:30 +00:00
Paul Saab 1795d0cdec Don't overflow when calculating vm_kmem_size. This fixes kmem_map
too small panics on PAE machines which have odd > 4GB sizes (4.5 gig
would render a 20MB of KVA for kmem_map instead of 200MB).

Submitted by:	John Cagle <john.cagle@hp.com>, jeff
Reviewed by:	jeff, peter, scottl, lots of USENIX folks
2003-06-11 05:18:59 +00:00
David Xu 7677ce18b8 Fix error in my last commit. Correctly maintain p_maxthrwaits and unlock
sched_lock.
2003-06-11 01:08:33 +00:00
David E. O'Brien 677b542ea2 Use __FBSDID(). 2003-06-11 00:56:59 +00:00
David Xu 36407bec4f If there are signals delivered to current thread, breaks out of loop,
userret() will be called again by ast() and thread_userret() will be
called again by userret().

Reported by: tegge
2003-06-10 02:21:32 +00:00
Maxime Henrion 0ca5dc1c3e style(9). 2003-06-09 21:57:48 +00:00
John Baldwin 5499ea019d Wait for the real interval timer callout handler to finish executing if it
is currently executing when we try to remove it in exit1().  Without this,
it was possible for the callout to bogusly rearm itself and eventually
refire after the process had been free'd resulting in a panic.

PR:		kern/51964
Reported by:	Jilles Tjoelker <jilles@stack.nl>
Reviewed by:	tegge, bde
2003-06-09 21:46:22 +00:00
John Baldwin 8bccf7034e The issetugid() function is MPSAFE. 2003-06-09 21:34:19 +00:00
Alan Cox 06fa71cdcc Update the vm object and page locking in exec_map_first_page(). Mark the
one still anticipated change with XXX.  Otherwise, this function is done.
2003-06-09 19:37:14 +00:00
Alan Cox 4412dc5468 - Add vm object locking to vm_pgmoveco().
- Add a comment to vm_pgmoveco() describing what remains to be done
   for vm locking.
2003-06-09 19:23:03 +00:00
Juli Mallett c02d762181 Attempt to fix Alpha build by renaming ident[] to kern_ident[]. 2003-06-09 18:19:33 +00:00
John Baldwin 5e26dcb560 - Add a td_pflags field to struct thread for private flags accessed only by
curthread.  Unlike td_flags, this field does not need any locking.
- Replace the td_inktr and td_inktrace variables with equivalent private
  thread flags.
- Move TDF_OLDMASK over to the private flags field so it no longer requires
  sched_lock.
2003-06-09 17:38:32 +00:00
Juli Mallett da1186f2c7 Expose kern.ident by way of OID_AUTO.
Requested by:	phk
2003-06-09 10:54:23 +00:00
Jeff Roberson 356500a306 - Add a simple CPU load balancing algorithm. This works by executing once a
second and equalizing the load between the two most imbalanced CPU.  This
   is intended to clear up long term load imbalances that would not be handled
   by the 'pull' method in sched_choose().
 - Pull out some bits of sched_choose() into a kseq_move() function that moves
   an arbitrary thread from one kseq to another.
2003-06-09 00:39:09 +00:00
Alan Cox fd0cc9a862 Lock the vm object when performing vm_page_grab(). 2003-06-08 07:14:30 +00:00
Jeff Roberson b90816f188 - When a new thread is added to a kseq the load is incremented prior to
adding it to the nice tables.  Therefore, in kseq_add_nice, we should
   keep in mind that the load will be 1 if we are the only thread, and not
   0.
 - Assert that the sched lock is held in all the appropriate places.
 - Increase the scope of the sched lock in sched_pctcpu_update().
 - Hold the sched lock in sched_runnable().  It is not held by the caller.
2003-06-08 00:47:33 +00:00
Poul-Henning Kamp 84c080a85e Improve the root-dev prompt facility for printing devices which could
possibly be a root filesystem.
2003-06-07 15:46:53 +00:00
David Xu b0bd5f38a6 thread_signal_add now is called with ps_mtx held, unlock it before
calling copyin.
2003-06-06 02:17:38 +00:00
Robert Watson 777621799b If a system call comes in requesting to retrieve an attribute named
"", temporarily map it to a call to extattr_list_vp() to provide
compatibility for older applications using the "" API to retrieve
EA lists.

Use VOP_LISTEXTATTR() to support extattr_list_vp() rather than
VOP_GETEXTATTR(..., "", ...).

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Asssociates Laboratories
2003-06-05 05:55:34 +00:00
Robert Watson a6f1342ff6 Add vop_listextattr(), similar to vop_getextattr() but without a
specific attribute name.  It will have the same semantics as the
older vop_getextattr() "retrieve the names" hack, returning
a buffer with ASCII nul-seperated names.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-06-05 05:53:35 +00:00
Marcel Moolenaar 11e0f8e16d Change the second (and last) argument of cpu_set_upcall(). Previously
we were passing in a void* representing the PCB of the parent thread.
Now we pass a pointer to the parent thread itself.
The prime reason for this change is to allow cpu_set_upcall() to copy
(parts of) the trapframe instead of having it done in MI code in each
caller of cpu_set_upcall(). Copying the trapframe cannot always be
done with a simply bcopy() or may not always be optimal that way. On
ia64 specifically the trapframe contains information that is specific
to an entry into the kernel and can only be used by the corresponding
exit from the kernel. A trapframe copied verbatim from another frame
is in most cases useless without some additional normalization.

Note that this change removes the assignment to td->td_frame in some
implementations of cpu_set_upcall(). The assignment is redundant.
A previous call to cpu_thread_setup() already did the exact same
assignment. An added benefit of removing the redundant assignment is
that we can now change td_pcb without nasty side-effects.

This change officially marks the ability on ia64 for 1:1 threading.

Not tested on: amd64, powerpc
Compile & boot tested on: alpha, sparc64
Functionally tested on: i386, ia64
2003-06-04 21:13:21 +00:00
Poul-Henning Kamp 22ee8c4f50 Add instrumentation which tells us how much work softclock() does
per invocation.
2003-06-04 05:25:58 +00:00
Robert Watson 8bebbb1a32 Implementations of extattr_list_fd(), extattr_list_file(), and
extattr_list_link() system calls, which return a least of extended
attributes defined for a vnode referenced by a file descriptor
or path name.  Currently, we just invoke VOP_GETEXTATTR() since
it will convert a request for an empty name into a query for a
name list, which was the old (more hackish) API.  At some point
in the near future, we'll push the distinction between get and
list down to the vnode operation layer, but this provides access
to the new API for applications in the short term.

Pointed out by:	Dominic Giampaolo <dbg@apple.com>
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-06-04 03:57:28 +00:00
Robert Watson 31d13e2a29 Regen from syscalls.master:1.149, addition of extended attribute
list system calls for fd, file, link.
2003-06-04 03:50:20 +00:00
Robert Watson 9e18f27730 Add system calls to explicitly list extended attributes on a
file/directory/link, rather than using a less explicit hack on
the extattr retrieval API:

  extattr_list_fd()
  extattr_list_file()
  extattr_list_link()

The existing API was counter-intuitive, and poorly documented.
The prototypes for these system calls are identical to
extattr_get_*(), but without a specific attribute name to
leave NULL.

Pointed out by:	Dominic Giampaolo <dbg@apple.com>
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-06-04 03:49:31 +00:00
Robert Watson 0b95513444 Assert the vnode lock when returning successfully from vn_open_cred(). 2003-06-04 00:54:27 +00:00
Julian Elischer 2b035cbe5a Remove un-needed code.
Don't copyin() data we are about to overwrite.
Add a flag to tell userland that KSE is officially "DONE" with the
mailbox and has gone away.

Obtained from:	davidxu@
2003-06-04 00:12:57 +00:00
Bosko Milekic 479728fd77 Fix a potential bucket leak where when freeing to an empty bucket
we failed to put the bucket back into the general cache/container.

Also, fix a bad assumption.  There was a KASSERT() that aimed to
guarantee that whenever the pcpu container's mc_starved was > 0,
that whatever the bucket we were freeing to was an empty bucket,
assuming it belonged to the pcpu container cache. However, there
is at least one case where this is not true anymore; consider:
1) All containers empty, next thread to try to alloc will touch
   a pcpu container, notice it's empty, and increment the pcpu
   container's mc_starved.
2) Some other thread frees an mbuf belonging to a bucket in
   the general cache/container.  Then it frees another mbuf
   belonging to the same bucket (still in gen container).
3) Some third thread tries to allocate an mbuf from the pcpu
   container and, since empty, grabs one mbuf now available
   in the general cache and moves the non-empty bucket from
   which it took 1 mbuf and to which the thread in (2) freed
   to, and moves it to the pcpu container.
4) A final thread tries to free an mbuf belonging to the
   NON-EMPTY bucket mentionned in (2) and (3) and, since
   the pcpu container's mc_starved is > 0, but the bucket
   is obviously non-empty, it trips on the KASSERT.
This meant that one could potentially get a panic in some
cases when out of mbufs and clusters.  The problem could
be mitigated by commenting out some cv_signal() calls,
but I'm assuming that was pure coincidence and this is
the correct fix.
2003-06-03 19:19:13 +00:00
Jeff Roberson 980c75b4d8 - Remove the blocked pointer from the umtx structure.
- Use a hash of umtx queues to queue blocked threads.  We hash on pid and the
   virtual address of the umtx structure.  This eliminates cases where we
   previously held a lock across a casuptr call.

Reviwed by:	jhb (quickly)
2003-06-03 05:24:46 +00:00
Tor Egge ad05d58087 Add tracking of process leaders sharing a file descriptor table and
allow a file descriptor table to be shared between multiple process
leaders.

PR:		50923
2003-06-02 16:05:32 +00:00
Marcel Moolenaar bf822712f7 Remove the ia64 hackery in threadinit() that was needed to work around
the lameness of the kstack code. The EPC overhaul de-lame-ified the
kstack code by removing the need for contigmalloc(). We can now
allocate stacks using malloc(). We probably want to make the stacks
swappable as well so that we can make it MI. But that's another story.
2003-06-01 05:57:58 +00:00
Robert Watson ef2e1ca561 Attempt to further comment and clarify System V IPC logic: document
why certain exceptions are made, note an inconsistency between
FreeBSD and some other implementations regarding IPC_M, and let
suser() generate our EPERM rather than forcing it ourselves.
Remove a carriage return that crept in in the last commit.

Reviewed by:	gordon
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-05-31 23:31:51 +00:00
Robert Watson a0ccd3f6ad Attempt to marginally de-obfuscate sections of the System V IPC access
control logic.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-05-31 23:17:30 +00:00
Poul-Henning Kamp b82af320cf Add "" around mutex name to make message less confusing. 2003-05-31 21:11:01 +00:00
Poul-Henning Kamp 670966596b Remove unused variable(s).
Found by:       FlexeLint
2003-05-31 20:29:34 +00:00
Poul-Henning Kamp 1e93e04fa9 Remove return after panic.
Found by:       FlexeLint
2003-05-31 20:18:23 +00:00
Poul-Henning Kamp 90471005e1 Remove needless return
Found by:       FlexeLint
2003-05-31 20:16:44 +00:00
Poul-Henning Kamp 4fe77d64a0 Add a couple of XXX comments where the intent is not clear.
Found by:       FlexeLint
2003-05-31 20:13:58 +00:00
Poul-Henning Kamp 74f1af0191 Remove unused variable(s).
Remove break after goto

Found by:       FlexeLint
2003-05-31 20:11:33 +00:00
Poul-Henning Kamp b1921a6f33 Remove return after panic.
Found by:       FlexeLint
2003-05-31 20:09:42 +00:00
Poul-Henning Kamp a62f80f8e0 Remove unused variable and now unbalanced call to splbio();
Found by:       FlexeLint
2003-05-31 20:09:01 +00:00
Marcel Moolenaar a063facbf6 Fix ia32 compat on ia64. Recent ia64 MD changes caused the garbage on
the stack to be changed in a way incompatible with elf32_map_insert()
where we used data_buf without initializing it for when the partial
mapping resulting in a misaligned image (typical when the page size
implied by the image is not the same as the page size in use by the
kernel). Since data_buf is passed by reference to vm_map_find(), the
compiler cannot warn about it.

While here, move all local variables to the top of the function.
2003-05-31 19:55:05 +00:00
Poul-Henning Kamp 850cb24ef8 "break" rather than fall through to a break in the default clause.
Found by:       FlexeLint
2003-05-31 16:53:16 +00:00
Poul-Henning Kamp 8313328657 Introduce {be,le}_uuid_{enc,dec}() functions for explicitly encoding
and decoding UUID's in big endian and little endian binary format.
2003-05-31 16:47:07 +00:00