Commit graph

5798 commits

Author SHA1 Message Date
Alan Cox d746789347 Hold the page queues lock when calling vm_page_flag_clear(). 2002-12-27 06:52:32 +00:00
Jeffrey Hsu 6f782c4636 Ensure that the made-up inode number for a Unix domain socket is persistent. 2002-12-25 07:59:39 +00:00
Robert Watson 79191eca57 Flush vop_refreshlabel() definition, since it is no longer used.
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-12-24 19:47:13 +00:00
Poul-Henning Kamp a7010ee2f4 White-space changes. 2002-12-24 09:44:51 +00:00
Jeffrey Hsu 956b0b653c SMP locking for radix nodes. 2002-12-24 03:03:39 +00:00
Poul-Henning Kamp 08c7670a8b Move the declaration of the socket fileops from socketvar.h to file.h.
This allows us to use the new typedefs and removes the needs for a number
of forward struct declarations in socketvar.h
2002-12-23 22:46:47 +00:00
Poul-Henning Kamp f3a682116c Detediousficate declaration of fileops array members by introducing
typedefs for them.
2002-12-23 21:53:20 +00:00
Poul-Henning Kamp 6ce9c72c30 s/sokqfilter/soo_kqfilter/ for consistency with the naming of all
other socket/file operations.
2002-12-23 21:37:28 +00:00
Alan Cox 0cb6c00463 - Hold the kernel_object's lock around vm_page_alloc(kernel_object,...).
- Hold the page queues lock around vm_page_wakeup().
2002-12-23 20:10:47 +00:00
Jake Burkholder c3c2862df4 - Add a spin lock to single thread cache invalidation and tlb flush ipis,
which allows ipis to be sent outside of Giant.
- Remove the ap boot mutex, which is unused.
2002-12-22 20:50:23 +00:00
Kris Kennaway 4ef3d7a27b Enforce correct ordering of the filedesc structure and pipe mutex, because
WITNESS can get the order wrong if it guesses based on first use.

Reviewed by:	jhb, alfred
2002-12-22 16:32:34 +00:00
Jeffrey Hsu b30a244c34 SMP locking for ifnet list. 2002-12-22 05:35:03 +00:00
Marcel Moolenaar 551d79e177 Fix multiple registration of the elf_legacy_coredump sysctl variable.
The duplication is caused by the fact that imgact_elf.c is included
by both imgact_elf32.c and imgact_elf64.c and both are compiled by
default on ia64. Consequently, we have two seperate copies of the
elf_legacy_coredump variable due to them being declared static, and
two entries for the same sysctl in the linker set, both referencing
the unique copy of the elf_legacy_coredump variable. Since the second
sysctl cannot be registered, one of the elf_legacy_coredump variables
can not be tuned (if ordering still holds, it's the ELF64 related one).

The only solution is to create two different sysctl variables, just
like the elf<32|64>_trace sysctl variables. This unfortunately is an
(user) interface change, but unavoidable. Thus, on ELF32 platforms
the sysctl variable is called elf32_legacy_coredump and on ELF64
platforms it is called elf64_legacy_coredump. Platforms that have
both ELF formats have both sysctl variables.

These variables should probably be retired sooner rather than later.
2002-12-21 01:15:39 +00:00
Sam Leffler 91974ce10b add generic rate limiting support from netbsd; ratelimit is purely time based,
ppsratecheck is for controlling packets/second

Obtained from:	netbsd
2002-12-20 23:54:47 +00:00
Alan Cox 2952e1fb58 Extend the scope of the page queues lock in vm_pgmoveco(). 2002-12-20 21:18:29 +00:00
Maxime Henrion 894db7b01f Don't forget to destroy the mutex if an error occurs
in the jail() system call.

Submitted by:	Pawel Jakub Dawidek <nick@garage.freebsd.pl>
2002-12-20 14:32:20 +00:00
Alan Cox ee113343eb Hold the page queues lock when performing vm_page_busy(). 2002-12-18 20:16:22 +00:00
Poul-Henning Kamp 4d99ef8d55 Indent properly. 2002-12-17 19:31:26 +00:00
Poul-Henning Kamp 126c7e29fe Remove unused variable cn_devfsdev. 2002-12-17 19:30:50 +00:00
Poul-Henning Kamp d321df47c3 Don't cast a pointer to (intptr_t) and then on to (int) when we cannot
be sure that (int) is large enough.  Instead cast only to (intptr_t) and
cast the switch/case values to (intptr_t) as well.
2002-12-17 19:13:03 +00:00
Matthew Dillon fa7dd9c5bc Change the way ELF coredumps are handled. Instead of unconditionally
skipping read-only pages, which can result in valuable non-text-related
data not getting dumped, the ELF loader and the dynamic loader now mark
read-only text pages NOCORE and the coredump code only checks (primarily) for
complete inaccessibility of the page or NOCORE being set.

Certain applications which map large amounts of read-only data will
produce much larger cores.  A new sysctl has been added,
debug.elf_legacy_coredump, which will revert to the old behavior.

This commit represents collaborative work by all parties involved.
The PR contains a program demonstrating the problem.

PR:		kern/45994
Submitted by:	"Peter Edwards" <pmedwards@eircom.net>, Archie Cobbs <archie@dellroad.org>
Reviewed by:	jdp, dillon
MFC after:	7 days
2002-12-16 19:24:43 +00:00
Robert Drehmel 0adb6d7a49 Remove the hto(be|le)[slq] and (be|le)toh[slq] macros defined in
_KERNEL scope from "src/sys/sys/mchain.h".

Replace each occurrence of the above in _KERNEL scope with the
appropriate macro from the set of hto(be|le)(16|32|64) and
(be|le)toh(16|32|64) from "src/sys/sys/endian.h".

Tested by:		tjr
Requested by:		comment marked with XXX
2002-12-16 16:20:06 +00:00
Matthew Dillon 72e7f3ddc2 Regenerate system calls (swapoff added) 2002-12-15 19:19:15 +00:00
Matthew Dillon 92da00bb24 This is David Schultz's swapoff code which I am finally able to commit.
This should be considered highly experimental for the moment.

Submitted by:	David Schultz <dschultz@uclink.Berkeley.EDU>
MFC after:	3 weeks
2002-12-15 19:17:57 +00:00
Matthew Dillon 389d2b6e21 Fix a refcount race with the vmspace structure. In order to prevent
resource starvation we clean-up as much of the vmspace structure as we
can when the last process using it exits.  The rest of the structure
is cleaned up when it is reaped.  But since exit1() decrements the ref
count it is possible for a double-free to occur if someone else, such as
the process swapout code, references and then dereferences the structure.
Additionally, the final cleanup of the structure should not occur until
the last process referencing it is reaped.

This commit solves the problem by introducing a secondary reference count,
calling 'vm_exitingcnt'.  The normal reference count is decremented on exit
and vm_exitingcnt is incremented.  vm_exitingcnt is decremented when the
process is reaped.  When both vm_exitingcnt and vm_refcnt are 0, the
structure is freed for real.

MFC after:	3 weeks
2002-12-15 18:50:04 +00:00
Maxim Konovalov 9f59c468f3 o Clear a high bit of ipc_perm.seq so msgget(3) never returns a
negative message queue id.

PR:		kern/46122
Submitted by:	Vladimir B.Grebenschikov <vova@sw.ru>
MFC after:	2 weeks
2002-12-15 09:41:46 +00:00
Alan Cox 475e8011ab Perform vm_object_lock() and vm_object_unlock() around
vm_object_page_remove().
2002-12-15 05:41:56 +00:00
Alfred Perlstein f97182acf8 unwrap lines made short enough by SCARGS removal 2002-12-14 08:18:06 +00:00
Alfred Perlstein b80521fee5 remove syscallarg().
Suggested by: peter
2002-12-14 02:07:32 +00:00
Alfred Perlstein d1e405c5ce SCARGS removal take II. 2002-12-14 01:56:26 +00:00
Kirk McKusick 0f5f789c0d The buffer daemon cannot skip over buffers owned by locked inodes as
they may be the only viable ones to flush. Thus it will now wait for
an inode lock if the other alternatives will result in rollbacks (and
immediate redirtying of the buffer). If only buffers with rollbacks
are available, one will be flushed, but then the buffer daemon will
wait briefly before proceeding. Failing to wait briefly effectively
deadlocks a uniprocessor since every other process writing to that
filesystem will wait for the buffer daemon to clean up which takes
close enough to forever to feel like a deadlock.

Reported by:	Archie Cobbs <archie@dellroad.org>
Sponsored by:   DARPA & NAI Labs.
Approved by:	re
2002-12-14 01:35:30 +00:00
Alfred Perlstein bc9e75d7ca Backout removal SCARGS, the code freeze is only "selectively" over. 2002-12-13 22:41:47 +00:00
Alfred Perlstein 0bbe7292e1 Remove SCARGS.
Reviewed by: md5
2002-12-13 22:27:25 +00:00
Tim J. Robbins 9d0fffd3ca Drop filedesc lock and acquire Giant around calls to malloc() and free().
These call uma_large_malloc() and uma_large_free() which require Giant.
Fixes panic when descriptor table is larger than KMEM_ZMAX bytes
noticed by kkenn.

Reviewed by:	jhb
2002-12-13 09:59:40 +00:00
Julian Elischer 696058c3c5 Unbreak the KSE code. Keep track of zobie threads using the Per-CPU storage
during the context switch. Rearrange thread cleanups
to avoid problems with Giant. Clean threads when freed or
when recycled.

Approved by:	re (jhb)
2002-12-10 02:33:45 +00:00
Robert Watson 990b4b2dc5 Remove dm_root entry from struct devfs_mount. It's never set, and is
unused.  Replace it with a dm_mount back-pointer to the struct mount
that the devfs_mount is associated with.  Export that pointer to MAC
Framework entry points, where all current policies don't use the
pointer.  This permits the SEBSD port of SELinux's FLASK/TE to compile
out-of-the-box on 5.0-CURRENT with full file system labeling support.

Approved by:	re (murray)
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-12-09 03:44:28 +00:00
Alan Cox 2e29a1f21f To avoid lock order reversals in getnewvnode(), the call to uma_zfree()
must be delayed until the vnode interlock is released.

Reported by:	kris@
Approved by:	re (jhb)
2002-12-08 05:06:50 +00:00
Giorgos Keramidas 0c920c0de8 Fix typo in comment. It's SYSINIT, not SYSINT.
Approved by:	re (murray)
2002-11-30 22:15:30 +00:00
Kirk McKusick c6964d3bc9 Remove a race condition / deadlock from snapshots. When
converting from individual vnode locks to the snapshot
lock, be sure to pass any waiting processes along to the
new lock as well. This transfer is done by a new function
in the lock manager, transferlockers(from_lock, to_lock);
Thanks to Lamont Granquist <lamont@scriptkiddie.org> for
his help in pounding on snapshots beyond all reason and
finding this deadlock.

Sponsored by:   DARPA & NAI Labs.
2002-11-30 19:00:51 +00:00
Warner Losh 304f10ce4a devd kernel improvements:
1) Record all device events when devctl is enabled, rather than just when
   devd has devctl open.  This is necessary to prevent races between when
   a device arrives, and when devd starts.
2) Add hw.bus.devctl_disable to disable devctl, this can also be set as a
   tunable.
3) Fix async support. Reset nonblocking and async_td in open.  remove
   async flags.
4) Free all memory when devctl is disabled.

Approved by: re (blanket)
2002-11-30 00:49:43 +00:00
Alan Cox fdff30d256 Use pmap_remove_all() instead of pmap_remove() before freeing the page
in vm_pgmoveco(); the page may have more than one mapping.  Hold the page
queues lock when calling pmap_remove_all().

Approved by:	re (blanket)
2002-11-28 08:44:26 +00:00
Robert Drehmel f85a961930 Do not set a variable (vp->p_pollinfo) to NULL if we know
it already has that value.

Approved by:	re
2002-11-27 16:45:54 +00:00
Maxim Konovalov 8819f45b51 Small SO_RCVTIMEO and SO_SNDTIMEO values are mistakenly taken to be zero.
PR:		kern/32827
Submitted by:	Hartmut Brandt <brandt@fokus.gmd.de>
Approved by:	re (jhb)
MFC after:	2 weeks
2002-11-27 13:34:04 +00:00
Tim J. Robbins fef82663b8 o Initialise each mbuf's m_len to 0 in m_getm(); mb_put_mem() depends
on this.
o Update the `cur' pointer in the cluster loop in m_getm() to avoid
  incorrect truncation and leaked mbufs.

Reviewed by:	bmilekic
Approved by:	re
2002-11-27 04:26:00 +00:00
Warner Losh 647501a046 Make the rman_{get,set}_* macros into real functions. The macros
create an ABI that encodes offsets and sizes of structures into client
drivers.  The functions isolate the ABI from changes to the resource
structure.  Since these are used very rarely (once at startup), the
speed penalty will be down in the noise.

Also, add r_rid to the structure so that clients can save the 'rid' of
the resource in the struct resource, plus accessor functions.  Future
additions to newbus will make use of this to present a simplified
interface for resource specification.

Approved by: re (jhb)
Reviewed by: jhb, jake
2002-11-27 03:55:22 +00:00
Bill Fenner 8b5f8b061a Don't hold acct_mtx over limcopy(), since it's unnecessary and
limcopy() can sleep.

Approved by:	re
2002-11-26 18:04:12 +00:00
Sam Leffler c8f43965d6 correct function names in KASSERT's for 2 m_tag routines
Submitted by:	rwatson
Approved by:	re
2002-11-26 17:59:16 +00:00
Robert Drehmel d1989db545 To avoid sleeping with all sorts of resources acquired (the reported
problem was a locked directory vnode), do not give the process a chance
to sleep in state "stopevent" (depends on the S_EXEC bit being set in
p_stops) until most resources have been released again.

Approved by:	re
2002-11-26 17:30:55 +00:00
John Baldwin 04f4a16448 If the file descriptors passed into do_dup() are negative, return EBADF
instead of panicing.  Also, perform some of the simpler sanity checks on
the fds before acquiring the filedesc lock.

Approved by:	re
Reported by:	Dan Nelson <dan@emsphone.com> and others
2002-11-26 17:22:15 +00:00
Robert Watson 4d10c0ce5f Un-staticize mac_cred_mmapped_drop_perms() so that it may be used
by policy modules making use of downgrades in the MAC AST event.  This
is required by the mac_lomac port of LOMAC to the MAC Framework.

Approved by:	re
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-11-26 17:11:57 +00:00
Alan Cox 2d21129db2 Acquire and release the page queues lock around pmap_remove_pages() because
it updates several of vm_page's fields.
2002-11-25 04:37:44 +00:00
Alan Cox 178949e021 Hold the page queues/flags lock when calling vm_page_set_validclean().
Approved by:	re
2002-11-23 19:10:31 +00:00
Maxime Henrion b19d9defef Under certain circumstances, we were calling kmem_free() from
i386 cpu_thread_exit().  This resulted in a panic with WITNESS
since we need to hold Giant to call kmem_free(), and we weren't
helding it anymore in cpu_thread_exit().  We now do this from a
new MD function, cpu_thread_dtor(), called by thread_dtor().

Approved by:	re@
Suggested by:	jhb
2002-11-22 23:57:02 +00:00
Jeff Roberson 79acfc497b - Add the new sched_pctcpu() function to the sched_* api.
- Provide a routine in sched_4bsd to add this functionality.
 - Use sched_pctcpu() in kern_proc, which is the one place outside of
   sched_4bsd where the old pctcpu value was accessed directly.

Approved by:	re
2002-11-21 09:30:55 +00:00
Jeff Roberson 06439a04a1 - Move scheduler specific macros and defines out of proc.h
Approved by:	re
2002-11-21 09:14:13 +00:00
Jeff Roberson 148302c9c9 - Move FSCALE back to kern_sync. This is not scheduler specific.
- Create a new callout for lbolt and move it out of schedcpu().  This is not
   scheduler specific either.

Approved by:	re
2002-11-21 08:57:08 +00:00
Jeff Roberson de028f5a4a - Implement a mechanism for allowing schedulers to place scheduler dependant
data in the scheduler independant structures (proc, ksegrp, kse, thread).
 - Implement unused stubs for this mechanism in sched_4bsd.

Approved by:	re
Reviewed by:	luigi, trb
Tested on:	x86, alpha
2002-11-21 01:22:38 +00:00
Robert Watson 2555374c4f Introduce p_label, extensible security label storage for the MAC framework
in struct proc.  While the process label is actually stored in the
struct ucred pointed to by p_ucred, there is a need for transient
storage that may be used when asynchronous (deferred) updates need to
be performed on the "real" label for locking reasons.  Unlike other
label storage, this label has no locking semantics, relying on policies
to provide their own protection for the label contents, meaning that
a policy leaf mutex may be used, avoiding lock order issues.  This
permits policies that act based on historical process behavior (such
as audit policies, the MAC Framework port of LOMAC, etc) can update
process properties even when many existing locks are held without
violating the lock order.  No currently committed policies implement use
of this label storage.

Approved by:	re
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-11-20 15:41:25 +00:00
Robert Watson a3df768b04 Merge kld access control checks from the MAC tree: these access control
checks permit policy modules to augment the system policy for permitting
kld operations.  This permits policies to limit access to kld operations
based on credential (and other) properties, as well as to perform checks
on the kld being loaded (integrity, etc).

Approved by:	re
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-11-19 22:12:42 +00:00
Robert Watson 293d2d2261 We leaked a process lock reference in the event an RFTHREAD process
leader wasn't exiting during a fork; instead, do remember to release
the lock avoiding lock order reversals and recursion panic.

Reported by:	"Joel M. Baldwin" <qumqats@outel.org>
2002-11-18 14:23:21 +00:00
David Xu bfd8325073 Make sure only update wall clock at upcall time, slightly reformat
code in kse_relase().
2002-11-18 12:28:15 +00:00
Alfred Perlstein ec63e12a03 During shutdown explain what the numbers following the 'syncing
disks' message mean, specifically, 'buffers remaining...'.
2002-11-18 02:41:03 +00:00
David Xu 8798d4f9c8 1. Support versioning and wall clock in kse mailbox,
also add rusage time in thread mailbox.
2. Minor change for thread limit code in thread_user_enter(),
   fix typo in kse_release() last I committed.

Reviewed by: deischen, mini
2002-11-18 01:59:31 +00:00
Julian Elischer 904f1b77cc include smp.h.
it is required by some code that was commented out until david's
last commit.
2002-11-17 23:26:42 +00:00
David Xu fdc5ecd24f 1.Add sysctls to control KSE resource allocation.
kern.threads.max_threads_per_proc
  kern.threads.max_groups_per_proc
2.Temporary disable borrower thread stash itself as
  owner thread's spare thread in thread_exit(). there
  is a race between owner thread and borrow thread:
  an owner thread may allocate a spare thread as this:
	if (td->td_standin == NULL)
		td->standin = thread_alloc();
  but thread_alloc() can block the thread, then a borrower
  thread would possible stash it self as owner's spare
  thread in thread_exit(), after owner is resumed, result
  is a thread leak in kernel, double check in owner can
  avoid the race, but it may be ugly and not worth to do.
2002-11-17 11:47:03 +00:00
David Xu db9b0729fc Rework last exiting thread in kse_release(), wait a signal and then
schedule an upcall and call thread_exit().
2002-11-17 10:12:00 +00:00
Jeff Roberson a9a088823e - Release the imgp vnode prior to freeing exec_map resources to avoid
deadlock.
2002-11-17 09:33:00 +00:00
Alfred Perlstein f51c1e897d Rework the sysconf(3) interaction with aio:
sysconf.c:
  Use 'break' rather than 'goto yesno' in sysconf.c so that we report a '0'
  return value from the kernel sysctl.

vfs_aio.c:
  Make aio reset its configuration parameters to -1 after unloading
  instead of 0.

posix4_mib.c:
  Initialize the aio configuration parameters to -1
  to indicate that it is not loaded.
  Add a facility (p31b_iscfg()) to determine if a posix4 facility has been
  initialized to avoid having to re-order the SYSINITs.
  Use p31b_iscfg() to determine if aio has had a chance to run yet which
  is likely if it is compiled into the kernel and avoid spamming its
  values.
  Introduce a macro P31B_VALID() instead of doing the same comparison over
  and over.

posix4.h:
  Prototype p31b_iscfg().
2002-11-17 04:15:34 +00:00
Alan Cox 4fec79bef8 Now that pmap_remove_all() is exported by our pmap implementations
use it directly.
2002-11-16 07:44:25 +00:00
Alfred Perlstein 86d52125a2 Export the values for _SC_AIO_MAX and _SC_AIO_PRIO_DELTA_MAX via the p1003b
sysctl interface.
2002-11-16 06:38:07 +00:00
Daniel Eischen f3ec9000e9 Regenerate after adding system calls. 2002-11-16 06:36:56 +00:00
Daniel Eischen 2be05b70c9 Add getcontext, setcontext, and swapcontext as system calls.
Previously these were libc functions but were requested to
be made into system calls for atomicity and to coalesce what
might be two entrances into the kernel (signal mask setting
and floating point trap) into one.

A few style nits and comments from bde are also included.

Tested on alpha by: gallatin
2002-11-16 06:35:53 +00:00
Alfred Perlstein c844abc920 Call 'p31b_setcfg(CTL_P1003_1B_AIO_LISTIO_MAX, AIO_LISTIO_MAX)'
when AIO is initialized so that sysconf() gives correct results.

Reported by: Craig Rodrigues <rodrigc@attbi.com>
2002-11-16 04:22:55 +00:00
Alfred Perlstein b565fb9e6f headers should not really include "opt_foo.h" (in this case opt_posix.h).
remove it from the header and add it to the files that require it.
2002-11-15 22:55:06 +00:00
David Xu 1d2c5bd519 Return EWOULDBLOCK for last thread in kse_release().
Requested by: archie
2002-11-15 00:53:59 +00:00
Thomas Moestl 01ee43955c Make the msg_size, msg_bufx and msg_bufr memebers of struct msgbuf
signed, since they describe a ring buffer and signed arithmetic is
performed on them. This avoids some evilish casts.

Since this changes all but two members of this structure, style(9)
those remaining ones, too.

Requested by:	bde
Reviewed by:	bde (earlier version)
2002-11-14 16:11:12 +00:00
David Xu ca161eb6e9 In kse_release(), check if current thread is bound
and current kse mailbox was already initialized, also
prevent last thread from exiting unless we figure out
how to safely support null thread proc.
2002-11-14 06:06:45 +00:00
Robert Watson a96acd1ace Introduce a condition variable to avoid returning EBUSY when
the MAC policy list is busy during a load or unload attempt.
We assert no locks held during the cv wait, meaning we should
be fairly deadlock-safe.  Because of the cv model and busy
count, it's possible for a cv waiter waiting for exclusive
access to the policy list to be starved by active and
long-lived access control/labeling events.  For now, we
accept that as a necessary tradeoff.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-11-13 15:47:09 +00:00
Maxime Henrion 2bb95458bd Add support for the C99 %t format modifier. 2002-11-13 15:15:59 +00:00
Robert Watson 63b6f478ec Garbage collect mac_create_devfs_vnode() -- it hasn't been used since
we brought in the new cache and locking model for vnode labels.  We
now rely on mac_associate_devfs_vnode().

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-11-12 04:20:36 +00:00
John Baldwin d2b28e078a Correct an assertion in the code to traverse the list of locks to find an
earlier acquired lock with the same witness as the lock currently being
acquired.  If we had released several earlier acquired locks after
acquiring enough locks to require another lock_list_entry bucket in the
lock list, then subsequent lock_list_entry buckets could contain only one
lock instance in which case i would be zero.

Reported by:	Joel M. Baldwin <qumqats@outel.org>
2002-11-11 16:36:20 +00:00
Robert Watson 2d43d24ed4 Garbage collect definition of M_MACOPVEC -- we no longer perform a
dynamic mapping of an operation vector into an operation structure,
rather, we rely on C99 sparse structure initialization.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-11-11 14:15:58 +00:00
Alan Cox d154fb4fe6 When prot is VM_PROT_NONE, call pmap_page_protect() directly rather than
indirectly through vm_page_protect().  The one remaining page flag that
is updated by vm_page_protect() is already being updated by our various
pmap implementations.

Note: A later commit will similarly change the VM_PROT_READ case and
eliminate vm_page_protect().
2002-11-10 07:12:04 +00:00
Alfred Perlstein 29f194457c Fix instances of macros with improperly parenthasized arguments.
Verified by: md5
2002-11-09 12:55:07 +00:00
Robert Watson 6d7bdc8def Assign value of NULL to imgp->execlabel when imgp is initialized
in the ELF code.  Missed in earlier merge from the MAC tree.

Approved by:	re
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-11-08 20:49:50 +00:00
Robert Watson 52378b8acd To reduce per-return overhead of userret(), call into
mac_thread_userret() only if PS_MACPEND is set in the process AST mask.
This avoids the cost of the entry point in the common case, but
requires policies interested in the userret event to set the flag
(protected by the scheduler lock) if they do want the event.  Since
all the policies that we're working with which use mac_thread_userret()
use the entry point only selectively to perform operations deferred
for locking reasons, this maintains the desired semantics.

Approved by:	re
Requested by:	bde
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-11-08 19:00:17 +00:00
Robert Watson 9fa3506ecd Add an explicit execlabel argument to exec-related MAC policy entry
points, rather than relying on policies to grub around in the
image activator instance structure.

Approved by:	re
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-11-08 18:04:00 +00:00
Thomas Moestl 0fca57b8b8 Move the definitions of the hw.physmem, hw.usermem and hw.availpages
sysctls to MI code; this reduces code duplication and makes all of them
available on sparc64, and the latter two on powerpc.
The semantics by the i386 and pc98 hw.availpages is slightly changed:
previously, holes between ranges of available pages would be included,
while they are excluded now. The new behaviour should be more correct
and brings i386 in line with the other architectures.

Move physmem to vm/vm_init.c, where this variable is used in MI code.
2002-11-07 23:57:17 +00:00
John Baldwin 6274bdda4c - Use %j to print intmax_t values.
- Cast more daddr_t values to intmax_t when printing to quiet warnings.
2002-11-07 22:41:08 +00:00
John Baldwin d0e938f4f1 Use %z to quiet a warning. 2002-11-07 22:38:04 +00:00
Maxime Henrion a7a00d0546 - Fix a bunch of casts to long which were truncating off_t's.
- Remove the comments which were justifying this by the fact
that we don't have %q in the kernel, this was probably right
back in time, but we now have %q, and we even have better to
print those types (%j).
2002-11-07 21:56:05 +00:00
Maxime Henrion b65d1ba9dd - Use a better definition for MNAMELEN which doesn't require
to have one #ifdef per architecture.
- Change a space to a tab after a nearby #define.

Obtained from:	bde
2002-11-07 21:15:02 +00:00
Robert Watson f8f750c53e Do a bit more work in the aio code to simulate the credential environment
of the original AIO request: save and restore the active thread credential
as well as using the file credential, since MAC (and some other bits of
the system) rely on the thread credential instead of/as well as the
file credential.  In brief: cache td->td_ucred when the AIO operation
is queued, temporarily set and restore the kernel thread credential,
and release the credential when done.  Similar to ktrace credential
management.

Reviewed by:	alc
Approved by:	re
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-11-07 20:46:37 +00:00
Kelly Yancey 04ac9b97b5 Spotted a couple of places where the socket buffer's counters were being
manipulated directly (rather than using sballoc()/sbfree()); update them
to tweak the new sb_ctl field too.

Sponsored by:	NTT Multimedia Communications Labs
2002-11-05 18:52:25 +00:00
Kelly Yancey 247a32f22a Fix filt_soread() to properly flag a kevent when a 0-byte datagram is
received.

Verified by:	dougb, Manfred Antar <null@pozo.com>
Sponsored by:	NTT Multimedia Communications Labs
2002-11-05 18:48:46 +00:00
Robert Watson 0c93266b9c Correct merge-o: disable the right execve() variation if !MAC 2002-11-05 18:04:50 +00:00
Robert Watson 670cb89bf4 Bring in two sets of changes:
(1) Permit userland applications to request a change of label atomic
    with an execve() via mac_execve().  This is required for the
    SEBSD port of SELinux/FLASK.  Attempts to invoke this without
    MAC compiled in result in ENOSYS, as with all other MAC system
    calls.  Complexity, if desired, is present in policy modules,
    rather than the framework.

(2) Permit policies to have access to both the label of the vnode
    being executed as well as the interpreter if it's a shell
    script or related UNIX nonsense.  Because we can't hold both
    vnode locks at the same time, cache the interpreter label.
    SEBSD relies on this because it supports secure transitioning
    via shell script executables.  Other policies might want to
    take both labels into account during an integrity or
    confidentiality decision at execve()-time.

Approved by:	re
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-11-05 17:51:56 +00:00
Robert Watson 051c41caf1 Regen. 2002-11-05 17:48:04 +00:00
Robert Watson 21bb9ea225 Flesh out the definition of __mac_execve(): per earlier discussion,
it's essentially execve() with an optional MAC label argument.

Approved by:	re
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-11-05 17:47:08 +00:00
Robert Watson 4443e9ff4a Assert that appropriate vnodes are locked in mac_execve_will_transition().
Allow transitioning to be twiddled off using the process and fs enforcement
flags, although at some point this should probably be its own flag.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-11-05 15:11:33 +00:00