Commit graph

3202 commits

Author SHA1 Message Date
Poul-Henning Kamp db90128160 Avoid the modules madness I inadvertently introduced by making the
cloning infrastructure standard in kern_conf.  Modules are now
the same with or without devfs support.

If you need to detect if devfs is present, in modules or elsewhere,
check the integer variable "devfs_present".

This happily removes an ugly hack from kern/vfs_conf.c.

This forces a rename of the eventhandler and the standard clone
helper function.

Include <sys/eventhandler.h> in <sys/conf.h>: it's a helper #include
like <sys/queue.h>

Remove all #includes of opt_devfs.h they no longer matter.
2000-09-02 19:17:34 +00:00
Don Lewis 8577117cc8 access() shouldn't diddle with the contents of a potentially shared
credential.  Create a temporary copy of the current credential and
modify the copy.

Submitted by:	tegge
2000-09-02 12:31:55 +00:00
Brian Feldman 468ddc8a44 Casts are needed to subtract u_longs.
Submitted by:	tor
2000-08-31 22:21:33 +00:00
Robert Watson c52396e365 o p_cansee() wasn't setting privused when suser() was required to override
kern.ps_showallprocs.  Apparently got lost in the merge process from
  the capability patches.  Now fixed.

Submitted by:	jdp
Obtained from:	TrustedBSD Project
2000-08-31 15:55:17 +00:00
Brian Feldman b6240737d5 Fix hangs caused by overzealous code removal.
Thanks, Nickolay, for figuring out this is the problem.

Submitted by:	Nickolay Dudorov <nnd@mail.nsk.ru>
2000-08-31 11:31:58 +00:00
Mike Smith 3e755f76d1 Make it possible to pass boot()'s flags to shutdown_nice() so that the
kernel can instigate an orderly shutdown but still determine the form of
that shutdown.  Make it possible eg. to cleanly shutdown and power off the
system under ACPI when the power button is pressed.
2000-08-31 00:08:50 +00:00
Robert Watson 387d2c036b o Centralize inter-process access control, introducing:
int p_can(p1, p2, operation, privused)

  which allows specification of subject process, object process,
  inter-process operation, and an optional call-by-reference privused
  flag, allowing the caller to determine if privilege was required
  for the call to succeed.  This allows jail, kern.ps_showallprocs and
  regular credential-based interaction checks to occur in one block of
  code.  Possible operations are P_CAN_SEE, P_CAN_SCHED, P_CAN_KILL,
  and P_CAN_DEBUG.  p_can currently breaks out as a wrapper to a
  series of static function checks in kern_prot, which should not
  be invoked directly.

o Commented out capabilities entries are included for some checks.

o Update most inter-process authorization to make use of p_can() instead
  of manual checks, PRISON_CHECK(), P_TRESPASS(), and
  kern.ps_showallprocs.

o Modify suser{,_xxx} to use const arguments, as it no longer modifies
  process flags due to the disabling of ASU.

o Modify some checks/errors in procfs so that ENOENT is returned instead
  of ESRCH, further improving concealment of processes that should not
  be visible to other processes.  Also introduce new access checks to
  improve hiding of processes for procfs_lookup(), procfs_getattr(),
  procfs_readdir().  Correct a bug reported by bp concerning not
  handling the CREATE case in procfs_lookup().  Remove volatile flag in
  procfs that caused apparently spurious qualifier warnigns (approved by
  bde).

o Add comment noting that ktrace() has not been updated, as its access
  control checks are different from ptrace(), whereas they should
  probably be the same.  Further discussion should happen on this topic.

Reviewed by:	bde, green, phk, freebsd-security, others
Approved by:	bde
Obtained from:	TrustedBSD Project
2000-08-30 04:49:09 +00:00
Robert Watson c6fac29aff o Disable flagging of ASU in suser_xxx() authorization check. For the
time being, the ASU accounting flag will no longer be available, but
  may be reinstituted in the future once authorization have been redone.
  As it is, the kernel went through contortions in access control to
  avoid calling suser, which always set the flag.  This will also allow
  suser to accept const struct *{cred, proc} arguments.

Reviewed by:	bde, green, phk, freebsd-security, others
Approved by:	bde
Obtained from:	TrustedBSD Project
2000-08-30 04:35:32 +00:00
Brian Feldman 343079d9b2 Remove an extraneous setting of sb_hiwat. 2000-08-30 00:09:57 +00:00
Robert Watson 012c643d3e o Restructure vaccess() so as to check for DAC permission to modify the
object before falling back on privilege.  Make vaccess() accept an
  additional optional argument, privused, to determine whether
  privilege was required for vaccess() to return 0.  Add commented
  out capability checks for reference.  Rename some variables to make
  it more clear which modes/uids/etc are associated with the object,
  and which with the access mode.
o Update file system use of vaccess() to pass NULL as the optional
  privused argument.  Once additional patches are applied, suser()
  will no longer set ASU, so privused will permit passing of
  privilege information up the stack to the caller.

Reviewed by:	bde, green, phk, -security, others
Obtained from:	TrustedBSD Project
2000-08-29 14:45:49 +00:00
Brian Feldman 6aef685fbb Remove any possibility of hiwat-related race conditions by changing
the chgsbsize() call to use a "subject" pointer (&sb.sb_hiwat) and
a u_long target to set it to.  The whole thing is splnet().

This fixes a problem that jdp has been able to provoke.
2000-08-29 11:28:06 +00:00
Doug Rabson f80e454726 Add kobj_class_compile_static() to allow classes to be initialised
statically (i.e. without calling malloc). This allows kobj to be used
very early in the boot sequence.
2000-08-28 21:11:12 +00:00
Doug Rabson a4f9b116e3 * Remove a bogus call to kobj_init() from make_device().
* Add a non-empty implementation of root_print_child().
2000-08-28 21:08:12 +00:00
Marcel Moolenaar 3f4ab6537f Regen: fix prototypes for {o|}{g|s}etrlimit. 2000-08-28 07:56:38 +00:00
Marcel Moolenaar ae51d56ce1 Fix prototypes for {o|}{g|s}etrlimit. A recent change in the
Linuxulator caused this bug to trigger.
2000-08-28 07:50:44 +00:00
Alfred Perlstein c58b821e4c new sysctl 'kern.openfiles' (exports nfiles to userland) 2000-08-26 23:49:44 +00:00
Robert Watson 877dd71fc6 o Correct spelling of ufs_exttatr_find_attr -> ufs_extattr_find_attr
o Add "const" qualifier to attrname argument of various calls to remove
  warnings

Obtained from:	TrustedBSD Project
2000-08-26 22:00:58 +00:00
Marcel Moolenaar 31c8f3f0af Make this file compile again when COMPAT_43 has not been
defined. This boils down to conditionally compile the
old signal syscalls.

We might want to extend the types in syscalls.master to
make these syscalls conditionally on something more
appropriate than COMPAT_43.
2000-08-26 02:27:01 +00:00
Peter Wemm 9579e8c145 m_mballoc_wait() had a spl/tsleep race. mbufs can be freed in interrupt
context, which can cause a wakeup.. which can race with this.
2000-08-25 22:28:08 +00:00
Peter Wemm 12db06a04f If the config program found a hints file and included it as a fallback,
then treat it as such.  This isn't perfect, but should do for things
like GENERIC.  When in fallback mode, they will be used if there are NO
other hints.
2000-08-25 19:48:10 +00:00
Poul-Henning Kamp d8cd1501f2 Dang, a _clone routine escaped #ifdef DEVFS containment. 2000-08-24 15:59:44 +00:00
Poul-Henning Kamp a481b90b82 Fix panic when removing open device (found by bp@)
Implement subdirs.
 Build the full "devicename" for cloning functions.
 Fix panic when deleted device goes away.
 Collaps devfs_dir and devfs_dirent structures.
 Add proper cloning to the /dev/fd* "device-"driver.
 Fix a bug in make_dev_alias() handling which made aliases appear
  multiple times.
 Use devfs_clone to implement getdiskbyname()
 Make specfs maintain the stat(2) timestamps per dev_t
2000-08-24 15:36:55 +00:00
Brian Feldman 0a7d171157 Revert the suser -> suser_xxx change made previously. It was right
before.
2000-08-24 04:54:31 +00:00
Paul Saab 03f808c55a Add a sysctl which hides all process except those that belong to
the user asking for the process list.

Reviewed by:	peter
2000-08-23 21:41:25 +00:00
Poul-Henning Kamp 3f54a085a6 Remove all traces of Julians DEVFS (incl from kern/subr_diskslice.c)
Remove old DEVFS support fields from dev_t.

  Make uid, gid & mode members of dev_t and set them in make_dev().

  Use correct uid, gid & mode in make_dev in disk minilayer.

  Add support for registering alias names for a dev_t using the
  new function make_dev_alias().  These will show up as symlinks
  in DEVFS.

  Use makedev() rather than make_dev() for MFSs magic devices to prevent
  DEVFS from noticing this abuse.

  Add a field for DEVFS inode number in dev_t.

  Add new DEVFS in fs/devfs.

  Add devfs cloning to:
        disk minilayer (ie: ad(4), sd(4), cd(4) etc etc)
        md(4), tun(4), bpf(4), fd(4)

  If DEVFS add -d flag to /sbin/inits args to make it mount devfs.

  Add commented out DEVFS to GENERIC
2000-08-20 21:34:39 +00:00
Poul-Henning Kamp 4fe6d43729 Fix typo in last commit. 2000-08-20 11:46:39 +00:00
Poul-Henning Kamp e39c53eda5 Centralize the canonical vop_access user/group/other check in vaccess().
Discussed with: bde
2000-08-20 08:36:26 +00:00
David Malone a5c4836d39 Replace the mbuf external reference counting code with something
that should be better.

The old code counted references to mbuf clusters by using the offset
of the cluster from the start of memory allocated for mbufs and
clusters as an index into an array of chars, which did the reference
counting. If the external storage was not a cluster then reference
counting had to be done by the code using that external storage.

NetBSD's system of linked lists of mbufs was cosidered, but Alfred
felt it would have locking issues when the kernel was made more
SMP friendly.

The system implimented uses a pool of unions to track external
storage. The union contains an int for counting the references and
a pointer for forming a free list. The reference counts are
incremented and decremented atomically and so should be SMP friendly.
This system can track reference counts for any sort of external
storage.

Access to the reference counting stuff is now through macros defined
in mbuf.h, so it should be easier to make changes to the system in
the future.

The possibility of storing the reference count in one of the
referencing mbufs was considered, but was rejected 'cos it would
often leave extra mbufs allocated. Storing the reference count in
the cluster was also considered, but because the external storage
may not be a cluster this isn't an option.

The size of the pool of reference counters is available in the
stats provided by "netstat -m".

PR:		19866
Submitted by:	Bosko Milekic <bmilekic@dsuper.net>
Reviewed by:	alfred (glanced at by others on -net)
2000-08-19 08:32:59 +00:00
Poul-Henning Kamp 39f70682ae Introduce vop_stdinactive() and make it the default if no vop_inactive
is declared.

Sort and prune a few vop_op[].
2000-08-18 10:01:02 +00:00
Brian Feldman 9b96968623 Fix a couple cases where p_trespass wasn't transitioned into place.
Make RTP_SET (rtprio) only accessible to real root, not root in jails.
2000-08-16 23:28:54 +00:00
Peter Wemm 37b087a645 Clean up some low level bootstrap code:
- stop using the evil 'struct trapframe' argument for mi_startup()
  (formerly main()).  There are much better ways of doing it.
- do not use prepare_usermode() - setregs() in execve() will do it
  all for us as long as the p_md.md_regs pointer is set.  (which is
  now done in machdep.c rather than init_main.c.  The Alpha port did it
  this way all along and is much cleaner).
- collect all the magic %cr0 etc register settings into one place and
  have the AP's call that instead of using magic numbers (!!) that keep
  changing over and over again.
- Make it safe to call kthread_create() earlier, including during the
  device probe sequence.  It doesn't need the callback mechanism that
  NetBSD's version uses.
- kthreads created this way are root-less as they exist before the root
  filesystem is mounted.  init(1) is set up so that it aquires the root
  pointers prior to running.  If other kthreads want filesystem acccess
  we can make this code more generic.
- set all threads start times once we have decided what time it is.
- init uses a trampoline rather than the evil prepare_usermode() hack.
- kern_descrip.c has a couple of tweaks to deal with forking when there
  is no rootdir or cwd etc.
- adjust the early SYSINIT() sequence so that a few prereqisites are in
  place. eg: make sure the run queue is initialized before doing forks.

With this, the USB code can easily create a kthread to do the device
tree discovery.  (I have tested it, it works nicely).

There are still some open issues before this is truely useful.
- tsleep() does not like working before the clock is running.  It
  sort-of tries to spin wait, but it can do more useful things now.
- stopping a kthread in kld code at unload time is "interesting" but
  we have a solution for that.

The Alpha code needs no changes for this.  It already uses pretty much the
same strategies, but a little cleaner.
2000-08-11 09:05:12 +00:00
Tor Egge 4428d39d63 Don't skip IOAPIC id conflict detection when only one pci bus is present.
PR:		20312
Reviewed by:	Steve Roome <steve@sse0691.bri.hp.com>
2000-08-10 17:33:24 +00:00
Tor Egge 3c2498c0d3 Don't set flags on the mount structure before all permission checks have
been done.

Don't allow multiple mount operations with MNT_UPDATE at the same
time on the same mount point.  When the first mount operation
completed, MNT_UPDATE was cleared in the mount structure, causing
the second to complete as if it was a no-update mount operation
with the following bad side effects:

        - mount structure inserted multiple times onto the mountlist
        - vp->v_mountedhere incorrectly set, causing next namei
          operation walking into the mountpoint to crash with
          a locking against myself panic.

Plug a vnode leak in case vinvalbuf fails.
2000-08-09 01:57:11 +00:00
Robert Watson e6a9ab52db o Introduce vn_extattr_{get,set}, wrapper routines for VOP_GETEXTATTR
and VOP_SETEXTATTR to simplify calling from in-kernel consumers,
  such as capability code.  Both accept a vnode (optionally locked,
  with ioflg to indicate that), attribute name, and a buffer + buffer
  length in UIO_SYSSPACE.  Both authorize the call as a kernel request,
  with cred set to NULL for the actual VOP_ calls.

Obtained from:	TrustedBSD Project
2000-08-08 17:15:32 +00:00
Jonathan Lemon a114459191 Make the kqueue socket read filter honor the SO_RCVLOWAT value.
Spotted by:  "Steve M." <stevem@redlinenetworks.com>
2000-08-07 17:52:08 +00:00
Jonathan Lemon ad91b6a280 Fix bug with timeout; previously, when attempting to poll the kqueue by
passing a zero-valued timeout, the code would always sleep for one tick.
Change code to avoid calling tsleep if we have no intention of sleeping.

Bring in bugfix from sys_select.c, r1.60 which also applies here.

Modify error handling slightly; passing in an invalid fd will now result
in EBADF returned in the eventlist, while an attempt to change a knote
which does not exist will result in ENOENT being returned.  Previously
such attempts would fail silently without notification.

Pointed out by: nicolas.leonard@animaths.com
	        Rick Reed (rr@yahoo-inc.com)
2000-08-07 16:45:42 +00:00
Paul Saab c206a8609e Change the behavior of isa_nmi to log an error message instead of
panicing and return a status so that we can decide whether to drop
into DDB or panic.  If the status from isa_nmi is true, panic the
kernel based on machdep.panic_on_nmi, otherwise if DDB is
enabled, drop to DDB based on machdep.ddb_on_nmi.

Reviewed by:	peter, phk
2000-08-06 14:17:21 +00:00
Tor Egge e666f57c3e Be more verbose when changing APIC ID on an IO APIC.
Don't allow cpu entries in the MP table to contain APIC IDs out of range.

Don't write outside array boundaries if an IO APIC entry in the MP table
contains an APIC ID out of range.

Assign APIC IDs for all IO APICs according to section 3.6.6 in the
Intel MP spec:

  - If the current APIC ID on an IO APIC doesn't conflict with other
    IO APICs or CPUs, that APIC ID should be used.  The copy of the MP
    table must be updated if the corresponding APIC ID in the MP table
    is different.

  - If the current APIC ID was in conflict with other units, the
    corresponding APIC ID specified in the MP table is checked for conflict.

  - If a conflict is still found then fall back to using a new unique ID.
    The copy of the MP table must be updated.

  - IDs out of range is considered to be in conflict.

During these operations, the IO_TO_ID array cannot be used, since any
conflict would have caused information loss.  The array is then corrected,
since all APIC ID conflicts should have been resolved.

PR:	20312, 18919
2000-08-06 00:04:03 +00:00
Jeffrey Hsu 51b86781c0 Modify to use fixed STAILQ_LAST().
Reviewed by:	dfr
2000-08-03 16:37:46 +00:00
Peter Wemm 2c7f8b4ebd Fix self referential dependencies. eg: uhub was packaged along with
usb, all in usb.ko.  uhub depends on usb.  The bug was that the preload
processing only adds a module to the list once it's internal dependencies
are resolved... Since it was not "seeing" the internal usb module it
believed that uhub had a missing dependency.
2000-08-02 21:08:53 +00:00
Peter Wemm af4b2d2d1c Fix the SYSINIT() bubble sort. This was fixed in kern_linker.c already. 2000-08-02 21:05:21 +00:00
Jonathan Lemon 1dfd47607b Back out rev 1.12; its not clear that this is the right thing to do,
and in any event, it wasn't done correctly in the first place.
2000-08-01 04:27:50 +00:00
Luoqi Chen 3fb50adb4c Handle write page faults (both write only or read-modify-write) as MI vm
write-only faults.  This would allow write-only mmapped regions to function
correctly.
2000-07-31 14:47:14 +00:00
Alfred Perlstein 9ad48853de mbstat should be a read-only sysctl.
Submitted by: Bosko Milekic <bmilekic@dsuper.net>
2000-07-31 09:24:32 +00:00
Paul Saab 030f7b3faa Remove unnecessary call to splnet when setting an accept filter
since we are already at splnet.
2000-07-31 08:23:43 +00:00
Peter Wemm 3a285cc807 Regen. (Fix SYS_exit) 2000-07-29 10:07:38 +00:00
Peter Wemm 4e0f152bbe Sigh. Fix SYS_exit problems. I misunderstood the significance of these
trailing options.
2000-07-29 10:05:25 +00:00
Paul Saab 0e461cb7e2 Remove this file incase of further confusion. 2000-07-29 04:09:07 +00:00
Peter Wemm 69065e880a Regenerate with makesyscalls.sh 2000-07-29 00:21:50 +00:00
Peter Wemm ac2b067b9a Change the 'exit()' system call to 'sys_exit()'. This avoids overlapping
gcc's internal exit() prototypes and the (futile) hackery that we did to
try and avoid warnings.  main() was renamed for similar reasons.
Remove an exit related hack from makesyscalls.sh.
2000-07-29 00:16:28 +00:00
Peter Wemm 5dec52bada Fix the #ifdef VFS_AIO to not compile a whole bunch of unused stuff in the
!VFS_AIO case.  Lots of things have hooks into here (kqueue, exit(),
 sockets, etc), I elected to keep the external interfaces the same
 rather than spread more #ifdefs around the kernel.
2000-07-28 23:10:10 +00:00
Peter Wemm f7ce4efc8a Fix a const related warning. 2000-07-28 22:41:56 +00:00
Peter Wemm 93e8459a02 Fix some style nits.
Fix(?) some compile warnings regarding const handling.
2000-07-28 22:40:04 +00:00
Peter Wemm c828c7b784 Fix warnings - make kevent args in comment match those in syscalls.master.
Deal with consts.
2000-07-28 22:32:25 +00:00
Peter Wemm b31ae1adc5 Fix a warning that has been annoying me for some time:
"kern/sys_generic.c:358: warning: cast discards qualifiers from pointer
   target type"
The idea for using the uintptr_t intermediate cast for de-constifying
a pointer was hinted at by bde some time ago.
2000-07-28 22:17:42 +00:00
Robert Watson fc3345a4a7 o Modify extattr_{set,get}() syscalls so that partial reads and writes
with an error condition such as EINTR, EWOULDBLOCK, and ERESTART,
  are reported to the application, not silently conceal.  This
  behavior was copied from the {read,write}v() syscalls, and is
  appropriate there but not here.
o Correct a bug in extattr_delete() wherein the LOCKLEAF flag is
  passed to the wrong argument in namei(), resulting in some
  unexpected errors during name resolution, and passing in an unlocked
  vnode.

Obtained from:	TrustedBSD Project
2000-07-28 19:52:38 +00:00
Jonathan Lemon ab2adc20f2 Have kevent() automatically restart if interrupted by a signal. If this
is not desired, then the user can register an EV_SIGNAL filter to
explicitly catch a signal event.

Change requested by: jayanth, ps, peter
		     "Why is kevent non-restartable after a signal?"
2000-07-27 23:06:14 +00:00
Brian Feldman 3c89e357f0 Distinguish between whether ktraceing was enabled before an IO
operation or after it.  If the ktrace operation was enabled while the
process was blocked doing IO, the race would allow it to pass down
invalid (uninitialized) data and panic later down the call stack.
2000-07-27 03:45:18 +00:00
Robert Watson 3ce7b7aa84 o Lock vnode before calling extattr_* VOP's, and modify vnode spec to
allow for that.
o Remember to call NDFREE() if exiting as a result of a failed
  vn_start_write() when snapshotting.

Reviewed by:	mckusick
Obtained from:	TrustedBSD Project
2000-07-26 20:29:20 +00:00
Kirk McKusick 54e53ebda7 Now that buffer locks can be recursive, we need to delete the panics
that complain about them.

Obtained from:	Brian Fundakowski Feldman <green@FreeBSD.org>
2000-07-25 18:28:46 +00:00
Kirk McKusick aec3bbe11c Do not need vrele(nd.ni_vp) as that is done by NDFREE(&nd, 0);
Submitted by:	Peter Holm <pho@freebsd.org>
2000-07-25 05:38:54 +00:00
Robert Watson e2e45aa8a0 o Add missing function return types from capability syscall call stubs,
fix compiler warning.

Submitted by:	jake
2000-07-25 03:37:36 +00:00
Kirk McKusick 9b97113391 This patch corrects the first round of panics and hangs reported
with the new snapshot code.

Update addaliasu to correctly implement the semantics of the old
checkalias function. When a device vnode first comes into existence,
check to see if an anonymous vnode for the same device was created
at boot time by bdevvp(). If so, adopt the bdevvp vnode rather than
creating a new vnode for the device. This corrects a problem which
caused the kernel to panic when taking a snapshot of the root
filesystem.

Change the calling convention of vn_write_suspend_wait() to be the
same as vn_start_write().

Split out softdep_flushworklist() from softdep_flushfiles() so that
it can be used to clear the work queue when suspending filesystem
operations.

Access to buffers becomes recursive so that snapshots can recursively
traverse their indirect blocks using ffs_copyonwrite() when checking
for the need for copy on write when flushing one of their own indirect
blocks. This eliminates a deadlock between the syncer daemon and a
process taking a snapshot.

Ensure that softdep_process_worklist() can never block because of a
snapshot being taken. This eliminates a problem with buffer starvation.

Cleanup change in ffs_sync() which did not synchronously wait when
MNT_WAIT was specified. The result was an unclean filesystem panic
when doing forcible unmount with heavy filesystem I/O in progress.

Return a zero'ed block when reading a block that was not in use at
the time that a snapshot was taken. Normally, these blocks should
never be read. However, the readahead code will occationally read
them which can cause unexpected behavior.

Clean up the debugging code that ensures that no blocks be written
on a filesystem while it is suspended. Snapshots must explicitly
label the blocks that they are writing during the suspension so that
they do not cause a `write on suspended filesystem' panic.

Reorganize ffs_copyonwrite() to eliminate a deadlock and also to
prevent a race condition that would permit the same block to be
copied twice. This change eliminates an unexpected soft updates
inconsistency in fsck caused by the double allocation.

Use bqrelse rather than brelse for buffers that will be needed
soon again by the snapshot code. This improves snapshot performance.
2000-07-24 05:28:33 +00:00
Brian Feldman 55af4c7d94 Using an atomic operation here won't help if nobody else uses them (for
this).  Use the simple_lock() on v_interlock like elsewhere.
2000-07-23 22:19:49 +00:00
Brian Feldman 25ead03462 Solve the problem where it is possible to get the kernel stuck in
a loop down in pmap_init_pt().  A subtraction causes the number of
pages to become negative, that was assigned to an unsigned variable,
and there is a lot of iteration.  The bug is due to the ELF image
activator not properly checking for its files being the correct size
as specified by the ELF header.

The solution is to check that the header doesn't ask for part of a
file when that part of the file doesn't exist.  Make sure to set
VEXEC at the proper times to make the executables immutable (remove
race conditions).  Also, the ELF format specifiies header entries
that allow embedding of other executables (hence how ld-elf.so.1
gets loaded, but not the same as loading shared libraries), so those
executables need to be set VEXEC, too, so they're immutable.

Reviewed by:	peter
2000-07-23 06:49:46 +00:00
Alfred Perlstein f408896444 only allow accept filter modifications on listening sockets
Submitted by: ps
2000-07-20 12:17:17 +00:00
Alfred Perlstein 85f5e7f098 disallow unload until we do proper refcounting 2000-07-20 12:12:41 +00:00
Jonathan Lemon 2ba03123c5 Fix a bug which would cause some knotes to get lost when two kqueues
were being used in a process at the same time.

Test case provided by:  Chris Peiffer <peifferc@CS.Stanford.EDU>
2000-07-18 21:41:47 +00:00
Jonathan Lemon a8e65b915e Simplify kqueue API slightly.
Discussed on:	-arch
2000-07-18 19:31:52 +00:00
Peter Wemm f03c9f90d1 Patch up some bogons in the resource_find() vs resource_find_hard()
interfaces.  The original resource_find() returned a pointer to an internal
resource table entry.  resource_find_hard() dereferences the actual
passed in value (oops!) - effectively trashing random memory due to
the pointer being passed in with a random initial value.

Submitted by:  bde
2000-07-18 06:08:27 +00:00
Andrzej Bialecki bd3cdc3105 These patches implement dynamic sysctls. It's possible now to add
and remove sysctl oids at will during runtime - they don't rely on
linker sets. Also, the node oids can be referenced by more than
one kernel user, which means that it's possible to create partially
overlapping trees.

Add sysctl contexts to help programmers manage multiple dynamic
oids in convenient way.

Please see the manpages for detailed discussion, and example module
for typical use.

This work is based on ideas and code snippets coming from many
people, among them:  Arun Sharma, Jonathan Lemon, Doug Rabson,
Brian Feldman, Kelly Yancey, Poul-Henning Kamp and others. I'd like
to specially thank Brian Feldman for detailed review and style
fixes.

PR:		kern/16928
Reviewed by:	dfr, green, phk
2000-07-15 10:26:04 +00:00
Alfred Perlstein af0e6bcdf0 Make mbstat.m_mtypes seperate and viewable via sysctl, also
expand the size from short to ulong

Submitted by: Ian Dowse <iedowse@maths.tcd.ie>
PR: kern/19809
2000-07-15 06:02:48 +00:00
Paul Saab 88f675ba30 Change the way NMI's are handled. Before, if DDB was enabled and
a NMI occured, you could type continue in DDB and the kernel would
not attempt to detect what type of NMI was recieved.  Now we check
for the type of NMI first and then go to DDB if it is enabled.

This will solve the problem with having DDB enabled and getting an
NMI due to some possibly bad error and being able to continue the
operation of the kernel when you really want to panic and know
what happened.

Submitted by:	jhb
2000-07-14 11:49:44 +00:00
Robert Watson e8483a05a6 o Commit two of two, introducing __cap_{get,set}_{fd,file} syscalls to
modify capability sets on files.

Obtained from:	TrustedBSD Project
2000-07-13 20:38:52 +00:00
Robert Watson 92eebb8a9b o Introduce syscall prototypes, stubs for __cap_{get,set}_{fd,file},
syscalls to manage capability sets on files.  First of two commits.

Obtained from:	TrustedBSD Project
2000-07-13 20:31:24 +00:00
John Baldwin 9c386f6b7d For infinite timeouts, set both the tv_sec and tv_usec fields to zero in
poll() and select().

Noticed by:	Wesley Morgan <morganw@chemicals.tacorp.com>
2000-07-13 02:12:25 +00:00
John Baldwin 4da144c091 Fix a very obscure bug in select() and poll() where the timeout would
never expire if poll() or select() was called before the system had been
in multiuser for 1 second.  This was caused by only checking to see if
tv_sec was zero rather than checking both tv_sec and tv_usec.
2000-07-12 22:46:40 +00:00
Jun-ichiro itojun Hagino f38211642f remove m_pulldown statistics, which is highly experimental and does not
belong to *bsd-merged tree
2000-07-12 16:39:13 +00:00
Kirk McKusick f2a2857bb3 Add snapshots to the fast filesystem. Most of the changes support
the gating of system calls that cause modifications to the underlying
filesystem. The gating can be enabled by any filesystem that needs
to consistently suspend operations by adding the vop_stdgetwritemount
to their set of vnops. Once gating is enabled, the function
vfs_write_suspend stops all new write operations to a filesystem,
allows any filesystem modifying system calls already in progress
to complete, then sync's the filesystem to disk and returns. The
function vfs_write_resume allows the suspended write operations to
begin again. Gating is not added by default for all filesystems as
for SMP systems it adds two extra locks to such critical kernel
paths as the write system call. Thus, gating should only be added
as needed.

Details on the use and current status of snapshots in FFS can be
found in /sys/ufs/ffs/README.snapshot so for brevity and timelyness
is not included here. Unless and until you create a snapshot file,
these changes should have no effect on your system (famous last words).
2000-07-11 22:07:57 +00:00
Boris Popov 2ff087318a Correct SYSINIT execution order in the case when KLD contains more
than one SYSINIT with the same 'subsystem' id and different 'order' id.

Reviewed by:	peter
2000-07-09 23:58:56 +00:00
Brian Feldman 7ceba2d755 Remove two micro-pessimizations I made. Bruce is teaching me well :)
KTRPOINT(p, KTR_GENIO) is more uncommon than error == 0, so it should
be first in the && statement.
2000-07-07 22:11:37 +00:00
Brian Feldman 9d1cfdce2a Change that &@!$# UIO_READ to be UIO_WRITE. I tested the ktrace stuff,
but somehow... pass the pointy hat, again!
2000-07-07 21:52:15 +00:00
Boris Popov 3660ebc2c0 Fix support for more than 256 simultaneous mounts. Theoretical limit
is 2^16 mounts per fs type.

Reported by:	Troy Arie Cobb <tcobb@staff.circle.net> via phk
Reviewed by:	bde
2000-07-07 14:01:08 +00:00
John Baldwin 9701cd40b4 Support for unsigned integer and long sysctl variables. Update the
SYSCTL_LONG macro to be consistent with other integer sysctl variables
and require an initial value instead of assuming 0.  Update several
sysctl variables to use the unsigned types.

PR:		15251
Submitted by:	Kelly Yancey <kbyanc@posi.net>
2000-07-05 07:46:41 +00:00
Warner Losh 5d10777c46 End two weeks of on and off debugging. Fix the crash on the Nth
insertion of a CF card, for random values of N > 1.  With these fixes,
I've been able to do 100 insert/remove of the cards w/o a crash with
lots of system activity going on that in the past would help trigger
the crash.

The problem:

FreeBSD creates dev_t's on the fly as they are needed and never
destroys them.  These dev_t's point to a struct disk that is used for
housekeeping on the disk.  When a device goes away, the struct disk
pointer becomes a dangling pointer.  Sometimes when the device comes
back, the pointer will point to the new struct disk (in which case the
insertion will work).  Other times it won't (especially if any length
of time has passed, since it is dependent on memory returned from
malloc).

The Fix:

There is one of these dev_t's that is always correct.  The
device for the WHOLE_DISK_SLICE is always right.  It gets set at
create_disk() time.  So, the fix is to spend a little CPU time and
lookup the WHOLE_DISK_SLICE dev_t and use the si_disk from that in
preference to the one that's in the device asking to do the I/O.  In
addition, we change the test of si_disk == NULL meaning that the dev
needed to inherit properties from the pdev to dev->si_disk !=
pdev->si_disk.  This test is a little stronger than the previous test,
but can sometimes be fooled into not inheriting.  However, the results
of this fooling are that the old values will be used, which will
generally always be the same as before.  si_drv[12] are the only
values that are copied that might pose a problem.  They tend to change
as the si_disk field would change, so it is a hole, but it is a small
hole.

One could correctly argue that one should replace much of this code
with something much much better.  I would be on the pro side of that
argument.

Reviewed by: phk (who also ported the original patch to current)
Sponsored by: Timing Solutions
2000-07-05 06:01:33 +00:00
Jun-ichiro itojun Hagino 686cdd19b1 sync with kame tree as of july00. tons of bug fixes/improvements.
API changes:
- additional IPv6 ioctls
- IPsec PF_KEY API was changed, it is mandatory to upgrade setkey(8).
  (also syntax change)
2000-07-04 16:35:15 +00:00
Poul-Henning Kamp 77978ab8bc Previous commit changing SYSCTL_HANDLER_ARGS violated KNF.
Pointed out by:	bde
2000-07-04 11:25:35 +00:00
Kirk McKusick c904bbbdd8 Simplify and rationalise the management of the vnode free list
(preparing the code to add snapshots).
2000-07-04 04:32:40 +00:00
Kirk McKusick e6796b67d9 Move the truncation code out of vn_open and into the open system call
after the acquisition of any advisory locks. This fix corrects a case
in which a process tries to open a file with a non-blocking exclusive
lock. Even if it fails to get the lock it would still truncate the
file even though its open failed. With this change, the truncation
is done only after the lock is successfully acquired.

Obtained from:	 BSD/OS
2000-07-04 03:34:11 +00:00
Kirk McKusick 3764219663 If a buffer flush fails when trying to reclaim a vnode, it is too
late to save the vnode, so just toss any remaining unwritten buffers
rather than leaving them lying around to make trouble in the future.
2000-07-04 03:23:29 +00:00
Kirk McKusick bdbd3ff7cf Update tags directive to reflect the new location of soft updates
and the reorganization of the eisa directory.
2000-07-04 00:18:43 +00:00
Poul-Henning Kamp 3275cf7379 Make the two calls from kern/* into softupdates #ifdef SOFTUPDATES,
that is way cleaner than using the softupdates_stub stunt, which
should be killed when convenient.

Discussed with:	mckusick
2000-07-03 13:26:54 +00:00
Poul-Henning Kamp 9282307a5d Add device_set_softc() which does the obvious.
Not objected to by:	dfr
2000-07-03 13:06:29 +00:00
Poul-Henning Kamp 82d9ae4e32 Style police catches up with rev 1.26 of src/sys/sys/sysctl.h:
Sanitize SYSCTL_HANDLER_ARGS so that simplistic tools can grog our
sources:

        -sysctl_vm_zone SYSCTL_HANDLER_ARGS
        +sysctl_vm_zone (SYSCTL_HANDLER_ARGS)
2000-07-03 09:35:31 +00:00
Chris Costello d41c16130b Instead of just blindly setting -rw-rw-rw-:
o Set access mode to -r--r--r-- if SS_CANTRCVMORE is set and the receive
  buffer is empty.

o Set access mode to --w--w--w- is SS_CANTSENDMORE is set.

Discussed with:	alfred
2000-07-02 23:56:45 +00:00
Chris Costello 417779230b Report -rw-rw-rw file access modes in soo_stat.
Reviewed by:	alfred
2000-07-02 19:31:00 +00:00
Brian Feldman 42ebfbf227 Modify ktrace's general I/O tracing, ktrgenio(), to use a struct uio *
instead of a struct iovec * array and int len.  Get rid of stupidly trying
to allocate all of the memory and copyin()ing the entire iovec[], and
instead just do the proper VOP_WRITE() in ktrwrite() using a copy of
the struct uio that the syscall originally used.

This solves the DoS which could easily be performed; to work around the
DoS, one could also remove "options KTRACE" from the kernel.  This is
a very strong MFC candidate for 4.1.

Found by:	art@OpenBSD.org
2000-07-02 08:08:09 +00:00
Brian S. Dean c6d3f3bfc1 Fix my own style bugs (use of spaces instead of tabs for indentation).
This is a style-only change.
2000-07-01 02:40:13 +00:00
Archie Cobbs 6c66bbed1a Move the securelevel check before loading KLD's into linker_load_file(),
instead of requiring every caller of linker_load_file() to perform the
check itself. This avoids netgraph loading KLD's when securelevel > 0,
not to mention any future code that may call linker_load_file().

Reviewed by:	dfr
2000-06-29 17:57:04 +00:00
Boris Popov 5badeabaca Move #ifdef to the right place. 2000-06-29 09:26:26 +00:00
Boris Popov 99063cf89e If kernel compiled with INVARIANTS:
On unload, remove references from freelist to memory type defined by module.
Print a warning if module defines and allocate its own memory type, but
didn't free it all on unload.

Reviewed by:	peter
2000-06-29 03:41:30 +00:00
Chris Costello 0e8363eca9 Report a file type (S_IFIFO) in kqueue_stat(). 2000-06-28 19:16:27 +00:00
Alfred Perlstein 1a61fa5e0d don't panic the system when fpathconv is called on an unsupported filetype. 2000-06-27 23:08:36 +00:00
Alfred Perlstein 35b1da8080 remove crufty exec stuff, perl is in the base system
make it work with warnings on (there was some harmless use of uninitialized
variables)
make it work with 'use strict'

Approved by: peter
2000-06-27 19:09:55 +00:00
Poul-Henning Kamp a8b1f9d2c9 Move prtactive to vfs from ufs. It is used all over the place. 2000-06-27 07:46:22 +00:00
Neil Blakey-Milner 47fdd692c6 Add sysctl descriptions to a few sysctls. Simply "documentation".
PR:		kern/8015
Submitted by:	Stefan Eggers <seggers@semyam.dinoco.de>
2000-06-26 13:52:31 +00:00
Peter Wemm ce365ee318 Some changes and fixes from Bruce:
Use strtoul(), not strtol() in the hints decoder so that
    'flags 0xa0ffa0ff' is not truncated to 0x7fffffff.
  Use a stack buffer instead of a static 100 byte bss buffer.
  Use \0 for the NUL character.
  Remove some ``excessive'' parens.
2000-06-26 09:53:37 +00:00
Jonathan Lemon cb5ad9d362 Fix stupid braino in last commit, initialize `vp' before we test vp->v_tag.
Spotted by: dillon
2000-06-25 18:10:45 +00:00
Mark Murray b6e67f5c7d Remove no-longer-relevant comment. 2000-06-25 10:14:06 +00:00
Mark Murray 4eeb4f04c3 Forgot this earlier; delete the old /dev/random driver, bring in the
header for the new.
Reviewed by:	dfr
2000-06-25 09:35:40 +00:00
Dima Ruban 1a432a2f54 Fix typo (inT -> int) 2000-06-23 07:10:34 +00:00
Alfred Perlstein c636255150 fix races in the uidinfo subsystem, several problems existed:
1) while allocating a uidinfo struct malloc is called with M_WAITOK,
   it's possible that while asleep another process by the same user
   could have woken up earlier and inserted an entry into the uid
   hash table.  Having redundant entries causes inconsistancies that
   we can't handle.

   fix: do a non-waiting malloc, and if that fails then do a blocking
   malloc, after waking up check that no one else has inserted an entry
   for us already.

2) Because many checks for sbsize were done as "test then set" in a non
   atomic manner it was possible to exceed the limits put up via races.

   fix: instead of querying the count then setting, we just attempt to
   set the count and leave it up to the function to return success or
   failure.

3) The uidinfo code was inlining and repeating, lookups and insertions
   and deletions needed to be in their own functions for clarity.

Reviewed by: green
2000-06-22 22:27:16 +00:00
Jonathan Lemon c8bea19ee3 Add a hack to fail registration of kq events on a non-ufs filesystem, as
support for those is non-existent at the moment.
2000-06-22 18:41:07 +00:00
Jonathan Lemon d2693dbbc4 Add code so that the udata field is preserved across a TRACK event.
When re-adding an event, do not reset the event state.  If the event was
pending, it will remain pending.  This allows the user to change the udata
field after the event was registered, while not losing any events which
have already occurred.

Reported by:   jmg
2000-06-22 18:39:31 +00:00
Neil Blakey-Milner 445572c1ed Add 'kern.disks', a sysctl which returns the list of disks from
disk_enumerate(), space delimited.  This allows non-root users to get a
list of disks and will simplify libdisk's Disk_Names().

Reviewed by:	phk
2000-06-22 11:44:43 +00:00
Alfred Perlstein a79b71281c return of the accept filter part II
accept filters are now loadable as well as able to be compiled into
the kernel.

two accept filters are provided, one that returns sockets when data
arrives the other when an http request is completed (doesn't work
with 0.9 requests)

Reviewed by: jmg
2000-06-20 01:09:23 +00:00
Alfred Perlstein a72fda7154 backout accept optimizations.
Requested by: jmg, dcs, jdp, nate
2000-06-18 08:49:13 +00:00
Poul-Henning Kamp 7c50d77218 Revert part of my bioops change which implemented panic(8). 2000-06-16 14:32:13 +00:00
Poul-Henning Kamp a2e7a027a7 Virtualizes & untangles the bioops operations vector.
Ref: Message-ID: <18317.961014572@critter.freebsd.dk> To: current@
2000-06-16 08:48:51 +00:00
Robert Watson 625cc84808 Second of two commits adding capability manipulation syscalls for
processes.

Obtained from:	TrustedBSD Project
2000-06-15 23:27:18 +00:00
Robert Watson b09b66abf6 Introduce syscalls for process capability manipulation. Currently backs
onto already committed stubs.  Commit one of two.

Reviewed by:	Damned if I can remember.  Many people.
Obtained from:	TrustedBSD Project
2000-06-15 23:08:17 +00:00
Poul-Henning Kamp 4bd02a5609 Add disk_enumerate() for finding names of disks. Vinum and libh will
need this RSN.

Remove a pointless warning in the root device locating code.

Remove the "wd" compatibility name from the "ad" driver.

WARNING: If you have not updated to use /dev/wd* in your /etc/fstab
and modern bootblocks, it would be a very good idea to do so BEFORE
you upgrade your kernel.
2000-06-15 20:30:53 +00:00
Alfred Perlstein 8f4e4aa5f1 add socketoptions DELAYACCEPT and HTTPACCEPT which will not allow an accept()
until the incoming connection has either data waiting or what looks like a
HTTP request header already in the socketbuffer.  This ought to reduce
the context switch time and overhead for processing requests.

The initial idea and code for HTTPACCEPT came from Yahoo engineers and has
been cleaned up and a more lightweight DELAYACCEPT for non-http servers
has been added

Reviewed by: silence on hackers.
2000-06-15 18:18:43 +00:00
Peter Wemm 7d02379e48 As a bit of a gross hack, allow earlier access to both the static and
dynamic hints.  This allows the resource_XXX_value() calls to work
before malloc() has started.  This gets the serial console working as well
as a few other things.
2000-06-15 09:57:20 +00:00
Peter Wemm 690f8fc4c3 Fix a stray debug output. change if (1 || bootverbose) to if (bootverbose) 2000-06-15 04:12:17 +00:00
Bruce Evans 8e8cac5555 sys/malloc.h:
Order the SYSINIT() for MALLOC_DEFINE() correctly so that malloc()
doesn't have to waste time initializing itself.  The
(SI_SUB_KMEM, SI_ORDER_ANY) order was shared with syscons' SYSINIT()
for scmeminit(), and scmeminit() calls malloc(), so malloc()
initialization was not always complete on the first call to malloc().

kern/kern_malloc.c:
- Removed self-initialization in malloc().
- Removed half-baked sanity check in free().  Trust MALLOC_DEFINE().
2000-06-14 18:31:42 +00:00
Peter Wemm f71c01cc52 Borrow phk's axe and apply the next stage of config(8)'s evolution.
Use Warner Losh's "hint" driver to decode ascii strings to fill the
resource table at boot time.

config(8) no longer generates an ioconf.c table - ie: the configuration
no longer has to be compiled into the kernel.  You can reconfigure your
isa devices with the likes of this at loader(8) time:
  set hint.ed.0.port=0x320

userconfig will be rewritten to use this style interface one day and will
move to /boot/userconfig.4th or something like that.

It is still possible to statically compile in a set of hints into a kernel
if you do not wish to use loader(8).  See the "hints" directive in GENERIC
as an example.

All device wiring has been moved out of config(8).  There is a set of
helper scripts (see i386/conf/gethints.pl, and the same for alpha and pc98)
that extract the 'at isa? port foo irq bar' from the old files and produces
a hints file.  If you install this file as /boot/device.hints (and update
/boot/defaults/loader.conf - You can do a build/install in sys/boot) then
loader will load it automatically for you.  You can also compile in the
hints directly with:  hints "device.hints"  as well.

There are a few things that I'm not too happy with yet.  Under this scheme,
things like LINT would no longer be useful as "documentation" of settings.
I have renamed this file to 'NOTES' and stored the example hints strings
in it.  However... this is not something that config(8) understands, so
there is a script that extracts the build-specific data from the
documentation file (NOTES) to produce a LINT that can be config'ed and
built.  A stack of man4 pages will need updating. :-/

Also, since there is no longer a difference between 'device' and
'pseudo-device' I collapsed the two together, and the resulting 'device'
takes a 'number of units' for devices that still have it statically
allocated.  eg:  'device fe 4' will compile the fe driver with NFE set
to 4.  You can then set hints for 4 units (0 - 3).  Also note that
'device fe0' will be interpreted as "zero units of 'fe'" which would be
bad, so there is a config warning for this.  This is only needed for
old drivers that still have static limits on numbers of units.
All the statically limited drivers that I could find were marked.

Please exercise EXTREME CAUTION when transitioning!

Moral support by: phk, msmith, dfr, asmodai, imp, and others
2000-06-13 22:28:50 +00:00
Jeroen Ruigrok van der Werven 3b43fd626a Fix panic by moving the prp == 0 check up the order of sanity checks.
Submitted by:	Bart Thate <freebsd@1st.dudi.org> on -current
Approved by:	rwatson
2000-06-13 15:44:04 +00:00
Alfred Perlstein 8757e5bbc5 unstatic getfp() so that other subsystems can use it.
make sendfile() use it.

Approved by: dg
2000-06-12 18:06:12 +00:00
Bruce Evans 0477138dad Fixed allocation of unit numbers. Allocate the amount of space actually
required (rounded up a little) instead of twice the previous amount (or
a fixed amount for the first allocation).

The bug caused memory corruption when a new unit number for a devclass
was more than about twice the previous maximum one (or more than 3 for
the first one), so it corrupted memory (which happened to be the atkbdc
port resource list) in the reporter's configuration with sio unit
numbers { 0, 25, 1, 2, ... }.

Reviewed by:	dfr
Reported by:	Leonid Lukiyanets <stalwar78@hotmail.com>
2000-06-11 07:19:20 +00:00
Poul-Henning Kamp c27f4d3c50 fix a typo 2000-06-10 19:21:20 +00:00
Peter Wemm 53cc6add2a Unused include: #include "pty.h" 2000-06-10 07:12:40 +00:00
Jonathan Lemon d36cb22369 malloc(..., M_WAITOK) will not return NULL, so remove the error
handling for this case (which was slightly broken anyway)

Fix up some whitespace problems while I'm here too.

Submitted by:  alfred   (in a slightly different form)
2000-06-10 01:51:18 +00:00
Robert Watson e812e4917d Dammit.
Trimmed an extra sysctl when I moved kern.suser_permitted from kern_mib.c
to kern_prot.c.  This commit should restore it, as well as fix the
resulting build problems.

Submitted by:	asmodai
2000-06-07 18:54:41 +00:00
Robert Watson a996141f6e Introduce additional POSIX.1e-related stubs
o options CAPABILITIES
o kern/kern_cap.c -- syscall stubs returning ENOSYS

syscalls.master changes to follow

Obtained from:	TrustedBSD Project
2000-06-07 04:53:49 +00:00
Robert Watson 579f4eb4cd o bde suggested moving the SYSCTL from kern_mib to the more appropriate
kern_prot, which cleans up some namespace issues
o Don't need a special handler to limit un-setting, as suser is used to
  protect suser_permitted, making it one-way by definition.

Suggested by:	bde
2000-06-05 18:30:55 +00:00
Robert Watson 0309554711 o Introduce kern.suser_permitted, a sysctl that disables the suser_xxx()
returning anything but EPERM.
o suser is enabled by default; once disabled, cannot be reenabled
o To be used in alternative security models where uid0 does not connote
  additional privileges
o Should be noted that uid0 still has some additional powers as it
  owns many important files and executables, so suffers from the same
  fundamental security flaws as securelevels.  This is fixed with
  MAC integrity protection code (in progress)
o Not safe for consumption unless you are *really* sure you don't want
  things like shutdown to work, et al :-)

Obtained from:	TrustedBSD Project
2000-06-05 14:53:55 +00:00
Robert Watson 7cadc2663e o Modify jail to limit creation of sockets to UNIX domain sockets,
TCP/IP (v4) sockets, and routing sockets.  Previously, interaction
  with IPv6 was not well-defined, and might be inappropriate for some
  environments.  Similarly, sysctl MIB entries providing interface
  information also give out only addresses from those protocol domains.

  For the time being, this functionality is enabled by default, and
  toggleable using the sysctl variable jail.socket_unixiproute_only.
  In the future, protocol domains will be able to determine whether or
  not they are ``jail aware''.

o Further limitations on process use of getpriority() and setpriority()
  by jailed processes.  Addresses problem described in kern/17878.

Reviewed by:	phk, jmg
2000-06-04 04:28:31 +00:00
Bruce Evans f47f0edde4 Use "nm | awk ..." instead of genassym(1) to generate symbol value headers.
Symbol values are now represented using array sizes (4 arrays per symbol
so that 16-bit machines can represent 64-bit values) instead of being raw
binary values.

Reviewed by:	marcel
2000-06-02 09:27:48 +00:00
Mike Smith c3c50c4e3a Further fixes for multiple-IO-APIC systems from Tor Egge:
Further experimentation showed that some Dell 2450 machines with the
prevention kludge installed still got T_RESERVED traps.  CPU interrupt
vector 0x7A was observed to be triggered.  This might have been the
bitwise OR of two different vectors sent from each of the IOAPICs at
the same time.

	IOAPIC #0: 0x68 --> irq 8: RTC timer interrupt
	IOAPIC #1: 0x32 --> irq 18: scsi host adapter or network interface
		   ----
		   0x7a --> T_RESERVED

Both IOAPICs had ID 0.

Appendix B.3 in the MP spec indicates that the operating system is
responsible for assigning unique IDs to the IOAPICs.

The enclosed patch programs the IOAPIC IDs according to the IOAPIC
entries in the MP table.

Submitted by:	tegge
2000-05-31 21:37:28 +00:00
Matthew Dillon 8b03c8ed5e This is a cleanup patch to Peter's new OBJT_PHYS VM object type
and sysv shared memory support for it.  It implements a new
    PG_UNMANAGED flag that has slightly different characteristics
    from PG_FICTICIOUS.

    A new sysctl, kern.ipc.shm_use_phys has been added to enable the
    use of physically-backed sysv shared memory rather then swap-backed.
    Physically backed shm segments are not tracked with PV entries,
    allowing programs which use a large shm segment as a rendezvous
    point to operate without eating an insane amount of KVM in the
    PV entry management.  Read: Oracle.

    Peter's OBJT_PHYS object will also allow us to eventually implement
    page-table sharing and/or 4MB physical page support for such segments.
    We're half way there.
2000-05-29 22:40:54 +00:00
Doug Rabson ca2e05343b Add taskqueue system for easy-to-use SWIs among other things.
Reviewed by: arch
2000-05-28 15:45:30 +00:00
Søren Schmidt d5f65fcbd7 If devclass_alloc_unit() is called with a wired unit #, and this is
buzy, only search upwards for a free slot to use..

This broke unit numbering on ATA systems where PCI attached controllers
come before the mainboard ones...

Reviewed by: dfr
2000-05-26 13:59:05 +00:00
Jake Burkholder e39756439c Back out the previous change to the queue(3) interface.
It was not discussed and should probably not happen.

Requested by:		msmith and others
2000-05-26 02:09:24 +00:00
Jake Burkholder 740a1973a6 Change the way that the queue(3) structures are declared; don't assume that
the type argument to *_HEAD and *_ENTRY is a struct.

Suggested by:	phk
Reviewed by:	phk
Approved by:	mdodd
2000-05-23 20:41:01 +00:00
Mike Smith b38f58db69 Make a trip to Pointy-Hats-R-Us and actually include the header that
defines ROOTDEVNAME.

Submitted by:	"Jeffrey S. Sharp" <jss@subatomix.com>
2000-05-22 17:25:47 +00:00
David E. O'Brien d4af7a50dc Sort the sys includes. 2000-05-22 17:09:13 +00:00
Brian Feldman a274d19ba2 Back out NOTE_EXIT status reporting pending discussion. 2000-05-21 16:27:41 +00:00
Peter Wemm 24488c7498 Provide a temporary undocumented option: SHM_PHYS_BACKED. This will
become sysctl and/or flags controlled later.  It's mainly here for an
easy place to test the physical memory backed objects.
2000-05-21 13:52:13 +00:00
Brian Feldman a24b514d72 Put the wait(2) exit status in "data" for NOTE_EXIT kevents. 2000-05-17 01:16:11 +00:00