Commit graph

91 commits

Author SHA1 Message Date
John Baldwin 33a9ed9d0e Change the pfind() and zpfind() functions to lock the process that they
find before releasing the allproc lock and returning.

Reviewed by:	-smp, dfr, jake
2001-04-24 00:51:53 +00:00
John Baldwin bc4ffcc97f Add missing includes of <sys/sx.h>
Reported by:	peter
2001-03-28 15:04:22 +00:00
John Baldwin 1005a129e5 Convert the allproc and proctree locks from lockmgr locks to sx locks. 2001-03-28 11:52:56 +00:00
John Baldwin 19eb87d22a Grab the process lock while calling psignal and before calling psignal. 2001-03-07 03:37:06 +00:00
John Baldwin f664e29076 - Hold both an exclusive proctree lock and the proc lock when reparenting
a traced process during exit.
- Lock the parent process while sending it SIGCHLD.
2001-03-07 02:17:43 +00:00
David E. O'Brien 21a3ee0ead MFS: bring the consistent `compat_3_brand' support into -CURRENT
(the work was first done in the RELENG_4 branch near a release
	 during a MFC to make the code cleaner and more consistent)
2001-02-24 22:20:11 +00:00
Robert Watson 91421ba234 o Move per-process jail pointer (p->pr_prison) to inside of the subject
credential structure, ucred (cr->cr_prison).
o Allow jail inheritence to be a function of credential inheritence.
o Abstract prison structure reference counting behind pr_hold() and
  pr_free(), invoked by the similarly named credential reference
  management functions, removing this code from per-ABI fork/exit code.
o Modify various jail() functions to use struct ucred arguments instead
  of struct proc arguments.
o Introduce jailed() function to determine if a credential is jailed,
  rather than directly checking pointers all over the place.
o Convert PRISON_CHECK() macro to prison_check() function.
o Move jail() function prototypes to jail.h.
o Emulate the P_JAILED flag in fill_kinfo_proc() and no longer set the
  flag in the process flags field itself.
o Eliminate that "const" qualifier from suser/p_can/etc to reflect
  mutex use.

Notes:

o Some further cleanup of the linux/jail code is still required.
o It's now possible to consider resolving some of the process vs
  credential based permission checking confusion in the socket code.
o Mutex protection of struct prison is still not present, and is
  required to protect the reference count plus some fields in the
  structure.

Reviewed by:	freebsd-arch
Obtained from:	TrustedBSD Project
2001-02-21 06:39:57 +00:00
Bosko Milekic 9ed346bab0 Change and clean the mutex lock interface.
mtx_enter(lock, type) becomes:

mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks)
mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)

similarily, for releasing a lock, we now have:

mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN.
We change the caller interface for the two different types of locks
because the semantics are entirely different for each case, and this
makes it explicitly clear and, at the same time, it rids us of the
extra `type' argument.

The enter->lock and exit->unlock change has been made with the idea
that we're "locking data" and not "entering locked code" in mind.

Further, remove all additional "flags" previously passed to the
lock acquire/release routines with the exception of two:

MTX_QUIET and MTX_NOSWITCH

The functionality of these flags is preserved and they can be passed
to the lock/unlock routines by calling the corresponding wrappers:

mtx_{lock, unlock}_flags(lock, flag(s)) and
mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN
locks, respectively.

Re-inline some lock acq/rel code; in the sleep lock case, we only
inline the _obtain_lock()s in order to ensure that the inlined code
fits into a cache line. In the spin lock case, we inline recursion and
actually only perform a function call if we need to spin. This change
has been made with the idea that we generally tend to avoid spin locks
and that also the spin locks that we do have and are heavily used
(i.e. sched_lock) do recurse, and therefore in an effort to reduce
function call overhead for some architectures (such as alpha), we
inline recursion for this case.

Create a new malloc type for the witness code and retire from using
the M_DEV type. The new type is called M_WITNESS and is only declared
if WITNESS is enabled.

Begin cleaning up some machdep/mutex.h code - specifically updated the
"optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN
and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently
need those.

Finally, caught up to the interface changes in all sys code.

Contributors: jake, jhb, jasone (in no particular order)
2001-02-09 06:11:45 +00:00
Jeroen Ruigrok van der Werven 1a6e52d0e9 Fix typo: seperate -> separate.
Seperate does not exist in the english language.
2001-02-06 11:21:58 +00:00
Jeroen Ruigrok van der Werven f09deb6962 Fix typo: wierd -> weird.
There is no such thing as wierd in the english language.
2001-02-06 09:25:10 +00:00
John Baldwin ba88dfc733 Back out proc locking to protect p_ucred for obtaining additional
references along with the actual obtaining of additional references.
2001-01-27 00:01:31 +00:00
John Baldwin f0ae4fa2db - Back out over-aggressive locking of p->p_cred.
- Back out locking ucred's and bumping refcounts for vnode operations.
2001-01-26 23:54:40 +00:00
John Baldwin 97eba215ac Argh, atomic_store_rel -> atomic_store_rel_int. 2001-01-23 21:40:07 +00:00
John Baldwin 505eb4d591 Woops, add in missing headers. 2001-01-23 21:39:15 +00:00
John Baldwin 1fed5f0326 Proc locking. 2001-01-23 21:33:55 +00:00
John Baldwin c41e968cbd Use queue macros. 2001-01-23 21:32:02 +00:00
John Baldwin bc4afa1f45 - Add proc locking.
- Fix several bugs in the wait syscall, including freeing the actual
  proc start, freeing the args, freeing the prison, and other minor
  nits.
- Use appropriate queue(3) macros.
- Use zpfind() instead of walking zombproc ourselves.
2001-01-23 21:30:25 +00:00
John Baldwin 48f8ca448a - Use proper atomic operations to make the run time initialization
controlled by svr_str_initialized be MP safe.
2001-01-23 21:07:16 +00:00
John Baldwin 37bf0fc1f8 FreeBSD doesn't have p_emuldata, and our stackgap_init() doesn't take an
argument.
2001-01-23 21:02:44 +00:00
Garrett Wollman 074a4a2c84 Finish deprecating <sys/select.h> in favor of <sys/selinfo.h> in kernel code. 2001-01-20 02:24:07 +00:00
Brian Feldman 3a83f9ac64 Take 10 seconds to actually fix the chgproccnt rather than just make it
explicitly error.  If the module is horribly broken, it should be
temporarily removed from src/sys/modules.
2001-01-09 04:55:37 +00:00
Garrett Wollman 5ae1c3979c With some trepidation, add a `#error' directive to this module. It was
broken and not fixed by whoever changed the interface of chgproccnt();
in the state it is in it could not possibly work (dereferencing an integer).
2001-01-09 04:27:09 +00:00
Jake Burkholder 98f03f9030 Protect proc.p_pptr and proc.p_children/p_sibling with the
proctree_lock.

linprocfs not locked pending response from informal maintainer.

Reviewed by:	jhb, -smp@
2000-12-23 19:43:10 +00:00
Jake Burkholder c0c2557090 - Change the allproc_lock to use a macro, ALLPROC_LOCK(how), instead
of explicit calls to lockmgr.  Also provides macros for the flags
  pased to specify shared, exclusive or release which map to the
  lockmgr flags.  This is so that the use of lockmgr can be easily
  replaced with optimized reader-writer locks.
- Add some locking that I missed the first time.
2000-12-13 00:17:05 +00:00
Marcel Moolenaar 4c9671ee0f Include machine/cpu.h for cpu_getstack().
Spotted by: jake
2000-12-03 01:56:15 +00:00
Marcel Moolenaar d034d459da Don't use p->p_sigstk.ss_flags to keep state of whether the
process is on the alternate stack or not. For compatibility
with sigstack(2) state is being updated if such is needed.

We now determine whether the process is on the alternate
stack by looking at its stack pointer. This allows a process
to siglongjmp from a signal handler on the alternate stack
to the place of the sigsetjmp on the normal stack. When
maintaining state, this would have invalidated the state
information and causing a subsequent signal to be delivered
on the normal stack instead of the alternate stack.

PR: 22286
2000-11-30 05:23:49 +00:00
Matthew Dillon e0be78f6dc Forgot to patch this file in file descriptor race fix commit
Submitted-by: "Danny J. Zerkel" <dzerkel@columbus.rr.com>
2000-11-23 11:05:14 +00:00
Marcel Moolenaar 806d7daafe Make MINSIGSTKSZ machine dependent, and have the sigaltstack
syscall compare against a variable sv_minsigstksz in struct
sysentvec as to properly take the size of the machine- and
ABI dependent struct sigframe into account.

The SVR4 and iBCS2 modules continue to have a minsigstksz of
8192 to preserve behavior. The real values (if different) are
not known at this time. Other ABI modules use the real
values.

The native MINSIGSTKSZ is now defined as follows:

Arch		MINSIGSTKSZ
----		-----------
alpha		    4096
i386		    2048
ia64		   12288

Reviewed by: mjacob
Suggested by: bde
2000-11-09 08:25:48 +00:00
David E. O'Brien 83d2913008 Make the target a little bit more generic. 2000-11-01 08:47:34 +00:00
David E. O'Brien 50e59af900 Cleanup after repo copy of sys/svr4 to sys/compat/svr4. 2000-08-31 22:54:09 +00:00
Peter Wemm 3a285cc807 Regen. (Fix SYS_exit) 2000-07-29 10:07:38 +00:00
Peter Wemm 4e0f152bbe Sigh. Fix SYS_exit problems. I misunderstood the significance of these
trailing options.
2000-07-29 10:05:25 +00:00
Peter Wemm 69065e880a Regenerate with makesyscalls.sh 2000-07-29 00:21:50 +00:00
Peter Wemm ac2b067b9a Change the 'exit()' system call to 'sys_exit()'. This avoids overlapping
gcc's internal exit() prototypes and the (futile) hackery that we did to
try and avoid warnings.  main() was renamed for similar reasons.
Remove an exit related hack from makesyscalls.sh.
2000-07-29 00:16:28 +00:00
Kirk McKusick f2a2857bb3 Add snapshots to the fast filesystem. Most of the changes support
the gating of system calls that cause modifications to the underlying
filesystem. The gating can be enabled by any filesystem that needs
to consistently suspend operations by adding the vop_stdgetwritemount
to their set of vnops. Once gating is enabled, the function
vfs_write_suspend stops all new write operations to a filesystem,
allows any filesystem modifying system calls already in progress
to complete, then sync's the filesystem to disk and returns. The
function vfs_write_resume allows the suspended write operations to
begin again. Gating is not added by default for all filesystems as
for SMP systems it adds two extra locks to such critical kernel
paths as the write system call. Thus, gating should only be added
as needed.

Details on the use and current status of snapshots in FFS can be
found in /sys/ufs/ffs/README.snapshot so for brevity and timelyness
is not included here. Unless and until you create a snapshot file,
these changes should have no effect on your system (famous last words).
2000-07-11 22:07:57 +00:00
Brian Feldman 42ebfbf227 Modify ktrace's general I/O tracing, ktrgenio(), to use a struct uio *
instead of a struct iovec * array and int len.  Get rid of stupidly trying
to allocate all of the memory and copyin()ing the entire iovec[], and
instead just do the proper VOP_WRITE() in ktrwrite() using a copy of
the struct uio that the syscall originally used.

This solves the DoS which could easily be performed; to work around the
DoS, one could also remove "options KTRACE" from the kernel.  This is
a very strong MFC candidate for 4.1.

Found by:	art@OpenBSD.org
2000-07-02 08:08:09 +00:00
Alfred Perlstein c636255150 fix races in the uidinfo subsystem, several problems existed:
1) while allocating a uidinfo struct malloc is called with M_WAITOK,
   it's possible that while asleep another process by the same user
   could have woken up earlier and inserted an entry into the uid
   hash table.  Having redundant entries causes inconsistancies that
   we can't handle.

   fix: do a non-waiting malloc, and if that fails then do a blocking
   malloc, after waking up check that no one else has inserted an entry
   for us already.

2) Because many checks for sbsize were done as "test then set" in a non
   atomic manner it was possible to exceed the limits put up via races.

   fix: instead of querying the count then setting, we just attempt to
   set the count and leave it up to the function to return success or
   failure.

3) The uidinfo code was inlining and repeating, lookups and insertions
   and deletions needed to be in their own functions for clarity.

Reviewed by: green
2000-06-22 22:27:16 +00:00
Jake Burkholder e39756439c Back out the previous change to the queue(3) interface.
It was not discussed and should probably not happen.

Requested by:		msmith and others
2000-05-26 02:09:24 +00:00
Jake Burkholder 740a1973a6 Change the way that the queue(3) structures are declared; don't assume that
the type argument to *_HEAD and *_ENTRY is a struct.

Suggested by:	phk
Reviewed by:	phk
Approved by:	mdodd
2000-05-23 20:41:01 +00:00
Bruce Evans fdf7d274e8 Regenerated (to fix "created from" lines, and to fix the previous
regeneration which somehow used the wrong syscalls.master file,
resulting in unbuildable svr4_sysent.c).
2000-05-10 14:38:28 +00:00
Bruce Evans 43914915de Fixed the "created from" lines generated from this file. makesyscalls.sh
expects the active id to be on the first line of the specification file.

Fixed some nearby gratuitous differences with kern/syscalls.master.
2000-05-10 14:32:32 +00:00
Bruce Evans 9114579d7a Regenerated (fixed the calculation of sy_nargs in sysent tables). 2000-05-09 21:52:02 +00:00
Bruce Evans 563ae30721 Don't forget to back up svr4_syscallnames.c. Don't depend on side effects
to generate it.
2000-05-09 21:40:01 +00:00
Bruce Evans 8bc3445472 Fixed the return type and args struct tag for exit(). They were wrong in
all emulators.  These entries were unused, so the bug had no effect, but
the the args struct tag will be used to calculate sy_nargs correctly.
2000-05-09 18:08:51 +00:00
Brian Feldman d6b17eeba3 Give the "streams" modulea version (1) and depend on it from the
"svr4elf" module.  This unbreaks the SVR4 KLD (which had an undefined
function because of thenewly-committed KLD enhancements).
2000-05-06 01:39:45 +00:00
Poul-Henning Kamp 2c9b67a8df Remove unneeded #include <vm/vm_zone.h>
Generated by:	src/tools/tools/kerninclude
2000-04-30 18:52:11 +00:00
Poul-Henning Kamp eb95c536ad Remove unneeded #include <sys/kernel.h> 2000-04-29 15:36:14 +00:00
Poul-Henning Kamp 3389ae9350 Remove ~25 unneeded #include <sys/conf.h>
Remove ~60 unneeded #include <sys/malloc.h>
2000-04-19 14:58:28 +00:00
Poul-Henning Kamp ed6aff7387 Remove unneeded <sys/buf.h> includes.
Due to some interesting cpp tricks in lockmgr, the LINT kernel shrinks
by 924 bytes.
2000-04-18 15:15:39 +00:00
David E. O'Brien c815a20cb2 Change our ELF binary branding to something more acceptable to the Binutils
maintainers.

After we established our branding method of writing upto 8 characters of
the OS name into the ELF header in the padding; the Binutils maintainers
and/or SCO (as USL) decided that instead the ELF header should grow two new
fields -- EI_OSABI and EI_ABIVERSION.  Each of these are an 8-bit unsigned
integer.  SCO has assigned official values for the EI_OSABI field.  In
addition to this, the Binutils maintainers and NetBSD decided that a better
ELF branding method was to include ABI information in a ".note" ELF
section.

With this set of changes, we will now create ELF binaries branded using
both "official" methods.  Due to the complexity of adding a section to a
binary, binaries branded with ``brandelf'' will only brand using the
EI_OSABI method.  Also due to the complexity of pulling a section out of an
ELF file vs. poking around in the ELF header, our image activator only
looks at the EI_OSABI header field.

Note that a new kernel can still properly load old binaries except for
Linux static binaries branded in our old method.

  *
  * For a short period of time, ``ld'' will also brand ELF binaries
  * using our old method.  This is so people can still use kernel.old
  * with a new world.  This support will be removed before 5.0-RELEASE,
  * and may not last anywhere upto the actual release.  My expiration
  * time for this is about 6mo.
  *
2000-04-18 02:39:26 +00:00