Commit graph

4155 commits

Author SHA1 Message Date
John Baldwin f21fc12736 Add a temporary hack that will go away with the ucred API update to bzero
the duplicated mutex before initializing it to avoid triggering the check
for init'ing an already initialized mutex.
2001-10-10 20:45:40 +00:00
John Baldwin 6a40eccec3 Malloc mutexes pre-zero'd as random garbage (including 0xdeadcode) my
trigget the check to make sure we don't initalize a mutex twice.
2001-10-10 20:43:50 +00:00
Doug Rabson e913ca22e2 Move setregs() out from under the PROC_LOCK so that it can use functions
list suword() which may trap.
2001-10-10 20:04:57 +00:00
Robert Watson 8a7d8cc675 - Combine kern.ps_showallprocs and kern.ipc.showallsockets into
a single kern.security.seeotheruids_permitted, describes as:
  "Unprivileged processes may see subjects/objects with different real uid"
  NOTE: kern.ps_showallprocs exists in -STABLE, and therefore there is
  an API change.  kern.ipc.showallsockets does not.
- Check kern.security.seeotheruids_permitted in cr_cansee().
- Replace visibility calls to socheckuid() with cr_cansee() (retain
  the change to socheckuid() in ipfw, where it is used for rule-matching).
- Remove prison_unpcb() and make use of cr_cansee() against the UNIX
  domain socket credential instead of comparing root vnodes for the
  UDS and the process.  This allows multiple jails to share the same
  chroot() and not see each others UNIX domain sockets.
- Remove unused socheckproc().

Now that cr_cansee() is used universally for socket visibility, a variety
of policies are more consistently enforced, including uid-based
restrictions and jail-based restrictions.  This also better-supports
the introduction of additional MAC models.

Reviewed by:	ps, billf
Obtained from:	TrustedBSD Project
2001-10-09 21:40:30 +00:00
John Baldwin 8688bb9383 proces -> process in a comment. 2001-10-09 17:25:30 +00:00
Robert Watson 32d186043b o Recent addition of (p1==p2) exception in p_candebug() permitted
processes to attach debugging to themselves even though the
  global kern_unprivileged_procdebug_permitted policy might disallow
  this.
o Move the kern_unprivileged_procdebug_permitted check above the
  (p1==p2) check.

Reviewed by:	des
2001-10-09 16:56:29 +00:00
John Baldwin 74e4502e62 Replace 'curproc' with 'td->td_proc'. 2001-10-08 21:05:46 +00:00
Matthew Dillon 917efbaaba WS Cleanup 2001-10-08 19:51:13 +00:00
Dag-Erling Smørgrav 3da3249106 Dissociate ptrace from procfs.
Until now, the ptrace syscall was implemented as a wrapper that called
various functions in procfs depending on which ptrace operation was
requested.  Most of these functions were themselves wrappers around
procfs_{read,write}_{,db,fp}regs(), with only some extra error checks,
which weren't necessary in the ptrace case anyway.

This commit moves procfs_rwmem() from procfs_mem.c into sys_process.c
(renaming it to proc_rwmem() in the process), and implements ptrace()
directly in terms of procfs_{read,write}_{,db,fp}regs() instead of
having it fake up a struct uio and then call procfs_do{,db,fp}regs().

It also moves the prototypes for procfs_{read,write}_{,db,fp}regs()
and proc_rwmem() from proc.h to ptrace.h, and marks all procfs files
except procfs_machdep.c as "optional procfs" instead of "standard".
2001-10-07 20:08:42 +00:00
Dag-Erling Smørgrav 23fad5b6c9 Always succeed if the target process is the same as the requesting process. 2001-10-07 20:06:03 +00:00
Ian Dowse 80f42b555d Fix a typo in do_sigaction() where sa_sigaction and sa_handler were
confused. Since sa_sigaction and sa_handler alias each other in a
union, the bug was completely harmless. This had been fixed as part
of the SIGCHLD changes in revision 1.125, but it was reverted when
they were backed out in revision 1.126.
2001-10-07 16:11:37 +00:00
Robert Watson c175d2226f o Introduce an 'options REGRESSION'-dependant sysctl namespaces,
'regression.*'.
o Add 'regression.securelevel_nonmonotonic', conditional on 'options
  REGRESSION', which allows the securelevel to be lowered for the purposes
  of efficient regression testing of securelevel policy decisions.
  Regression tests for securelevels will be committed shortly.

NOTE: 'options REGRESSION' should never be used on production machines, as
it permits violation of system invariants so as to improve the ability to
effectively test edge cases, and improve testing efficiency.
2001-10-07 03:51:22 +00:00
Marcel Moolenaar 49ead724c6 Fix breakage caused by previous commit. The lkmnosys and lkmressys
syscalls are of type NODEF but not in a way that fits the given
definition of that type. The exact difference of lkmressys and
lkmnosys is unclear, which makes it all the more confusing. A
reevaluation of what we have and what we really need is in order.

Spotted by: Maxime Henrion <mux@qualys.com>
Pointy hat: marcel
2001-10-07 00:16:31 +00:00
Matthew Dillon 845bd795c9 vinvalbuf() was only waiting for write-I/O to complete. It really has to
wait for both read AND write I/O to complete.  Only NFS calls vinvalbuf()
on an active vnode (when the server indicates that the file is stale), so
this bug fix only effects NFS clients.

MFC after:	3 days
2001-10-05 20:10:32 +00:00
John Baldwin 43150722c9 The aio kthreads start off with a root credential just like all other
kthreads, so don't malloc a ucred just so we can create a duplicate of the
one we already have.
2001-10-05 17:55:11 +00:00
Paul Saab 4787fd37af Only allow users to see their own socket connections if
kern.ipc.showallsockets is set to 0.

Submitted by:	billf (with modifications by me)
Inspired by:	Dave McKay (aka pm aka Packet Magnet)
Reviewed by:	peter
MFC after:	2 weeks
2001-10-05 07:06:32 +00:00
Dag-Erling Smørgrav 50f74e92b8 Final style(9) commit: placement of opening brace; a continuation indent I
missed in the previous commit; a line that exceeded 80 characters.  No
functional changes, but the object file's md5 checksum changes because some
lines have been displaced.
2001-10-04 16:35:44 +00:00
Dag-Erling Smørgrav 8a8d4e459c More style(9) fixes: no spaces between function name and parameter list;
some indentation fixes (particularly continuation lines).

Reviewed by:	md5(1)
2001-10-04 16:29:45 +00:00
Dag-Erling Smørgrav c5799337ea This file had a mixture of "return foo;" and "return (foo);"; standardize
on "return (foo);" as mandated by style(9).

Reviewed by:	md5(1)
2001-10-04 16:09:22 +00:00
David Malone 2bc21ed985 Hopefully improve control message passing over Unix domain sockets.
1) Allow the sending of more than one control message at a time
over a unix domain socket. This should cover the PR 29499.

2) This requires that unp_{ex,in}ternalize and unp_scan understand
mbufs with more than one control message at a time.

3) Internalize and externalize used to work on the mbuf in-place.
This made life quite complicated and the code for sizeof(int) <
sizeof(file *) could end up doing the wrong thing. The patch always
create a new mbuf/cluster now. This resulted in the change of the
prototype for the domain externalise function.

4) You can now send SCM_TIMESTAMP messages.

5) Always use CMSG_DATA(cm) to determine the start where the data
in unp_{ex,in}ternalize. It was using ((struct cmsghdr *)cm + 1)
in some places, which gives the wrong alignment on the alpha.
(NetBSD made this fix some time ago).

This results in an ABI change for discriptor passing and creds
passing on the alpha. (Probably on the IA64 and Spare ports too).

6) Fix userland programs to use CMSG_* macros too.

7) Be more careful about freeing mbufs containing (file *)s.
This is made possible by the prototype change of externalise.

PR:		29499
MFC after:	6 weeks
2001-10-04 13:11:48 +00:00
David Malone 59bdd40568 Allow sbcreatecontrol to make cluster sized control messages. 2001-10-04 12:59:53 +00:00
John Baldwin 0479e3d339 Move the ap boot spin lock earlier in the lock order before the sio(4)
lock since we occasionally call printf() while holding the ap boot lock
which can call down into the sio(4) driver if using a serial console.
2001-10-01 22:50:30 +00:00
Robert Watson c6ab2f6b4e o Complete the migration from suser error checking in the following form
in vfs_syscalls.c:

    if (mp->mnt_stat.f_owner != p->p_ucred->cr_uid &&
        (error = suser_td(td)) != 0) {
            unwrap_lots_of_stuff();
            return (error);
    }

  to:

    if (mp->mnt_stat.f_owner != p->p_ucred->cr_uid) {
            error = suser_td(td);
            if (error) {
                unwrap_lots_of_stuff();
                return (error);
            }
    }

  This makes the code more readable when complex clauses are in use,
  and minimizes conflicts for large outstanding patchsets modifying the
  kernel authorization code (of which I have several), especially where
  existing authorization and context code are combined in the same if()
  conditional.

Obtained from:	TrustedBSD Project
2001-10-01 20:01:07 +00:00
Matthew Dillon b5810bab2d After extensive testing it has been determined that adding complexity
to avoid removing higher level directory vnodes from the namecache has
no perceivable effect and will be removed.  This is especially true
when vmiodirenable is turned on, which it is by default now.  ( vmiodirenable
makes a huge difference in directory caching ).  The vfs.vmiodirenable and
vfs.nameileafonly sysctls have been left in to allow further testing, but
I expect to rip out vfs.nameileafonly soon too.

I have also determined through testing that the real problem with numvnodes
getting too large is due to the VM Page cache preventing the vnode from
being reclaimed.  The directory stuff made only a tiny dent relative
to Poul's original code, enough so that some tests succeeded.  But tests
with several million small files show that the bigger problem is the VM Page
cache.  This will have to be addressed by a future commit.

MFC after:	3 days
2001-10-01 04:33:35 +00:00
Jonathan Lemon 1a6fc8ef63 When FREE()ing kqueue related structures, charge them to the correct bucket.
Submitted by: iedowse
Forgotten by: jlemon
2001-09-30 17:00:56 +00:00
Bosko Milekic 70a61707f6 Re-enable mbtypes statistics in the mbuf allocator. I disabled these
when I changed the allocator bits. This implements per-CPU mbtypes
stats by keeping net number of decrements/increments of a given mbtype
per-CPU and then summing all of the per-CPU mbtypes to produce the total
net number of allocated mbufs of the given mbtype.
Counters are carefully balanced to avoid/prevent underflows/overflows.

mbtypes stats are re-enabled with the idea that we may occasionally
(although very rarely) observe slight inconsistencies in the stat
reporting. Most of the time, we should be fine, though.

Also make appropriate modifications to netstat(1) and systat(1) to do
the necessary reporting.

Submitted by: Jiangyi Liu <jyliu@163.net>
2001-09-30 01:58:39 +00:00
Jonathan Lemon 0217f5c71e Have EVFILT_TIMERS allocate their callouts via malloc() instead of using
the static callout list allocated by the system.

Change malloc type from M_TEMP to M_KQUEUE to better track memory.

Add a kern.kq_calloutmax to globally limit the amount of kernel memory
that can be allocated by callouts.

Submitted by: iedowse  (items 1, 2)
2001-09-29 17:48:39 +00:00
Dag-Erling Smørgrav 5b6db47748 Add a couple of API functions I need for my pseudofs WIP. Documentation
will follow when I've decided whether to keep this API or ditch it in
favor of something slightly more subtle.
2001-09-29 00:32:46 +00:00
Marcel Moolenaar 4166877345 Make the NODEF type usable. A syscall of type NODEF will only
have its entry in the syscall table added. Nothing else is
done. This differs from type NOPROTO in that NOPROTO adds a
definition to syscall.h besides adding a sysent. A syscall can
now have multiple entries without conflict. Note that the
argssize is fixed and depends on the syscall name.
2001-09-28 01:21:57 +00:00
Robert Watson 87fce2bb96 o When performing a securelevel check as part of securelevel_ge() or
securelevel_gt(), determine first if a local securelevel exists --
  if so, perform the check based on imax(local, global).  Otherwise,
  simply use the global value.
o Note: even though local securelevels might lag below the global one,
  if the global value is updated to higher than local values, maximum
  will still be used, making the global dominant even if there is local
  lag.

Obtained from:	TrustedBSD Project
2001-09-26 20:41:48 +00:00
Robert Watson 8a528812a0 o Modify kern.securelevel MIB entry to return a local securelevel, if
one is present in the current jail, otherwise, to return the global
  securelevel.
o If the securelevel is being updated, require that it be greater than
  the maximum of local and global, if a local securelevel exists,
  otherwise, just maximum of the global.  If there is a local
  securelevel, update the local one instead of the global one.
o Note: this does allow local securelevels to lag behind the global one
  as long as the local one is not updated following a global increase.

Obtained from:	TrustedBSD Project
2001-09-26 20:39:48 +00:00
Robert Watson 567931c8f6 o Initialize per-jail securelevel from global securelevel as part of
jail creation.

Obtained from:	TrustedBSD Project
2001-09-26 20:37:15 +00:00
Robert Watson d501d04b9e o Modify static settime() to accept the proc * for the process requesting
a time change, and callers so that they provide td->td_proc.
o Modify settime() to use securevel_gt() for securelevel checking.

Obtained from:	TrustedBSD Project
2001-09-26 19:53:57 +00:00
Robert Watson c2f413af19 o Modify sysctl access control check to use securelevel_gt(), and
clarify sysctl access control logic.

Obtained from:	TrustedBSD Project
2001-09-26 19:51:25 +00:00
Matthew Dillon 46cad5761c Enable vmiodirenable by default. Remove incorrect comment from sysctl.conf.
MFC after:	1 week
2001-09-26 19:35:04 +00:00
Matthew Dillon 3418ebebfe Make uio_yield() a global. Call uio_yield() between chunks
in vn_rdwr_inchunks(), allowing other processes to gain an exclusive
lock on the vnode.  Specifically: directory scanning, to avoid a race to the
root directory, and multiple child processes coring simultaniously so they
can figure out that some other core'ing child has an exclusive adv lock and
just exit instead.

This completely fixes performance problems when large programs core.  You
can have hundreds of copies (forked children) of the same binary core all
at once and not notice.

MFC after:	3 days
2001-09-26 06:54:32 +00:00
Paul Saab 88b1d98f31 Lock the vnode while truncating the corefile. This fixes a panic
with softupdates dangling deps.

Submitted by:	peter
MFC:		ASAP :)
2001-09-26 01:24:07 +00:00
John Baldwin 21377ce065 Remove superflous parens after de-macroizing. 2001-09-26 00:05:18 +00:00
Robert Watson 75bc5b3f22 o So, when <dd> e-mailed me and said that the comment was inverted
for securelevel_ge() and securelevel_gt(), I was a little surprised,
  but fixed it.  Turns out that it was the code that was inverted, during
  a whitespace cleanup in my commit tree.  This commit inverts the
  checks, and restores the comment.
2001-09-25 21:08:33 +00:00
John Baldwin dde96c9933 Since we no longer inline any debugging code in the mutex operations, move
all the debugging code into the function versions of the mutex operations
in kern_mutex.c.  This reduced the __mtx_* macros to simply wrappers of
the _{get,rel}_lock_* macros, so the __mtx_* macros were also abolished in
favor of just calling the _{get,rel}_lock_* macros.  The tangled hairy mass
of macros calling macros is at least a bit more sane now.
2001-09-22 21:19:55 +00:00
Robert Watson b4799065ef o vpaccess() -> vn_access() -- Peter reminds me that there is already
a convention for vnop helper routines of this sort.

Submitted by:	Mr Wemm <peter>
2001-09-22 03:07:41 +00:00
John Baldwin ed01445d8f Use the passed in thread to selrecord() instead of curthread. 2001-09-21 22:46:54 +00:00
John Baldwin 456ca585db Use the passed in thread pointer instead of curthread in calls to
selrecord() in ptcpoll().  The pre-KSE code used the passed in proc pointer
rather than curproc, and an earlier seltrue() call uses the passed in
thread and not curthread.
2001-09-21 22:22:25 +00:00
John Baldwin fea2ab833e The P_SELECT flag was moved from p->p_flag to td->td_flags, but p_flag
was locked by the proc lock and td_flags is locked by the sched_lock.
The places that read, set, and cleared TDF_SELECT weren't updated, so they
read and modified td_flags w/o holding the sched_lock, meaning that they
could corrupt the per-thread flags field.  As an immediate band-aid,
grab sched_lock while reading and manipulating td_flags in relation to
TDF_SELECT.  This will probably be cleaned up some later on.
2001-09-21 22:06:22 +00:00
John Baldwin e649bcb506 Remove unneeded proc variables and fix comments. 2001-09-21 21:54:45 +00:00
Robert Watson a90a3f2882 o Part two of eaccess(2) commit, rebuilt system call code.
Obtained from:	TrustedBSD Project
2001-09-21 21:34:06 +00:00
Robert Watson 9c94f7731e o Introduce eaccess(2), a version of access(2) that uses the effective
credentials rather than the real credentials.  This is useful for
  implementing GUI's which need to modify icons based on access rights,
  but where use of open(2) is too expensive, use of stat(2) doesn't
  reflect the file system's real protection model, and use of
  access() suffers from real/effective credential confusion.  This
  implementation provides the same semantics as the call of the same
  name on SCO OpenServer.  Note: using this call improperly can
  leave you subject to some of the same races present in the
  access(2) call.
o To implement this, break out the basic logic of access(2) into
  vpaccess(), which accepts a passed credential to perform the
  invocation of VOP_ACCESS().  Add eaccess(2) to invoke vpaccess(),
  and modify access(2) to use vpaccess().

Obtained from:	TrustedBSD Project
2001-09-21 21:33:22 +00:00
John Baldwin 278da5113f Remove a bogus comment. "atomic" doesn't mean that the operation is done
as a physical atomic operation.  That would require the code to use the
atomic API, which it does not.  Instead, the operation is made psuedo
atomic (hence the quotes) by use of the lock to protect clearing all of the
flags in question.
2001-09-21 19:26:57 +00:00
John Baldwin 21832b1ec0 GC some #if 0'd code. 2001-09-21 19:21:18 +00:00
John Baldwin 3226cbf43b Whitespace and spelling fixes. 2001-09-21 19:16:12 +00:00