system/freebsd-src

mirror of https://github.com/freebsd/freebsd-src synced 2024-10-20 15:24:25 +00:00

Author	SHA1	Message	Date
John Baldwin	2178ff8b9f	Sort includes from previous commit.	2001-05-21 23:19:50 +00:00
Alfred Perlstein	2395531439	Introduce a global lock for the vm subsystem (vm_mtx). vm_mtx does not recurse and is required for most low level vm operations. faults can not be taken without holding Giant. Memory subsystems can now call the base page allocators safely. Almost all atomic ops were removed as they are covered under the vm mutex. Alpha and ia64 now need to catch up to i386's trap handlers. FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties). Reviewed (partially) by: jake, jhb	2001-05-19 01:28:09 +00:00
Boris Popov	10fa1684ed	Currently there is no way to tell if write operation invoked via vn_start_write() on the given vnode will be successful. VOP_LEASE() may help to solve this problem, but its return value ignored nearly everywhere. For now just assume that the missing upper layer on write means insufficient access rights (which is correct for most cases).	2001-05-18 07:43:13 +00:00
Boris Popov	f3d1ec67b2	VOP getwritemount() can be invoked on vnodes with VFREE flag set (used in snapshots code). At this point upper vp may not exist.	2001-05-17 04:58:25 +00:00
Boris Popov	3413421bda	Use vop_*vobject() VOPs to get reference to VM object from upper or lower fs.	2001-05-17 04:52:57 +00:00
Boris Popov	9dbd7336ee	Do not leave an extra reference on vnode. PR: kern/27250 Submitted by: "Vladimir B. Grebenschikov" <vova@express.ru> MFC after: 2 weeks	2001-05-17 04:40:01 +00:00
Ian Dowse	0864ef1e8a	Change the second argument of vflush() to an integer that specifies the number of references on the filesystem root vnode to be both expected and released. Many filesystems hold an extra reference on the filesystem root vnode, which must be accounted for when determining if the filesystem is busy and then released if it isn't busy. The old `skipvp' approach required individual filesystem xxx_unmount functions to re-implement much of vflush()'s logic to deal with the root vnode. All 9 filesystems that hold an extra reference on the root vnode got the logic wrong in the case of forced unmounts, so `umount -f' would always fail if there were any extra root vnode references. Fix this issue centrally in vflush(), now that we can. This commit also fixes a vnode reference leak in devfs, which could result in idle devfs filesystems that refuse to unmount. Reviewed by: phk, bp	2001-05-16 18:04:37 +00:00
John Baldwin	b012b205a7	GC prototype for procfs_bmap() missed during a previous commit.	2001-05-11 23:37:37 +00:00
Poul-Henning Kamp	a62615e59b	Implement vop_std{get\|put}pages() and add them to the default vop[]. Un-copy&paste all the VOP_{GET\|PUT}PAGES() functions which do nothing but the default.	2001-05-01 08:34:45 +00:00
Mark Murray	fb919e4d5a	Undo part of the tangle of having sys/lock.h and sys/mutex.h included in other "system" header files. Also help the deprecation of lockmgr.h by making it a sub-include of sys/lock.h and removing sys/lockmgr.h form kernel .c files. Sort sys/*.h includes where possible in affected files. OK'ed by: bde (with reservations)	2001-05-01 08:13:21 +00:00
Bruce Evans	438abdb9c6	Backed out previous commit. It cause massive filesystem corruption, not to mention a compile-time warning about the critical function becoming unused, by replacing spec_bmap() with vop_stdbmap(). ntfs seems to have the same bug. The factor for converting specfs block numbers to physical block numbers is 1, but vop_stdbmap() uses the bogus factor btodb(ap->a_vp->v_mount->mnt_stat.f_iosize), which is 16 for ffs with the default block size of 8K. This factor is bogus even for vop_stdbmap() -- the correct factor is related to the filesystem blocksize which is not necessarily the same to the optimal i/o size. vop_stdbmap() was apparently cloned from nfs where these sizes happen to be the same. There may also be a problem with a_vp->v_mount being null. spec_bmap() still checks for this, but I think the checks in specfs are dead code which used to support block devices.	2001-04-30 14:35:35 +00:00
Poul-Henning Kamp	b7ebffbc08	Add a vop_stdbmap(), and make it part of the default vop vector. Make 7 filesystems which don't really know about VOP_BMAP rely on the default vector, rather than more or less complete local vop_nopbmap() implementations.	2001-04-29 11:48:41 +00:00
Greg Lehey	60fb0ce365	Revert consequences of changes to mount.h, part 2. Requested by: bde	2001-04-29 02:45:39 +00:00
John Baldwin	33a9ed9d0e	Change the pfind() and zpfind() functions to lock the process that they find before releasing the allproc lock and returning. Reviewed by: -smp, dfr, jake	2001-04-24 00:51:53 +00:00
Matt Jacob	2b4169610b	fix it so it compiles again	2001-04-23 18:51:54 +00:00
Greg Lehey	d98dc34f52	Correct #includes to work with fixed sys/mount.h.	2001-04-23 09:05:15 +00:00
John Baldwin	0316f71d56	- Various style fixes. - Fix a silly bug so that we return the actual error code if a procfs attach fails rather than always returning 0. Reported by: bde	2001-03-29 18:10:46 +00:00
John Baldwin	1005a129e5	Convert the allproc and proctree locks from lockmgr locks to sx locks.	2001-03-28 11:52:56 +00:00
John Baldwin	f34fa851e0	Catch up to header include changes: - <sys/mutex.h> now requires <sys/systm.h> - <sys/mutex.h> and <sys/sx.h> now require <sys/lock.h>	2001-03-28 09:17:56 +00:00
Robert Watson	70f3685105	o Change the API and ABI of the Extended Attribute kernel interfaces to introduce a new argument, "namespace", rather than relying on a first- character namespace indicator. This is in line with more recent thinking on EA interfaces on various mailing lists, including the posix1e, Linux acl-devel, and trustedbsd-discuss forums. Two namespaces are defined by default, EXTATTR_NAMESPACE_SYSTEM and EXTATTR_NAMESPACE_USER, where the primary distinction lies in the access control model: user EAs are accessible based on the normal MAC and DAC file/directory protections, and system attributes are limited to kernel-originated or appropriately privileged userland requests. o These API changes occur at several levels: the namespace argument is introduced in the extattr_{get,set}_file() system call interfaces, at the vnode operation level in the vop_{get,set}extattr() interfaces, and in the UFS extended attribute implementation. Changes are also introduced in the VFS extattrctl() interface (system call, VFS, and UFS implementation), where the arguments are modified to include a namespace field, as well as modified to advoid direct access to userspace variables from below the VFS layer (in the style of recent changes to mount by adrian@FreeBSD.org). This required some cleanup and bug fixing regarding VFS locks and the VFS interface, as a vnode pointer may now be optionally submitted to the VFS_EXTATTRCTL() call. Updated documentation for the VFS interface will be committed shortly. o In the near future, the auto-starting feature will be updated to search two sub-directories to the ".attribute" directory in appropriate file systems: "user" and "system" to locate attributes intended for those namespaces, as the single filename is no longer sufficient to indicate what namespace the attribute is intended for. Until this is committed, all attributes auto-started by UFS will be placed in the EXTATTR_NAMESPACE_SYSTEM namespace. o The default POSIX.1e attribute names for ACLs and Capabilities have been updated to no longer include the '$' in their filename. As such, if you're using these features, you'll need to rename the attribute backing files to the same names without '$' symbols in front. o Note that these changes will require changes in userland, which will be committed shortly. These include modifications to the extended attribute utilities, as well as to libutil for new namespace string conversion routines. Once the matching userland changes are committed, a buildworld is recommended to update all the necessary include files and verify that the kernel and userland environments are in sync. Note: If you do not use extended attributes (most people won't), upgrading is not imperative although since the system call API has changed, the new userland extended attribute code will no longer compile with old include files. o Couple of minor cleanups while I'm there: make more code compilation conditional on FFS_EXTATTR, which should recover a bit of space on kernels running without EA's, as well as update copyright dates. Obtained from: TrustedBSD Project	2001-03-15 02:54:29 +00:00
Kirk McKusick	589c7af992	Fixes to track snapshot copy-on-write checking in the specinfo structure rather than assuming that the device vnode would reside in the FFS filesystem (which is obviously a broken assumption with the device filesystem).	2001-03-07 07:09:55 +00:00
John Baldwin	931cccf603	Proc locking identical to that of linprocfs' vnops except that we hold the proc lock while calling psignal.	2001-03-07 03:15:05 +00:00
John Baldwin	30ac5d0f9e	Protect read to p_pptr with proc lock rather than proctree lock.	2001-03-07 03:10:20 +00:00
John Baldwin	c65c565b44	Proc locking. Lock around psignal() and also ensure both an exclusive proctree lock and the process lock are held when updating p_pptr and p_oppid. When we are just reaading p_pptr we only need the proc lock and not a proctree lock as well.	2001-03-07 03:09:40 +00:00
John Baldwin	0087374731	Protect p_flag with the proc lock.	2001-03-07 02:07:56 +00:00
Doug Rabson	a76decc6f7	Remove the copyinstr call which was trying to copy the pathname in from user space. It has already been copied in and mp->mnt_stat.f_mntonname has already been initialised by the caller. This fixes a panic on the alpha caused by the fact that the variable 'size' wasn't initialised because the call to copyinstr() bailed out with an EFAULT error.	2001-03-03 15:15:33 +00:00
Adrian Chadd	f3a90da995	Reviewed by: jlemon An initial tidyup of the mount() syscall and VFS mount code. This code replaces the earlier work done by jlemon in an attempt to make linux_mount() work. * the guts of the mount work has been moved into vfs_mount(). * move `type', `path' and `flags' from being userland variables into being kernel variables in vfs_mount(). `data' remains a pointer into userspace. * Attempt to verify the `type' and `path' strings passed to vfs_mount() aren't too long. * rework mount() and linux_mount() to take the userland parameters (besides data, as mentioned) and pass kernel variables to vfs_mount(). (linux_mount() already did this, I've just tidied it up a little more.) * remove the copyin() stuff for `path'. `data' still requires copyin() since its a pointer into userland. * set `mount->mnt_statf_mntonname' in vfs_mount() rather than in each filesystem. This variable is generally initialised with `path', and each filesystem can override it if they want to. * NOTE: f_mntonname is intiailised with "/" in the case of a root mount.	2001-03-01 21:00:17 +00:00
Robert Watson	91421ba234	o Move per-process jail pointer (p->pr_prison) to inside of the subject credential structure, ucred (cr->cr_prison). o Allow jail inheritence to be a function of credential inheritence. o Abstract prison structure reference counting behind pr_hold() and pr_free(), invoked by the similarly named credential reference management functions, removing this code from per-ABI fork/exit code. o Modify various jail() functions to use struct ucred arguments instead of struct proc arguments. o Introduce jailed() function to determine if a credential is jailed, rather than directly checking pointers all over the place. o Convert PRISON_CHECK() macro to prison_check() function. o Move jail() function prototypes to jail.h. o Emulate the P_JAILED flag in fill_kinfo_proc() and no longer set the flag in the process flags field itself. o Eliminate that "const" qualifier from suser/p_can/etc to reflect mutex use. Notes: o Some further cleanup of the linux/jail code is still required. o It's now possible to consider resolving some of the process vs credential based permission checking confusion in the socket code. o Mutex protection of struct prison is still not present, and is required to protect the reference count plus some fields in the structure. Reviewed by: freebsd-arch Obtained from: TrustedBSD Project	2001-02-21 06:39:57 +00:00
Jonathan Lemon	608a3ce62a	Extend kqueue down to the device layer. Backwards compatible approach suggested by: peter	2001-02-15 16:34:11 +00:00
Bosko Milekic	9ed346bab0	Change and clean the mutex lock interface. mtx_enter(lock, type) becomes: mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks) mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized) similarily, for releasing a lock, we now have: mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN. We change the caller interface for the two different types of locks because the semantics are entirely different for each case, and this makes it explicitly clear and, at the same time, it rids us of the extra `type' argument. The enter->lock and exit->unlock change has been made with the idea that we're "locking data" and not "entering locked code" in mind. Further, remove all additional "flags" previously passed to the lock acquire/release routines with the exception of two: MTX_QUIET and MTX_NOSWITCH The functionality of these flags is preserved and they can be passed to the lock/unlock routines by calling the corresponding wrappers: mtx_{lock, unlock}_flags(lock, flag(s)) and mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN locks, respectively. Re-inline some lock acq/rel code; in the sleep lock case, we only inline the _obtain_lock()s in order to ensure that the inlined code fits into a cache line. In the spin lock case, we inline recursion and actually only perform a function call if we need to spin. This change has been made with the idea that we generally tend to avoid spin locks and that also the spin locks that we do have and are heavily used (i.e. sched_lock) do recurse, and therefore in an effort to reduce function call overhead for some architectures (such as alpha), we inline recursion for this case. Create a new malloc type for the witness code and retire from using the M_DEV type. The new type is called M_WITNESS and is only declared if WITNESS is enabled. Begin cleaning up some machdep/mutex.h code - specifically updated the "optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently need those. Finally, caught up to the interface changes in all sys code. Contributors: jake, jhb, jasone (in no particular order)	2001-02-09 06:11:45 +00:00
Poul-Henning Kamp	37d4006626	Another round of the <sys/queue.h> FOREACH transmogriffer. Created with: sed(1) Reviewed by: md5(1)	2001-02-04 16:08:18 +00:00
Poul-Henning Kamp	fc2ffbe604	Mechanical change to use <sys/queue.h> macro API instead of fondling implementation details. Created with: sed(1) Reviewed by: md5(1)	2001-02-04 13:13:25 +00:00
Poul-Henning Kamp	ef9e85abba	Use <sys/queue.h> macro API.	2001-02-04 12:37:48 +00:00
Poul-Henning Kamp	4997ad7c1f	Add a BUF_KERNPROC() in the BIO_DELETE path. This seems to fix the problem which md(4) backed filesystems exposed.	2001-01-30 10:06:08 +00:00
Matthew Dillon	2a9737202a	This patch reestablishes the spec_fsync() guarentee that synchronous fsyncs, which typically occur during unmounting, will drain all dirty buffers even if it takes multiple passes to do so. The guarentee was mangled by the last patch which solved a problem due to -current disabling interrupts while holding giant (which caused an infinite spin loop waiting for I/O to complete). -stable does not have either patch, but has a similar bug in the original spec_fsync() code which is triggered by a bug in the softupdates umount code, a fix for which will be committed to -current as soon as Kirk stamps it. Then both solutions will be MFC'd to -stable. -stable currently suffers from a combination of the softupdates bug and a small window of opportunity in the original spec_fsync() code, and -stable also suffers from the spin-loop bug but since interrupts are enabled the spin resolves itself in a few milliseconds.	2001-01-29 08:19:28 +00:00
John Baldwin	b939335607	- Catch up to proc flag changes.	2001-01-24 11:20:05 +00:00
Peter Wemm	10cf882b4f	Fix breakage unconvered by LINT - dont refer to undefined variables in KASSERT()	2001-01-17 01:10:23 +00:00
Garrett Wollman	b7ef0b1281	Don't compile a dead variable declaration.	2001-01-09 04:24:43 +00:00
Poul-Henning Kamp	49851cc706	Use macro API to <sys/queue.h>	2000-12-31 10:24:19 +00:00
Matthew Dillon	08c0a67b2e	Fix a lockup problem that occurs with 'cvs update'. specfs's fsync can get into the same sort of infinite loop that ffs's fsync used to get into, probably due to background bitmap writes. The solution is the same.	2000-12-30 23:32:24 +00:00
Dag-Erling Smørgrav	dd488b6dd8	Retire kernfs (kernel part).	2000-12-28 12:17:35 +00:00
Matthew Dillon	2b6b0df712	This implements a better launder limiting solution. There was a solution in 4.2-REL which I ripped out in -stable and -current when implementing the low-memory handling solution. However, maxlaunder turns out to be the saving grace in certain very heavily loaded systems (e.g. newsreader box). The new algorithm limits the number of pages laundered in the first pageout daemon pass. If that is not sufficient then suceessive will be run without any limit. Write I/O is now pipelined using two sysctls, vfs.lorunningspace and vfs.hirunningspace. This prevents excessive buffered writes in the disk queues which cause long (multi-second) delays for reads. It leads to more stable (less jerky) and generally faster I/O streaming to disk by allowing required read ops (e.g. for indirect blocks and such) to occur without interrupting the write stream, amoung other things. NOTE: eventually, filesystem write I/O pipelining needs to be done on a per-device basis. At the moment it is globalized.	2000-12-26 19:41:38 +00:00
Jake Burkholder	98f03f9030	Protect proc.p_pptr and proc.p_children/p_sibling with the proctree_lock. linprocfs not locked pending response from informal maintainer. Reviewed by: jhb, -smp@	2000-12-23 19:43:10 +00:00
Robert Watson	f6a99e61c5	o Tighten restrictions on use of /proc/pid/ctl and move access checks in ctl to using centralized p_can() inter-process access control interface. Reviewed by: sef	2000-12-13 04:28:24 +00:00
Jake Burkholder	c0c2557090	- Change the allproc_lock to use a macro, ALLPROC_LOCK(how), instead of explicit calls to lockmgr. Also provides macros for the flags pased to specify shared, exclusive or release which map to the lockmgr flags. This is so that the use of lockmgr can be easily replaced with optimized reader-writer locks. - Add some locking that I missed the first time.	2000-12-13 00:17:05 +00:00
Dag-Erling Smørgrav	668891c57b	Add a module version (so that linprocfs can properly depend on procfs)	2000-12-09 13:17:51 +00:00
David Malone	7cc0979fd6	Convert more malloc+bzero to malloc+M_ZERO. Submitted by: josh@zipperup.org Submitted by: Robert Drehmel <robd@gmx.net>	2000-12-08 21:51:06 +00:00
John Baldwin	c3f52eedeb	Protect p_stat with the sched_lock. Reviewed by: jake	2000-12-02 01:58:15 +00:00
Jonathan Lemon	747fa57549	Update to reflect the disappearance of getsock(). Found by: LINT	2000-11-25 07:16:06 +00:00
Eivind Eklund	b8c8516a7f	More paranoia against overflows	2000-11-08 21:53:05 +00:00

1 2 3 4 5 ...

629 commits