system/freebsd-src

mirror of https://github.com/freebsd/freebsd-src synced 2024-07-22 02:37:15 +00:00

Author	SHA1	Message	Date
Doug Rabson	b49a2b39fd	Remove the old kernel RPC implementation and the NFS_LEGACYRPC option. Approved by: re	2009-06-30 19:03:27 +00:00
Robert Watson	14961ba789	Replace AUDIT_ARG() with variable argument macros with a set more more specific macros for each audit argument type. This makes it easier to follow call-graphs, especially for automated analysis tools (such as fxr). In MFC, we should leave the existing AUDIT_ARG() macros as they may be used by third-party kernel modules. Suggested by: brooks Approved by: re (kib) Obtained from: TrustedBSD Project MFC after: 1 week	2009-06-27 13:58:44 +00:00
Marcel Moolenaar	dbb95048da	Add cpu_flush_dcache() for use after non-DMA based I/O so that a possible future I-cache coherency operation can succeed. On ARM for example the L1 cache can be (is) virtually mapped, which means that any I/O that uses temporary mappings will not see the I-cache made coherent. On ia64 a similar behaviour has been observed. By flushing the D-cache, execution of binaries backed by md(4) and/or NFS work reliably. For Book-E (powerpc), execution over NFS exhibits SIGILL once in a while as well, though cpu_flush_dcache() hasn't been implemented yet. Doing an explicit D-cache flush as part of the non-DMA based I/O read operation eliminates the need to do it as part of the I-cache coherency operation itself and as such avoids pessimizing the DMA-based I/O read operations for which D-cache are already flushed/invalidated. It also allows future optimizations whereby the bcopy() followed by the D-cache flush can be integrated in a single operation, which could be implemented using on-chips DMA engines, by-passing the D-cache altogether.	2009-05-18 18:37:18 +00:00
Rick Macklem	2a536430e9	Adding sys/nfs/nfssvc.h and sys/nfs/nfs_nfssvc.c in preparation for sharing of the nfssvc() system call between nfsserver and the nfsv4 server. Building of nfs_nfssvc.c will be committed later, at the time the .c files in sys/nfsserver are updated. To do so now would result in nfssvc() multiply defined. Submitted by: rmacklem Reviewed by: dfr Approved by: kib (mentor)	2009-04-07 19:06:51 +00:00
Ruslan Ermilov	ea26d58729	Replaced the misleading uses of a historical artefact M_TRYWAIT with M_WAIT. Removed dead code that assumed that M_TRYWAIT can return NULL; it's not true since the advent of MBUMA. Reviewed by: arch There are ongoing disputes as to whether we want to switch to directly using UMA flags M_WAITOK/M_NOWAIT for mbuf(9) allocation.	2008-03-25 09:39:02 +00:00
Jim Rees	c1ecb4d7c3	NFSv4 client: Add support for va_birthtime Fix va_ctime to use TIME_METADATA, not TIME_CREATE	2006-11-28 19:33:28 +00:00
Paul Saab	0e38f5365b	Fixes for NFS crashes on architectures that require strict alignment. - Fix nfsm_disct() so that after pulling up data, the remaining data is aligned if necessary. - Fix nfs_clnt_tcp_soupcall() to bcopy() the rpc length out of the mbuf (instead of casting m_data to a uint32). Submitted by: Pyun YongHyeon Reviewed by: Mohan Srinivasan	2005-07-14 20:08:27 +00:00
Warner Losh	c398230b64	/* -> /*- for license, minor formatting changes	2005-01-07 01:45:51 +00:00
Paul Saab	b8d0fc9581	Add non-blocking versions of nfsm_dissect() and friends, for use from socket callbacks or similar callers, from both the NFS client and the server. Instituted nfsm_dissect_nonblock(), nfsm_dissect_xx_nonblock(). And nfsm_disct() now takes an extra M_TRYWAIT/M_DONTWAIT argument. Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com	2004-12-06 17:33:52 +00:00
Warner Losh	2fcbca0d85	Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999 and email from Peter Wemm, Alan Cox and Robert Watson. Approved by: core, peter, alc, rwatson	2004-04-07 05:00:01 +00:00
Alfred Perlstein	1bf8720450	University of Michigan's Citi NFSv4 kernel client code. Submitted by: Jim Rees <rees@umich.edu>	2003-11-14 20:54:10 +00:00
Warner Losh	a163d034fa	Back out M_* changes, per decision of the TRB. Approved by: trb	2003-02-19 05:47:46 +00:00
Alfred Perlstein	44956c9863	Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0. Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.	2003-01-21 08:56:16 +00:00
Mike Barcroft	2b7f24d210	Change iov_base's type from `char ' to the standard` void '. All uses of iov_base which assume its type is `char ' (in order to do pointer arithmetic) have been updated to cast iov_base to `char '.	2002-10-11 14:58:34 +00:00
Peter Wemm	4f0da50ce2	nfsnode.h was moved to ../nfsclient ages ago. I forgot to remove it here.	2002-09-06 05:47:33 +00:00
Jeff Roberson	070adb7022	Remove unused include.	2002-03-20 10:12:07 +00:00
Alfred Perlstein	8febc6ba17	Remove __P.	2002-03-20 02:15:46 +00:00
Ian Dowse	6a4b84ea0d	When the old nfsm_adv() macro was moved to nfsm_adv_xx(), a '>=' must have been inadvertently changed to '>'. This broke nfsm_adv() in the case where the advancement count is equal to the amount of data remaining in the current mbuf. Instead of moving the current position N bytes forward, nfs_adv() could end up moving it back to N bytes from the start of the mbuf data. This should fix the client-side readdirplus problems that have been reported since September.	2001-12-31 06:56:31 +00:00
Ian Dowse	9669bb479a	Avoid passing the variable `tl' to functions that just use it for temporary storage. In the old NFS code it wasn't at all clear if the value of `tl' was used across or after macro calls, but I'm fairly confident that the convention was to keep its use local. Each ex-macro function now uses a local version of this variable, so all of the double-indirection goes away. The only exception to the `local use' rule for `tl' is nfsm_clget(), which is left unchanged by this commit. Reviewed by: peter	2001-12-18 01:22:09 +00:00
Peter Wemm	b9b0e19206	Unwind some more macros. NFSMADV() was kinda silly since it was right next to equivalent m_len adjustments. Move the nfsm_subs.h macros into groups depending on which phase they are used in, since that affects the error recovery requirements. Collect some of the common error checking into a single macro as preparation for unwinding some more. Have nfs_rephead return a value instead of secretly modifying args. Remove some unused function arguments that were being passed around. Clarify nfsm_reply()'s error handling (I hope).	2001-09-28 04:37:08 +00:00
Peter Wemm	1290984b33	Make nfsm_dissect() have an obvious return value.	2001-09-27 22:40:38 +00:00
Peter Wemm	ea7fe289fe	Tidy up nfsm_build usage. This is only partially finished.	2001-09-27 02:33:36 +00:00
Peter Wemm	c7f9a7e8ae	Oops, forgot to rm this last time.	2001-09-26 23:57:25 +00:00
Peter Wemm	eb25edbda3	Cleanup and split of nfs client and server code. This builds on the top of several repo-copies.	2001-09-18 23:32:09 +00:00
Warner Losh	976a26437e	nfs_strategy calls nfs_asyncio with td as NULL. So add a bandaid that will pass NULL as the struct proc when td is NULL. This has stopped crashing on my machine. Note: The passing of NULL may be bogus, but I'll let others fix that problem. Reviewed by: jhb	2001-09-18 18:37:52 +00:00
Peter Wemm	38f48395d6	Sync some differences that were different between the copies of the files that were in nfs/nfs.h and nfsserver/nfs.h in the p4 tree.	2001-09-15 04:41:56 +00:00
Julian Elischer	b40ce4165d	KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha	2001-09-12 08:38:13 +00:00
Kris Kennaway	bf61e26696	Fix some signed/unsigned integer confusion, and add bounds checking of arguments to some functions. Obtained from: NetBSD Reviewed by: peter MFC after: 2 weeks	2001-09-10 11:28:07 +00:00
Matthew Dillon	4e174404a3	Pushdown Giant for nfs syscalls (nfssvc())	2001-08-31 22:39:36 +00:00
Andrey A. Chernov	f6bf1abc1b	Stupid error from my side in prev. commit: \|\| -> &&	2001-08-23 18:02:29 +00:00
Andrey A. Chernov	e02faad5ca	Implement l_len<0 per POSIX check. Check for valid l_whence too.	2001-08-23 16:13:59 +00:00
Andrey A. Chernov	6c3f4fef64	Even better move: suppose that server is able to handle SEEK_END, so check arguments for all but not SEEK_END case, leaving SEEK_END handling for server	2001-08-23 14:21:26 +00:00
Andrey A. Chernov	e018907ed4	Apparently SEEK_END locking not supported by NFS. Previous variant returns EINVAL in that case, change it to EOPNOTSUPP.	2001-08-23 14:09:16 +00:00
Andrey A. Chernov	fb2f187058	Move <machine/> after <sys/> Pointed by: bde	2001-08-23 13:27:58 +00:00
Andrey A. Chernov	e9d095afdc	adv. lock: detect off_t overflow _before_ it occurse and return EOVERFLOW instead of EINVAL	2001-08-23 08:20:21 +00:00
Ian Dowse	02b31a0ee9	Fix a client-side memory leak in nfs_flush(). The code allocates a temporary array to store struct buf pointers if the list doesn't fit in a local array. Usually it frees the array when finished, but if it jumps to the 'again' label and the new list does fit in the local array then it can forget to free a previously malloc'd M_TEMP memory. Move the free() up a line so that it frees any previously allocated memory whether or not it needs to malloc a new array. Reviewed by: dillon	2001-08-01 10:25:13 +00:00
Peter Wemm	7b141d5db3	Check the filehandle size when mounting. Obtained from: Constantine Sapuntzakis <csapuntz@openbsd.org>	2001-07-30 20:01:59 +00:00
John Baldwin	617e358cdf	- Sort includes. - Update vmmeter statistics for vnode pagein/pageouts in getpages/putpages.	2001-07-04 20:14:59 +00:00
Matthew Dillon	0cddd8f023	With Alfred's permission, remove vm_mtx in favor of a fine-grained approach (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to test in a more piecemeal fashion. These macros will allow us to test fine-grained locks to a degree before removing Giant, and also after, and to remove Giant in a piecemeal fashion via sysctl's on those subsystems which the authors believe can operate without Giant.	2001-07-04 16:20:28 +00:00
John Baldwin	bc2327c310	- Protect the mnt_vnode list with the mntvnode lock. - Use queue(9) macros.	2001-06-28 04:10:07 +00:00
Jake Burkholder	d389ead74f	Unlock the process returned from pfind() if it does not return NULL. This fixes a witness lock violation for nfssvc returning with locks held. Submitted by: Jean-Luc Richier <Jean-Luc.Richier@imag.fr> PR: kern/27776	2001-06-01 01:30:51 +00:00
Robert Watson	b1fc0ec1a7	o Merge contents of struct pcred into struct ucred. Specifically, add the real uid, saved uid, real gid, and saved gid to ucred, as well as the pcred->pc_uidinfo, which was associated with the real uid, only rename it to cr_ruidinfo so as not to conflict with cr_uidinfo, which corresponds to the effective uid. o Remove p_cred from struct proc; add p_ucred to struct proc, replacing original macro that pointed. p->p_ucred to p->p_cred->pc_ucred. o Universally update code so that it makes use of ucred instead of pcred, p->p_ucred instead of p->p_pcred, cr_ruidinfo instead of p_uidinfo, cr_{r,sv}{u,g}id instead of p_*, etc. o Remove pcred0 and its initialization from init_main.c; initialize cr_ruidinfo there. o Restruction many credential modification chunks to always crdup while we figure out locking and optimizations; generally speaking, this means moving to a structure like this: newcred = crdup(oldcred); ... p->p_ucred = newcred; crfree(oldcred); It's not race-free, but better than nothing. There are also races in sys_process.c, all inter-process authorization, fork, exec, and exit. o Remove sigio->sio_ruid since sigio->sio_ucred now contains the ruid; remove comments indicating that the old arrangement was a problem. o Restructure exec1() a little to use newcred/oldcred arrangement, and use improved uid management primitives. o Clean up exit1() so as to do less work in credential cleanup due to pcred removal. o Clean up fork1() so as to do less work in credential cleanup and allocation. o Clean up ktrcanset() to take into account changes, and move to using suser_xxx() instead of performing a direct uid==0 comparision. o Improve commenting in various kern_prot.c credential modification calls to better document current behavior. In a couple of places, current behavior is a little questionable and we need to check POSIX.1 to make sure it's "right". More commenting work still remains to be done. o Update credential management calls, such as crfree(), to take into account new ruidinfo reference. o Modify or add the following uid and gid helper routines: change_euid() change_egid() change_ruid() change_rgid() change_svuid() change_svgid() In each case, the call now acts on a credential not a process, and as such no longer requires more complicated process locking/etc. They now assume the caller will do any necessary allocation of an exclusive credential reference. Each is commented to document its reference requirements. o CANSIGIO() is simplified to require only credentials, not processes and pcreds. o Remove lots of (p_pcred==NULL) checks. o Add an XXX to authorization code in nfs_lock.c, since it's questionable, and needs to be considered carefully. o Simplify posix4 authorization code to require only credentials, not processes and pcreds. Note that this authorization, as well as CANSIGIO(), needs to be updated to use the p_cansignal() and p_cansched() centralized authorization routines, as they currently do not take into account some desirable restrictions that are handled by the centralized routines, as well as being inconsistent with other similar authorization instances. o Update libkvm to take these changes into account. Obtained from: TrustedBSD Project Reviewed by: green, bde, jhb, freebsd-arch, freebsd-audit	2001-05-25 16:59:11 +00:00
John Baldwin	ce70e0a964	Assert Giant is held by the caller rather than getting it and releasing it in getpages/putpages.	2001-05-23 22:26:05 +00:00
Ruslan Ermilov	99d300a1ec	- FDESC, FIFO, NULL, PORTAL, PROC, UMAP and UNION file systems were repo-copied from sys/miscfs to sys/fs. - Renamed the following file systems and their modules: fdesc -> fdescfs, portal -> portalfs, union -> unionfs. - Renamed corresponding kernel options: FDESC -> FDESCFS, PORTAL -> PORTALFS, UNION -> UNIONFS. - Install header files for the above file systems. - Removed bogus -I${.CURDIR}/../../sys CFLAGS from userland Makefiles.	2001-05-23 09:42:29 +00:00
Alfred Perlstein	2395531439	Introduce a global lock for the vm subsystem (vm_mtx). vm_mtx does not recurse and is required for most low level vm operations. faults can not be taken without holding Giant. Memory subsystems can now call the base page allocators safely. Almost all atomic ops were removed as they are covered under the vm mutex. Alpha and ia64 now need to catch up to i386's trap handlers. FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties). Reviewed (partially) by: jake, jhb	2001-05-19 01:28:09 +00:00
Ian Dowse	0864ef1e8a	Change the second argument of vflush() to an integer that specifies the number of references on the filesystem root vnode to be both expected and released. Many filesystems hold an extra reference on the filesystem root vnode, which must be accounted for when determining if the filesystem is busy and then released if it isn't busy. The old `skipvp' approach required individual filesystem xxx_unmount functions to re-implement much of vflush()'s logic to deal with the root vnode. All 9 filesystems that hold an extra reference on the root vnode got the logic wrong in the case of forced unmounts, so `umount -f' would always fail if there were any extra root vnode references. Fix this issue centrally in vflush(), now that we can. This commit also fixes a vnode reference leak in devfs, which could result in idle devfs filesystems that refuse to unmount. Reviewed by: phk, bp	2001-05-16 18:04:37 +00:00
Mark Murray	fb919e4d5a	Undo part of the tangle of having sys/lock.h and sys/mutex.h included in other "system" header files. Also help the deprecation of lockmgr.h by making it a sub-include of sys/lock.h and removing sys/lockmgr.h form kernel .c files. Sort sys/*.h includes where possible in affected files. OK'ed by: bde (with reservations)	2001-05-01 08:13:21 +00:00
Poul-Henning Kamp	b7ebffbc08	Add a vop_stdbmap(), and make it part of the default vop vector. Make 7 filesystems which don't really know about VOP_BMAP rely on the default vector, rather than more or less complete local vop_nopbmap() implementations.	2001-04-29 11:48:41 +00:00
Alfred Perlstein	f411fba5d3	Remove incorrect comment. Submitted by: quinot@inf.enst.fr <quinot@inf.enst.fr> PR: kern/26893	2001-04-29 03:10:24 +00:00
Greg Lehey	60fb0ce365	Revert consequences of changes to mount.h, part 2. Requested by: bde	2001-04-29 02:45:39 +00:00
Greg Lehey	d98dc34f52	Correct #includes to work with fixed sys/mount.h.	2001-04-23 09:05:15 +00:00
Alfred Perlstein	d8d5fa8805	vnode_pager_freepage() is really vm_page_free() in disguise, nuke vnode_pager_freepage() and replace all calls to it with vm_page_free()	2001-04-19 06:18:23 +00:00
Alfred Perlstein	603c86672c	Implement client side NFS locks. Obtained from: BSD/os Import Ok'd by: mckusick, jkh, motd on builder.freebsd.org	2001-04-17 20:45:23 +00:00
Poul-Henning Kamp	f84e29a06c	This patch removes the VOP_BWRITE() vector. VOP_BWRITE() was a hack which made it possible for NFS client side to use struct buf with non-bio backing. This patch takes a more general approach and adds a bp->b_op vector where more methods can be added. The success of this patch depends on bp->b_op being initialized all relevant places for some value of "relevant" which is not easy to determine. For now the buffers have grown a b_magic element which will make such issues a tiny bit easier to debug.	2001-04-17 08:56:39 +00:00
Peter Wemm	9d10eb0c0c	Create debug.hashstat.[raw]nchash and debug.hashstat.[raw]nfsnode to enable easy access to the hash chain stats. The raw prefixed versions dump an integer array to userland with the chain lengths. This cheats and calls it an array of 'struct int' rather than 'int' or sysctl -a faithfully dumps out the 128K array on an average machine. The non-raw versions return 4 integers: count, number of chains used, maximum chain length, and percentage utilization (fixed point, multiplied by 100). The raw forms are more useful for analyzing the hash distribution, while the other form can be read easily by humans and stats loggers.	2001-04-11 00:39:20 +00:00
Robert Watson	2955f0b360	o Rather than arbitrarily construct a credential in the nfs_statfs() VFS operation, make use of the calling process's credential. This solution may not be ideal (there are a number of other possible proposals, including making use of the proc0 credential, adding a credential argument to the VFSOP, and switching from a hard-coded ucred to a hard-coded nfscred), it is simple and appears to work. The arguments against using simply crget() are fairly strong: it is the only place in the code (other than a nearly identical invocation in ncp) where crget() is invoked, other than in the process credential creation code; as ucred becomes extensible, this use of crget() without appropriate context results in less and less meaningful credential data. The implementation here will probably be tweaked as a result of experimentation and further exploration of the requirements. In the mean-time, it allows progress to be made in ucred expansion for new security models without causing a crash every time df is used on an NFS mounted file system. This code has been interop tested against FreeBSD and Solaris NFS servers. While using the process credentials should not introduce interop problems, please let me know if any turn out to exist. Reviewed by: freebsd-arch	2001-04-05 06:12:38 +00:00
Peter Wemm	439fea92c2	Use the same API as the example code. Allow the initial hash value to be passed in, as the examples do. Incrementally hash in the dvp->v_id (using the official api) rather than add it. This seems to help power-of-two predictable filename trees where the filenames repeat on a power-of-two cycle and the directory trees have power-of-two components in it. The simple add then mask was causing things like 12000+ entry collision chains while most other entries have between 0 and 3 entries each. This way seems to improve things.	2001-03-20 02:10:18 +00:00
Peter Wemm	6eb39ac8fc	Use a generic implementation of the Fowler/Noll/Vo hash (FNV hash). Make the name cache hash as well as the nfsnode hash use it. As a special tweak, create an unsigned version of register_t. This allows us to use a special tweak for the 64 bit versions that significantly speeds up the i386 version (ie: int64 XOR int64 is slower than int64 XOR int32). The code layout is a little strange for the string function, but I was able to get between 5 to 10% improvement over the original version I started with. The layout affects gcc code generation choices and this way was fastest on x86 and alpha. Note that 'CPUTYPE=p3' etc makes a fair difference to this. It is around 45% faster with -march=pentiumpro on a p6 cpu.	2001-03-17 09:31:06 +00:00
Peter Wemm	be1d4058eb	Dramatically improve the lame nfs_hash(). This is based on the Fowler / Noll / Vo Hash (http://www.isthe.com/chongo/tech/comp/fnv/). This improves hash coverage a massive amount. We were seeing one set of machines that were using 0.84% of their 131072 entry nfsnode hash buckets with maximum chain lengths of up to ~500 entries. The machine was spending nearly 100% of its time in 'system'. A test with this has pushed the coverage from a few perCent up to 91% utilization with a max chain length of 11. Submitted by: David Filo	2001-03-17 05:43:01 +00:00
John Baldwin	19eb87d22a	Grab the process lock while calling psignal and before calling psignal.	2001-03-07 03:37:06 +00:00
Adrian Chadd	f3a90da995	Reviewed by: jlemon An initial tidyup of the mount() syscall and VFS mount code. This code replaces the earlier work done by jlemon in an attempt to make linux_mount() work. * the guts of the mount work has been moved into vfs_mount(). * move `type', `path' and `flags' from being userland variables into being kernel variables in vfs_mount(). `data' remains a pointer into userspace. * Attempt to verify the `type' and `path' strings passed to vfs_mount() aren't too long. * rework mount() and linux_mount() to take the userland parameters (besides data, as mentioned) and pass kernel variables to vfs_mount(). (linux_mount() already did this, I've just tidied it up a little more.) * remove the copyin() stuff for `path'. `data' still requires copyin() since its a pointer into userland. * set `mount->mnt_statf_mntonname' in vfs_mount() rather than in each filesystem. This variable is generally initialised with `path', and each filesystem can override it if they want to. * NOTE: f_mntonname is intiailised with "/" in the case of a root mount.	2001-03-01 21:00:17 +00:00
Matthew Dillon	63692125a9	Fix lockup for loopback NFS mounts. The pipelined I/O limitations could be hit on the client side and prevent the server side from retiring writes. Pipeline operations turned off for all READs (no big loss since reads are usually synchronous) and for NFS writes, and left on for the default bwrite(). (MFC expected prior to 4.3 freeze) Testing by: mjacob, dillon	2001-02-28 04:13:11 +00:00
Brian Feldman	c0511d3b58	Switch to using a struct xucred instead of a struct xucred when not actually in the kernel. This structure is a different size than what is currently in -CURRENT, but should hopefully be the last time any application breakage is caused there. As soon as any major inconveniences are removed, the definition of the in-kernel struct ucred should be conditionalized upon defined(_KERNEL). This also changes struct export_args to remove dependency on the constantly-changing struct ucred, as well as limiting the bounds of the size fields to the correct size. This means: a) mountd and friends won't break all the time, b) mountd and friends won't crash the kernel all the time if they don't know what they're doing wrt actual struct export_args layout. Reviewed by: bde	2001-02-18 13:30:20 +00:00
Jeroen Ruigrok van der Werven	d7d97eb0aa	Preceed/preceeding are not english words. Use precede and preceding.	2001-02-18 10:43:53 +00:00
Ian Dowse	27d9bb4e44	Fix some problems that were introduced in revision 1.97. Instead of returning an error code to the caller, NFS server op routines must themselves build an error reply and return 0 to the caller. This is achieved by replacing the erroneous return statements with code that jumps forward to the op function's reply code. We need to be careful to ensure that the 'struct mount' pointer is NULL though, so that the final vn_finished_write() call becomes a no-op. Reviewed by: mckusick, dillon	2001-02-09 13:24:06 +00:00
Bosko Milekic	9ed346bab0	Change and clean the mutex lock interface. mtx_enter(lock, type) becomes: mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks) mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized) similarily, for releasing a lock, we now have: mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN. We change the caller interface for the two different types of locks because the semantics are entirely different for each case, and this makes it explicitly clear and, at the same time, it rids us of the extra `type' argument. The enter->lock and exit->unlock change has been made with the idea that we're "locking data" and not "entering locked code" in mind. Further, remove all additional "flags" previously passed to the lock acquire/release routines with the exception of two: MTX_QUIET and MTX_NOSWITCH The functionality of these flags is preserved and they can be passed to the lock/unlock routines by calling the corresponding wrappers: mtx_{lock, unlock}_flags(lock, flag(s)) and mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN locks, respectively. Re-inline some lock acq/rel code; in the sleep lock case, we only inline the _obtain_lock()s in order to ensure that the inlined code fits into a cache line. In the spin lock case, we inline recursion and actually only perform a function call if we need to spin. This change has been made with the idea that we generally tend to avoid spin locks and that also the spin locks that we do have and are heavily used (i.e. sched_lock) do recurse, and therefore in an effort to reduce function call overhead for some architectures (such as alpha), we inline recursion for this case. Create a new malloc type for the witness code and retire from using the M_DEV type. The new type is called M_WITNESS and is only declared if WITNESS is enabled. Begin cleaning up some machdep/mutex.h code - specifically updated the "optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently need those. Finally, caught up to the interface changes in all sys code. Contributors: jake, jhb, jasone (in no particular order)	2001-02-09 06:11:45 +00:00
Tor Egge	7d1af7b215	Enable use of DHCP extensions. Reviewed by: Per Kristian Hove <Per.Hove@math.ntnu.no>	2001-02-02 02:35:40 +00:00
Matthew Dillon	d2d00d11be	NFS O_EXCL file create semantics temporarily uses file attributes to store the file verifier. The NFS client is supposed to do a SETATTR after a successful O_EXCL open/create to clean up the attributes. FreeBSD's client code was generating a SETATTR rpc but was not generating an access or modification time update within that rpc, leaving the file with a broken access time that solaris chokes on (and it doesn't look very nice when you ls -lua under FreeBSD either!). Fixed.	2001-01-04 22:45:19 +00:00
Bosko Milekic	2a0c503e7a	* Rename M_WAIT mbuf subsystem flag to M_TRYWAIT. This is because calls with M_WAIT (now M_TRYWAIT) may not wait forever when nothing is available for allocation, and may end up returning NULL. Hopefully we now communicate more of the right thing to developers and make it very clear that it's necessary to check whether calls with M_(TRY)WAIT also resulted in a failed allocation. M_TRYWAIT basically means "try harder, block if necessary, but don't necessarily wait forever." The time spent blocking is tunable with the kern.ipc.mbuf_wait sysctl. M_WAIT is now deprecated but still defined for the next little while. * Fix a typo in a comment in mbuf.h * Fix some code that was actually passing the mbuf subsystem's M_WAIT to malloc(). Made it pass M_WAITOK instead. If we were ever to redefine the value of the M_WAIT flag, this could have became a big problem.	2000-12-21 21:44:31 +00:00
David Malone	7cc0979fd6	Convert more malloc+bzero to malloc+M_ZERO. Submitted by: josh@zipperup.org Submitted by: Robert Drehmel <robd@gmx.net>	2000-12-08 21:51:06 +00:00
Poul-Henning Kamp	a52585d77e	Simplify the tprintf() API. Loose the special <sys/tprintf.h> #include file.	2000-11-26 20:35:21 +00:00
Matthew Dillon	279d722604	This patchset fixes a large number of file descriptor race conditions. Pre-rfork code assumed inherent locking of a process's file descriptor array. However, with the advent of rfork() the file descriptor table could be shared between processes. This patch closes over a dozen serious race conditions related to one thread manipulating the table (e.g. closing or dup()ing a descriptor) while another is blocked in an open(), close(), fcntl(), read(), write(), etc... PR: kern/11629 Discussed with: Alexander Viro <viro@math.psu.edu>	2000-11-18 21:01:04 +00:00
Kirk McKusick	d6514f21d7	In preparation for deprecating CIRCLEQ macros in favor of TAILQ macros which provide the same functionality and are a bit more efficient, convert use of CIRCLEQ's in NFS to TAILQ's.	2000-11-14 08:00:39 +00:00
Eivind Eklund	e3c4036b18	Give vop_mmap an untimely death. The opportunity to give it a timely death timed out in 1996.	2000-11-01 17:57:24 +00:00
Poul-Henning Kamp	53ce36d17a	Remove unneeded #include <sys/proc.h> lines.	2000-10-29 13:57:19 +00:00
Tor Egge	e4e7a9a4e9	Reduce kernel stack usage by not having large packets on the stack. Supply correct size parameter to dhcpd. Replace some magic numbers with macro names. Handle more than one interface.	2000-10-29 01:19:32 +00:00
Tor Egge	5b93d1da3f	Eliminate some bitrot (nonexisting member variable names). Don't use curproc when a proc pointer is available.	2000-10-24 23:33:01 +00:00
Tor Egge	6d7518c134	Style fixes.	2000-10-24 22:40:18 +00:00
Tor Egge	f6ee793a3c	Make RPC timeout message more readable. Supply proc pointer to sosend.	2000-10-24 22:37:55 +00:00
David Malone	dc6dd1259f	Problem to avoid processes getting stuck in "vmopar". From Ian's mail: The problem seems to originate with NFS's postop_attr information that is returned with a read or write RPC. Within a vm_fault context, the code cannot deal with vnode_pager_setsize() shrinking a vnode. The workaround in the patch below stops the nfsm_postop_attr() macro from ever shrinking a vnode. If the new size in the postop_attr information is smaller, then it just sets the nfsnode n_attrstamp to 0 to stop the wrong size getting used in the future. This change only affects postop_attr attributes; the nfsm_loadattr() macro works as normal. The change is implemented by adding a new argument to nfs_loadattrcache() called 'dontshrink'. When this is non-zero, nfs_loadattrcache() will never reduce the vnode/nfsnode size; instead it zeros n_attrstamp. There remain other was processes can get stuck in vmopar. Submitted by: Ian Dowse <iedowse@maths.tcd.ie> Reviewed by: dillon Tested by: Vadim Belman <voland@lflat.org>	2000-10-24 10:13:36 +00:00
Boris Popov	c523a62949	Make nfs PDIRUNLOCK aware. Now it is possible to use nullfs mounts on top of nfs mounts, but there can be side effects because nfs uses shared locks for vnodes.	2000-10-15 08:06:32 +00:00
Boris Popov	823548e131	Add missed vop_stdunlock() for fifo's vnops (this affects only v2 mounts). Give nfs's node lock its own name.	2000-10-15 08:01:28 +00:00
Jason Evans	a18b1f1d4d	Convert lockmgr locks from using simple locks to using mutexes. Add lockdestroy() and appropriate invocations, which corresponds to lockinit() and must be called to clean up after a lockmgr lock is no longer needed.	2000-10-04 01:29:17 +00:00
Boris Popov	67e871664b	Add a lock structure to vnode structure. Previously it was either allocated separately (nfs, cd9660 etc) or keept as a first element of structure referenced by v_data pointer(ffs). Such organization leads to known problems with stacked filesystems. From this point vop_nolock() functions maintain only interlock lock. vop_stdlock() functions maintain built-in v_lock structure using lockmgr(). vop_sharedlock() is compatible with vop_stdunlock(), but maintains a shared lock on vnode. If filesystem wishes to export lockmgr compatible lock, it can put an address of this lock to v_vnlock field. This indicates that the upper filesystem can take advantage of it and use single lock structure for entire (or part) of stack of vnodes. This field shouldn't be examined or modified by VFS code except for initialization purposes. Reviewed in general by: mckusick	2000-09-25 15:24:04 +00:00
Jason Evans	0384fff8c5	Major update to the way synchronization is done in the kernel. Highlights include: * Mutual exclusion is used instead of spl(). See mutex(9). (Note: The alpha port is still in transition and currently uses both.) Per-CPU idle processes. * Interrupts are run in their own separate kernel threads and can be preempted (i386 only). Partially contributed by: BSDi (BSD/OS) Submissions by (at least): cp, dfr, dillon, grog, jake, jhb, sheldonh	2000-09-07 01:33:02 +00:00
Mike Smith	a77773909d	Don't scan for the "right" network interface by shooting in the dark. Assume that the nfs_diskless structure is correctly set up; the provider ought to be getting it right.	2000-09-05 22:29:36 +00:00
Kirk McKusick	9b97113391	This patch corrects the first round of panics and hangs reported with the new snapshot code. Update addaliasu to correctly implement the semantics of the old checkalias function. When a device vnode first comes into existence, check to see if an anonymous vnode for the same device was created at boot time by bdevvp(). If so, adopt the bdevvp vnode rather than creating a new vnode for the device. This corrects a problem which caused the kernel to panic when taking a snapshot of the root filesystem. Change the calling convention of vn_write_suspend_wait() to be the same as vn_start_write(). Split out softdep_flushworklist() from softdep_flushfiles() so that it can be used to clear the work queue when suspending filesystem operations. Access to buffers becomes recursive so that snapshots can recursively traverse their indirect blocks using ffs_copyonwrite() when checking for the need for copy on write when flushing one of their own indirect blocks. This eliminates a deadlock between the syncer daemon and a process taking a snapshot. Ensure that softdep_process_worklist() can never block because of a snapshot being taken. This eliminates a problem with buffer starvation. Cleanup change in ffs_sync() which did not synchronously wait when MNT_WAIT was specified. The result was an unclean filesystem panic when doing forcible unmount with heavy filesystem I/O in progress. Return a zero'ed block when reading a block that was not in use at the time that a snapshot was taken. Normally, these blocks should never be read. However, the readahead code will occationally read them which can cause unexpected behavior. Clean up the debugging code that ensures that no blocks be written on a filesystem while it is suspended. Snapshots must explicitly label the blocks that they are writing during the suspension so that they do not cause a `write on suspended filesystem' panic. Reorganize ffs_copyonwrite() to eliminate a deadlock and also to prevent a race condition that would permit the same block to be copied twice. This change eliminates an unexpected soft updates inconsistency in fsck caused by the double allocation. Use bqrelse rather than brelse for buffers that will be needed soon again by the snapshot code. This improves snapshot performance.	2000-07-24 05:28:33 +00:00
Kirk McKusick	f2a2857bb3	Add snapshots to the fast filesystem. Most of the changes support the gating of system calls that cause modifications to the underlying filesystem. The gating can be enabled by any filesystem that needs to consistently suspend operations by adding the vop_stdgetwritemount to their set of vnops. Once gating is enabled, the function vfs_write_suspend stops all new write operations to a filesystem, allows any filesystem modifying system calls already in progress to complete, then sync's the filesystem to disk and returns. The function vfs_write_resume allows the suspended write operations to begin again. Gating is not added by default for all filesystems as for SMP systems it adds two extra locks to such critical kernel paths as the write system call. Thus, gating should only be added as needed. Details on the use and current status of snapshots in FFS can be found in /sys/ufs/ffs/README.snapshot so for brevity and timelyness is not included here. Unless and until you create a snapshot file, these changes should have no effect on your system (famous last words).	2000-07-11 22:07:57 +00:00
Paul Saab	fb27899f3b	Correctly set the Maximum DHCP Message Size. bootpd now works again as well as ISC dhcpd.	2000-06-13 09:32:09 +00:00
Jake Burkholder	e39756439c	Back out the previous change to the queue(3) interface. It was not discussed and should probably not happen. Requested by: msmith and others	2000-05-26 02:09:24 +00:00
Jake Burkholder	740a1973a6	Change the way that the queue(3) structures are declared; don't assume that the type argument to _HEAD and _ENTRY is a struct. Suggested by: phk Reviewed by: phk Approved by: mdodd	2000-05-23 20:41:01 +00:00
Poul-Henning Kamp	831e32f863	Include a RFC 1533 "Maximum DHCP Message Size" option in our request. ISC DHCP will limit the reply length to 64 bytes for bootp replies unless we explicitly tell it we can do more. We tell it that we can do 1200 bytes.	2000-05-07 14:29:19 +00:00
Poul-Henning Kamp	9626b608de	Separate the struct bio related stuff out of <sys/buf.h> into <sys/bio.h>. <sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall not be made a nested include according to bdes teachings on the subject of nested includes. Diskdrivers and similar stuff below specfs::strategy() should no longer need to include <sys/buf.> unless they need caching of data. Still a few bogus uses of struct buf to track down. Repocopy by: peter	2000-05-05 09:59:14 +00:00
Poul-Henning Kamp	2c9b67a8df	Remove unneeded #include <vm/vm_zone.h> Generated by: src/tools/tools/kerninclude	2000-04-30 18:52:11 +00:00
Poul-Henning Kamp	87150cb06d	s/biowait/bufwait/g Prodded by: several.	2000-04-29 16:25:22 +00:00
Poul-Henning Kamp	3389ae9350	Remove ~25 unneeded #include <sys/conf.h> Remove ~60 unneeded #include <sys/malloc.h>	2000-04-19 14:58:28 +00:00
Poul-Henning Kamp	8177437d85	Complete the bio/buf divorce for all code below devfs::strategy Exceptions: Vinum untouched. This means that it cannot be compiled. Greg Lehey is on the case. CCD not converted yet, casts to struct buf (still safe) atapi-cd casts to struct buf to examine B_PHYS	2000-04-15 05:54:02 +00:00
Poul-Henning Kamp	c244d2de43	Move B_ERROR flag to b_ioflags and call it BIO_ERROR. (Much of this done by script) Move B_ORDERED flag to b_ioflags and call it BIO_ORDERED. Move b_pblkno and b_iodone_chain to struct bio while we transition, they will be obsoleted once bio structs chain/stack. Add bio_queue field for struct bio aware disksort. Address a lot of stylistic issues brought up by bde.	2000-04-02 15:24:56 +00:00
Matthew Dillon	8d1b3828fa	Add a sysctl to specify the amount of UDP receive space NFS should reserve, in maximal NFS packets. Originally only 2 packets worth of space was reserved. The default is now 4, which appears to greatly improve performance for slow to mid-speed machines on gigabit networks. Add documentation and correct some prior documentation. Problem Researched by: Andrew Gallatin <gallatin@cs.duke.edu> Approved by: jkh	2000-03-27 21:38:35 +00:00
Poul-Henning Kamp	b99c307a21	Rename the existing BUF_STRATEGY() to DEV_STRATEGY() substitute BUF_WRITE(foo) for VOP_BWRITE(foo->b_vp, foo) substitute BUF_STRATEGY(foo) for VOP_STRATEGY(foo->b_vp, foo) This patch is machine generated except for the ccd.c and buf.h parts.	2000-03-20 11:29:10 +00:00
Poul-Henning Kamp	21144e3bf1	Remove B_READ, B_WRITE and B_FREEBUF and replace them with a new field in struct buf: b_iocmd. The b_iocmd is enforced to have exactly one bit set. B_WRITE was bogusly defined as zero giving rise to obvious coding mistakes. Also eliminate the redundant struct buf flag B_CALL, it can just as efficiently be done by comparing b_iodone to NULL. Should you get a panic or drop into the debugger, complaining about "b_iocmd", don't continue. It is likely to write on your disk where it should have been reading. This change is a step in the direction towards a stackable BIO capability. A lot of this patch were machine generated (Thanks to style(9) compliance!) Vinum users: Greg has not had time to test this yet, be careful.	2000-03-20 10:44:49 +00:00
Peter Wemm	242c5536ea	Clean up some loose ends in the network code, including the X.25 and ISO #ifdefs. Clean out unused netisr's and leftover netisr linker set gunk. Tested on x86 and alpha, including world. Approved by: jkh	2000-02-13 03:32:07 +00:00
Matthew Dillon	c9ef26814c	Fix catastrophic bug in NQNFS related to UDP mounts. The 'nqhost' struct contains a major union for which lph_slp was being initialized only for TCP connections, but accessed for all types of connections leading to a crash. Also, a conditional controlling an nfs_slplock() call contained an improper paren grouping, causing a second crash in the UDP case. The nqhost structure has been reorganized and lph_slp has been made a normal structural field rather then a union field, and properly initialized for all connection types. Approved by: jkh	2000-01-26 20:51:29 +00:00
Matthew Dillon	34ddf54812	The alpha build cuases the 'nfsuid bloated' warning to occur. Well, there is nothing we can do about it. In fact, after further review there simply are not very many instances of the two structures NFS checks for 'bloat' so I've decided to simply rip the checks out entirely. Submitted by: Andrew Gallatin <gallatin@cs.duke.edu>	2000-01-13 20:18:25 +00:00
Yoshinobu Inoue	fb59c426ff	tcp updates to support IPv6. also a small patch to sys/nfs/nfs_socket.c, as max_hdr size change. Reviewed by: freebsd-arch, cvs-committers Obtained from: KAME project	2000-01-09 19:17:30 +00:00
Matthew Dillon	c37c9620cd	Enhance reassignbuf(). When a buffer cannot be time-optimally inserted into vnode dirtyblkhd we append it to the list instead of prepend it to the list in order to maintain a 'forward' locality of reference, which is arguably better then 'reverse'. The original algorithm did things this way to but at a huge time cost. Enhance the append interlock for NFS writes to handle intr/soft mounts better. Fix the hysteresis for NFS async daemon I/O requests to reduce the number of unnecessary context switches. Modify handling of NFS mount options. Any given user option that is too high now defaults to the kernel maximum for that option rather then the kernel default for that option. Reviewed by: Alfred Perlstein <bright@wintelcom.net>	2000-01-05 05:11:37 +00:00
Matthew Dillon	54986abd15	Fix at least one source of the continued 'NFS append race'. close() was calling nfs_flush() and then clearing the NMODIFIED bit. This is not legal since there might still be dirty buffers after the nfs_flush (for example, pending commits). The clearing of this bit in turn prevented a necessary vinvalbuf() from occuring leaving left over dirty buffers even after truncating the file in a new operation. The fix is to simply not clear NMODIFIED. Also added a sysctl vfs.nfs.nfsv3_commit_on_close which, if set to 1, will cause close() to do a stage 1 write AND a stage 2 commit synchronously. By default only the stage 1 write is done synchronously. Reviewed by: Alfred Perlstein <bright@wintelcom.net>	2000-01-05 00:32:18 +00:00
Peter Wemm	c447342094	Change #ifdef KERNEL to #ifdef _KERNEL in the public headers. "KERNEL" is an application space macro and the applications are supposed to be free to use it as they please (but cannot). This is consistant with the other BSD's who made this change quite some time ago. More commits to come.	1999-12-29 05:07:58 +00:00
Alfred Perlstein	20883b0f10	make getfh a standard syscall instead of dependant on having NFSSERVER defined, useful for userland fileservers that want to use a filehandle type interface to the filesystem. Submitted by: Assar Westerlund assar@stacken.kth.se PR: kern/15452	1999-12-21 20:21:12 +00:00
Robert Watson	91f37dcba1	Second pass commit to introduce new ACL and Extended Attribute system calls, vnops, vfsops, both in /kern, and to individual file systems that require a vfsop_ array entry. Reviewed by: eivind	1999-12-19 06:08:07 +00:00
Brian Feldman	d25f3712b7	M_PREPEND-related cleanups (unregisterifying struct mbuf *s).	1999-12-19 01:55:37 +00:00
Matthew Dillon	60c959f40b	Fix compilation warning on alpha when converting pointer to integer to generate hash index. Reviewed by: Andrew Gallatin <gallatin@cs.duke.edu>	1999-12-18 19:20:05 +00:00
Matthew Dillon	2cac06495e	Have NFS use a snapshot of boottime instead of boottime itself to generate the NFSv3 Version id. boottime itself may change, sometimes once every tick if you are running xntpd, which really throws off clients. Clients will tend to throw away what they believe to be stale data too often, and can get into long loops rewriting the same data over and over again because they believe the server has rebooted over and over again due to the changing version id. Approved by: jkh	1999-12-16 17:01:32 +00:00
Eivind Eklund	762e6b856c	Introduce NDFREE (and remove VOP_ABORTOP)	1999-12-15 23:02:35 +00:00
Matthew Dillon	b7303db36e	Fix two problems: First, fix the append seek position race that can occur due to np->n_size potentially changing if nfs_getcacheblk() blocks in nfs_write(). Second, under -current we must supply the proper bufsize when obtaining buffers that straddle the EOF, but due to the fact that np->n_size can change out from under us it is possible that we may specify the wrong buffer size and wind up truncating dirty data written by another process. Both problems are solved by implementing nfs_rslock(), which allows us to lock around sensitive buffer cache operations such as those that occur when appending to a file. It is believed that this race is responsible for causing dirtyoff/dirtyend and (in stable) validoff/validend to exceed the buffer size. Therefore we have now added a warning printf for the dirtyoff/end case in current. However, we have introduced a new problem which we need to fix at some point, and that is that soft or intr NFS mounts may become uninterruptable from the point of view of process A which is stuck waiting on rslock while process B is stuck doing the rpc. To unstick process A, process B would have to be interrupted first. Reviewed by: Alfred Perlstein <bright@wintelcom.net>	1999-12-14 19:07:54 +00:00
Matthew Dillon	1e64c256dc	Add a readahead heuristic to the NFS server side code. While the server cannot unilaterally pass data to a client it can reduce the physical disk transaction overhead by reading larger blocks. This results in better pipelining of requests/responses over the network and an almost 100% increase in cpu efficiency on the server. On a 100BaseTX network NFS read performance increases from 8.5 MBytes/sec to 10 MB/sec (maxed out), and cpu efficiency increases from 72% idle to 80% idle on the server. Reviewed by: Alfred Perlstein <bright@wintelcom.net>	1999-12-13 17:34:45 +00:00
Matthew Dillon	c9940d3b84	PR: kern/15222 Submitted by: Ian Dowse <iedowse@maths.tcd.ie>	1999-12-13 17:07:03 +00:00
Matthew Dillon	4682c8eac9	Fix a timeout deadlock that can occur when the process holding the receive lock hasn't yet managed to send its own request. PR: kern/15055 Submitted by: Ian Dowse iedowse@maths.tcd.ie	1999-12-13 04:24:55 +00:00
Matthew Dillon	5f3bfd608d	Fix a number of server-side issues related to aborting badly formed NFS packets, mainly initializing structure pointers to NULL which are conditionally freed prior to return. PR: kern/15249 Submitted by: Ian Dowse <iedowse@maths.tcd.ie>	1999-12-12 07:06:39 +00:00
Matthew Dillon	ea94c7b968	Synopsis of problem being fixed: Dan Nelson originally reported that blocks of zeros could wind up in a file written to over NFS by a client. The problem only occurs a few times per several gigabytes of data. This problem turned out to be bug #3 below. bug #1: B_CLUSTEROK must be cleared when an NFS buffer is reverted from stage 2 (ready for commit rpc) to stage 1 (ready for write). Reversions can occur when a dirty NFS buffer is redirtied with new data. Otherwise the VFS/BIO system may end up thinking that a stage 1 NFS buffer is clusterable. Stage 1 NFS buffers are not clusterable. bug #2: B_CLUSTEROK was inappropriately set for a 'short' NFS buffer (short buffers only occur near the EOF of the file). Change to only set when the buffer is a full biosize (usually 8K). This bug has no effect but should be fixed in -current anyway. It need not be backported. bug #3: B_NEEDCOMMIT was inappropriately set in nfs_flush() (which is typically only called by the update daemon). nfs_flush() does a multi-pass loop but due to the lack of vnode locking it is possible for new buffers to be added to the dirtyblkhd list while a flush operation is going on. This may result in nfs_flush() setting B_NEEDCOMMIT on a buffer which has NOT yet gone through its stage 1 write, causing only the commit rpc to be made and thus causing the contents of the buffer to be thrown away (never sent to the server). The patch also contains some cleanup, which only applies to the commit into -current. Reviewed by: dg, julian Originally Reported by: Dan Nelson <dnelson@emsphone.com>	1999-12-12 06:09:57 +00:00
Eivind Eklund	6bdfe06ad9	Lock reporting and assertion changes. * lockstatus() and VOP_ISLOCKED() gets a new process argument and a new return value: LK_EXCLOTHER, when the lock is held exclusively by another process. * The ASSERT_VOP_(UN)LOCKED family is extended to use what this gives them * Extend the vnode_if.src format to allow more exact specification than locked/unlocked. This commit should not do any semantic changes unless you are using DEBUG_VFS_LOCKS. Discussed with: grog, mch, peter, phk Reviewed by: peter	1999-12-11 16:13:02 +00:00
Matthew Dillon	98733bd871	The symlink implementation could improperly return a NULL vp along with a 0 error code. The problem occured with NFSv2 mounts and also with any NFSv3 mount returning an EEXIST error (which is translated to 0 prior to return). The reply to the rpc only contains the file handle for the no-error case under NFSv3. The error case under NFSv3 and all cases under NFSv2 do not return the file handle. The fix is to do a secondary lookup to obtain the file handle and thus be able to generate a return vnode for the situations where the rpc reply does not contain the required information. The bug was originally introduced when VOP_SYMLINK semantics were changed for -CURRENT. The NFS symlink implementation was not properly modified to go along with the change despite the fact that three people reviewed the code. It took four attempts to get the current fix correct with five people. Is NFS obfuscated? Ha! Reviewed by: Alfred Perlstein <bright@wintelcom.net> Testing and Discussion: "Viren R.Shah" <viren@rstcorp.com>, Eivind Eklund <eivind@FreeBSD.ORG>, Ian Dowse <iedowse@maths.tcd.ie>	1999-11-30 06:56:15 +00:00
Eivind Eklund	679106b15a	Remap the error EEXISTS => 0 before using error to determine if we should return a vp.	1999-11-27 18:14:41 +00:00
Matthew Dillon	b314ed9662	nm_srtt and nm_sdrtt are arrays[4]. Remove explicit initialization of element [4] in both, which goes beyond the end of the array, leaving [0], [1], [2], and [3]. This bug did not cause any problems since the overrun fields are initialized after the bogus array init but needs to be fixed anyway. Submitted by: Ian Dowse <iedowse@maths.tcd.ie>	1999-11-22 04:50:09 +00:00
Eivind Eklund	b6335212d6	Fix VOP_MKNOD for loss of WILLRELE. I don't know how I could have missed this in the first place :-( Noticed by: bde	1999-11-20 16:09:10 +00:00
Poul-Henning Kamp	0429e37ade	struct mountlist and struct mount.mnt_list have no business being a CIRCLEQ. Change them to TAILQ_HEAD and TAILQ_ENTRY respectively. This removes ugly mp != (void*)&mountlist comparisons. Requested by: phk Submitted by: Jake Burkholder jake@checker.org PR: 14967	1999-11-20 10:00:46 +00:00
Eivind Eklund	dd8c04f4c7	Remove WILLRELE from VOP_SYMLINK Note: Previous commit to these files (except coda_vnops and devfs_vnops) that claimed to remove WILLRELE from VOP_RENAME actually removed it from VOP_MKNOD.	1999-11-13 20:58:17 +00:00
Eivind Eklund	edfe736df9	Remove WILLRELE from VOP_RENAME	1999-11-12 03:34:28 +00:00
Matthew Dillon	a6aa6d9137	Remove special case socket sharing code in order to allow nfsd to bind IP addresses to udp/cltp sockets separately. PR: kern/13049 Reviewed by: David Malone <dwmalone@maths.tcd.ie>, freebsd-current	1999-11-11 17:24:02 +00:00
Matthew Dillon	6b21e94604	Fix nfssvc_addsock() to not attempt to free a NULL socket structure when returning an error. Bug fix was extracted from the PR. The PR is not yet entirely resolved by this commit. PR: kern/13049 Reviewed by: Matt Dillon <dillon@freebsd.org> Submitted by: Ian Dowse <iedowse@maths.tcd.ie>	1999-11-08 19:10:16 +00:00
Mike Smith	b7017a8210	Call bootpc_init before we try to mount an NFS root, if we're configured to use BOOTP for NFS root discovery. The entire interface setup inside nfs_mountroot is evil, and should die.	1999-11-01 23:55:38 +00:00
Poul-Henning Kamp	923502ff91	useracc() the prequel: Merge the contents (less some trivial bordering the silly comments) of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>. This puts the #defines for the vm_inherit_t and vm_prot_t types next to their typedefs. This paves the road for the commit to follow shortly: change useracc() to use VM_PROT_{READ\|WRITE} rather than B_{READ\|WRITE} as argument.	1999-10-29 18:09:36 +00:00
Matthew Dillon	a5d3fe3f85	Move NFS access cache hits/misses into nfsstats structure so /usr/bin/nfsstat can get to it easily.	1999-10-25 19:22:33 +00:00
Poul-Henning Kamp	3b6fb88590	Before we start to mess with the VFS name-cache clean things up a little bit: Isolate the namecache in its own file, and give it a dedicated malloc type.	1999-10-03 12:18:29 +00:00
Marcel Moolenaar	16df98ecc6	Careless use of struct proc *p caused major problems. 'p' is allowed to be NULL in this function (nfs_sigintr). Reorder the statements and guard them all with a single if (p != NULL). reported, reviewed and tested by: jdp	1999-09-29 20:12:39 +00:00
Matthew Dillon	13e14363fe	Make FreeBSD less conservative in determining when to return a cookie error for a directory. I have made this change after a great deal of review although I cannot be absolutely sure that this meets the spec. The issue devolves into whether changes in an underlying (UFS) directory can cause NFS directory blocks to be renumbered. My read of the code indicates that NFS directory blocks will not be renumbered, which means that the cookies should still remain valid after a change is made to the underlying directory. This being the case, a cookie error should not be returned when a change is made to the underlying directory and, instead, the NFS client should rely on mtime detection to invalidate and reload the directory. The use of mtime is problematic in of itself, due to insufficient resolution, which is why I believe the original conservative error handling was done. Still, there have been dozens of bug reports by people needing solaris<->FreeBSD interoperability and these have to be accomodated.	1999-09-29 17:14:58 +00:00
Marcel Moolenaar	2c42a14602	sigset_t change (part 2 of 5) ----------------------------- The core of the signalling code has been rewritten to operate on the new sigset_t. No methodological changes have been made. Most references to a sigset_t object are through macros (see signalvar.h) to create a level of abstraction and to provide a basis for further improvements. The NSIG constant has not been changed to reflect the maximum number of signals possible. The reason is that it breaks programs (especially shells) which assume that all signals have a non-null name in sys_signame. See src/bin/sh/trap.c for an example. Instead _SIG_MAXSIG has been introduced to hold the maximum signal possible with the new sigset_t. struct sigprop has been moved from signalvar.h to kern_sig.c because a) it is only used there, and b) access must be done though function sigprop(). The latter because the table doesn't holds properties for all signals, but only for the first NSIG signals. signal.h has been reorganized to make reading easier and to add the new and/or modified structures. The "old" structures are moved to signalvar.h to prevent namespace polution. Especially the coda filesystem suffers from the change, because it contained lines like (p->p_sigmask == SIGIO), which is easy to do for integral types, but not for compound types. NOTE: kdump (and port linux_kdump) must be recompiled. Thanks to Garrett Wollman and Daniel Eischen for pressing the importance of changing sigreturn as well.	1999-09-29 15:03:48 +00:00
Matthew Dillon	8fdd2461b3	Add comment to clarify a commit rpc optimization already being performed.	1999-09-20 19:10:28 +00:00
Matthew Dillon	b5acbc8b9c	Asynchronized client-side nfs_commit. NFS commit operations were previously issued synchronously even if async daemons (nfsiod's) were available. The commit has been moved from the strategy code to the doio code in order to asynchronize it. Removed use of lastr in preparation for removal of vnode->v_lastr. It has been replaced with seqcount, which is already supported by the system and, in fact, gives us a better heuristic for sequential detection then lastr ever did. Made major performance improvements to the server side commit. The server previously fsync'd the entire file for each commit rpc. The server now bawrite()s only those buffers related to the offset/size specified in the commit rpc. Note that we do not commit the meta-data yet. This works still needs to be done. Note that a further optimization can be done (and has not yet been done) on the client: we can merge multiple potential commit rpc's into a single rpc with a greater file offset/size range and greatly reduce rpc traffic. Reviewed by: Alan Cox <alc@cs.rice.edu>, David Greenman <dg@root.com>	1999-09-17 05:57:57 +00:00
Alfred Perlstein	c24fda81c9	Seperate the export check in VFS_FHTOVP, exports are now checked via VFS_CHECKEXP. Add fh(open\|stat\|stafs) syscalls to allow userland to query filesystems based on (network) filehandle. Obtained from: NetBSD	1999-09-11 00:46:08 +00:00
Alfred Perlstein	5a5fccc8e7	All unimplemented VFS ops now have entries in kern/vfs_default.c that return reasonable defaults. This avoids confusing and ugly casting to eopnotsupp or making dummy functions. Bogus casting of filesystem sysctls to eopnotsupp() have been removed. This should make *_vfsops.c more readable and reduce bloat. Reviewed by: msmith, eivind Approved by: phk Tested by: Jeroen Ruigrok/Asmodai <asmodai@wxs.nl>	1999-09-07 22:42:38 +00:00
Poul-Henning Kamp	9626728875	remove unused variables.	1999-08-28 19:21:03 +00:00
Peter Wemm	c3aac50f28	$Id$ -> $FreeBSD$	1999-08-28 01:08:13 +00:00
Poul-Henning Kamp	dbafb3660f	Simplify the handling of VCHR and VBLK vnodes using the new dev_t: Make the alias list a SLIST. Drop the "fast recycling" optimization of vnodes (including the returning of a prexisting but stale vnode from checkalias). It doesn't buy us anything now that we don't hardlimit vnodes anymore. Rename checkalias2() and checkalias() to addalias() and addaliasu() - which takes dev_t and udev_t arg respectively. Make the revoke syscalls use vcount() instead of VALIASED. Remove VALIASED flag, we don't need it now and it is faster to traverse the much shorter lists than to maintain the flag. vfs_mountedon() can check the dev_t directly, all the vnodes point to the same one. Print the devicename in specfs/vprint(). Remove a couple of stale LFS vnode flags. Remove unimplemented/unused LK_DRAINED;	1999-08-26 14:53:31 +00:00
Peter Wemm	ac7cc2e469	Convert all the nfs macros to do { blah } while (0) to ensure it works correctly in if/else etc. egcs had probably picked up most of the problems here before with "ambiguous braces" etc, but this should increase the robustness a bit. Based on an idea from Eivind Eklund.	1999-08-19 14:50:12 +00:00
Alan Cox	2c28a10540	Add the (inline) function vm_page_undirty for clearing the dirty bitmask of a vm_page. Use it. Submitted by: dillon	1999-08-17 04:02:34 +00:00
Dmitrij Tejblum	e868365294	nfs_getcacheblk() can return 0 if the mount is interruptible. It need to be checked by the caller. Broken in: rev. 1.70 (1999/05/02)	1999-08-12 18:04:39 +00:00
Poul-Henning Kamp	0ef1c82630	Decommision miscfs/specfs/specdev.h. Most of it goes into <sys/conf.h>, a few lines into <sys/vnode.h>. Add a few fields to struct specinfo, paving the way for the fun part.	1999-08-08 18:43:05 +00:00
Peter Wemm	56ba093ddb	Don't over-allocate and over-copy shorter NFSv2 filehandles and then correct the pointers afterwards. It's kinda bogus that we generate a 24 (?) byte filehandle (2 x int32 fsid and 16 byte VFS fhandle) and pad it out to 64 bytes for NFSv3 with garbage. The whole point of NFSv3's variable filehandle length was to allow for shorter handles, both in memory and over the wire. I plan on taking a shot at fixing this shortly.	1999-08-04 14:41:39 +00:00
Mike Smith	98f8aa275b	As described by the submitter: I did some tcpdumping the other day and noticed that GETATTR calls were frequently followed by an ACCESS call to the same file. The attached patch changes nfs_getattr to fill the access cache as a side effect. This is accomplished by calling ACCESS rather than GETATTR. This implies a modest overhead of 4 bytes in the request and 8 bytes in the response compared to doing a vanilla GETATTR. ... [The patch comprises two parts] The first is the "real" patch, the second counts misses and hits rather than fills and hits. The difference is subtle but important because both nfs_getattr and nfs_access now fill the cache. It also changes the default value of nfsaccess_cache_timeout to better match the attribute cache. IMHO, file timestamps change much more frequently than protection bits. Submitted by: Bjoern Groenvall <bg@sics.se> Reviewed by: dillon (partially)	1999-07-31 01:51:58 +00:00

1 2 3 4 5 ...

628 commits