system/freebsd-src

mirror of https://github.com/freebsd/freebsd-src synced 2024-10-19 06:44:31 +00:00

Author	SHA1	Message	Date
David Xu	d03c79eea1	Cosmetic change, make it QUEUE_MACRO_DEBUG friendly	2003-03-09 04:27:46 +00:00
Tim J. Robbins	ef3dab76bf	Hold the proc lock while accessing p_procsig in trapsignal().	2003-03-09 01:40:55 +00:00
Poul-Henning Kamp	f37de12275	Retire devstat_add_entry() as a public function and bump __FreeBSD_version to mark this act.	2003-03-08 21:46:43 +00:00
Poul-Henning Kamp	c7e73d59c4	Introduce a device driver for /dev/devstat, this will allow us to mmap the device statistics structures into userland instead of using sysctl. Introduce new devstat_new_entry() function which allocates the devstat structure an calls devstat_add_entry() on it.	2003-03-08 19:58:57 +00:00
Kenneth D. Merry	9b80d344ec	Zero copy send and receive fixes: - On receive, vm_map_lookup() needs to trigger the creation of a shadow object. To make that happen, call vm_map_lookup() with PROT_WRITE instead of PROT_READ in vm_pgmoveco(). - On send, a shadow object will be created by the vm_map_lookup() in vm_fault(), but vm_page_cowfault() will delete the original page from the backing object rather than simply letting the legacy COW mechanism take over. In other words, the new page should be added to the shadow object rather than replacing the old page in the backing object. (i.e. vm_page_cowfault() should not be called in this case.) We accomplish this by making sure fs.object == fs.first_object before calling vm_page_cowfault() in vm_fault(). Submitted by: gallatin, alc Tested by: ken	2003-03-08 06:58:18 +00:00
David Xu	b4508d7d3f	Lock sched_lock before modifying td_flags.	2003-03-08 04:09:04 +00:00
Rob Braun	d132c84f07	Fix a spelling error. Submitted by: jkh Reviewed by: zarzycki	2003-03-07 22:47:32 +00:00
John Baldwin	9722121a3c	Respect any passed in external lockmgr flags such as LK_NOWAIT in the default implementations of VOP_LOCK() and VOP_UNLOCK(). Tested by: jlemon, phk Glanced at by: jeffr	2003-03-07 20:45:07 +00:00
John Baldwin	9da590b49b	Oops, fix the double faults people were seeing with the recent changes to witness. Sleepable locks such as sx locks always come before all mutexes including Giant. However, the static lock order list placed Giant before the proctree and allproc sx locks. This resulted in witness creating a cycle in its lock order "tree" (real trees don't have cycles) leading to infinite recursion and eventually a double fault. To fix, put Giant after sx locks in the lock order list.	2003-03-06 17:25:06 +00:00
Alan Cox	7c4351aabd	Remove GIANT_REQUIRED from sf_buf_free().	2003-03-06 04:48:19 +00:00
Robert Watson	9283578946	Instrument sysarch() MD privileged I/O access interfaces with a MAC check, mac_check_sysarch_ioperm(), permitting MAC security policy modules to control access to these interfaces. Currently, they protect access to IOPL on i386, and setting HAE on Alpha. Additional checks might be required on other platforms to prevent bypass of kernel security protections by unauthorized processes. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2003-03-06 04:47:47 +00:00
Alan Cox	09c80124a3	Remove ENABLE_VFS_IOOPT. It is a long unfinished work-in-progress. Discussed on: arch@	2003-03-06 03:41:02 +00:00
Robert Watson	1b2c2ab29a	Provide a mac_check_system_swapoff() entry point, which permits MAC modules to authorize disabling of swap against a particular vnode. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2003-03-05 23:50:15 +00:00
Robert Watson	a184d471e2	Move the initialization of the vattr flags field in setfflags() to before the MAC check so that we pass the flags field into the MAC check properly initialized. This didn't affect any current MAC modules since they didn't care what the flags argument was (as they were primarily interested in the fact that it was a meta-data write, not the contents of the write), but would be relevant to future modules relying on that field. Submitted by: Mike Halderman <mrh@spawar.navy.mil> Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2003-03-05 23:15:23 +00:00
Peter Wemm	3c6b084e96	Finish driving a stake through the heart of netns and the associated ifdefs scattered around the place - its dead Jim! The SMB stuff had stolen AF_NS, make it official.	2003-03-05 19:24:24 +00:00
David Schultz	9c62b3ee7c	Make TTYHOG tunable. Reviewed by: mike (mentor)	2003-03-05 08:16:29 +00:00
Jonathan Lemon	1cafed3941	Update netisr handling; Each SWI now registers its queue, and all queue drain routines are done by swi_net, which allows for better queue control at some future point. Packets may also be directly dispatched to a netisr instead of queued, this may be of interest at some installations, but currently defaults to off. Reviewed by: hsu, silby, jayanth, sam Sponsored by: DARPA, NAI Labs	2003-03-04 23:19:55 +00:00
John Baldwin	c141c242ac	Bah, fix a bogon in the last commit: get the sense of a compare test right so that we allow a sleepable lock to be acquired with Giant held rather than allowing a sleepable lock to be acquired with anything but Giant held.	2003-03-04 22:34:07 +00:00
Jeff Roberson	24deed1aaa	- Hold the buf lock while manipulating and inspecting its fields. - Use gbincore() and not incore() so that we can drop the vnode interlock as we acquire the buflock. - Use GB_LOCK_NOWAIT when getting bufs for read ahead clusters so that we don't block on locked bufs. - Convert a while loop to a howmany() that will most likely be faster on modern processors. There is another while loop divide that was left near by because it is operating on a 64bit int and is most likely faster. - Cleanup the cluster_read() code a little to get rid of a goto and make the logic clearer. Tested on: x86, alpha Tested by: Steve Kargl <sgk@troutmask.apl.washington.edu> Reviewd by: arch	2003-03-04 21:35:28 +00:00
John Baldwin	1106937d99	Remove safety belt: it is now ok to do a mtx_trylock() on a mutex you already own. The mtx_trylock() will fail however. Enhance the comment at the top of the try lock function to explain this. Requested by: jlemon and his evil netisr locking	2003-03-04 21:32:25 +00:00
John Baldwin	263067951a	Replace calls to WITNESS_SLEEP() and witness_list() with equivalent calls to WITNESS_WARN().	2003-03-04 21:03:05 +00:00
John Baldwin	9b4982bfed	Add a WITNESS_WARN() call to verify that we hold no locks after running a handler from an interrupt thread.	2003-03-04 21:01:42 +00:00
John Baldwin	35580ede37	A small overhaul of witness: - Add a comment about special lock order rules and Giant near the top of subr_witness.c. Specifically, this documents and explains the real lock order relationship between Giant and sleepable locks (i.e. lockmgr locks and sx locks). Basically, Giant can be safely acquired either before or after sleepable locks and the case of Giant before a sleepable lock is exempted as a special case. - Add a new static function 'witness_list_lock()' that displays a single line of information about a struct lock_instance. This is used to make the output of witness messages more consistent and reduce some code duplication. - Fixup a few comments in witness_lock(). - Properly handle the Giant-before-sleepable-lock lock order exception in a more general fashion and remove the no longer needed LI_SLEPT flag. - Break up the last condition before assuming a reversal a bit to try and make the logic less confusing in witness_lock(). - Axe WITNESS_SLEEP() now that LI_SLEPT is no longer needed and replace it with a more general WITNESS_WARN() macro/function combination. WITNESS_WARN() allows you to output a customized message out to the console along with a list of held locks. It will optionally drop into the debugger as well. You can exempt a single lock from the check by passing it in as the second argument. You can also use flags to specify if Giant should be exempt from the check, if all sleepable locks should be exempt from the check, and if witness should panic if any non-exempt locks are found. - Make the witness_list() function static. Other areas of the kernel should use the new WITNESS_WARN() instead.	2003-03-04 20:56:39 +00:00
John Baldwin	5fa8dd90f9	Miscellaneous cleanups to _mtx_lock_sleep(): - Declare some local variables at the top of the function instead of in a nested block. - Use mtx_owned() instead of masking off bits from mtx_lock manually. - Read the value of mtx_lock into 'v' as a separate line rather than inside an if statement for clarity. This code is hairy enough as it is.	2003-03-04 20:32:41 +00:00
John Baldwin	6b869595c5	Properly assert that mtx_trylock() is not called on a mutex we already owned. Previously the KASSERT would only trigger if we successfully acquired a lock that we already held. However, _obtain_lock() fails to acquire locks that we already hold, so the KASSERT was never checked in the case it was supposed to fail.	2003-03-04 20:30:30 +00:00
Jeff Roberson	e1f89c222b	- Create a function sched_interact_score() which decides on the interactivity of a kseg and assigns it a value of 0 through 100. - Use sched_interact_score() to determine the dynamic priority. - Define SCHED_CURR() in terms of sched_interact_score(). - Adjust the maximum slice back down to 100ms. - Remove redundant clearing of ke_runq in sched_wakeup() - Clean up #defines and comment them.	2003-03-04 02:45:59 +00:00
Jeff Roberson	7261f5f68e	- Add a new 'flags' parameter to getblk(). - Define one flag GB_LOCK_NOWAIT that tells getblk() to pass the LK_NOWAIT flag to the initial BUF_LOCK(). This will eventually be used in cases were we want to use a buffer only if it is not currently in use. - Convert all consumers of the getblk() api to use this extra parameter. Reviwed by: arch Not objected to by: mckusick	2003-03-04 00:04:44 +00:00
Jeff Roberson	f727171140	- Correct the wchan in vop_stdfsync() This is almost what bde asked for. There is some desire to have per fs wchans still but that is difficult giving the current arrangement of the code.	2003-03-03 23:37:50 +00:00
Ruslan Ermilov	6de61153e8	FreeBSD 5.0 has stopped shipping /modules 2.5 years ago. Catch up with this further by excluding /modules from the (default) kern.module_path.	2003-03-03 22:53:35 +00:00
Nate Lawson	7dc9111650	Pick up one file missed in the previous vprint() cleanup	2003-03-03 19:50:36 +00:00
Nate Lawson	99648386d3	Finish cleanup of vprint() which was begun with changing v_tag to a string. Remove extraneous uses of vop_null, instead defering to the default op. Rename vnode type "vfs" to the more descriptive "syncer". Fix formatting for various filesystems that use vop_print.	2003-03-03 19:15:40 +00:00
Poul-Henning Kamp	182a9f7455	Make nokqfilter() return the correct return value. Ditch the D_KQFILTER flag which was used to prevent calling NULL pointers.	2003-03-03 16:24:47 +00:00
Poul-Henning Kamp	7ac40f5f59	Gigacommit to improve device-driver source compatibility between branches: Initialize struct cdevsw using C99 sparse initializtion and remove all initializations to default values. This patch is automatically generated and has been tested by compiling LINT with all the fields in struct cdevsw in reverse order on alpha, sparc64 and i386. Approved by: re(scottl)	2003-03-03 12:15:54 +00:00
Poul-Henning Kamp	a9463ba804	Don't pick up a name from the dev_t if it is not there.	2003-03-03 11:14:36 +00:00
Jeff Roberson	65c8760dbf	- Shift the tick count by 10 and back around sched_pctcpu_update() calculations. Keep this changes local to the function so the tick count is in its natural form otherwise. Previously 1000 was added each time a tick fired and we divided by 1000 when it was reported. This is done to reduce rounding errors.	2003-03-03 05:29:09 +00:00
Jeff Roberson	a6ed41865b	- In sched_add() special case PRI_TIMESHARE and PRI_ITHD\|PRI_REALTIME. We always place ITHD & REALTIME threads on the current queue of the current cpu. Prior to this change an interrupt thread would only ever run on one cpu.	2003-03-03 04:28:07 +00:00
Jeff Roberson	f1e8dc4a3b	- Refrain from setting the td_priority in sched_wakeup(). It will be reset before we return to user space.	2003-03-03 04:11:40 +00:00
Poul-Henning Kamp	f16304aaf0	Explicitly initialize all cdevsw methods with the relevant nofoo() function if they are NULL.	2003-03-02 19:46:45 +00:00
Dag-Erling Smørgrav	521f364b80	More low-hanging fruit: kill caddr_t in calls to wakeup(9) / [mt]sleep(9).	2003-03-02 16:54:40 +00:00
Dag-Erling Smørgrav	8994a245e0	Clean up whitespace, s/register //, refrain from strong urge to ANSIfy.	2003-03-02 15:56:49 +00:00
Dag-Erling Smørgrav	c952458814	uiomove-related caddr_t -> void * (just the low-hanging fruit)	2003-03-02 15:50:23 +00:00
Dag-Erling Smørgrav	d5279f20c5	Convert one of our main caddr_t consumers, uiomove(9), to void *.	2003-03-02 15:29:13 +00:00
Dag-Erling Smørgrav	34ca14c687	Clean up whitespace, unregisterize, ANSIfy, remove prototypes made superfluous by ANSIfication.	2003-03-02 15:08:33 +00:00
Poul-Henning Kamp	9c486c30e2	NO_GEOM cleanup: Remove cdevsw->d_size() implementation. No longer needed.	2003-03-02 14:43:46 +00:00
Poul-Henning Kamp	9285a87efd	NODEVFS cleanup: Replace devfs_{create,destroy} hooks with direct function calls.	2003-03-02 13:35:30 +00:00
Jeff Roberson	491081fabf	- Hold the vnode interlock across calls to bgetvp instead of acquiring it internally. This is required to stop multiple bufs from being associated with a single lblkno.	2003-03-02 06:05:23 +00:00
Tor Egge	c6faf3bf1d	Remove unneeded code added in revision 1.188.	2003-03-01 17:18:28 +00:00
Jeff Roberson	bff5362bf2	- gc USE_BUFHASH. The smp locking of the buf cache renders this useless.	2003-03-01 05:55:03 +00:00
David Xu	9948c47f0e	Check kse group limit before linking new ksegrp.	2003-02-28 15:57:33 +00:00
Poul-Henning Kamp	85f19dccdb	Add the flip-side check: If a driver wants a particular major#, make sure it is marked as allocated in reserved_majors[]. Whine if it wasn't.	2003-02-27 15:17:37 +00:00
Maxime Henrion	bca0668a92	We can now properly return ENODEV in nommap(), so do it. Remove the now wrong comment which says we can't.	2003-02-27 14:48:53 +00:00
Poul-Henning Kamp	beea48b254	Add support for allocating a device driver major number on demand. To do this, initialize the d_maj member of the cdevsw to MAJOR_AUTO. When the cdevsw is first passed to make_dev() a free major number will be assigned. Until we have a bit more experience with this a printf will announce this fact. Major numbers are not reclaimed, so loading/unloading the same device driver which uses MAJOR_AUTO will eventually deplete the pool of free major numbers and the system will panic when it can not allocate one. Still undecided who to invonvenience with the solution to this.	2003-02-27 14:46:51 +00:00
Hartmut Brandt	b89bc9e62b	When a process has been waiting on a condition variable or mutex the td_wmesg field in the thread structure points to the description string of the condition variable or mutex. If the condvar or the mutex had been initialized from a loadable module that was unloaded in the meantime, td_wmesg may now point to invalid memory. Retrieving the process table now may panic the kernel (or access junk). Setting the td_wmesg field to NULL after unblocking on the condvar/mutex prevents this panic. PR: kern/47408 Approved by: jake (mentor)	2003-02-27 08:43:27 +00:00
Poul-Henning Kamp	f477b4fd53	NODEVFS cleanup: Remove cdevsw_add() and cdevsw_remove(), they served us well for a long time. Bump __FreeBSD_version to 500104 to mark this.	2003-02-27 07:40:44 +00:00
David Xu	3b3df40fc4	Release sched_lock before calling upcall_free.	2003-02-27 05:42:01 +00:00
Julian Elischer	ac2e415327	Change the process flags P_KSES to be P_THREADED. This is just a cosmetic change but I've been meaning to do it for about a year.	2003-02-27 02:05:19 +00:00
Sam Leffler	893bec8059	o fix ppsratecheck to interpret a maxpps of zero as "ignore everything" o add a comment explaining the significance of using 0 or -1 (actually any negative value) for maxpps	2003-02-26 17:16:38 +00:00
David Xu	426269b2c2	Fix a bug when handling SIGCONT. Reported By: Mike Makonnen <mtm@identd.net>	2003-02-26 12:47:46 +00:00
Scott Long	7874f606d5	Introduce a new taskqueue that runs completely free of Giant, and in turns runs its tasks free of Giant too. It is intended that as drivers become locked down, they will move out of the old, Giant-bound taskqueue and into this new one. The old taskqueue has been renamed to taskqueue_swi_giant, and the new one keeps the name taskqueue_swi.	2003-02-26 03:15:42 +00:00
David Xu	5614648e5e	Add a missing '!'.	2003-02-26 01:56:14 +00:00
David Xu	4b4866ed42	Add a simple facility to allow round roubin in userland. Reviewed by: julain	2003-02-26 00:58:23 +00:00
Kirk McKusick	7e734c4149	When doing cleanup of excessive buffers in bdwrite (see kern/vfs_bio.c delta 1.371) we must ensure that we do not get ourselves into a recursive trap endlessly trying to clean up after ourselves. Reported by: Attila Nagy <bra@fsn.hu> Sponsored by: DARPA & NAI Labs.	2003-02-25 23:59:09 +00:00
Mike Makonnen	0bd5f7979d	Unbreak mutex profiling (at least for me). o Always check for null when dereferencing the filename component. o Implement a try-and-backoff method for allocating memory to dump stats to avoid a spin-lock -> sleep-lock mutex lock order panic with WITNESS. Approved by: des, markm (mentor) Not objected: jhb	2003-02-25 22:28:46 +00:00
Jeff Roberson	2e3981a70c	- Add the missing NULL interlock argument to a recently added BUF_LOCK.	2003-02-25 08:23:11 +00:00
Kirk McKusick	3a7053cb60	Prevent large files from monopolizing the system buffers. Keep track of the number of dirty buffers held by a vnode. When a bdwrite is done on a buffer, check the existing number of dirty buffers associated with its vnode. If the number rises above vfs.dirtybufthresh (currently 90% of vfs.hidirtybuffers), one of the other (hopefully older) dirty buffers associated with the vnode is written (using bawrite). In the event that this approach fails to curb the growth in it the vnode's number of dirty buffers (due to soft updates rollback dependencies), the more drastic approach of doing a VOP_FSYNC on the vnode is used. This code primarily affects very large and actively written files such as snapshots. This change should eliminate hanging when taking snapshots or doing background fsck on very large filesystems. Hopefully, one day it will be possible to cache filesystem metadata in the VM cache as is done with file data. As it stands, only the buffer cache can be used which limits total metadata storage to about 20Mb no matter how much memory is available on the system. This rather small memory gets badly thrashed causing a lot of extra I/O. For example, taking a snapshot of a 1Tb filesystem minimally requires about 35,000 write operations, but because of the cache thrashing (we only have about 350 buffers at our disposal) ends up doing about 237,540 I/O's thus taking twenty-five minutes instead of four if it could run entirely in the cache. Reported by: Attila Nagy <bra@fsn.hu> Sponsored by: DARPA & NAI Labs.	2003-02-25 06:44:42 +00:00
David Xu	d4b570f053	Remove a bogus comment.	2003-02-25 05:17:18 +00:00
David Xu	768298d8c4	Remove a never true condition.	2003-02-25 05:14:18 +00:00
Jeff Roberson	17661e5ac4	- Add an interlock argument to BUF_LOCK and BUF_TIMELOCK. - Remove the buftimelock mutex and acquire the buf's interlock to protect these fields instead. - Hold the vnode interlock while locking bufs on the clean/dirty queues. This reduces some cases from one BUF_LOCK with a LK_NOWAIT and another BUF_LOCK with a LK_TIMEFAIL to a single lock. Reviewed by: arch, mckusick	2003-02-25 03:37:48 +00:00
Maxime Henrion	07159f9c56	Cleanup of the d_mmap_t interface. - Get rid of the useless atop() / pmap_phys_address() detour. The device mmap handlers must now give back the physical address without atop()'ing it. - Don't borrow the physical address of the mapping in the returned int. Now we properly pass a vm_offset_t * and expect it to be filled by the mmap handler when the mapping was successful. The mmap handler must now return 0 when successful, any other value is considered as an error. Previously, returning -1 was the only way to fail. This change thus accidentally fixes some devices which were bogusly returning errno constants which would have been considered as addresses by the device pager. - Garbage collect the poorly named pmap_phys_address() now that it's no longer used. - Convert all the d_mmap_t consumers to the new API. I'm still not sure wheter we need a __FreeBSD_version bump for this, since and we didn't guarantee API/ABI stability until 5.1-RELEASE. Discussed with: alc, phk, jake Reviewed by: peter Compile-tested on: LINT (i386), GENERIC (alpha and sparc64) Runtime-tested on: i386	2003-02-25 03:21:22 +00:00
Scott Long	3303c14b57	Don't NULL out p_fd until after closefd() has been called. This isn't totally correct, but it has caused breakage for too long. I welcome someone with more fd fu to fix it correctly.	2003-02-24 05:46:55 +00:00
David Xu	0fccb684d1	Remove a XXXKSE. kg_completed now needs proc lock.	2003-02-24 01:28:10 +00:00
David Xu	f5878f69df	Backout last surplus commit. That day just wasn't my day.	2003-02-24 00:49:55 +00:00
Tor Egge	6a07a13944	Sync new socket nonblocking/async state with file flags in accept(). PR: 1775 Reviewed by: mbr	2003-02-23 23:00:28 +00:00
Poul-Henning Kamp	acb18acfec	Bracket the kern.vnode sysctl in #ifdef notyet because it results in massive locking issues on diskless systems. It is also not clear that this sysctl is non-dangerous in its requirements for locked down memory on large RAM systems.	2003-02-23 18:09:05 +00:00
Poul-Henning Kamp	5cb3dc8fa3	OK, I was too sleepy there... Pointy hat over here!	2003-02-23 13:45:55 +00:00
Poul-Henning Kamp	8f5ef1a9fa	Implement CLOCK_MONOTONIC.	2003-02-23 10:18:31 +00:00
Jake Burkholder	fc718df7d0	Add a /a modifier to the show ktr ddb command, which prints the whole trace buffer without stopping. Useful if you just want to capture the output but can't run ktrdump.	2003-02-22 23:30:37 +00:00
Robert Watson	90623e1a9e	Don't panic when enumerating SYSCTL_NODE() nodes without any children nodes. Submitted by: green, Hiten Pandya <hiten@unixdaemons.com>	2003-02-22 17:58:06 +00:00
Mike Makonnen	750a91d8b1	Remove a comment which hasn't been true since rev. 1.158 Approved by: jhb, markm (mentor)(implicit)	2003-02-22 05:59:48 +00:00
Robert Watson	838a6d03e8	Export the name of the device used to mount the root file system as kern.rootdev. If rootdev is undefined (NFS mount, etc), export an empty string. Desired by: peter	2003-02-22 05:01:12 +00:00
Peter Wemm	86bb731626	Missing M_TRYWAIT from so_upcall third argument.	2003-02-21 22:23:40 +00:00
Poul-Henning Kamp	2c6b49f6af	NO_GEOM cleanup: Retire the "d_dump_t" and use the "dumper_t" type instead. Dumper_t takes a void * as first arg which is more general than the dev_t taken by d_dump_t. (Remember: we could have net-dumpers if somebody wrote us one!) Define the convention for GEOM controlled disk devices to be that the first argument to the dumper function is the struct disk pointer. Change device drivers accordingly.	2003-02-21 19:00:48 +00:00
David Xu	34ada4b3bb	If UTS kernel is calling kse_wakeup for itself, do nothing.	2003-02-21 07:11:38 +00:00
Poul-Henning Kamp	263444cfbf	Change the console interface to pass a "struct consdev " instead of a dev_t to the method functions. The dev_t can still be found at struct consdev ->cn_dev. Add a void *cn_arg element to struct consdev which the drivers can use for retrieving their softc.	2003-02-20 20:54:45 +00:00
Poul-Henning Kamp	02574b19e1	Add a dead_cdevsw which does its best to return ENXIO if at all possible. In devsw() return dead_cdevsw instead of NULL in case the dev_t does not have a si_devsw. This may improve our survival chances with devices which go away unexpectedly.	2003-02-20 15:35:54 +00:00
David Xu	ab7d94f7eb	Forgot to set KU_DOUPCALL in kse_wakeup.	2003-02-20 08:22:04 +00:00
David Xu	eb117d5cb0	Add a timeout parameter to kse_release.	2003-02-20 08:18:15 +00:00
Bosko Milekic	025b4be197	o Allow "buckets" in mb_alloc to be differently sized (according to compile-time constants). That is, a "bucket" now is not necessarily a page-worth of mbufs or clusters, but it is MBUF_BUCK_SZ, CLUS_BUCK_SZ worth of mbufs, clusters. o Rename {mbuf,clust}_limit to {mbuf,clust}_hiwm and introduce {mbuf,clust}_lowm, which currently has no effect but will be used to set the low watermarks. o Fix netstat so that it can deal with the differently-sized buckets and teach it about the low watermarks too. o Make sure the per-cpu stats for an absent CPU has mb_active set to 0, explicitly. o Get rid of the allocate refcounts from mbuf map mess. Instead, just malloc() the refcounts in one shot from mbuf_init() o Clean up / update comments in subr_mbuf.c	2003-02-20 04:26:58 +00:00
Tim J. Robbins	27e39ae4d8	Remove the PL_SHAREMOD flag from struct plimit, which could have been used to share resource limits between rfork threads, but never was. Removing it makes resource limit locking much simpler -- only the current process can change the contents of the structure that p_limit points to.	2003-02-20 04:18:42 +00:00
Olivier Houchard	d6bf23783f	Remove duplicate includes. Submitted by: Cyril Nguyen-Huu <cyril@ci0.org>	2003-02-20 03:26:11 +00:00
Bosko Milekic	ec73437395	Fix a serious bug when computing the index for the reference counter array for mbuf clusters. I don't know how this got past early testing nor how it survived so long without getting caught. If anyone was seeing really really bizarre memory corruption in a few mbufs this would be why.	2003-02-20 03:01:04 +00:00
David Xu	a87891ee9e	Move thread limits testing code up a bit. This let UPCALLING thread takes possible accumulated contexts away.	2003-02-20 01:11:17 +00:00
Poul-Henning Kamp	0c977c9c53	Add M_WAITOK	2003-02-19 22:51:33 +00:00
David Xu	fc8cdd87d2	Count non-threaded group.	2003-02-19 13:40:24 +00:00
David Xu	4f6cfa4520	Update comments to reflect new KSE code.	2003-02-19 13:36:51 +00:00
Tim J. Robbins	a44a414e11	The "m = m->m_next" that was removed in the revision 1.12 was necessary for the m->m_next != NULL case to avoid looping infinitely when the first mbuf in the chain becomes full.	2003-02-19 10:12:42 +00:00
David Xu	30621e142d	M_WAITOK and remove an useless comment.	2003-02-19 09:59:12 +00:00
Warner Losh	a163d034fa	Back out M_* changes, per decision of the TRB. Approved by: trb	2003-02-19 05:47:46 +00:00
David Xu	0252d20369	Optimize the case when max threads number was hit.	2003-02-19 04:01:55 +00:00
Peter Wemm	af3d516f55	Initiate de-orbit burn for USE_PCI_BIOS_FOR_READ_WRITE. This has been #if'ed out for a while. Complete the deed and tidy up some other bits. We need to be able to call this stuff from outer edges of interrupt handlers for devices that have the ISR bits in pci config space. Making the bios code mpsafe was just too hairy. We had also stubbed it out some time ago due to there simply being too much brokenness in too many systems. This adds a leaf lock so that it is safe to use pci_read_config() and pci_write_config() from interrupt handlers. We still will use pcibios to do interrupt routing if there is no acpi.. [yes, I tested this] Briefly glanced at by: imp	2003-02-18 03:36:49 +00:00
David Xu	88aba94cdc	Further fix PS_NEEDSIGCHK	2003-02-17 14:54:57 +00:00
David Xu	02bbffaf3c	Move code for detecting PS_NEEDSIGCHK into thread_schedule_upcall, I think it is a better place to handle it.	2003-02-17 14:41:22 +00:00
Tim J. Robbins	96d7f8ef46	Use the proc lock to protect p_realtimer instead of Giant, and obtain sched_lock around accesses to p_stats->p_timer[] to avoid a potential race with hardclock. getitimer(), setitimer() and the realitexpire() callout are now Giant-free.	2003-02-17 10:03:02 +00:00
Jeff Roberson	58a3c27384	- Add a new function, thread_signal_add(), that is called from postsig to add a signal to a mailbox's pending set. - Add a new function, thread_signal_upcall(), this causes the current thread to upcall so that we can deliver pending signals. Reviewed by: mini	2003-02-17 09:58:11 +00:00
Julian Elischer	4a338afd7a	Move a bunch of flags from the KSE to the thread. I was in two minds as to where to put them in the first case.. I should have listenned to the other mind. Submitted by: parts by davidxu@ Reviewed by: jeff@ mini@	2003-02-17 09:55:10 +00:00
Jeff Roberson	5215b1872f	- Split the struct kse into struct upcall and struct kse. struct kse will soon be visible only to schedulers. This greatly simplifies much the KSE code. Submitted by: davidxu	2003-02-17 05:14:26 +00:00
Jeff Roberson	e4625663c9	- Move ke_sticks, ke_iticks, ke_uticks, ke_uu, ke_su, and ke_iu back into the proc. These counters are only examined through calcru. Submitted by: davidxu Tested on: x86, alpha, UP/SMP	2003-02-17 02:19:58 +00:00
Alfred Perlstein	9d4156aed3	Fix logic in loop so it actually executes. Pointed out by: fjoe	2003-02-16 16:12:10 +00:00
Poul-Henning Kamp	f341ca9891	Remove #include <sys/dkstat.h>	2003-02-16 14:13:23 +00:00
Poul-Henning Kamp	3abd4ccf87	Move the tty related statistics counters to live with the tty code.	2003-02-16 13:22:15 +00:00
Jeff Roberson	71146186a1	- Introduce a new function bremfreel() that does a bremfree with the buf queue lock already held. - In getblk() and flushbufqueues() use bremfreel() while we still have the buf queue lock held to keep the lists consistent. - Add LK_NOWAIT to two cases where we're essentially asserting that the bufs are not locked while acquiring the locks. This will make sure that we get the appropriate panic() and not another one for sleeping with a lock held.	2003-02-16 10:43:06 +00:00
Jeff Roberson	5e8feb5bed	- Add a WITNESS_SLEEP() for the appropriate cases in lockmgr().	2003-02-16 10:39:49 +00:00
Alfred Perlstein	5015c68a3c	prevent overflow in shminfo.shmmax	2003-02-16 06:08:55 +00:00
Jeffrey Hsu	a44009e07d	Remove extraneous FILEDESC_LOCK around atomic read.	2003-02-16 02:15:15 +00:00
Andrew R. Reiter	1f5a94d5f6	- Update a couple of comments to make sense with what today's code is doing (stale comments make arr something something ;)).	2003-02-15 23:25:12 +00:00
Tor Egge	218a01e062	Avoid file lock leakage when linuxthreads port or rfork is used: - Mark the process leader as having an advisory lock - Check if process leader is marked as having advisory lock when closing file - Check that file is still open after lock has been obtained - Don't allow file descriptor table sharing between processes with different leaders PR: 10265 Reviewed by: alfred	2003-02-15 22:43:05 +00:00
Andrew R. Reiter	da8f0c8429	- Remove old comment for PURGE() as it no longer exists and implied it was a comment to cache_zap(). - Add a comment to quickly state what cache_zap() does. Reviewed by: phk, mux	2003-02-15 18:58:06 +00:00
Tim J. Robbins	4444375710	Acquire Giant around calls to kern_sigaction() in sigaction(), freebsd4_sigaction() and osigaction() instead of around the whole body of those functions. They now no longer hold Giant around calls to copyin() and copyout(), and it is slightly more obvious what Giant is protecting.	2003-02-15 09:56:09 +00:00
Tim J. Robbins	c41c566c4a	osigpending() no longer needs Giant, for the same reason sigpending() does not.	2003-02-15 09:15:30 +00:00
Tim J. Robbins	48e8f774cb	All uses of p_siglist are protected by the proc lock now, so there's no need to acquire Giant in sigpending() anymore.	2003-02-15 08:42:02 +00:00
Alfred Perlstein	e7d6662f1b	Do not allow kqueues to be passed via unix domain sockets.	2003-02-15 06:04:55 +00:00
Alfred Perlstein	edf6699ae6	Fix LOR with PROC/filedesc. Introduce fdesc_mtx that will be used as a barrier between free'ing filedesc structures. Basically if you want to access another process's filedesc, you want to hold this mutex over the entire operation.	2003-02-15 05:52:56 +00:00
Bosko Milekic	9e7225808e	Make m_getm() always return the top of the newly allocated chain, as opposed to returning the top of the old chain when there was one and the top of the newly allocated chain if there was no old chain. Actually, it should be noted that prior to this fix, although the comment above m_getm() advertised that m_getm() would return the top of the old chain (if an old chain was being passed in) it actually [wrongly] was returning the tail mbuf in the old chain instead. This is a bug but since the one use of m_getm() in the tree luckily did not depend on the behavior, it happened to work out without notice. Harti Brandt pointed out that the advertised behavior was actually not the real behavior and so this change makes m_getm() ALWAYS return the newly allocated chain (and fixes the comment). This is less confusing and is the best course of action as then the caller is always able to have both a reference to the top of the original chain (because it's passing it in in the call) and a reference to the newly attached chain. Although the API is slightly modified, I don't think that any third-party code uses m_getm() and if it does, it surely can't be working properly because the old behavior was bogus. API bug pointed out by: Harti Brandt <brandt@fokus.fraunhofer.de>	2003-02-14 16:50:13 +00:00
Dag-Erling Smørgrav	af2eed6648	Style nit.	2003-02-14 13:30:25 +00:00
Alfred Perlstein	3dc593c895	KASSERT format string does not need newline termination	2003-02-14 13:28:44 +00:00
Alfred Perlstein	0c5f7aaab5	Add kasserts to catch bad API usage. Submitted by: Hiten Pandya <hiten@unixdaemons.com>	2003-02-14 13:18:51 +00:00
Alfred Perlstein	c11110eabe	Fix crash dumps on ata and scsi. To fix scsi, don't wait for ithreads if we're dumping, it makes the debugger sad. To fix ata, use what appears to be a polling method if we're dumping, I stole this from tmm but added code to ensure that this change is only in effect while dumping. Tested by: des	2003-02-14 13:10:40 +00:00
Alfred Perlstein	e95499bd4c	style.	2003-02-14 12:44:48 +00:00
Alfred Perlstein	aae87a3681	Print a backtrace in case we tsleep from inside of DDB.	2003-02-14 12:44:07 +00:00
Alan Cox	2bd63062b5	Use atomic ops to update amountpipekva. Amountpipekva represents the total kernel virtual address space used by all pipes. It is, thus, outside the scope of any individual pipe lock.	2003-02-13 19:39:54 +00:00
Dag-Erling Smørgrav	f6cebd7310	It seems the extra precautions are no longer needed.	2003-02-13 10:05:20 +00:00
Tim J. Robbins	5ce623b8e0	Add an XXX comment noting that getrusage() accesses p_stats->p_ru and p_stats->p_cru without holding the appropriate locks.	2003-02-13 09:53:59 +00:00
Peter Wemm	1c425b874c	Add a 'debug.witness_trace' sysctl (and tunable) when DDB is present. This causes LOR and could-sleep messages to come with a stack trace.	2003-02-13 01:35:56 +00:00
Peter Wemm	891e066864	Print "Stack backtrace:" right before dumping the backtrace. We cannot expect end users to automatically recognize a stack trace for what it is.	2003-02-13 01:33:59 +00:00
Warner Losh	b235704d7c	Implement rman_get_device # I though this was alredy implemented Pointy hat on my head shown by: peter	2003-02-12 07:00:59 +00:00
Alfred Perlstein	42e1b74af2	Don't lock FILEDESC under PROC. The locking here needs to be revisited, but this ought to get rid of the LOR messages that people are complaining about for now. I imagine either I or someone else interested with smp will eventually clear this up.	2003-02-11 07:20:52 +00:00
Jeff Roberson	25c4325446	- Add a comment about a race that will happen without Giant.	2003-02-10 22:47:34 +00:00
Jeff Roberson	c7b716cc2a	- Unlock the nblock after the loop in bwillwrite().	2003-02-10 22:33:59 +00:00
Jeff Roberson	783caefbbf	- Enable STRICT_RESCHED until code that dynamically decides on resched strictness based on the current workload is finished.	2003-02-10 14:11:23 +00:00
Jeff Roberson	407b015791	- Add a new variable 'kg_runtime' that tracks the amount of time we've run. - Use the ratio of kg_runtime / kg_slptime to determine our dynamic priority. - Scale kg_runtime and kg_slptime back when the sum of the two exceeds SCHED_SLP_RUN_MAX. This allows us to slowly forget old behavior. - Scale back the runtime and slptime in fork so that the new process has the same ratio but much less accumulated time. This causes new behavior to be noticed more quickly.	2003-02-10 14:03:45 +00:00
Tim J. Robbins	fbf70de6b0	Lock the proc around accessing p_siglist in ttycheckoutq() in the unused wait != 0 case.	2003-02-10 06:06:46 +00:00
Jeff Roberson	7137d635ac	- In getnewbuf() unlock the bq lock prior to sleeping when we're out of buffers. Submitted by: tegge	2003-02-10 06:02:51 +00:00
Jake Burkholder	3749dff3f9	Remove mtx_lock_giant from functions which are mp-safe.	2003-02-10 04:42:20 +00:00
Jeff Roberson	3306adcfcf	- Correct another atomic op. Spotted by: alc	2003-02-09 22:39:51 +00:00
Jeff Roberson	08883c8a85	- Claim we're 'fsync' and not 'spec_fsync' in vop_stdfsync.	2003-02-09 12:29:38 +00:00
Jeff Roberson	69953c8435	- Move some code out from #ifdef INVARIANTS.	2003-02-09 12:11:37 +00:00
Jeff Roberson	05e393f0cd	- Update a printf format for b_flags.	2003-02-09 11:56:13 +00:00
Jeff Roberson	767b9a529d	- Cleanup unlocked accesses to buf flags by introducing a new b_vflag member that is protected by the vnode lock. - Move B_SCANNED into b_vflags and call it BV_SCANNED. - Create a vop_stdfsync() modeled after spec's sync. - Replace spec_fsync, msdos_fsync, and hpfs_fsync with the stdfsync and some fs specific processing. This gives all of these filesystems proper behavior wrt MNT_WAIT/NOWAIT and the use of the B_SCANNED flag. - Annotate the locking in buf.h	2003-02-09 11:28:35 +00:00
Jeff Roberson	15553af710	- spell add 'add' and not 'subtract' in an atomic op. Spotted by: alc Pointy hat to: jeff	2003-02-09 11:21:40 +00:00
Jeff Roberson	d85be48243	- Lock down the buffer cache's infrastructure code. This includes locks on buf lists, synchronization variables, and atomic ops for the counters. This change does not remove giant from any code although some pushdown may be possible. - In vfs_bio_awrite() don't access buf fields without the buf lock.	2003-02-09 09:47:31 +00:00
Julian Elischer	a282253a29	A little infrastructure, preceding some upcoming changes to the profiling and statistics code. Submitted by: DavidXu@ Reviewed by: peter@	2003-02-08 02:58:16 +00:00
Jeffrey Hsu	67c0ddef59	Remove vestiges of no longer needed unp_rvnode field. Approved by: phk (who originally added it in rev 1.8 of unpcb.h)	2003-02-06 01:34:43 +00:00
Julian Elischer	822ded67fe	The lockmanager has to keep track of locks per thread, not per process. Submitted by: david Xu (davidxu@) Reviewed by: jhb@	2003-02-05 19:36:58 +00:00
Dag-Erling Smørgrav	c524b1a8cf	Correct grammatical error in previous commit.	2003-02-04 18:47:17 +00:00
Dag-Erling Smørgrav	91dd013b1e	Extra precautions before trying to start init(8).	2003-02-04 18:16:50 +00:00
Poul-Henning Kamp	6334a66358	Implement proper bounds-checking and truncation of device names, this has become an issue now that end-user controlable attributes can become devices names with the geom_vol_ffs class.	2003-02-04 11:04:26 +00:00
Poul-Henning Kamp	237d2765f9	Pave the road to removing the fixed size limit on device nodes: Change the si_name of dev_t's to be a char * and put a private buffer for holding the name at then end of the struct. Initialize si_name to point to the private buffer. Put a KASSERT in geom_disk to prevent overrun on the fake dev_t we still have to generate for the disk_drivers.	2003-02-04 10:32:40 +00:00
Poul-Henning Kamp	8751a8c73b	Add vsnrprintf() which is just like vsnprintf() but takes a "radix" argument for the kernel-special %r format.	2003-02-04 10:00:34 +00:00
Poul-Henning Kamp	91f1c2b3cc	Split the global timezone structure into two integer fields to prevent the compiler from optimizing assignments into byte-copy operations which might make access to the individual fields non-atomic. Use the individual fields throughout, and don't bother locking them with Giant: it is no longer needed. Inspired by: tjr	2003-02-03 19:49:35 +00:00
Jake Burkholder	238dd3209a	Split statclock into statclock and profclock, and made the method for driving statclock based on profhz when profiling is enabled MD, since most platforms don't use this anyway. This removes the need for statclock_process, whose only purpose was to subdivide profhz, and gets the profiling clock running outside of sched_lock on platforms that implement suswintr. Also changed the interface for starting and stopping the profiling clock to do just that, instead of changing the rate of statclock, since they can now be separate. Reviewed by: jhb, tmm Tested on: i386, sparc64	2003-02-03 17:53:15 +00:00
Hajimu UMEMOTO	12e4397ea3	Break out the bind and connect syscalls to intend to make calling these syscalls internally easy. This is preparation for force coming IPv6 support for Linuxlator. Submitted by: dwmalone MFC after: 10 days	2003-02-03 17:36:52 +00:00
Tim J. Robbins	b338d59fef	No need to lock Giant around call to nanosleep1() in nanosleep().	2003-02-03 15:31:57 +00:00
Tim J. Robbins	411c25edae	Avoid holding Giant across copyout() in gettimeofday() and getitimer().	2003-02-03 14:47:22 +00:00
Hartmut Brandt	1b978d453b	Make the variable types, the sysctl macros and the sysctl handler for kern.ipc.{maxsockbuf,sockbuf_waste_factor} to agree that those variables are of type unsigned long. PR: sparc64/47389 Approved by: jake (mentor)	2003-02-03 06:50:59 +00:00
Jeff Roberson	5d7ef00cfe	- Make some context switches conditional on SCHED_STRICT_RESCHED. This may have some negative effect on interactivity but it yields great perf. gains. This also brings the conditions under which ULE context switches inline with SCHED_4BSD. - Define some new kseq_* functions for manipulating the run queue. - Add a new kseq member ksq_rslices and ksq_bload. rslices is the sum of the slices of runnable kses. This will be used for push load balance decisions. bload is the number of threads blocked waiting on IO.	2003-02-03 05:30:07 +00:00
Jeff Roberson	cd6e33df1c	- Stop abusing oncpu for our cpu binding. Define a scheduler local element in the kse datastructure called ke_cpu. This is the cpu which we are currently bound to. Some flags may be added later to support hard binding.	2003-02-03 02:26:28 +00:00
Alfred Perlstein	04738e99b5	Catch more uses of MIN().	2003-02-02 13:30:00 +00:00
Alfred Perlstein	8deebb0160	Consolidate MIN/MAX macros into one place (param.h). Submitted by: Hiten Pandya <hiten@unixdaemons.com>	2003-02-02 13:17:30 +00:00
Scott Long	7121cce58a	Use hz if stathz is zero. Adopted from sched_4bsd.	2003-02-02 08:24:32 +00:00
Julian Elischer	6f8132a867	Reversion of commit by Davidxu plus fixes since applied. I'm not convinced there is anything major wrong with the patch but them's the rules.. I am using my "David's mentor" hat to revert this as he's offline for a while.	2003-02-01 12:17:09 +00:00
Poul-Henning Kamp	4db4f5c87f	Under #ifdef DIAGNOSTIC, fill malloc(9) allocations which do not have M_ZERO specified with 0x70. (malloc_flags=J for the kernel :-)	2003-02-01 10:07:49 +00:00
Poul-Henning Kamp	33bef83cc6	Under DIAGNOSTIC, only report expensive timeouts if they are more expensive than the last on we reported.	2003-02-01 10:06:40 +00:00
Julian Elischer	ff92b12dce	Only add one tick per tick to the thread stats, instead of some random number.	2003-01-31 22:14:46 +00:00
Robert Watson	565211b27f	Correct handling of locking for chroot() and chdir() cases: rather than having change_dir() release the vnode lock on success, hold the lock so that we can use it later when invoking MAC checks and VOP_ACCESS() in the chroot() code. Update the comment to reflect this calling convention. Update callers to unlock the vnode lock. Correct a typo regarding vnode naming in the MAC case that crept in via the previous patch applied.	2003-01-31 21:13:25 +00:00
Robert Watson	7278944df1	Clean up vnode handling on return from chroot() in certain error cases: we might multiply vrele() a vnode when certain classes of failures occur. This appears to stem from earlier Giant/file descriptor lock pushdown and restructuring. Submitted by: maxim	2003-01-31 18:57:04 +00:00
Tim J. Robbins	48ed1432c5	Use a local variable to store the number of ticks that elapsed in kernel mode instead of (unintentionally) using the global `ticks'. This error completely broke profiling.	2003-01-31 11:22:31 +00:00
Poul-Henning Kamp	6e1203e558	NO_GEOM cleanup: unifdef;	2003-01-30 19:22:27 +00:00
Poul-Henning Kamp	c5cab5b2fa	NO_GEOM cleanup: retire to attic.	2003-01-30 12:58:55 +00:00
Poul-Henning Kamp	c9834aa961	NODEVFS cleanup: Unifdef.	2003-01-30 12:51:32 +00:00
Poul-Henning Kamp	8e67075792	NO_GEOM cleanup: remove #ifdef	2003-01-30 12:36:30 +00:00
Poul-Henning Kamp	4af0d0c21f	NODEVFS cleanup: remove #ifdefs	2003-01-30 12:35:40 +00:00
Poul-Henning Kamp	f6a1852dcc	NODEVFS cleanup: remove #ifdefs.	2003-01-30 12:35:17 +00:00
Poul-Henning Kamp	34189c035b	NODEVFS cleanup: Remove cdevsw[]. This implicitly removes the need for major numbers, but a number of drivers still know things they shouldn't need to, and we need to consider if there are applications which cache major(+minor) gleaned from stat(2) and rely on it being constant over reboots before we start assigning random majors.	2003-01-29 21:54:03 +00:00
Tim J. Robbins	af7cbce89c	Fix two fatal signedness errors introduced when i and j in semop() were changed from int to size_t in the previous revision. PR: 47625	2003-01-29 12:30:59 +00:00
Poul-Henning Kamp	60ca399653	Move timecounters notion of frequency to 64 bits. [WARNING: CPUs in the distant future may be closer than they appear!]	2003-01-29 11:29:22 +00:00
Jeff Roberson	0a016a05a4	- Use ksq_load as the authoritive count of kses on the pair of kseqs for sched_runnable() et all. - Remove some dead code in sched_clock(). - Define two macros KSEQ_SELF() and KSEQ_CPU() for getting the kseq of the current cpu or some alternate cpu. - Start introducing kseq_() functions, such as kseq_choose() and kseq_setup().	2003-01-29 07:00:51 +00:00
Jeff Roberson	bf857e69a2	- Remove debugging code that didn't work on UP.	2003-01-29 00:26:47 +00:00
Jeff Roberson	d465fb9589	- Allow idle's pctcpu time to be calculated.	2003-01-28 09:30:17 +00:00
Jeff Roberson	c9f25d8f92	- Fix the ksq_load calculation. It now reflects the number of entries on the run queue for each cpu. - Introduce kse stealing into the sched_choose() code. This helps balance cpus better in cases where process turnover is high. This implementation is fairly trivial and will likely be only a temporary measure until something more sophisticated has been written.	2003-01-28 09:28:20 +00:00
Peter Wemm	bf2053cad6	No longer force COMPAT_FREEBSD4 to be on.	2003-01-27 23:01:03 +00:00
Poul-Henning Kamp	109751d28c	Don't dereference null vnode pointer if controling terminal was revoked. Submitted by: "Peter Edwards" <pmedwards@eircom.net>	2003-01-27 16:54:17 +00:00
David Xu	ba07d97e62	Use kg_numupcalls to see if we are closing a thread group, not kg_kses which is not changed when a group is still working.	2003-01-26 23:39:33 +00:00
Alfred Perlstein	ca315837c7	fix warnings	2003-01-26 23:25:00 +00:00
Alfred Perlstein	b17c9cfa5e	Add const qualifier to data argument for msgsnd. PR: standards/45274 Submitted by: Craig Rodrigues <rodrigc@attbi.com>	2003-01-26 20:09:34 +00:00
David Xu	0dbb100b9b	Move UPCALL related data structure out of kse, introduce a new data structure called kse_upcall to manage UPCALL. All KSE binding and loaning code are gone. A thread owns an upcall can collect all completed syscall contexts in its ksegrp, turn itself into UPCALL mode, and takes those contexts back to userland. Any thread without upcall structure has to export their contexts and exit at user boundary. Any thread running in user mode owns an upcall structure, when it enters kernel, if the kse mailbox's current thread pointer is not NULL, then when the thread is blocked in kernel, a new UPCALL thread is created and the upcall structure is transfered to the new UPCALL thread. if the kse mailbox's current thread pointer is NULL, then when a thread is blocked in kernel, no UPCALL thread will be created. Each upcall always has an owner thread. Userland can remove an upcall by calling kse_exit, when all upcalls in ksegrp are removed, the group is atomatically shutdown. An upcall owner thread also exits when process is in exiting state. when an owner thread exits, the upcall it owns is also removed. KSE is a pure scheduler entity. it represents a virtual cpu. when a thread is running, it always has a KSE associated with it. scheduler is free to assign a KSE to thread according thread priority, if thread priority is changed, KSE can be moved from one thread to another. When a ksegrp is created, there is always N KSEs created in the group. the N is the number of physical cpu in the current system. This makes it is possible that even an userland UTS is single CPU safe, threads in kernel still can execute on different cpu in parallel. Userland calls kse_create to add more upcall structures into ksegrp to increase concurrent in userland itself, kernel is not restricted by number of upcalls userland provides. The code hasn't been tested under SMP by author due to lack of hardware. Reviewed by: julian	2003-01-26 11:41:35 +00:00
Jeff Roberson	35e6168fcd	- Add the ule scheduler. This is intended to be a general purpose process scheduler with many SMP benefits. It is still very experimental and should be used only in test environments.	2003-01-26 05:23:15 +00:00
Jeff Roberson	4e997f4b87	- Call sched_sleep() instead of rolling our own in cv_waitq_add().	2003-01-26 04:00:39 +00:00
Alfred Perlstein	e1d7d0bb60	Bring shm functions closer the the opengroup standards. PR: 47469 Submitted by: Craig Rodrigues <rodrigc@attbi.com>	2003-01-25 21:33:05 +00:00
Alfred Perlstein	3beb32709d	Bring semop() closer the the opengroup standards. PR: 47471 Submitted by: Craig Rodrigues <rodrigc@attbi.com>	2003-01-25 21:27:37 +00:00
Poul-Henning Kamp	4394f4767d	Add sysctl kern.timecounter.nsetclock which indicates the number of potential discontinuities in our UTC timescale. Applications can monitor this variable if they want to be informed about steps in the timescale. Slews (ntp and adjtime(2)) and frequency adjustments (ntp) will not increment this counter, only operations which set the clock. No attempt is made to classify size or direction of the step.	2003-01-25 07:51:09 +00:00

... 2 3 4 5 6 ...

6210 commits