Commit graph

1702 commits

Author SHA1 Message Date
Poul-Henning Kamp 6a4b58230c Replace a homegrown bdone()/bwait() implementation by the real thing 2003-08-18 19:47:16 +00:00
Alan Cox ef13663bb6 Three unrelated changes to vm_proc_new(): (1) add vm object locking on the
U pages object; (2) reorganize such that the U pages object is created and
filled in one block; and (3) remove an unnecessary clearing of PG_ZERO.
2003-08-18 01:31:43 +00:00
Poul-Henning Kamp ec7948490b Use NULL for 3rd argument of VOP_BMAP() rather than custom cast.
Eliminate unused variable.
2003-08-17 18:54:23 +00:00
Marcel Moolenaar 710338e94f In vm_thread_swap{in|out}(), remove the alpha specific conditional
compilation and replace it with a call to cpu_thread_swap{in|out}().
This allows us to add similar code on ia64 without cluttering the
code even more.
2003-08-16 23:15:15 +00:00
Poul-Henning Kamp 395714feb7 Eliminate unnecessary udev_t variable: we can derive it from the dev_t
when we need it.
2003-08-15 13:14:25 +00:00
Poul-Henning Kamp 89dc784fa3 Make swaponvp() static to the swap_pager. 2003-08-15 12:04:29 +00:00
Alan Cox 3e1b578a28 Extend the scope of the page queues lock in vm_pageout_scan() to cover
the traversal of the PQ_INACTIVE queue.
2003-08-15 05:13:36 +00:00
Alan Cox 5402d8ec23 Remove GIANT_REQUIRED from vmspace_alloc(). 2003-08-13 19:23:51 +00:00
Alan Cox 46add12552 Reduce the size of the vm map (and by inclusion the vm space) on 64-bit
architectures by moving a field within the structure.
2003-08-13 03:13:22 +00:00
Warner Losh 06b4bf3e55 Expand inline the relevant parts of src/COPYRIGHT for Matt Dillon's
copyrighted files.

Approved by: Matt Dillon
2003-08-12 23:24:05 +00:00
Alan Cox c759a3ca06 Reduce the size of the vm object on 64-bit architectures by moving
a field within the structure.
2003-08-12 20:10:32 +00:00
Bosko Milekic 20e8e865bd - When deciding whether to init the zone with small_init or large_init,
compare the zone element size (+1 for the byte of linkage) against
  UMA_SLAB_SIZE - sizeof(struct uma_slab), and not just UMA_SLAB_SIZE.
  Add a KASSERT in zone_small_init to make sure that the computed
  ipers (items per slab) for the zone is not zero, despite the addition
  of the check, just to be sure (this part submitted by: silby)

- UMA_ZONE_VM used to imply BUCKETCACHE.  Now it implies
  CACHEONLY instead.  CACHEONLY is like BUCKETCACHE in the
  case of bucket allocations, but in addition to that also ensures that
  we don't setup the zone with OFFPAGE slab headers allocated from the
  slabzone.  This means that we're not allowed to have a UMA_ZONE_VM
  zone initialized for large items (zone_large_init) because it would
  require the slab headers to be allocated from slabzone, and hence
  kmem_map.  Some of the zones init'd with UMA_ZONE_VM are so init'd
  before kmem_map is suballoc'd from kernel_map, which is why this
  change is necessary.
2003-08-11 19:39:45 +00:00
Bruce M Simpson abd498aa71 Add the mlockall() and munlockall() system calls.
- All those diffs to syscalls.master for each architecture *are*
   necessary. This needed clarification; the stub code generation for
   mlockall() was disabled, which would prevent applications from
   linking to this API (suggested by mux)
 - Giant has been quoshed. It is no longer held by the code, as
   the required locking has been pushed down within vm_map.c.
 - Callers must specify VM_MAP_WIRE_HOLESOK or VM_MAP_WIRE_NOHOLES
   to express their intention explicitly.
 - Inspected at the vmstat, top and vm pager sysctl stats level.
   Paging-in activity is occurring correctly, using a test harness.
 - The RES size for a process may appear to be greater than its SIZE.
   This is believed to be due to mappings of the same shared library
   page being wired twice. Further exploration is needed.
 - Believed to back out of allocations and locks correctly
   (tested with WITNESS, MUTEX_PROFILING, INVARIANTS and DIAGNOSTIC).

PR:             kern/43426, standards/54223
Reviewed by:    jake, alc
Approved by:    jake (mentor)
MFC after:	2 weeks
2003-08-11 07:14:08 +00:00
Mike Silbersack cebde06978 More pipe changes:
From alc:
Move pageable pipe memory to a seperate kernel submap to avoid awkward
vm map interlocking issues.  (Bad explanation provided by me.)

From me:
Rework pipespace accounting code to handle this new layout, and adjust
our default values to account for the fact that we now have a solid
limit on allocations.

Also, remove the "maxpipes" limit, as it no longer has a purpose.
(The limit on kva usage solves the problem of having two many pipes.)
2003-08-11 05:51:51 +00:00
Poul-Henning Kamp ef3c5abdba Make the first two pages magic to protect the BSD labels rather than
only one.
2003-08-06 14:13:38 +00:00
Poul-Henning Kamp 07f81f9159 Remove an unused variable. 2003-08-06 12:09:34 +00:00
Poul-Henning Kamp 751221fd32 Staticize swap_pager_putpages()
Eliminate a lot of checkes to make sure requests are not cross-device
which is unnecessary with the new layout.  We know a sequential request
cannot possibly be cross-device because there is a reserved page between
the devices.

Remove a couple of comments which no longer are relevant.
2003-08-06 12:08:27 +00:00
Poul-Henning Kamp 030b34923d Access the swap_pagers' ->putpages() through swappagerops instead
of directly, this is a cleaner way to do it.
2003-08-06 12:05:48 +00:00
Poul-Henning Kamp f976cfd99a Add XXX: comment to vm_pager_unswapped(). 2003-08-06 10:51:40 +00:00
Poul-Henning Kamp 5e04322a6e Explicitly set B_PAGING 2003-08-06 09:22:47 +00:00
Poul-Henning Kamp c37a77ee86 Rip out the totally bogos vnode swapdev_vp with extreeme prejudice.
Don't mark buffers with B_KEEPGIANT, we don't drop giant in strategy
at this point in time.
2003-08-06 06:53:31 +00:00
Poul-Henning Kamp e04e4bacf6 Use sparse struct initialization for struct pagerops.
Mark our buffers B_KEEPGIANT before sending them downstream.

Remove swap_pager_strategy implementation.
2003-08-05 06:54:56 +00:00
Poul-Henning Kamp 4e6586002d Use sparse struct initializations for struct pagerops.
This makes grepping for which pagers implement which methods easier.
2003-08-05 06:51:26 +00:00
Poul-Henning Kamp 665c0caf03 Put an uncovered page between the swap devices, that way we can be sure
to not get any cross-device I/O requests.  (The unallocated first page
protecting BSD labels already gave us this, but that hack may go away
at some point in time).

Remove the check for cross-device I/O requests in swap_pager_strategy.

Move the repeated statistics updating into flushchainbuf().
2003-08-04 08:22:49 +00:00
Alan Cox 981371629a Use kmem_alloc_nofault() instead of kmem_alloc_pageable() to allocate
swapbkva.  Swapbkva mappings are explicitly managed using pmap_qenter(),
not on-demand by vm_fault(), making kmem_alloc_nofault() more appropriate.

Submitted by:	tegge
2003-08-04 04:35:04 +00:00
Poul-Henning Kamp 12692209a6 Name swap_pager_find_dev() more correctly swp_pager_finde_dev().
Use ->bio_children to count child buffers, rather than abuse the
bio_caller1 pointer.

Expand the relevant bits of waitchainbuf() inline, this clarifies
the code a little bit.
2003-08-03 21:22:42 +00:00
Poul-Henning Kamp 5ff0108d21 I accidentally hit undo before committing, fix the resulting off-by-one. 2003-08-03 14:53:52 +00:00
Poul-Henning Kamp 8f60c087e6 Change the layout policy of the swap_pager from a hardcoded width
striping to a per device round-robin algorithm.

Because of the policy of not attempting to retain previous swap
allocation on page-out, this means that a newly added swap device
almost instantly takes its 1/N share of the I/O load but it takes
somewhat longer for it to assume it's 1/N share of the pages if there
is plenty of space on the other devices.

Change the 8G total swapspace limitation to 8G per device instead
by using a per device blist rather than one global blist.  This
reduces the memory footprint by 75% (typically a couple hundred
kilobytes) for the common case with one swapdevice but NSWAPDEV=4.

Remove the compile time constant limit of number of swap devices,
there is no limit now.  Instead of a fixed size array, store the
per swapdev structure in a TAILQ.

Total swap space is still addressed by a 32 bit page number and
therefore the upper limit is now 2^42 bytes = 16TB (for i386).

We still do not allocate the first page of each device in order to
give some amount of protection to any bsdlabel at the start of the
device.

A new device is appended after the existing devices in the swap space,
no attempt is made to fill in holes left behind by swapoff (this can
trivially be changed should it ever become a problem).

The sysctl vm.nswapdev now reflects the number of currently configured
swap devices.

Rename vm_swap_size to swap_pager_avail for consistency with other
exported names.

Change argument type for vm_proc_swapin_all() and swap_pager_isswapped()
to be a struct swdevt pointer rather than an index.

Not changed: we are still using blists to manage the free space,
but since the swapspace is no longer fragmented by the striping
different resource managers might fare better.
2003-08-03 13:35:31 +00:00
Poul-Henning Kamp 745f330503 Move extern declaration of the various pagerops from vm_pager.c
to vm_pager.h where the various pagers will also see them.
2003-08-03 09:27:39 +00:00
Alan Cox b245ac95cf Revise obj_alloc(). Most notably, use the object's lock to prevent two
concurrent invocations from acquiring the same address(es).  Also, in case
of an incomplete allocation, free any allocated pages.

In collaboration with:	tegge
2003-08-03 06:08:48 +00:00
Bosko Milekic 48bf87258f When INVARIANTS is on and we're in uma_zalloc_free(), we need to make
sure that uma_dbg_free() is called if we're about to call
uma_zfree_internal() but we're asking it to skip the dtor and
uma_dbg_free() call itself.  So, if we're about to call
uma_zfree_internal() from uma_zfree_arg() and skip == 1, call
uma_dbg_free() ourselves.
2003-08-02 22:40:27 +00:00
Alan Cox b77c2bcd98 Update the comment at the head of kmem_alloc_nofault() to describe its
purpose and use.
2003-08-01 19:51:43 +00:00
Bosko Milekic 174ab4501e Only free the pcpu cache buckets if they are non-NULL.
Crashed this person's machine: harti
Pointy-hat to: me
2003-08-01 17:42:27 +00:00
Poul-Henning Kamp 8d677ef93f Remove unused stuff.
Move used stuff to swap_pager.c where it belongs.

This file no longer exports anything to userland.
2003-07-31 22:19:28 +00:00
Peter Wemm 15a7ad60fb Add #include "opt_kstack_pages.h" and "opt_kstack_max_pages.h" to remain
in sync with the backend machdep code.  When cpu_thread_init() does not
have the same idea of KSTACK_PAGES as the thing that created the kstack,
all hell breaks loose.

Bad alc! no cookie! :-)
2003-07-31 01:25:05 +00:00
Bosko Milekic d56368d779 Plug a race and a leak in UMA.
1) The race has to do with zone destruction.  From the zone destructor we
   would lock the zone, set the working set size to 0, then unlock the zone,
   drain it, and then free the structure.  Within the window following the
   working-set-size set to 0 and unlocking of the zone and the point where
   in zone_drain we re-acquire the zone lock, the uma timer routine could
   have fired off and changed the working set size to something non-zero,
   thereby potentially preventing us from completely freeing slabs before
   destroying the zone (and thus leaking them).

2) The leak has to do with zone destruction as well.  When destroying a
   zone we would take care to free all the buckets cached in the zone, but
   although we would drain the pcpu cache buckets, we would not free them.
   This resulted in leaking a couple of bucket structures (512 bytes each)
   per cpu on SMP during zone destruction.

While I'm here, also silence GCC warnings by turning uma_slab_alloc()
from inline to real function.  It's too big to be an inline.

Reviewed by: JeffR
2003-07-30 18:55:15 +00:00
Bosko Milekic a40fdcb439 When generating the zone stats make sure to handle the master zone
("UMA Zone") carefully, because it does not have pcpu caches allocated
at all.  In the UP case, we did not catch this because one pcpu cache
is always allocated with the zone, but for the MP case, we were getting
bogus stats for this zone.

Tested by: Lukas Ertl <le@univie.ac.at>
2003-07-30 15:22:37 +00:00
Poul-Henning Kamp 7b4bd98ad5 Remove the disabling of buckets workaround.
Thanks to:	jeffr
2003-07-30 07:50:19 +00:00
Jeff Roberson f828e5bedb - Get rid of the ill-conceived uz_cachefree member of uma_zone.
- In sysctl_vm_zone use the per cpu locks to read the current cache
   statistics this makes them more accurate while under heavy load.

Submitted by:	tegge
2003-07-30 05:59:17 +00:00
Jeff Roberson d11e0ba565 - Check to see if we need a slab prior to allocating one. Failure to do
so not only wastes memory but it can also cause a leak in zones that
   will be destroyed later.  The problem is that the slab allocation code
   places newly created slabs on the partially allocated list because it
   assumes that the caller will actually allocate some memory from it.
   Failure to do so places an otherwise free slab on the partial slab list
   where we wont find it later in zone_drain().

Continuously prodded to fix by:	phk (Thanks)
2003-07-30 05:42:55 +00:00
Poul-Henning Kamp 0c32d97ab5 Temporary workaround: Always disable buckets, there is a bug there
somewhere.

JeffR will look at this as soon as he has time.

OK'ed by:	jeffr
2003-07-29 22:07:10 +00:00
Alan Cox 234c7726c8 None of the "alloc" functions used by UMA assume that Giant is held any
longer.  (If they still need it, e.g., contigmalloc(), they acquire it
themselves.)  Therefore, we need not acquire Giant in slab_zalloc().
2003-07-28 02:29:07 +00:00
Alan Cox f50ab15dff Remove GIANT_REQUIRED from kmem_alloc(). 2003-07-27 18:31:32 +00:00
Maxime Henrion 085f5d6043 Use pmap_zero_page() to zero pages instead of bzero() because
they haven't been vm_map_wire()'d yet.
2003-07-27 10:41:33 +00:00
Alan Cox 9c65e7a336 Allow vm_object_reference() on kernel_object without Giant. 2003-07-27 05:43:58 +00:00
Alan Cox 17d89a1f67 Acquire Giant rather than asserting it is held in contigmalloc(). This is
a prerequisite to removing further uses of Giant from UMA.
2003-07-26 21:48:46 +00:00
Poul-Henning Kamp a8d43c90af Add a "int fd" argument to VOP_OPEN() which in the future will
contain the filedescriptor number on opens from userland.

The index is used rather than a "struct file *" since it conveys a bit
more information, which may be useful to in particular fdescfs and /dev/fd/*

For now pass -1 all over the place.
2003-07-26 07:32:23 +00:00
Alan Cox 0c1a133f56 Gulp ... call kmem_malloc() without Giant. 2003-07-26 03:55:32 +00:00
Maxime Henrion b9ff8db1be Add support for the M_ZERO flag to contigmalloc().
Reviewed by:	jeff
2003-07-25 21:02:25 +00:00
Poul-Henning Kamp a5edd34afe Remove all but one of the inlines here, this reduces the code size by
2032 bytes and has no measurable impact on performance.
2003-07-22 20:54:26 +00:00