freebsd-src/sys/vm/vm_radix.h

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

169 lines
5.1 KiB
C
Raw Normal View History

/*-
* SPDX-License-Identifier: BSD-2-Clause
*
Sync back vmcontention branch into HEAD: Replace the per-object resident and cached pages splay tree with a path-compressed multi-digit radix trie. Along with this, switch also the x86-specific handling of idle page tables to using the radix trie. This change is supposed to do the following: - Allowing the acquisition of read locking for lookup operations of the resident/cached pages collections as the per-vm_page_t splay iterators are now removed. - Increase the scalability of the operations on the page collections. The radix trie does rely on the consumers locking to ensure atomicity of its operations. In order to avoid deadlocks the bisection nodes are pre-allocated in the UMA zone. This can be done safely because the algorithm needs at maximum one new node per insert which means the maximum number of the desired nodes is the number of available physical frames themselves. However, not all the times a new bisection node is really needed. The radix trie implements path-compression because UFS indirect blocks can lead to several objects with a very sparse trie, increasing the number of levels to usually scan. It also helps in the nodes pre-fetching by introducing the single node per-insert property. This code is not generalized (yet) because of the possible loss of performance by having much of the sizes in play configurable. However, efforts to make this code more general and then reusable in further different consumers might be really done. The only KPI change is the removal of the function vm_page_splay() which is now reaped. The only KBI change, instead, is the removal of the left/right iterators from struct vm_page, which are now reaped. Further technical notes broken into mealpieces can be retrieved from the svn branch: http://svn.freebsd.org/base/user/attilio/vmcontention/ Sponsored by: EMC / Isilon storage division In collaboration with: alc, jeff Tested by: flo, pho, jhb, davide Tested by: ian (arm) Tested by: andreast (powerpc)
2013-03-18 00:25:02 +00:00
* Copyright (c) 2013 EMC Corp.
* Copyright (c) 2011 Jeffrey Roberson <jeff@freebsd.org>
* Copyright (c) 2008 Mayur Shardul <mayur.shardul@gmail.com>
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
#ifndef _VM_RADIX_H_
#define _VM_RADIX_H_
#include <vm/_vm_radix.h>
#ifdef _KERNEL
#include <sys/pctrie.h>
#include <vm/vm_page.h>
#include <vm/vm.h>
Sync back vmcontention branch into HEAD: Replace the per-object resident and cached pages splay tree with a path-compressed multi-digit radix trie. Along with this, switch also the x86-specific handling of idle page tables to using the radix trie. This change is supposed to do the following: - Allowing the acquisition of read locking for lookup operations of the resident/cached pages collections as the per-vm_page_t splay iterators are now removed. - Increase the scalability of the operations on the page collections. The radix trie does rely on the consumers locking to ensure atomicity of its operations. In order to avoid deadlocks the bisection nodes are pre-allocated in the UMA zone. This can be done safely because the algorithm needs at maximum one new node per insert which means the maximum number of the desired nodes is the number of available physical frames themselves. However, not all the times a new bisection node is really needed. The radix trie implements path-compression because UFS indirect blocks can lead to several objects with a very sparse trie, increasing the number of levels to usually scan. It also helps in the nodes pre-fetching by introducing the single node per-insert property. This code is not generalized (yet) because of the possible loss of performance by having much of the sizes in play configurable. However, efforts to make this code more general and then reusable in further different consumers might be really done. The only KPI change is the removal of the function vm_page_splay() which is now reaped. The only KBI change, instead, is the removal of the left/right iterators from struct vm_page, which are now reaped. Further technical notes broken into mealpieces can be retrieved from the svn branch: http://svn.freebsd.org/base/user/attilio/vmcontention/ Sponsored by: EMC / Isilon storage division In collaboration with: alc, jeff Tested by: flo, pho, jhb, davide Tested by: ian (arm) Tested by: andreast (powerpc)
2013-03-18 00:25:02 +00:00
void vm_radix_wait(void);
void vm_radix_zinit(void);
void *vm_radix_node_alloc(struct pctrie *ptree);
void vm_radix_node_free(struct pctrie *ptree, void *node);
extern smr_t vm_radix_smr;
static __inline void
vm_radix_init(struct vm_radix *rtree)
{
pctrie_init(&rtree->rt_trie);
}
static __inline bool
vm_radix_is_empty(struct vm_radix *rtree)
{
return (pctrie_is_empty(&rtree->rt_trie));
}
PCTRIE_DEFINE_SMR(VM_RADIX, vm_page, pindex, vm_radix_node_alloc, vm_radix_node_free,
vm_radix_smr);
/*
* Inserts the key-value pair into the trie.
* Panics if the key already exists.
*/
static __inline int
vm_radix_insert(struct vm_radix *rtree, vm_page_t page)
{
return (VM_RADIX_PCTRIE_INSERT(&rtree->rt_trie, page));
}
/*
* Insert the page into the vm_radix tree with its pindex as the key. Panic if
* the pindex already exists. Return zero on success or a non-zero error on
* memory allocation failure. Set the out parameter mpred to the previous page
* in the tree as if found by a previous call to vm_radix_lookup_le with the
* new page pindex.
*/
static __inline int
vm_radix_insert_lookup_lt(struct vm_radix *rtree, vm_page_t page,
vm_page_t *mpred)
{
int error;
error = VM_RADIX_PCTRIE_INSERT_LOOKUP_LE(&rtree->rt_trie, page, mpred);
if (__predict_false(error == EEXIST))
panic("vm_radix_insert_lookup_lt: page already present, %p",
*mpred);
return (error);
}
/*
* Returns the value stored at the index assuming there is an external lock.
*
* If the index is not present, NULL is returned.
*/
static __inline vm_page_t
vm_radix_lookup(struct vm_radix *rtree, vm_pindex_t index)
{
return (VM_RADIX_PCTRIE_LOOKUP(&rtree->rt_trie, index));
}
/*
* Returns the value stored at the index without requiring an external lock.
*
* If the index is not present, NULL is returned.
*/
static __inline vm_page_t
vm_radix_lookup_unlocked(struct vm_radix *rtree, vm_pindex_t index)
{
return (VM_RADIX_PCTRIE_LOOKUP_UNLOCKED(&rtree->rt_trie, index));
}
/*
* Returns the page with the least pindex that is greater than or equal to the
* specified pindex, or NULL if there are no such pages.
*
* Requires that access be externally synchronized by a lock.
*/
static __inline vm_page_t
vm_radix_lookup_ge(struct vm_radix *rtree, vm_pindex_t index)
{
return (VM_RADIX_PCTRIE_LOOKUP_GE(&rtree->rt_trie, index));
}
/*
* Returns the page with the greatest pindex that is less than or equal to the
* specified pindex, or NULL if there are no such pages.
*
* Requires that access be externally synchronized by a lock.
*/
static __inline vm_page_t
vm_radix_lookup_le(struct vm_radix *rtree, vm_pindex_t index)
{
return (VM_RADIX_PCTRIE_LOOKUP_LE(&rtree->rt_trie, index));
}
/*
* Remove the specified index from the trie, and return the value stored at
* that index. If the index is not present, return NULL.
*/
static __inline vm_page_t
vm_radix_remove(struct vm_radix *rtree, vm_pindex_t index)
{
return (VM_RADIX_PCTRIE_REMOVE_LOOKUP(&rtree->rt_trie, index));
}
/*
* Remove and free all the nodes from the radix tree.
*/
static __inline void
vm_radix_reclaim_allnodes(struct vm_radix *rtree)
{
VM_RADIX_PCTRIE_RECLAIM(&rtree->rt_trie);
}
/*
* Replace an existing page in the trie with another one.
* Panics if there is not an old page in the trie at the new page's index.
*/
static __inline vm_page_t
vm_radix_replace(struct vm_radix *rtree, vm_page_t newpage)
{
return (VM_RADIX_PCTRIE_REPLACE(&rtree->rt_trie, newpage));
}
Sync back vmcontention branch into HEAD: Replace the per-object resident and cached pages splay tree with a path-compressed multi-digit radix trie. Along with this, switch also the x86-specific handling of idle page tables to using the radix trie. This change is supposed to do the following: - Allowing the acquisition of read locking for lookup operations of the resident/cached pages collections as the per-vm_page_t splay iterators are now removed. - Increase the scalability of the operations on the page collections. The radix trie does rely on the consumers locking to ensure atomicity of its operations. In order to avoid deadlocks the bisection nodes are pre-allocated in the UMA zone. This can be done safely because the algorithm needs at maximum one new node per insert which means the maximum number of the desired nodes is the number of available physical frames themselves. However, not all the times a new bisection node is really needed. The radix trie implements path-compression because UFS indirect blocks can lead to several objects with a very sparse trie, increasing the number of levels to usually scan. It also helps in the nodes pre-fetching by introducing the single node per-insert property. This code is not generalized (yet) because of the possible loss of performance by having much of the sizes in play configurable. However, efforts to make this code more general and then reusable in further different consumers might be really done. The only KPI change is the removal of the function vm_page_splay() which is now reaped. The only KBI change, instead, is the removal of the left/right iterators from struct vm_page, which are now reaped. Further technical notes broken into mealpieces can be retrieved from the svn branch: http://svn.freebsd.org/base/user/attilio/vmcontention/ Sponsored by: EMC / Isilon storage division In collaboration with: alc, jeff Tested by: flo, pho, jhb, davide Tested by: ian (arm) Tested by: andreast (powerpc)
2013-03-18 00:25:02 +00:00
#endif /* _KERNEL */
#endif /* !_VM_RADIX_H_ */