freebsd-src/lib/libc/stdlib/tsearch.c

201 lines
5.1 KiB
C
Raw Normal View History

Let tsearch()/tdelete() use an AVL tree. The existing implementations of POSIX tsearch() and tdelete() don't attempt to perform any balancing at all. Testing reveals that inserting 100k nodes into a tree sequentially takes approximately one minute on my system. Though most other BSDs also don't use any balanced tree internally, C libraries like glibc and musl do provide better implementations. glibc uses a red-black tree and musl uses an AVL tree. Red-black trees have the advantage over AVL trees that they only require O(1) rotations after insertion and deletion, but have the disadvantage that the tree has a maximum depth of 2*log2(n) instead of 1.44*log2(n). My take is that it's better to focus on having a lower maximum depth, for the reason that in the case of tsearch() the invocation of the comparator likely dominates the running time. This change replaces the tsearch() and tdelete() functions by versions that create an AVL tree. Compared to musl's implementation, this version is different in two different ways: - We don't keep track of heights; just balances. This is sufficient. This has the advantage that it reduces the number of nodes that are being accessed. Storing heights requires us to also access all of the siblings along the path. - Don't use any recursion at all. We know that the tree cannot 2^64 elements in size, so the height of the tree can never be larger than 96. Use a 128-bit bitmask to keep track of the path that is computed. This allows us to iterate over the same path twice, meaning we can apply rotations from top to bottom. Inserting 100k nodes into a tree now only takes 0.015 seconds. Insertion seems to be twice as fast as glibc, whereas deletion has about the same performance. Unlike glibc, it uses a fixed amount of memory. I also experimented with both recursive and iterative bottom-up implementations of the same algorithm. This iterative top-down version performs similar to the recursive bottom-up version in terms of speed and code size. For some reason, the iterative bottom-up algorithm was actually 30% faster for deletion, but has a quadratic memory complexity to keep track of all the parent pointers. Reviewed by: jilles Obtained from: https://github.com/NuxiNL/cloudlibc Differential Revision: https://reviews.freebsd.org/D4412
2015-12-22 18:12:11 +00:00
/*-
* Copyright (c) 2015 Nuxi, https://nuxi.nl/
*
Let tsearch()/tdelete() use an AVL tree. The existing implementations of POSIX tsearch() and tdelete() don't attempt to perform any balancing at all. Testing reveals that inserting 100k nodes into a tree sequentially takes approximately one minute on my system. Though most other BSDs also don't use any balanced tree internally, C libraries like glibc and musl do provide better implementations. glibc uses a red-black tree and musl uses an AVL tree. Red-black trees have the advantage over AVL trees that they only require O(1) rotations after insertion and deletion, but have the disadvantage that the tree has a maximum depth of 2*log2(n) instead of 1.44*log2(n). My take is that it's better to focus on having a lower maximum depth, for the reason that in the case of tsearch() the invocation of the comparator likely dominates the running time. This change replaces the tsearch() and tdelete() functions by versions that create an AVL tree. Compared to musl's implementation, this version is different in two different ways: - We don't keep track of heights; just balances. This is sufficient. This has the advantage that it reduces the number of nodes that are being accessed. Storing heights requires us to also access all of the siblings along the path. - Don't use any recursion at all. We know that the tree cannot 2^64 elements in size, so the height of the tree can never be larger than 96. Use a 128-bit bitmask to keep track of the path that is computed. This allows us to iterate over the same path twice, meaning we can apply rotations from top to bottom. Inserting 100k nodes into a tree now only takes 0.015 seconds. Insertion seems to be twice as fast as glibc, whereas deletion has about the same performance. Unlike glibc, it uses a fixed amount of memory. I also experimented with both recursive and iterative bottom-up implementations of the same algorithm. This iterative top-down version performs similar to the recursive bottom-up version in terms of speed and code size. For some reason, the iterative bottom-up algorithm was actually 30% faster for deletion, but has a quadratic memory complexity to keep track of all the parent pointers. Reviewed by: jilles Obtained from: https://github.com/NuxiNL/cloudlibc Differential Revision: https://reviews.freebsd.org/D4412
2015-12-22 18:12:11 +00:00
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
Let tsearch()/tdelete() use an AVL tree. The existing implementations of POSIX tsearch() and tdelete() don't attempt to perform any balancing at all. Testing reveals that inserting 100k nodes into a tree sequentially takes approximately one minute on my system. Though most other BSDs also don't use any balanced tree internally, C libraries like glibc and musl do provide better implementations. glibc uses a red-black tree and musl uses an AVL tree. Red-black trees have the advantage over AVL trees that they only require O(1) rotations after insertion and deletion, but have the disadvantage that the tree has a maximum depth of 2*log2(n) instead of 1.44*log2(n). My take is that it's better to focus on having a lower maximum depth, for the reason that in the case of tsearch() the invocation of the comparator likely dominates the running time. This change replaces the tsearch() and tdelete() functions by versions that create an AVL tree. Compared to musl's implementation, this version is different in two different ways: - We don't keep track of heights; just balances. This is sufficient. This has the advantage that it reduces the number of nodes that are being accessed. Storing heights requires us to also access all of the siblings along the path. - Don't use any recursion at all. We know that the tree cannot 2^64 elements in size, so the height of the tree can never be larger than 96. Use a 128-bit bitmask to keep track of the path that is computed. This allows us to iterate over the same path twice, meaning we can apply rotations from top to bottom. Inserting 100k nodes into a tree now only takes 0.015 seconds. Insertion seems to be twice as fast as glibc, whereas deletion has about the same performance. Unlike glibc, it uses a fixed amount of memory. I also experimented with both recursive and iterative bottom-up implementations of the same algorithm. This iterative top-down version performs similar to the recursive bottom-up version in terms of speed and code size. For some reason, the iterative bottom-up algorithm was actually 30% faster for deletion, but has a quadratic memory complexity to keep track of all the parent pointers. Reviewed by: jilles Obtained from: https://github.com/NuxiNL/cloudlibc Differential Revision: https://reviews.freebsd.org/D4412
2015-12-22 18:12:11 +00:00
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
#include <sys/cdefs.h>
__FBSDID("$FreeBSD$");
Let tsearch()/tdelete() use an AVL tree. The existing implementations of POSIX tsearch() and tdelete() don't attempt to perform any balancing at all. Testing reveals that inserting 100k nodes into a tree sequentially takes approximately one minute on my system. Though most other BSDs also don't use any balanced tree internally, C libraries like glibc and musl do provide better implementations. glibc uses a red-black tree and musl uses an AVL tree. Red-black trees have the advantage over AVL trees that they only require O(1) rotations after insertion and deletion, but have the disadvantage that the tree has a maximum depth of 2*log2(n) instead of 1.44*log2(n). My take is that it's better to focus on having a lower maximum depth, for the reason that in the case of tsearch() the invocation of the comparator likely dominates the running time. This change replaces the tsearch() and tdelete() functions by versions that create an AVL tree. Compared to musl's implementation, this version is different in two different ways: - We don't keep track of heights; just balances. This is sufficient. This has the advantage that it reduces the number of nodes that are being accessed. Storing heights requires us to also access all of the siblings along the path. - Don't use any recursion at all. We know that the tree cannot 2^64 elements in size, so the height of the tree can never be larger than 96. Use a 128-bit bitmask to keep track of the path that is computed. This allows us to iterate over the same path twice, meaning we can apply rotations from top to bottom. Inserting 100k nodes into a tree now only takes 0.015 seconds. Insertion seems to be twice as fast as glibc, whereas deletion has about the same performance. Unlike glibc, it uses a fixed amount of memory. I also experimented with both recursive and iterative bottom-up implementations of the same algorithm. This iterative top-down version performs similar to the recursive bottom-up version in terms of speed and code size. For some reason, the iterative bottom-up algorithm was actually 30% faster for deletion, but has a quadratic memory complexity to keep track of all the parent pointers. Reviewed by: jilles Obtained from: https://github.com/NuxiNL/cloudlibc Differential Revision: https://reviews.freebsd.org/D4412
2015-12-22 18:12:11 +00:00
#define _SEARCH_PRIVATE
#include <search.h>
#include <stdlib.h>
Let tsearch()/tdelete() use an AVL tree. The existing implementations of POSIX tsearch() and tdelete() don't attempt to perform any balancing at all. Testing reveals that inserting 100k nodes into a tree sequentially takes approximately one minute on my system. Though most other BSDs also don't use any balanced tree internally, C libraries like glibc and musl do provide better implementations. glibc uses a red-black tree and musl uses an AVL tree. Red-black trees have the advantage over AVL trees that they only require O(1) rotations after insertion and deletion, but have the disadvantage that the tree has a maximum depth of 2*log2(n) instead of 1.44*log2(n). My take is that it's better to focus on having a lower maximum depth, for the reason that in the case of tsearch() the invocation of the comparator likely dominates the running time. This change replaces the tsearch() and tdelete() functions by versions that create an AVL tree. Compared to musl's implementation, this version is different in two different ways: - We don't keep track of heights; just balances. This is sufficient. This has the advantage that it reduces the number of nodes that are being accessed. Storing heights requires us to also access all of the siblings along the path. - Don't use any recursion at all. We know that the tree cannot 2^64 elements in size, so the height of the tree can never be larger than 96. Use a 128-bit bitmask to keep track of the path that is computed. This allows us to iterate over the same path twice, meaning we can apply rotations from top to bottom. Inserting 100k nodes into a tree now only takes 0.015 seconds. Insertion seems to be twice as fast as glibc, whereas deletion has about the same performance. Unlike glibc, it uses a fixed amount of memory. I also experimented with both recursive and iterative bottom-up implementations of the same algorithm. This iterative top-down version performs similar to the recursive bottom-up version in terms of speed and code size. For some reason, the iterative bottom-up algorithm was actually 30% faster for deletion, but has a quadratic memory complexity to keep track of all the parent pointers. Reviewed by: jilles Obtained from: https://github.com/NuxiNL/cloudlibc Differential Revision: https://reviews.freebsd.org/D4412
2015-12-22 18:12:11 +00:00
#include "tsearch_path.h"
void *
Let tsearch()/tdelete() use an AVL tree. The existing implementations of POSIX tsearch() and tdelete() don't attempt to perform any balancing at all. Testing reveals that inserting 100k nodes into a tree sequentially takes approximately one minute on my system. Though most other BSDs also don't use any balanced tree internally, C libraries like glibc and musl do provide better implementations. glibc uses a red-black tree and musl uses an AVL tree. Red-black trees have the advantage over AVL trees that they only require O(1) rotations after insertion and deletion, but have the disadvantage that the tree has a maximum depth of 2*log2(n) instead of 1.44*log2(n). My take is that it's better to focus on having a lower maximum depth, for the reason that in the case of tsearch() the invocation of the comparator likely dominates the running time. This change replaces the tsearch() and tdelete() functions by versions that create an AVL tree. Compared to musl's implementation, this version is different in two different ways: - We don't keep track of heights; just balances. This is sufficient. This has the advantage that it reduces the number of nodes that are being accessed. Storing heights requires us to also access all of the siblings along the path. - Don't use any recursion at all. We know that the tree cannot 2^64 elements in size, so the height of the tree can never be larger than 96. Use a 128-bit bitmask to keep track of the path that is computed. This allows us to iterate over the same path twice, meaning we can apply rotations from top to bottom. Inserting 100k nodes into a tree now only takes 0.015 seconds. Insertion seems to be twice as fast as glibc, whereas deletion has about the same performance. Unlike glibc, it uses a fixed amount of memory. I also experimented with both recursive and iterative bottom-up implementations of the same algorithm. This iterative top-down version performs similar to the recursive bottom-up version in terms of speed and code size. For some reason, the iterative bottom-up algorithm was actually 30% faster for deletion, but has a quadratic memory complexity to keep track of all the parent pointers. Reviewed by: jilles Obtained from: https://github.com/NuxiNL/cloudlibc Differential Revision: https://reviews.freebsd.org/D4412
2015-12-22 18:12:11 +00:00
tsearch(const void *key, void **rootp,
int (*compar)(const void *, const void *))
{
Let tsearch()/tdelete() use an AVL tree. The existing implementations of POSIX tsearch() and tdelete() don't attempt to perform any balancing at all. Testing reveals that inserting 100k nodes into a tree sequentially takes approximately one minute on my system. Though most other BSDs also don't use any balanced tree internally, C libraries like glibc and musl do provide better implementations. glibc uses a red-black tree and musl uses an AVL tree. Red-black trees have the advantage over AVL trees that they only require O(1) rotations after insertion and deletion, but have the disadvantage that the tree has a maximum depth of 2*log2(n) instead of 1.44*log2(n). My take is that it's better to focus on having a lower maximum depth, for the reason that in the case of tsearch() the invocation of the comparator likely dominates the running time. This change replaces the tsearch() and tdelete() functions by versions that create an AVL tree. Compared to musl's implementation, this version is different in two different ways: - We don't keep track of heights; just balances. This is sufficient. This has the advantage that it reduces the number of nodes that are being accessed. Storing heights requires us to also access all of the siblings along the path. - Don't use any recursion at all. We know that the tree cannot 2^64 elements in size, so the height of the tree can never be larger than 96. Use a 128-bit bitmask to keep track of the path that is computed. This allows us to iterate over the same path twice, meaning we can apply rotations from top to bottom. Inserting 100k nodes into a tree now only takes 0.015 seconds. Insertion seems to be twice as fast as glibc, whereas deletion has about the same performance. Unlike glibc, it uses a fixed amount of memory. I also experimented with both recursive and iterative bottom-up implementations of the same algorithm. This iterative top-down version performs similar to the recursive bottom-up version in terms of speed and code size. For some reason, the iterative bottom-up algorithm was actually 30% faster for deletion, but has a quadratic memory complexity to keep track of all the parent pointers. Reviewed by: jilles Obtained from: https://github.com/NuxiNL/cloudlibc Differential Revision: https://reviews.freebsd.org/D4412
2015-12-22 18:12:11 +00:00
struct path path;
node_t *root, **base, **leaf, *result, *n, *x, *y, *z;
int cmp;
Let tsearch()/tdelete() use an AVL tree. The existing implementations of POSIX tsearch() and tdelete() don't attempt to perform any balancing at all. Testing reveals that inserting 100k nodes into a tree sequentially takes approximately one minute on my system. Though most other BSDs also don't use any balanced tree internally, C libraries like glibc and musl do provide better implementations. glibc uses a red-black tree and musl uses an AVL tree. Red-black trees have the advantage over AVL trees that they only require O(1) rotations after insertion and deletion, but have the disadvantage that the tree has a maximum depth of 2*log2(n) instead of 1.44*log2(n). My take is that it's better to focus on having a lower maximum depth, for the reason that in the case of tsearch() the invocation of the comparator likely dominates the running time. This change replaces the tsearch() and tdelete() functions by versions that create an AVL tree. Compared to musl's implementation, this version is different in two different ways: - We don't keep track of heights; just balances. This is sufficient. This has the advantage that it reduces the number of nodes that are being accessed. Storing heights requires us to also access all of the siblings along the path. - Don't use any recursion at all. We know that the tree cannot 2^64 elements in size, so the height of the tree can never be larger than 96. Use a 128-bit bitmask to keep track of the path that is computed. This allows us to iterate over the same path twice, meaning we can apply rotations from top to bottom. Inserting 100k nodes into a tree now only takes 0.015 seconds. Insertion seems to be twice as fast as glibc, whereas deletion has about the same performance. Unlike glibc, it uses a fixed amount of memory. I also experimented with both recursive and iterative bottom-up implementations of the same algorithm. This iterative top-down version performs similar to the recursive bottom-up version in terms of speed and code size. For some reason, the iterative bottom-up algorithm was actually 30% faster for deletion, but has a quadratic memory complexity to keep track of all the parent pointers. Reviewed by: jilles Obtained from: https://github.com/NuxiNL/cloudlibc Differential Revision: https://reviews.freebsd.org/D4412
2015-12-22 18:12:11 +00:00
/* POSIX requires that tsearch() returns NULL if rootp is NULL. */
if (rootp == NULL)
Let tsearch()/tdelete() use an AVL tree. The existing implementations of POSIX tsearch() and tdelete() don't attempt to perform any balancing at all. Testing reveals that inserting 100k nodes into a tree sequentially takes approximately one minute on my system. Though most other BSDs also don't use any balanced tree internally, C libraries like glibc and musl do provide better implementations. glibc uses a red-black tree and musl uses an AVL tree. Red-black trees have the advantage over AVL trees that they only require O(1) rotations after insertion and deletion, but have the disadvantage that the tree has a maximum depth of 2*log2(n) instead of 1.44*log2(n). My take is that it's better to focus on having a lower maximum depth, for the reason that in the case of tsearch() the invocation of the comparator likely dominates the running time. This change replaces the tsearch() and tdelete() functions by versions that create an AVL tree. Compared to musl's implementation, this version is different in two different ways: - We don't keep track of heights; just balances. This is sufficient. This has the advantage that it reduces the number of nodes that are being accessed. Storing heights requires us to also access all of the siblings along the path. - Don't use any recursion at all. We know that the tree cannot 2^64 elements in size, so the height of the tree can never be larger than 96. Use a 128-bit bitmask to keep track of the path that is computed. This allows us to iterate over the same path twice, meaning we can apply rotations from top to bottom. Inserting 100k nodes into a tree now only takes 0.015 seconds. Insertion seems to be twice as fast as glibc, whereas deletion has about the same performance. Unlike glibc, it uses a fixed amount of memory. I also experimented with both recursive and iterative bottom-up implementations of the same algorithm. This iterative top-down version performs similar to the recursive bottom-up version in terms of speed and code size. For some reason, the iterative bottom-up algorithm was actually 30% faster for deletion, but has a quadratic memory complexity to keep track of all the parent pointers. Reviewed by: jilles Obtained from: https://github.com/NuxiNL/cloudlibc Differential Revision: https://reviews.freebsd.org/D4412
2015-12-22 18:12:11 +00:00
return (NULL);
root = *rootp;
Let tsearch()/tdelete() use an AVL tree. The existing implementations of POSIX tsearch() and tdelete() don't attempt to perform any balancing at all. Testing reveals that inserting 100k nodes into a tree sequentially takes approximately one minute on my system. Though most other BSDs also don't use any balanced tree internally, C libraries like glibc and musl do provide better implementations. glibc uses a red-black tree and musl uses an AVL tree. Red-black trees have the advantage over AVL trees that they only require O(1) rotations after insertion and deletion, but have the disadvantage that the tree has a maximum depth of 2*log2(n) instead of 1.44*log2(n). My take is that it's better to focus on having a lower maximum depth, for the reason that in the case of tsearch() the invocation of the comparator likely dominates the running time. This change replaces the tsearch() and tdelete() functions by versions that create an AVL tree. Compared to musl's implementation, this version is different in two different ways: - We don't keep track of heights; just balances. This is sufficient. This has the advantage that it reduces the number of nodes that are being accessed. Storing heights requires us to also access all of the siblings along the path. - Don't use any recursion at all. We know that the tree cannot 2^64 elements in size, so the height of the tree can never be larger than 96. Use a 128-bit bitmask to keep track of the path that is computed. This allows us to iterate over the same path twice, meaning we can apply rotations from top to bottom. Inserting 100k nodes into a tree now only takes 0.015 seconds. Insertion seems to be twice as fast as glibc, whereas deletion has about the same performance. Unlike glibc, it uses a fixed amount of memory. I also experimented with both recursive and iterative bottom-up implementations of the same algorithm. This iterative top-down version performs similar to the recursive bottom-up version in terms of speed and code size. For some reason, the iterative bottom-up algorithm was actually 30% faster for deletion, but has a quadratic memory complexity to keep track of all the parent pointers. Reviewed by: jilles Obtained from: https://github.com/NuxiNL/cloudlibc Differential Revision: https://reviews.freebsd.org/D4412
2015-12-22 18:12:11 +00:00
/*
* Find the leaf where the new key needs to be inserted. Return
* if we've found an existing entry. Keep track of the path that
* is taken to get to the node, as we will need it to adjust the
* balances.
*/
path_init(&path);
base = &root;
leaf = &root;
while (*leaf != NULL) {
if ((*leaf)->balance != 0) {
/*
* If we reach a node that has a non-zero
* balance on the way, we know that we won't
* need to perform any rotations above this
* point. In this case rotations are always
* capable of keeping the subtree in balance.
* Make this the base node and reset the path.
*/
base = leaf;
path_init(&path);
}
cmp = compar(key, (*leaf)->key);
if (cmp < 0) {
path_taking_left(&path);
leaf = &(*leaf)->llink;
} else if (cmp > 0) {
path_taking_right(&path);
leaf = &(*leaf)->rlink;
} else {
return (&(*leaf)->key);
}
}
Let tsearch()/tdelete() use an AVL tree. The existing implementations of POSIX tsearch() and tdelete() don't attempt to perform any balancing at all. Testing reveals that inserting 100k nodes into a tree sequentially takes approximately one minute on my system. Though most other BSDs also don't use any balanced tree internally, C libraries like glibc and musl do provide better implementations. glibc uses a red-black tree and musl uses an AVL tree. Red-black trees have the advantage over AVL trees that they only require O(1) rotations after insertion and deletion, but have the disadvantage that the tree has a maximum depth of 2*log2(n) instead of 1.44*log2(n). My take is that it's better to focus on having a lower maximum depth, for the reason that in the case of tsearch() the invocation of the comparator likely dominates the running time. This change replaces the tsearch() and tdelete() functions by versions that create an AVL tree. Compared to musl's implementation, this version is different in two different ways: - We don't keep track of heights; just balances. This is sufficient. This has the advantage that it reduces the number of nodes that are being accessed. Storing heights requires us to also access all of the siblings along the path. - Don't use any recursion at all. We know that the tree cannot 2^64 elements in size, so the height of the tree can never be larger than 96. Use a 128-bit bitmask to keep track of the path that is computed. This allows us to iterate over the same path twice, meaning we can apply rotations from top to bottom. Inserting 100k nodes into a tree now only takes 0.015 seconds. Insertion seems to be twice as fast as glibc, whereas deletion has about the same performance. Unlike glibc, it uses a fixed amount of memory. I also experimented with both recursive and iterative bottom-up implementations of the same algorithm. This iterative top-down version performs similar to the recursive bottom-up version in terms of speed and code size. For some reason, the iterative bottom-up algorithm was actually 30% faster for deletion, but has a quadratic memory complexity to keep track of all the parent pointers. Reviewed by: jilles Obtained from: https://github.com/NuxiNL/cloudlibc Differential Revision: https://reviews.freebsd.org/D4412
2015-12-22 18:12:11 +00:00
/* Did not find a matching key in the tree. Insert a new node. */
result = *leaf = malloc(sizeof(**leaf));
if (result == NULL)
return (NULL);
result->key = (void *)key;
result->llink = NULL;
result->rlink = NULL;
result->balance = 0;
Let tsearch()/tdelete() use an AVL tree. The existing implementations of POSIX tsearch() and tdelete() don't attempt to perform any balancing at all. Testing reveals that inserting 100k nodes into a tree sequentially takes approximately one minute on my system. Though most other BSDs also don't use any balanced tree internally, C libraries like glibc and musl do provide better implementations. glibc uses a red-black tree and musl uses an AVL tree. Red-black trees have the advantage over AVL trees that they only require O(1) rotations after insertion and deletion, but have the disadvantage that the tree has a maximum depth of 2*log2(n) instead of 1.44*log2(n). My take is that it's better to focus on having a lower maximum depth, for the reason that in the case of tsearch() the invocation of the comparator likely dominates the running time. This change replaces the tsearch() and tdelete() functions by versions that create an AVL tree. Compared to musl's implementation, this version is different in two different ways: - We don't keep track of heights; just balances. This is sufficient. This has the advantage that it reduces the number of nodes that are being accessed. Storing heights requires us to also access all of the siblings along the path. - Don't use any recursion at all. We know that the tree cannot 2^64 elements in size, so the height of the tree can never be larger than 96. Use a 128-bit bitmask to keep track of the path that is computed. This allows us to iterate over the same path twice, meaning we can apply rotations from top to bottom. Inserting 100k nodes into a tree now only takes 0.015 seconds. Insertion seems to be twice as fast as glibc, whereas deletion has about the same performance. Unlike glibc, it uses a fixed amount of memory. I also experimented with both recursive and iterative bottom-up implementations of the same algorithm. This iterative top-down version performs similar to the recursive bottom-up version in terms of speed and code size. For some reason, the iterative bottom-up algorithm was actually 30% faster for deletion, but has a quadratic memory complexity to keep track of all the parent pointers. Reviewed by: jilles Obtained from: https://github.com/NuxiNL/cloudlibc Differential Revision: https://reviews.freebsd.org/D4412
2015-12-22 18:12:11 +00:00
/*
* Walk along the same path a second time and adjust the
* balances. Except for the first node, all of these nodes must
* have a balance of zero, meaning that these nodes will not get
* out of balance.
*/
for (n = *base; n != *leaf;) {
if (path_took_left(&path)) {
n->balance += 1;
n = n->llink;
} else {
n->balance -= 1;
n = n->rlink;
}
}
Let tsearch()/tdelete() use an AVL tree. The existing implementations of POSIX tsearch() and tdelete() don't attempt to perform any balancing at all. Testing reveals that inserting 100k nodes into a tree sequentially takes approximately one minute on my system. Though most other BSDs also don't use any balanced tree internally, C libraries like glibc and musl do provide better implementations. glibc uses a red-black tree and musl uses an AVL tree. Red-black trees have the advantage over AVL trees that they only require O(1) rotations after insertion and deletion, but have the disadvantage that the tree has a maximum depth of 2*log2(n) instead of 1.44*log2(n). My take is that it's better to focus on having a lower maximum depth, for the reason that in the case of tsearch() the invocation of the comparator likely dominates the running time. This change replaces the tsearch() and tdelete() functions by versions that create an AVL tree. Compared to musl's implementation, this version is different in two different ways: - We don't keep track of heights; just balances. This is sufficient. This has the advantage that it reduces the number of nodes that are being accessed. Storing heights requires us to also access all of the siblings along the path. - Don't use any recursion at all. We know that the tree cannot 2^64 elements in size, so the height of the tree can never be larger than 96. Use a 128-bit bitmask to keep track of the path that is computed. This allows us to iterate over the same path twice, meaning we can apply rotations from top to bottom. Inserting 100k nodes into a tree now only takes 0.015 seconds. Insertion seems to be twice as fast as glibc, whereas deletion has about the same performance. Unlike glibc, it uses a fixed amount of memory. I also experimented with both recursive and iterative bottom-up implementations of the same algorithm. This iterative top-down version performs similar to the recursive bottom-up version in terms of speed and code size. For some reason, the iterative bottom-up algorithm was actually 30% faster for deletion, but has a quadratic memory complexity to keep track of all the parent pointers. Reviewed by: jilles Obtained from: https://github.com/NuxiNL/cloudlibc Differential Revision: https://reviews.freebsd.org/D4412
2015-12-22 18:12:11 +00:00
/*
* Adjusting the balances may have pushed the balance of the
* base node out of range. Perform a rotation to bring the
* balance back in range.
*/
x = *base;
if (x->balance > 1) {
y = x->llink;
if (y->balance < 0) {
/*
* Left-right case.
*
* x
* / \ z
* y D / \
* / \ --> y x
* A z /| |\
* / \ A B C D
* B C
*/
z = y->rlink;
y->rlink = z->llink;
z->llink = y;
x->llink = z->rlink;
z->rlink = x;
*base = z;
x->balance = z->balance > 0 ? -1 : 0;
y->balance = z->balance < 0 ? 1 : 0;
z->balance = 0;
} else {
/*
* Left-left case.
*
* x y
* / \ / \
* y C --> A x
* / \ / \
* A B B C
*/
x->llink = y->rlink;
y->rlink = x;
*base = y;
x->balance = 0;
y->balance = 0;
}
} else if (x->balance < -1) {
y = x->rlink;
if (y->balance > 0) {
/*
* Right-left case.
*
* x
* / \ z
* A y / \
* / \ --> x y
* z D /| |\
* / \ A B C D
* B C
*/
node_t *z = y->llink;
x->rlink = z->llink;
z->llink = x;
y->llink = z->rlink;
z->rlink = y;
*base = z;
x->balance = z->balance < 0 ? 1 : 0;
y->balance = z->balance > 0 ? -1 : 0;
z->balance = 0;
} else {
/*
* Right-right case.
*
* x y
* / \ / \
* A y --> x C
* / \ / \
* B C A B
*/
x->rlink = y->llink;
y->llink = x;
*base = y;
x->balance = 0;
y->balance = 0;
}
}
Let tsearch()/tdelete() use an AVL tree. The existing implementations of POSIX tsearch() and tdelete() don't attempt to perform any balancing at all. Testing reveals that inserting 100k nodes into a tree sequentially takes approximately one minute on my system. Though most other BSDs also don't use any balanced tree internally, C libraries like glibc and musl do provide better implementations. glibc uses a red-black tree and musl uses an AVL tree. Red-black trees have the advantage over AVL trees that they only require O(1) rotations after insertion and deletion, but have the disadvantage that the tree has a maximum depth of 2*log2(n) instead of 1.44*log2(n). My take is that it's better to focus on having a lower maximum depth, for the reason that in the case of tsearch() the invocation of the comparator likely dominates the running time. This change replaces the tsearch() and tdelete() functions by versions that create an AVL tree. Compared to musl's implementation, this version is different in two different ways: - We don't keep track of heights; just balances. This is sufficient. This has the advantage that it reduces the number of nodes that are being accessed. Storing heights requires us to also access all of the siblings along the path. - Don't use any recursion at all. We know that the tree cannot 2^64 elements in size, so the height of the tree can never be larger than 96. Use a 128-bit bitmask to keep track of the path that is computed. This allows us to iterate over the same path twice, meaning we can apply rotations from top to bottom. Inserting 100k nodes into a tree now only takes 0.015 seconds. Insertion seems to be twice as fast as glibc, whereas deletion has about the same performance. Unlike glibc, it uses a fixed amount of memory. I also experimented with both recursive and iterative bottom-up implementations of the same algorithm. This iterative top-down version performs similar to the recursive bottom-up version in terms of speed and code size. For some reason, the iterative bottom-up algorithm was actually 30% faster for deletion, but has a quadratic memory complexity to keep track of all the parent pointers. Reviewed by: jilles Obtained from: https://github.com/NuxiNL/cloudlibc Differential Revision: https://reviews.freebsd.org/D4412
2015-12-22 18:12:11 +00:00
/* Return the new entry. */
*rootp = root;
return (&result->key);
}