AVL Array |
Get the latest
|
- Usage
- Is
avl_array
a substitute forlist
,vector
anddeque
?- May I simply use
avl_array
everywhere instead of spending my time deciding betweenlist
,vector
anddeque
?- How should I decide what to use in every case?
- Why?
- Complexity? O(log N)? What's that?
- Is
avl_array
like amap
with an integer key?- How should I iterate through an
avl_array
: with[]
with++
, or with+=
?- Does it mean I should never use
[]
or+=
?- What is the NPSV stuff?
- What would I want it for?
- Why do I get a compiler error when I try to use
stable_sort()
?- Design
- What's the performance price of NPSV and
stable_sort()
when they are not wanted?- Why aren't reverse iterators defined with the
std::reverse_iterator
template?- Why are tree nodes also in a circular doubly linked list?
- Why are iterators tagged as bidirectional instead of random access?
- Why are some algorithms implemented inside the
avl_array
class?- What is that O(min{N, n log N}) complexity?
- What's the cost, and how precise is it?
- Why don't iterators and/or tree nodes contain a pointer to their container object?
- Why does
avl_array
inherit fromavl_array_node_tree_fields
instead of having a member of that class, or a simple pointer?- Why AVL trees?
avl_array
a substitute for
list
, vector
and deque
?Not at all.
avl_array
everywhere
instead of spending my time deciding between list
,
vector
and deque
?No! Please, go on reading.
First, follow the wise, wide spread, official recommendations. Choose
list
when you are planning to insert/erase everywhere in the sequence and you don't need random access. Choosevector
when you are planning to use random access and you only need fast insert/erase at the end of the sequence. Choosedeque
when you are planning to use random access and you need fast insert/erase at both ends of the sequence (but not in the middle).When none of these fits, that is, when you are planning to use intensely both random access and insert/erase everywhere in the sequence (not only in the ends), then choose
avl_array
.
Because
avl_array
has logarithmic time complexity (O(log N)) at both random access and insert/erase.
See Big O notation in the Wikipedia or the introduction to this documentation.
avl_array
like a
map
with an integer key?Not at all. Unlike
map
,avl_array
is a sequence container. In amap
, insert/erase of one element doesn't affect the keys of the other elements. Inavl_array
, insert/erase of one element alters the indexes (call them keys if you prefer, but they are not ;) of all following elements in the sequence.
avl_array
:
with []
with ++
,
or with +=
?It's faster with ++. Do it like this:
for (it=container.begin(); it!=container.end(); it++) // BTW, use !=, not < { /* do something with *it */ }instead of:
for (i=0; i<container.size(); i++) { /* do something with container[i] */ }Comparing iterators with
==
or!=
is also faster than with<
,>
,<=
or>=
.
[]
or +=
?No. While every
iterator++
only needs a single step, the number of steps needed foriterator+=n
oriterator[n]
is at least2*log2(|n|)
(larger on average, and up to2*log2(size)
in the worst case). Forcontainer[n]
, the average number of steps is aboutlog2(size)
. Make your numbers.
NPSV stands for "Non-Proportional Sequence View". It is a mechanism that provides one or more additional random access indices. These indices are affected by insert/erase operations in the same way as the ordinary random access index (numbers from 0 to size-1). The difference is that they are not necessarily integers and they don't necessarily grow on a constant basis (by one in the ordinary random access index). See
avl_array.hpp
and the NPSV-related examples for more information.
For storing the data of text editors, text processors or internet browsers. You are using one right now (or you used it for printing what you have in your hands). In fact, this idea is already in use in GTK:
struct _GtkTextBTreeNode { //... int num_lines; /* Total number of lines (leaves) in * the subtree rooted here. */ int num_chars; /* Number of chars below here */ //... };Think of e-books, mobile phones...
stable_sort()
?Stable sort requires one extra unsigned integer per element, so it is disabled by default. Just specify
std::size_t
in the fourth parameter of theavl_array
template instantiation. You will need to fill the previous parameters too (allocator and NPSV width type). Use the default values if you don't need them. For example:avl_array<employee, // Your class std::allocator<employee>, // Default allocator for your class mkr::detail::empty_number, // No NPSV std::size_t> // Enable stable sort staff;
stable_sort()
when they are not wanted?None. All compilers should optimize them away when the corresponding template parameters are set to their default values. See rationale.
std::reverse_iterator
template?Because it would spoil, for reverse iterators, a valuable property: they remain valid, referring the same element until it is erased (or the iterator modified, of course). The
std::reverse_iterator
template is based on embedding a normal iterator referring the next element, thus making the validity of the reverse iterator dependent on what happens to that next element. See rationale.
It has an obvious memory cost, but it slightly simplifies the maintenance of the tree balance, and more important: it lows down the average and worst case time complexity of many operations, like iterators ++ and --, or massive insert/erase/move operations. See rationale.
Because they provide random access, but not with constant time complexity (O(1)). The specializations of generic algorithms for random access assume O(1) random access. This assumption (wrong in the case of
avl_array
) would make far slower than their bidirectional counterparts. Not only proportionally slower, but of a higher complexity order. A perfect example of this are the implementations ofreverse()
shown in the SGI STL iterator tags overview. See rationale.
avl_array
class?As those in
list
(seesplice
,unique
,reverse
,merge
andsort
in the SGI STL implementation), they support operations otherwise prohibitive or even impossible. They work in-place, without moving or copying the contained objects, and they benefit from the internal structure of the implementation to achieve a better performance (not only proportionally, but with a lower complexity order). See rationale.
There are two ways for multiple insert/erase/move operations: 1) repeating single insert/erase/move operations, maintaining the tree(s) integrity and restoring balance every time; and 2) ignoring the tree structure, working with the list, and rebuilding the tree(s) only in the end.
The first way has a cost of O(n log N), where n is the number of elements to insert/erase/move, and N is the total number of elements in the container. The second way has a cost of O(N). A fast estimation is made to decide which way is better in each case. See rationale.
It has no loops and it uses integer arithmetic only. Therefore, the cost is almost null compared with both the total time consumed by the operation and the gain achieved in extreme cases. The estimation will be possibly wrong in a small interval near the critical point (where both ways take the same time), but it will work well with the big numbers, thus saving a lot of execution time. See rationale.
Because they don't need to. Maintaining it would be a waste of space and time (in the case of tree nodes), or it would spoil valuable properties (in the case of iterators). See rationale.
avl_array
inherit from
avl_array_node_tree_fields
instead of having
a member of that class, or a simple pointer?Because that way the container object can be reached from the tree nodes by following the parent pointers until a NULL pointer is found. See rationale.
Just because. Well, I find them easier to understand (and implement, and maintain...) than Red-Black trees, which have more corner cases. AVL trees have better lookup time, because they guarantee better balance. On the other hand, for the same reason, one can expect Red-Black trees to make fewer rotations. This should be considered as a matter of implementation. It affects neither the interface nor the algorithmic complexity of any operation.
Revised 19 May, 2010
Copyright © Universidad de Alcalá 2006. See accompanying license.