|
|
AVL ArrayRationale |
Download |
This rationale explains the main design decisions. The overview should be read first. The design part of the FAQ can be used as a light version of this rationale.
The end position (or rather past-the-end) must be taken into account while designing containers and iterators. It is a special iterator value that a) doesn't refer any element and therefore can't be dereferenced, and b) is still somehow connected to the container.
Two options can be considered for representing this end position:
--
--)
value_type)
The second option has been chosen for its obvious advantages. The no-payload-dummy problem is discussed below.
Handling lists and trees, it's usual to have a special node called sentinel that is never deallocated, so that the head or root pointers always point to a non-NULL address. This simplifies algorithms managing them. For example, in a circular doubly linked list without a sentinel, the code for appending one element would be:
if (first==NULL)
{
first = newnode;
newnode->next = newnode->prev = newnode;
}
else
{
newnode->next = first;
newnode->prev = first->prev;
newnode->next->prev = newnode->prev->next = newnode;
}
With a sentinel before the beginning of the list, it would be:
Note that, in circular doubly linked lists, having a sentinel before the beginning and having it after the end are exactly the same thing. In annewnode->next = sentinel; newnode->prev = sentinel->prev; newnode->next->prev = newnode->prev->next = newnode;
avl_array
the sentinel node is the dummy node at the same time. It represents the
position for both end() and rend(). It is the root
of the tree and its parent and right child pointers are always NULL.
In linked data structures like lists or trees there's no reason for invalidating iterators on insertion/removal operations (unless for those iterators referring the elements to be removed, of course). This is a very valuable property for the user.
Reverse iterators could have been defined with the std::reverse_iterator
template. This template is usually based on embedding a normal iterator
referring the next element of the intended one. For example: the reverse iterator
returned by rbegin() would contain a normal (straight) iterator
referring past-the-end.
With this technique, the validity of every reverse iterator does not depend on the intended element, but on its neighbor. In other words, removing an element invalidates the reverse iterators that refer to the previous element. For this reason, reverse iterators have been defined from scratch, just like normal (straight) iterators.
Neither iterators nor reverse iterators contain anything but a pointer to the
node of the refered element. Thus, they remain valid until the refered element
is erased, even when it is moved within the container, or into a different
container (move(), splice(), swap() and
other operations).
As it has been commented in the overview,
a special kind of rank trees has been chosen for avl_array. Normal
rank trees store, in every node, the rank of the node in its subtree. That is,
the count of nodes in it's left subtree:
dummy,7
/
D,3
/ \
B,1 F,1
/ \ / \
A,0 C,0 E,0 G,0
The total number of nodes can be calculated by traversing the right-most branch.
When a node is added/removed, some ranks must be incremented/decremented in the
path to the root (only in nodes that have the modified branch in their left
subtree).
In the trees used by avl_array, every node stores the total count
of nodes in its subtree (including the node itself):
dummy,8
/
D,7
/ \
B,3 F,3
/ \ / \
A,1 C,1 E,1 G,1
The main advantage of this approach is the simplicity of the update algorithm:
when a node (or a subtree) is added/removed, all counts must be updated
from this point up in the way to the root. The count of a node is updated by
adding the counts of its two children plus one.
AVL trees have been chosen for simplicity. They are probably easier to understand (and implement, and maintain...) than Red-Black trees, which have more corner cases. AVL trees have better lookup time, because they guarantee better balance. On the other hand, for the same reason, one can expect Red-Black trees to make fewer rotations. This should be considered as a matter of implementation. It affects neither the interface nor the algorithmic complexity of any operation.
The dummy node is not aggregated as a member variable to the container class. Instead, it is inherited:
avl_array_node_tree_fields
(without value_type)
|
________|________
| |
| |
avl_array avl_array_node
(without value_type) (with value_type)
All pointers in the base class (m_parent, m_next,
m_prev, m_children[L] and m_children[R]) are
pointers to the base class, though they will point to objects of the derived
classes. Pointer conversions are always done with static casts.
Ordinary nodes have non-NULL values in their m_parent pointer, while
the dummy node has always the NULL value in its m_parent pointer.
This fact is exploited for ensuring the correct use of the nodes via assertions
and/or exceptions.
The container of any element can be obtained in logarithmic time (with a travel to the root), so neither nodes nor iterators need to have a pointer to the container. Note that having such a pointer in iterators would invalidate them when the referenced element was moved from a container to another, and having it in nodes would require a lot of extra memory and time for maintenance.
The pointer to the container is only needed in some operations that already have logarithmic complexity or above. Some operations with lower complexity can optionally use it for assertions regarding the correct use of the library (see asserts and exceptions below).
The iterators of avl_array provide random access, but they are
not fully qualified random access iterators. The reason is that random
access operations are provided with O(log N) complexity, instead of O(1).
Algorithms specialized for random access iterators assume O(1) random access
in their optimizations.
Therefore, avl_array iterators have been tagged as bidirectional instead of random access.
A perfect example for this point are the implementations of reverse()
shown in the
SGI STL iterator tags
overview. The specialization for bidirectional iterators uses a loop
like this one:
The specialization for random access iterators uses a loop like:while (first != last && first != --last) // Two comparisons iter_swap(first++, last);
In both caseswhile (first < last) // Just one comparison iter_swap(first++, --last);
iter_swap() makes a by-value swap of the
referenced elements.
The random access version benefits from the < operator,
saving one comparison per step. This is a gain for truly random access
iterators: complexity is still O(n), but a slight proportional time
reduction is achieved. Though, with avl_array iterators,
this would raise the complexity to O(n log N). For example, with an
avl_array containing 1000 elements the < operation
would be, on average, 10 times slower (or worse) than the !=
operation.
On the other hand, operators ++ and -- are
in avl_array iterators as fast as in list:
O(1) worst case (not only amortized). For this reason, it's much better
to stick to the algorithm specializations for bidirectional iterators,
especially if they work by value (copying instead of extracting and
inserting).
Two additional parameters of the iterator classes templates (Ptr
and Ref) support the 2-in-1 trick used for
instantiating two iterator versions (mutable and const) using a single
template. The drawback is that it hardens making them friend of each
other. The workaround for this little inconvenience is using the
avl_array class as a friendship gateway between different kinds
of iterators, including reverse iterators by the way.
Individual insert/erase/move operations have O(log N) time complexity. Group operations would have O(n log N) time complexity if they were implemented as n individual operations. Indeed, this is how they are performed in cases where n is small compared with N. But when n is not so small, an algorithm with O(N) complexity is used.
The perfect example is the copy constructor: it might be implemented
with a loop calling push_back() N times. This would have
O(N log N) complexity.
A better approach would be copying the tree structure of the source
container. An even better approach (and the chosen one) is building
the tree in perfect balance. The method build_known_size_tree()
builds a tree in perfect balance, without function recursion
and with O(N) time complexity. The same method is used in all
constructors, excepting the default constructor, which only needs to
initialize an empty avl_array.
But constructors are not the only place where this approach is useful.
Any group operation can benefit from it when the number of elements
to insert/erase/move is large enough. The method worth_rebuild()
detects this by comparing N with n·log2(N), or more exactly:
final_N with n·log2(average_N). When n is great enough, the operation
is performed ignoring the tree structure (working with the linked list
only) and rebuilding the tree only in the end.
Loops and floating point calculations are avoided in
worth_rebuild(). This implies some loss of precision, but a
broad estimation suffices here. Note that the comparison assumes that
both complexities are translated into time multiplying by the same
constant, which is obviously false. Therefore, the estimation will
probably favor one side or the other near the critical point, where
both algorithms take the same time. Though, it will work for the big
numbers, saving a lot of time.
The algorithms unique() and merge() always
use build_known_size_tree().
Global algorithms (like reverse() or merge())
that work by value, leave iterators referring the same positions of the
sequence. As in list, local versions of some algorithms
are provided in avl_array. These algorithms don't work
copying payload values (value_type) but changing nodes'
links. This is a great advantage when the value_type is
big, or it's assignment heavy. Additionally, these algorithms benefit
from the internal structure of the implementation in two ways:
merge() or massive move/insert/erase
operations treat the container as a list, and rebuild the tree
only in the end (perfectly balanced and in record O(N) time), thus
achieving O(N) complexity instead of O(N log N).
The search tree sort algorithm is not stable. A stable version can be
provided, but it requires O(N) extra memory: one extra size_t
field per element. The extra field of every element stores its
old position, so that the previous order between equal elements can be
reproduced in the new tree. For this reason, stable_sort()
is by default disabled. The type of the extra field is the
fourth parameter of the avl_array template instantiation.
Its default value is empty_number, a class intended to be
optimized away by the compiler. Specifying std::size_t
instead as fourth parameter enables stable_sort().
See Frequently Asked Questions for a brief description of this feature.
This feature is expected to be required only in very particular
situations. The class empty_number is used again as
default value for the NPSV width type parameter (the third one
in the avl_array template).
The NPSV is based on the same core idea of avl_array:
in every node a field stores the total amount of space occupied by
its subtree in the sequence. The main difference is that in the
normal sequence view (or rank) every node occupies one unit of
space while, in the NPSV, every node occupies its own width (stored
in another field).
The NPSV widths are added bottom-up, like the nodes counts. The
operations that modify the tree must update them. Therefore, this
feature must be implemented inside the avl_array class.
As it was explained in the overview, every node stores the number of nodes in its subtree (including the node itself), instead of its rank in the subtree. The same scheme is used in NPSV, where the rank approach would be prone to cumulative errors if a floating point type was used as width.
Changing the NPSV width of a node requires updating the width sums
of every node in the path to the root. The time complexity of this
operation is O(log N). Changing every width in an avl_array
is, therefore, O(N log N). The lazy sums update mode provides
a better option: NPSV widths can be changed postponing the sums
update, achieving O(N) time complexity instead of O(N log N).
A composite width can be used for the NPSV, providing more than
one view at the same time. An overloaded version of
npsv_at_pos() uses a comparison functor for finding a
node using the corresponding view.
Revised 22 November, 2006
Copyright © Universidad de Alcalá 2006. See accompanying license.