Subject: Re: [boost] [container] New container: why is everybody silent?
From: Aleksandr Kupriianov (alexkupri_at_[hidden])
Date: 2014-08-25 02:48:29
Jeremy Murphy <jeremy.william.murphy <at> gmail.com> writes:
> OK. I might have missed it, but I can't see in the docs an explanation of
> how to choose L and M for a given data structure and cache line size. Is
> there a formula?
> I guess I'm concerned that these L and M values limit the ease of
> portability, by which I mean the additional effort required to consider the
> performance effects of the parameters on one machine or another is very
> different to the ease of use of std::vector.
Jeremy, I proposed good default values (L=30, M=60), which could be however
changed. The default values are good for most PCs. I've performed experiments
on my laptop and found that these values form plateau. I mean, if you change
L or M twice, the result changes not more than 7 to 10%. However, if you
decrease them strongly (L=4, M=4), you can lose 50% performance, if you
increase them strongly (M=500), insertion/deletion will be performed slowly.
___ To be brief, you should use default values ___. (Compare this with vector
behavior: you use default growth strategy, but you can change it if you want.)
If want to get best performance, you should take into account not only
machine, but also operations your application performs, also size of your
structure, also memory manager, also typical size of container. If the
application likes to read/access, large M is preferable. If it likes to
insert/delete, small M is preferable. Default L is good enough unless you
want to split/merge containers or perform bulk insertions/deletions (in that
case L should be slightly increased). Large L is slightly better for big
containers. Not to mention memory allocation: if ptree_seq contains 1
element, it allocates the whole leaf or M+ elements. So mobile apps can
prefer smaller L and M for memory saving.
___ To be brief, if you want 100% performance, you should tune L and M during
the profiling stage, when the whole application is complete ___.
> Well, the point of the bibliography is primarily for anyone looking at your
> data structure to understand why it is designed that way. A specific
> reference to a textbook or paper allows us to see that and also potentially
> see what you missed. If your implementation is based on a Wikipedia
> article then cite that. Maybe I'm banging the science drum too hard, but I
> personally find that kind of explicit trail of theory helpful.
I mentioned article in wikipedia about rope just because it helps the common
understanding. It shows how to keep sequence in tree-like structure. The idea
is quite simple: to keep number of elements in each subtree near the pointer
to that subtree. If you insert (delete) element, you modify log(N) subtrees,
so you update log(N) numbers. If you want to access an element with given
index, you use the search algorithm, which is very similar to tree-search
algorithm, but you use these numbers instead of index.
Also, I mention rope because it is very respectful and is included into gcc
under the name of __gnu_cxx::rope.
However, the rest is my optimizations. Firstly, I use btree instead of binary
tree, which makes data structure cache-efficient. Secondly, I use some
algoritmic tricks, which allow to improve performance (by constant value).
The best example is using recursion-free, one-pass update algorithm, which I
described on my title page.
To be brief, I used an already known concept of keeping sequence in tree-
like structure, and then effectively applied some known optimizations to it.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk