|
Boost : |
From: Jooseong Lee (frozenca91_at_[hidden])
Date: 2022-07-25 15:03:31
>I see that you have O(log N) n-th element access, that it is an
>interesting feature. Does this mean that every non-leaf node has a
>count of its descendants?
Yes, every non-leaf (and leaf) node has a count of keys in the subtree
where that node is the root.
> I don't see how erase(a,b) can be O(log N), is that a typo?
No, it's not.
In order to explain this, I have to explain split() first:
split(Tree&& tree, key a, key b) splits a tree to two trees, say tree_l and
tree_r, so that
tree_l contains all keys less than a, tree_r contains all keys greater than
b.
By B-Tree's nature, this is O(log N).
erase(a, b) works like this:
1. compute lb = lower_bound(a), ub = upper_bound(b) pair. These are
iterators. O(log N)
2. compute order(lb), order(ub). Therefore, order(ub) - order(lb) is the
number of elements that will be erased. O(log N)
3-1. When order(ub) - order(lb) < 30, the tree erases element one by one
3-2. When it exceeds 30, the algorithm first splits the tree to two trees
by calling spit(std::move(*this), lb, ub)
and joins the two split trees. split() and join() is O(log N), and there is
one split() call and join() call,
so the whole complexity is O(log N)
I plan to benchmark mine with abseil's B-Tree soon.
Thank you for pointing out Beman's implementation and sqlite, I'll list
them too.
I'll add Boost license too, thanks for that
Best,
Jooseong
2022ë
7ì 25ì¼ (ì) ì¤í 11:29, Phil Endecott via Boost <boost_at_[hidden]>ëì´
ìì±:
> Jooseong Lee wrote:
> > I recently wrote a general-purpose STL-like ordered associative B-Tree
> > based container in C++.
> > .
> > Its API and usage is very similar to std::set, std::map, std::multiset,
> > std::multimap,
> > and when the number of elements is relatively large, my benchmark shows
> > that it is 2.5~3 times faster than std::set and its friends.
> >
> > Repo link: https://github.com/frozenca/BTree
>
> I would be interested to see a comparison with Beman Dawes'
> B-Tree library from 2010:
>
> https://lists.boost.org/Archives/boost//2010/09/170995.php
> https://github.com/Beman/btree
>
> Personally, I often use flat sets/maps. Of course these are best
> suited when lookups dominate over modifications. It would be interesting
> to know how your library compares for lookups.
>
> I see that you have O(log N) n-th element access, that it is an
> interesting feature. Does this mean that every non-leaf node has a
> count of its descendants?
>
> I don't see how erase(a,b) can be O(log N), is that a typo?
>
> In the past I have used a lot of memory-mapped, mostly read-only,
> binary data structures but these days I'm more likely to use
> sqlite. Have you tried to benchmark against that?
>
> I note it is not Boost licensed.
>
>
> Regards, Phil.
>
>
>
>
>
> _______________________________________________
> Unsubscribe & other changes:
> http://lists.boost.org/mailman/listinfo.cgi/boost
>
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk