Boost logo

Boost :

From: Jooseong Lee (frozenca91_at_[hidden])
Date: 2022-07-26 15:19:13


> What is not stored contiguously is child, this is stored in
std::vector<std::unique_ptr>.
> But when N keys are erased, the number of children erased is not O(N),
because a node can have #(fanout) keys.

I have to correct this sentence, the number of children erased is O(N).
But the coefficient is small, because practically B-Trees use large fanouts.

Most technically, a call for erase_range(a, b) would be O(log N) + O(K)
where N is the number of whole keys and K is the number of keys being
erased.
I've updated the repo description, thanks for the catch.

Best,
Jooseong

2022년 7월 27일 (수) 오전 12:06, Jooseong Lee <frozenca91_at_[hidden]>님이 작성:

> > If the element type has a trivial destructor and they are stored
> contiguously,
> > then you can try to argue that it is better than O(N). This will depend
> on
> > the complexity guarantees of your memory allocator. But your elements
> are not
> > stored contiguously; destroying N elements will require O(N) "pages" to
> be
> > released, if your pages are fixed-size. So it is still O(N).
>
> Actually, the keys are stored contiguously:
>
>
> https://github.com/frozenca/BTree/blob/bc62015a3789e5004ec98be0b82587de4edfd488/fc_btree.h#L145-#L167
>
> For trivially copyable types, the type of key array is std::array;
> for other types, the type of key array is std::vector. (in this case, your
> argument would be right)
>
> What is not stored contiguously is child, this is stored in
> std::vector<std::unique_ptr>.
> But when N keys are erased, the number of children erased is not O(N),
> because a node can have #(fanout) keys.
>
> Best,
> Jooseong
>
> 2022년 7월 26일 (화) 오후 11:40, Phil Endecott via Boost <boost_at_[hidden]>님이
> 작성:
>
>> Jooseong Lee wrote:
>> >> > I don't see how erase(a,b) can be O(log N), is that a typo?
>> >>
>> >> No, it's not.
>> >> In order to explain this, I have to explain split() first:
>> >> split(Tree&& tree, key a, key b) splits a tree to two trees, say tree_l
>> >> and tree_r, so that
>> >> tree_l contains all keys less than a, tree_r contains all keys greater
>> >> than b.
>> >> By B-Tree's nature, this is O(log N).
>> >>
>> >> erase(a, b) works like this:
>> >> 1. compute lb = lower_bound(a), ub = upper_bound(b) pair. These are
>> >> iterators. O(log N)
>> >> 2. compute order(lb), order(ub). Therefore, order(ub) - order(lb) is
>> the
>> >> number of elements that will be erased. O(log N)
>> >> 3-1. When order(ub) - order(lb) < 30, the tree erases element one by
>> one
>> >> 3-2. When it exceeds 30, the algorithm first splits the tree to two
>> trees
>> >> by calling spit(std::move(*this), lb, ub)
>> >> and joins the two split trees. split() and join() is O(log N), and
>> there
>> >> is one split() call and join() call,
>> >> so the whole complexity is O(log N)
>>
>> You have to call the destructors on the elements that you have erased,
>> so it must be O(N) in general. Your description skips the part where you
>> drop the portion of the tree between the two splits.
>>
>> If the element type has a trivial destructor and they are stored
>> contiguously,
>> then you can try to argue that it is better than O(N). This will depend on
>> the complexity guarantees of your memory allocator. But your elements
>> are not
>> stored contiguously; destroying N elements will require O(N) "pages" to be
>> released, if your pages are fixed-size. So it is still O(N).
>>
>>
>> Regards, Phil.
>>
>>
>>
>>
>>
>> _______________________________________________
>> Unsubscribe & other changes:
>> http://lists.boost.org/mailman/listinfo.cgi/boost
>>
>


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk