Boost logo

Boost :

From: degski (degski_at_[hidden])
Date: 2020-02-09 13:27:02


On Sun, 9 Feb 2020 at 07:18, degski <degski_at_[hidden]> wrote:

> On Sat, 8 Feb 2020 at 11:43, Joana Modesto Hrotko via Boost <
> boost_at_[hidden]> wrote:
>
>> Dear Boost community,
>>
>> I am a Master's student and currently working on my Master thesis.
>> My Master Thesis consists in implementing the k2-tree data structure in
>> C++ and one of the additional proposals for my thesis would be to
>> integrate my code with the boost graph library.I believe this
>> integration
>> would bring great value to the boost library and also it would provide a
>> better validation for my work.
>> So my question is if this would be something of interest for the library
>> to integrate with boost graph library.
>>
>> INFO ABOUT K2-TREE:
>> The k2-tree have been proved successful to represent in a very compact
>> way different kinds of binary relations, such as web graphs, RDFs or
>> raster data.
>> Binary relations can be an abstraction to represent the relation between
>> the objects of two collections of different nature: graphs, trees,
>> strings, among others. They can be used in several low-level structures
>> within a more complex information retrieval system, or even replace one
>> of the most used one: an inverted index can be regarded as a binary
>> relation
>> between the vocabulary of terms and the documents where they appear.
>> Moreover,
>> it can represent the relation between terms and web pages, or even
>> social
>> networks or the connection between the web pages in the Web, which is
>> normally
>> represented as a web graph. k2-trees have been also used in other
>> scenarios,
>> such as geographical and RDF data, or images.
>>
>> Brisaboa, Nieves R., et al. "Compressed representation of dynamic binary
>> relations with applications." Information Systems 69 (2017): 106-123.
>> Coimbra, Miguel E., et al. "On dynamic succinct graph representations."
>> arXiv preprint arXiv:1911.03195 (2019).
>>
>
> Great subject for a thesis. Munro, the man of the beap
> <https://github.com/pfalcon/beap> (a dynamic implicit data structure
> <https://github.com/pfalcon/awesome-implicit-data-structures>). I don't
> really understand why you would like a dynamic k2-tree, you can just as
> well have a static one
> <https://github.com/degski/KD-Tree/blob/master/KD-Tree/ikdtree.hpp>
> [Eytzinger-tree, the levels are implicit and the sorting is recursively
> done, partially ( with std::nth_element, (in-place) sorting around the
> median (the nth-element) ) along the way, getting more sorted on lower
> levels, on my laptop I can easily run the recursion, without exhausting
> stack-space, till my heap space starts to run out (with default stack size
> settings)] and copy the data on re-balancing (or re-balance the tree on
> copying), as, in general, one insertion will completely change the tree,
> due to the alternating dimensions, and the mechanics of updating is
> out-performed very quickly by the recursive (in place) construction. Having
> said that, I think there is still some (maybe a lot of) mileage into
> exploring how these things can/should be well implemented in C++17.
>
> Best of luck with your thesis.
>
> degski
> --
> @realdegski
> https://brave.com/google-gdpr-workaround/
> "We value your privacy, click here!" Sod off! - degski
> "Anyone who believes that exponential growth can go on forever in a finite
> world is either a madman or an economist" - Kenneth E. Boulding
> "Growth for the sake of growth is the ideology of the cancer cell" -
> Edward P. Abbey
>

I just realized, one can completely (and simply) hide the
copying/re-construction behind a dynamic (vector-like) interface, and the
dynamic implicit kd-tree is a fact. This is now on my TODO-list. KD-trees
of dimensions 1, 2 and 3 seem to be the norm (at 4 dimensions one runs
already in the curse of dimensionality, and linear search does better).

degski

-- 
@realdegski
https://brave.com/google-gdpr-workaround/
"We value your privacy, click here!" Sod off! - degski
"Anyone who believes that exponential growth can go on forever in a finite
world is either a madman or an economist" - Kenneth E. Boulding
"Growth for the sake of growth is the ideology of the cancer cell" - Edward
P. Abbey

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk