Boost logo

Boost :

Subject: Re: [boost] [review] Last day for formal review for Sort library
From: Steven Ross (spreadsort_at_[hidden])
Date: 2014-11-21 00:01:24


Thanks for your input Dupuis.

On Thu Nov 20 2014 at 10:32:46 AM DUPUIS Etienne <e.dupuis_at_[hidden]> wrote:

> Greetings,
>
> I do not intend to send a formal review but I nevertheless compiled and
> executed the provided test code, with clang 3.5. I added timings and
> confirmed that in the provided test case, this sort is significantly
> faster than std::sort() provided by the latest Linux Mint distribution.
>
> To the author, I would suggest two very minor issues :
>
> 1. There are compiler warnings that can easily be avoided (seen with
> '-Wunused-parameter' and '-Wold-style-cast').
>

I've fixed the ones in my examples and library in the develop branch
(thanks for explaining how to detect them). Some examples use other boost
libraries which include plenty of code with old-style casts.

> 2. In string_sort_test.cpp, tests are performed using an array of 32
> elements. 10 000 or more should make a more convincing test.
>
> Thanks for pointing that out! I've increased it to 100000, adding a
trivial amount to the total test+build time.

> I have also seen in the documentation that samples are available.
> Indeed, they are in git. However, I would appreciate samples in the
> documentation, without any useless I/O code (useless to understand how
> to use the sort functions).

I do have some basic usage snippets in the documentation, but the idea
behind the links to the full example code is that I'll maintain the code,
and nothing necessary to run the code is missing. Would it help if I
created a library to pull most of the I/O code out of the examples, so that
most of what is visible is sorting?

> For example, taken from 'keyplusdatasample.cpp' :
>
> // --------------- Start of code snippet ------------------------
>
> struct DATA_TYPE {
> int key;
> std::string data;
> };
> struct lessthan {
> inline bool operator()(const DATA_TYPE &x, const DATA_TYPE &y) const
> {
> return x.key < y.key;
> }
> };
>
> struct rightshift {
> inline int operator()(const DATA_TYPE &x, const unsigned offset) {
> return x.key >> offset;
> }
> };
>
> std::vector<DATA_TYPE> array;
> integer_sort(array.begin(), array.end(), rightshift(), lessthan());

I'm confused. This code snippet is almost identical to the one at the
bottom of the integer_sort.html page. What am I missing?

> // --------------- End of code snippet ------------------------
> Having concise code snippets like the above in the documentation is in
> my opinion the best way to have users quickly understand how the library
> must be used.
>
> I would also like to see one or two more example using "complex" data
> types. Consider for example :
>
> // --------------- Start of code snippet ------------------------
> struct Md5Sum {
> uint8_t bytes[16];
>
> bool operator<(const Md5Sum& md5) const {
> for (int k = 0; k < 16; k++)
> if (bytes[k] < md5.bytes[k])
> return true;
> return false;
> }
> };
>

This seems very similar to the structure example at the bottom of
string_sort.html. How is this example superior to that one?

>
> // --------------- End of code snippet ------------------------
> After reading your documentation and the provided samples, I understand
> that I should use string_sort() after defining some get_char() and
> get_length() functions. Am I right ?
>

Yes

>
> Another example, sorting by birth data and then by name :
>
> // --------------- Start of code snippet ------------------------
> struct Id {
> std::string name;
> time_t birth;
>
> bool operator<(const Id& id) const {
> return (birth < id.birth) || ((birth == id.birth) && (name <
> id.name));
> }
> };

// --------------- End of code snippet ------------------------
> Again, I understand that I must use string_sort and split the birth date
> into 8-bit components. Am I right ?
>

That's a good idea! I'll add such an example if the library is deemed
worthy of including in boost.

>
> These samples are important as there is probably much more people
> sorting structures than simple integers.
>
> You're probably right. Key-plus-data is how most of my sorts at work
operate.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk