Boost logo

Boost :

From: Ion Gaztañaga (igaztanaga_at_[hidden])
Date: 2006-03-04 12:21:40


Hi to all,

  While revising a bit the boost::filesystem error/exception use for
looking a model for interprocess library and after reading the N1934
"Filesystem Library Proposal for TR2 (Revision 2)" I'm a bit concerned
about the heavy use of std::string/string_type in the library.

The following functions return some kind of path as value type:

// observers
const string_type string() const;
const string_type file_string() const;
const string_type directory_string() const;

const external_string_type external_file_string() const;
const external_string_type external_directory_string() const;

string_type root_name() const;
string_type root_directory() const;
basic_path root_path() const;
basic_path relative_path() const;
string_type leaf() const;
basic_path branch_path() const;

I know that syntactically this is nicer and we can have RVO:

string_type str = path.string();

But when iterating through directories I find that that returning a
temporary object that must allocate/copy/deallocate hurts my performance
paranoia. Even with move semantics we have a an overhead:

std::vector<std::path> paths;
//fill with paths

std::path::iterator beg = paths.begin(), end = paths.end();

for(; beg != end; ++it){
    std::path::string_type str = it->root_name();//temporary created
    str += "append some data";
    std::cout << str;
}

Couldn't be better (although uglier) to take a reference to a string
that will be filled?

void fill_root_name(string_type &root_name) const;
...

////////////////////////////////////////////////////////////////////

std::vector<std::path> paths;
//fill with paths

std::path::string_type root_name;

root_name.reserve(PATH_LENGTH);

std::path::iterator beg = paths.begin(), end = paths.end();

for(; beg != end; ++it){
    it->fill_root_name(root_name);
    str += "append some data";
    std::cout << str;
}

This way we only allocate memory if we don't have enough place in our
string. We can also reserve it beforehand to speed up code.

Apart from this I see that path::iterator has a string member.
dereference will return a reference to that member but an iterator is
supposed to be a "lightweight" pointer-like abstraction, which is
value-copied between functions. A string member, in my opinion, converts
an iterator in a heavy class (that depends on the string length, but an
small string optimization of 16 bytes is not going to help much).

Now that filesystem is proposed for the standard I would like to ask
boosters (and Beman, of course) if they find these performance concerns
serious enough.

Regards,

Ion


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk