Boost logo

Boost :

From: Beman Dawes (bdawes_at_[hidden])
Date: 2006-03-04 17:09:36


"Ion Gaztañaga" <igaztanaga_at_[hidden]> wrote in message
news:4409CCA4.6050107_at_gmail.com...
> Hi to all,
>
> While revising a bit the boost::filesystem error/exception use for
> looking a model for interprocess library and after reading the N1934
> "Filesystem Library Proposal for TR2 (Revision 2)" I'm a bit concerned
> about the heavy use of std::string/string_type in the library.
>
> The following functions return some kind of path as value type:
>
> // observers
> const string_type string() const;
> const string_type file_string() const;
> const string_type directory_string() const;
>
> const external_string_type external_file_string() const;
> const external_string_type external_directory_string() const;

They are specified to be able to return either a const value or const ref.
The Boost implementation is more concerned with proving the interface than
blind efficiency, so it returns by const value. A POSIX narrow character
paths implementation, for example, need perform no conversion at all, so it
could use const ref returns.

> string_type root_name() const;
> string_type root_directory() const;
> basic_path root_path() const;
> basic_path relative_path() const;
> string_type leaf() const;
> basic_path branch_path() const;
>
> I know that syntactically this is nicer and we can have RVO:
>
> string_type str = path.string();
>
> But when iterating through directories I find that that returning a
> temporary object that must allocate/copy/deallocate hurts my performance
> paranoia.

As Caleb points out, it is premature optimizaton to talk about "hurting
performance" in the absence of timings in realistic use scenarios.

That said, if you can come up with a realistic use case that really does
show significant slow-down compared to some alternate interface, it would be
worth talking about.

>...
>
> Apart from this I see that path::iterator has a string member.
> dereference will return a reference to that member but an iterator is
> supposed to be a "lightweight" pointer-like abstraction, which is
> value-copied between functions. A string member, in my opinion, converts
> an iterator in a heavy class (that depends on the string length, but an
> small string optimization of 16 bytes is not going to help much).

That's an implementation detail. It isn't required by the spec, although
that may be the most obvious way to implement the spec. An alternate
implementation would be to keep a pool of directory entry objects and
recycle them if performance was a concern. It would be great if Boost had a
cache library to make such a strategy trivial to implement.

If there is a way to modify the interface to make it easier to create
high-performance implementations, that would be of interest as long as it
didn't do too much violence to the usual STL ideoms.

> Now that filesystem is proposed for the standard I would like to ask
> boosters (and Beman, of course) if they find these performance concerns
> serious enough.

The more scrutiny the better!

Thanks for the comments,

--Beman


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk