Boost logo

Boost :

From: Beman Dawes (bdawes_at_[hidden])
Date: 2006-03-05 21:35:58

"Ion Gaztañaga" <igaztanaga_at_[hidden]> wrote in message
> Hi Beman,
>> As Caleb points out, it is premature optimizaton to talk about "hurting
>> performance" in the absence of timings in realistic use scenarios.
>> That said, if you can come up with a realistic use case that really does
>> show significant slow-down compared to some alternate interface, it would
>> be
>> worth talking about.
> Ok. I see it like "premature pessimization", but you are right about a
> realistic use case. Scanning recursively a directory looking for files
> that have an extension (say for example, looking for mp3 files) is in my
> opinion a realistic use case. Obviously, looking for files will be
> slower than returning "path.leaf()" (although maybe the OS catches
> directory entries in memory) but apart from the speed, I think that the
> important point is the memory stress you force creating a temporary
> every time you want to obtain the name of the file. The filesystem
> operations are maybe slower but surely the OS is carefully avoiding heap
> fragmentation using internal pools, while the user is creating a lot of
> temporaries. I will try to implement this use case if you agree.

That's a use case that I would be interested in.

But also remember that objectives of the library including ease-of-use for
script-like programs, and in general valuing clean design over the last iota
of efficiency. I'd also like it to feel familiar to standard library users.

I have a long personal history of adding kinky little bits and pieces to
designs (mostly for efficiency) and then regretting it later.

> The "path" class is also a class not related with disk operations (by
> the way, we can mount a filesystem in memory so operations can be fast,
> or it can represent a shared memory, following "Everything is a file"
> UNIX philosophy). Is it realistic to store a lot of path objects in a
> containers and request operations like leaf(), root(), etc...? I don't
> know. But I see the path class as a pseudo-container of strings
> representing a hierarchy. Path could represent a file or any other
> hierarchy in the operating system, because is quite generic.
>>> Apart from this I see that path::iterator has a string member.
>>> dereference will return a reference to that member but an iterator is
>>> supposed to be a "lightweight" pointer-like abstraction, which is
>>> value-copied between functions. A string member, in my opinion, converts
>>> an iterator in a heavy class (that depends on the string length, but an
>>> small string optimization of 16 bytes is not going to help much).
>> That's an implementation detail. It isn't required by the spec, although
>> that may be the most obvious way to implement the spec. An alternate
>> implementation would be to keep a pool of directory entry objects and
>> recycle them if performance was a concern. It would be great if Boost had
>> a
>> cache library to make such a strategy trivial to implement.
> You are right. I need to concentrate on the interface. As a comment
> looking the code, since iterator returns a const string reference, you
> could also add a vector of strings to the path class, so that the
> iterator could be the const_iterator of the vector. You could avoid the
> string member and have trivial increment/copy operations. You are
> requesting more memory to the path class, though.

Yes, if you are willing to expend more memory in the path class itself, you
can gain a lot of theoretical speed, particularly on non-POSIX
implementations where the portable syntax and the native syntax differ.


Boost list run by bdawes at, gregod at, cpdaniel at, john at