Boost logo

Boost :

From: Ion Gaztañaga (igaztanaga_at_[hidden])
Date: 2006-03-05 05:18:11


Hi Beman,

> As Caleb points out, it is premature optimizaton to talk about "hurting
> performance" in the absence of timings in realistic use scenarios.
>
> That said, if you can come up with a realistic use case that really does
> show significant slow-down compared to some alternate interface, it would be
> worth talking about.

Ok. I see it like "premature pessimization", but you are right about a
realistic use case. Scanning recursively a directory looking for files
that have an extension (say for example, looking for mp3 files) is in my
opinion a realistic use case. Obviously, looking for files will be
slower than returning "path.leaf()" (although maybe the OS catches
directory entries in memory) but apart from the speed, I think that the
important point is the memory stress you force creating a temporary
every time you want to obtain the name of the file. The filesystem
operations are maybe slower but surely the OS is carefully avoiding heap
fragmentation using internal pools, while the user is creating a lot of
temporaries. I will try to implement this use case if you agree.

The "path" class is also a class not related with disk operations (by
the way, we can mount a filesystem in memory so operations can be fast,
or it can represent a shared memory, following "Everything is a file"
UNIX philosophy). Is it realistic to store a lot of path objects in a
containers and request operations like leaf(), root(), etc...? I don't
know. But I see the path class as a pseudo-container of strings
representing a hierarchy. Path could represent a file or any other
hierarchy in the operating system, because is quite generic.

>> Apart from this I see that path::iterator has a string member.
>> dereference will return a reference to that member but an iterator is
>> supposed to be a "lightweight" pointer-like abstraction, which is
>> value-copied between functions. A string member, in my opinion, converts
>> an iterator in a heavy class (that depends on the string length, but an
>> small string optimization of 16 bytes is not going to help much).
>
> That's an implementation detail. It isn't required by the spec, although
> that may be the most obvious way to implement the spec. An alternate
> implementation would be to keep a pool of directory entry objects and
> recycle them if performance was a concern. It would be great if Boost had a
> cache library to make such a strategy trivial to implement.

You are right. I need to concentrate on the interface. As a comment
looking the code, since iterator returns a const string reference, you
could also add a vector of strings to the path class, so that the
iterator could be the const_iterator of the vector. You could avoid the
string member and have trivial increment/copy operations. You are
requesting more memory to the path class, though.

> Thanks for the comments,

Thanks for your quick reply,

Ion


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk