Boost logo

Boost :

From: Beman Dawes (bdawes_at_[hidden])
Date: 2005-06-02 07:16:59


"Chris Frey" <cdfrey_at_[hidden]> wrote in message
news:20050602055255.GA32207_at_netdirect.ca...
> Hi,
>
> I recently had a need to do directory lookups in C++ and thought I'd
> take a look at boost::filesystem. I ran the sample_ls.cpp from the
> documentation, and it worked great.
>
> The problem is that directory_iterator appears to return a pointer to a
> path object. This path object is then passed to functions such as
> is_directory() to find the type.
>
> On systems that support type information in the directory entry itself,
> this structure limits what data can be returned in an iteration.
>
> I would suggest something like (roughly):
>
> class path;
> class dirent;
> class directory_iterator {
> // ...
> const dirent * operator-> () const;
> // ...
> };
>
> class dirent {
> // some easy method to convert to path
> operator path ();
> // or perhaps a safer method would just be
> // to duplicate the path functions
> const std::string & leaf() const; // etc.
>
> // and then possible optimized versions
> bool is_directory() const;
>
> // ...
> };
>
> The members of dirent would mirror the available functions that use
> the path object. If no optimization is possible, it just calls what
> the user would have had to call anyway. But if it is possible to
> optimize certain items (such as a struct dirent containing d_type),
> this would be used, possibly saving a call to stat().
>
> Normally optimization should be left as implementation details... but
> in this case, I believe the class design limits what optimization
> is allowed. I would be pleased if I'm wrong on this.
>
> Thanks for reading this far.

The question has come up in the past, although I don't recall anyone
proposing exactly the solution you suggest. So the thoughts that follow are
based on considerable thought, although that doesn't always mean much.

Here is the analysis:

* That is a lot of additional interface complexity to support an
optimization that applies to Windows but not POSIX. Some of the other
schemes (which involved additional overloads to specific operations
functions) had less visible impact on the interface.

* There have been no timings to indicate the inefficiency of the current
interface actually impacts production applications.

* AFAICS, there is nothing in the current interface which prevents caching
of directory entries. Caching would probably aid more use cases than
proposed changes to interfaces. But both caching and user dirent storage
introduce serious additional race conditions. Not a good thing. In fact a
showstopper unless cache management is introduced, further complicating the
interface.

* There is another outstanding issue (lack of directory iterator filtering
and/or globbing) that does in fact impact both timing and ease of use of
real-world applications and so is the highest priority for future work.

Those are enough concerns to make the current interface look pretty good,
IMO.

Thanks for your interest,

--Beman


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk