Boost logo

Boost Users :

From: Rush Manbert (rush_at_[hidden])
Date: 2005-11-14 12:53:46


Hi all,

I have written a class around the glob_iterator class that W. Richard
Johnson put into the file vault last year. My class finds all the paths
and/or files matching a multi-level wildcard pattern. (i.e. the matching
expressions may appear in more than one place, such as
"/Users/*/MySource/??foo/*.cpp")

I need to detect loops caused by links between directories, and I also
want to detect multiple paths to the same file, and only keep one of
them. The only method that I have come up with so far is to keep the
paths that I have found in a collection and test each new path against
all of them by calling equivalent() once for each collection member.
Needless to say, this is a little slow when the number of files or
directories becomes large. It could be faster if I only tested symlinks,
but I think that the existence of hard links forces me to test everything.

My first question: Can anyone suggest a better method of achieving the
duplicates detection?

My second question (This is probably directed at Beman Dawes): I read
the code for equivalent(), and it seems like I could derive a class from
path that would implement operator< in such a way that it sorted by the
equivalence information, rather than by the path string. If I could sort
the paths by their equivalence information, then I could binary search
the collection when I look for duplicates.

I would have to copy the code in equivalent(), which is an ugly thing to
do, and I guess I would need to define all of the operators that are
defined in path, since they are not virtual. Assuming that I did these
things, though, does this sound like it's even valid? From the comments
that I read in the equivalent() source, it seems that the file
equivalence information can change, at least on a windows system. I
guess I could create a handle_wrapper for each path that I add to the
collection in order to keep the file from being moved while it's in the
collection.

Of course, I can't be the first person who has gone down this path, or
the library would probably already support this sort of sorting. Is this
just a stupid idea?

Thanks,
Rush


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net