Boost logo

Boost :

From: Mountfort, Geoff (gmountfort_at_[hidden])
Date: 2002-02-21 22:37:55


>I'm not convinced this is the way to do it,
>but the conversions definitely need to happen automatically in a
>library somewhere

Some may consider that they should not happen automatically - consider
std::string.

Geoff

-----Original Message-----
From: mfdylan [mailto:dylan_at_[hidden]]
Sent: Friday, 22 February 2002 2:30 PM
To: boost_at_[hidden]
Subject: [boost] Re: Any interest in a filename (parser) class?

I'm interested in getting feedback on whether something like the
following is a workable design:

template <class E, class T = platform_dependent_default>
class basic_pathname
{
public:
        typedef std::basic_string<E> string;

// constructors etc.
        basic_pathname(const string& name);
        basic_pathname(const E* name)
        template <class U>
        basic_pathname(const basic_pathname<E, U>& other);
        ~basic_pathname();
        pathname& operator=(const pathname& other);

// full name as a string
        const string& str() const;

// everything up to and including the last '/', "." if none
        pathname dirname() const;

// everything from the first '.' after the last '/' onward
        pathname suffix() const;

// everything after the last '/', stopping at the first '.'
        pathname basename() const;

// everything after the last '/'
        pathname filename() const;

// everything after the last '.' after the last '/'
        pathname extension() const;

// everything up to and include the first '/', "" if none
        pathname rootname() const;

// true if any part of the filename ends with this pattern
        bool suffix_is(const string& pattern);

// sets the dirname as defined above, adds '/' if needed
        void dirname(const string& name);

// sets the suffix as defined above (should begin with '.')
        void suffix(const string& name);

// sets the basename as defined above
        void basename(const string& name);

// sets the filename as defined above
        void filename(const string& name);

// sets the extension as defined above, adds '.' if needed
        void extension(const string& name);

// sets the rootname as defined above
        void rootname(const string& name);

// platform-dependent check whether dir is fully specified
        bool is_absolute() const;

// inverse of above
        bool is_relative() const;

// generate an absolute path relative to wd
        pathname absolute(const pathname& wd) const;

// generate an absolute path relative to OS's cwd
        pathname absolute() const;

// determine relative path given wd
        pathname relative(const pathname& wd) const;

// determine relative path using OS's cwd
        pathname relative() const;

// true if path is empty
        bool empty() const;

// platform-dependent comparison
        int compare(const basic_pathname& other) const;
};
typedef basic_pathname<char> pathname;
typedef basic_pathname<wchar_t> wpathname;

// also allow operator==, operator!=, maybe >, >= and <, <=

The template parameters are E = char type (either char or wchar_t)
and T = platform-specific traits. An example:

template <class E>
struct pathname_style_win32
{
// path delimiters { \, / }
        static const E* delims();

// check if name starts with x:\ or \\ (x = any alpha)
        static inline bool is_absolute(const E* name);

// case insensitive comparison (maybe based on locale?)
        static inline int compare(const E* left, const E* right);
};

Currently the suffix separator '.' isn't parameterized but this is
obviously trivial. Path delimiters are restricted to one character,
although you can specify multiple legal delimiters, the first being a
default when, for instance, setting the dirname of a plain filename,
or making a pathname absolute. I'm not pretending to be able to
support every possible filesystem (and certainly not VAX!) - my main
concerns are Win32 POSIX and MAC, the latter of which I know nothing
about.

Some sample usage:

        pathname pname("./test.cpp");
        pathname s = pname.suffix(); // == ".cpp"
        s = pname.dirname(); // == "./"
        s = pname.basename(); // == "test"
        s = pname.filename(); // == "test.cpp"
        s = pname.extension(); // == "cpp"
        s = pname.rootname(); // == "./"
        bool b = pname.suffix_is(".cpp"); // true
        pname.suffix(".cxx"); // pname = "./test.cxx"
        pname.basename("file"); // pname = "./file.cxx"
        pname.dirname("c:\\temp"); // pname = "c:\\file.cxx"
        pname.rootname("d:\\"); // pname = "d:\file.cxx"
        pname.extension("cpp"); // pname = "d:\file.cpp"
        pname = "test"; // pname = "test"
        pname.extension("cpp"); // pname = "test.cpp"
        pname.dirname("."); // pname = ".\test.cpp"

Some rationales:

Why suffix() & extension()? Mainly because in the POSIX world
suffixes are generally considered to include the '.' and are
potentially multi-part (file.cpp.1), whereas Windows generally only
cares about what is after the last '.' (Explorer calls file.cpp.1
a '1' file). However they are both useful concepts regardless of
platform. Also suffix allows you to distinguish between 'file.'
and 'file', whereas extension doesn't.

Why suffix_is? Mainly so you can call
pathname("file.cpp.1").suffix_is(".cpp.1") and
pathname("file.cpp.1").suffix_is(".1") and
pathname("file.1.cpp").suffix_is(".cpp")

and get what you (probably) expect (true in all cases).
But this is certainly a close call, and removing it would be no great
loss.

Why doesn't basename work like POSIX basename()? POSIX basename
returns the whole filename portion unless you explicitly give it a
suffix to remove, so

basename("/home/dylan/file.txt", "") == "file.txt"
and
basename("/home/dylan/file.txt", ".txt") == "file"
 
I was originally going to have pathname::basename do the same thing
but I honestly couldn't think of a single case where I wanted this
functionality, at any rate it's easy to do:

pname.suffix() == ".txt" ? pname.basename() : pname.filename();
 
It's true however this won't let you strip off ".txt" if part of a
multi-part suffix:

"/home/dylan/file.1.txt"

Currently the only way to do this is to use extension(""), which will
strip off the last suffix and '.' I agree this may not be entirely
obvious.

Why do dirname() etc return a pathname instead of a string?
Mainly to allow

pname.dirname().dirname().dirname() etc.
or
pname.suffix().suffix() etc.

or

pathname p("/home/dylan/file.txt");
p = p.basename().dirname("/tmp");

and get "/tmp/file"

Of course it would simple to do even if these functions returned
strings instead:

pathname p("/home/dylan/file.txt");
p = p.basename();
p.dirname("/tmp");

But perhaps the clincher is being able to do

p.suffix() == ".cpp"

and be sure that it will use the supplied comparison function, which
might be case insensitive.

I've basically designed the class around pathname manipulations that
I have needed to do, usually repeatedly, in my own work.
Unfortunately it's hard to judge whether others have similar needs
and whether this particular design makes it awkward to do common
operations.

One thing I have (experimentily!) also added is automatic conversion
to const char* AND const wchar_t* regardless of the templated char
type. This is because one would expect to be able to pass a pathname
into filesystem manipulation functions, which may only be available
in one particular flavour (usually ASCII only, but wchar_t* only if
an NT/2000 unicode app). I'm not convinced this is the way to do it,
but the conversions definitely need to happen automatically in a
library somewhere.

Any thoughts comments suggestions etc. etc. are most welcome.

Dylan

Info: http://www.boost.org Send unsubscribe requests to:
<mailto:boost-unsubscribe_at_[hidden]>

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk