Boost logo

Boost :

From: Beman Dawes (bdawes_at_[hidden])
Date: 2008-05-29 11:10:02

David Abrahams wrote:
> on Wed May 21 2008, Beman Dawes <> wrote:
>> David Abrahams wrote:
>>> on Tue May 20 2008, Beman Dawes <> wrote:
>>>> David Abrahams wrote:
>>>>> I was just reviewing the filesystem docs and came across "leaf()". I'm
>>>>> sure this isn't the first time I've seen it, but this time I picked up a
>>>>> little semantic dissonance. Normally we think of "leaf" in the context
>>>>> of a tree as being a thing with no children. An interior node like a
>>>>> directory that has files or other directories in it is usually not
>>>>> called a "leaf."
>>>> Right. And "leaf" never returns an interior node of a path.
>>> What is an "interior node" of a path? Would you talk about "interior
>>> nodes" of a std::vector<string>?
>>>>> I wonder if this is the best possible name?
>>>> The names used by the filesystem library were carefully chosen as a
>>>> matched set. So you can't change a single name without making a
>>>> corresponding change to the other names (like "branch") it is related
>>>> to.
>>> I understand that a change may upset the apple cart, but the fact that
>>> the names are interdependent doesn't mean we shouldn't consider
>>> different ones.
>> Sure. But given that the current names were widely discussed at the time
>> of adoption, have been in use for quite a few years, and "basename" is
>> already used by the library for a function with different semantics,
>> change would be difficult.
> Well, I did bring this up almost five years ago:
>>>>> Is there a precedent we can draw on in some other language/library? In
>>>>> python, it's os.path.basename(p). Perl, php, and the posix basename
>>>>> command seem to do something similar.
>>>> The filesytem names were chosen to be an improvement over the naming
>>>> used by other libraries and/or languages, which always seemed to me to
>>>> be misleading. For example, my intuition is that basename("")
>>>> should yield "foo", not "".
>>> I don't know why -- basename seems like one of those names that would be
>>> semantically void except for the precedent provided by other languages
>>> trying to do the same thing. On the face of it, it doesn't suggest
>>> anything about the extension part of the name one way or the other. In
>>> any case, the 2-argument version of basename does something like what
>>> you want in many of those other languages.
>>> I'm certainly open to persuasion, but so far, a pathname doesn't seem to
>>> resemble a tree in any conceptually useful way, and there seems to be no
>>> compelling advantage to inventing our own terminology here.
>> If you have a better set of names, why don't you suggest them?
> You asked for it. I'm going in the order given by
> because that's
> reasonably comprehensive and readable even though it looks like it may
> have some serious errors. Don't have time to pore through the full
> reference right now.
> path members:
> const std::string& string( ): OK
> std::string root_directory( ): OK, but maybe should return
> boost::optional<path>. I wonder why we decay to std::string
> so eagerly.
> std::string root_name( ): OK, but maybe should be called "root" and
> return boost::optional<path>

Logically, the root is made up of the root_name() and the
root_directory(). If you change the name of root_directory() to root(),
what do you call the combination of root_name() and root_directory()?

As far as other aspects of the interface, like the return type, I want
to revisit the whole design once C++0x stabilizes and there is a
compiler available with more C++0x features to experiment with.

> std::string leaf( ): should be basename(). back(), tail() and
> p.split()[1].string() are viable alternatives

One problem with basename is that it is already used by one of the
convenience functions. Another is that I find it misleading. An
alternative set I'd be comfortable with would be tail() for the current
leaf() and head_path() for the current branch_path().

> std::string branch_path( ): should be dirname()

That breaks the naming scheme; _path is uniformly used to signal that
that the return potentially contains a path rather than just a single
name. It is also misleading in that the return is often a path rather
than just a single directory name, and for "c:" on Windows isn't a
directory name at all.

One of my frustrations with similar libraries has always been their
misleading function names. And that's really your point about leaf() and
branch_path(); you find them misleading. That's a concern, and why I'm
willing to consider renaming them. But not to a set of names that is
misleading to a different set of people, me included.

> bool empty( ): OK, although this seems like a really uninteresting
> question to ask.
> iterator: OK
> operator/: I've liked that one ever since I came up with it ;-)

Yeah, I like it too, although some people find it too cute. I had a
complaint recently from a new user that the append functionality was not
supported; turned out he never read the docs for operator/ because he
assumed operator/ couldn't possibly be what he was looking for.

> The rest of the names look OK to me except for is_regular, which should
> be is_file. That name seems overthought and "regular" has all kinds of
> other connotations.

I'm not particularly fond of is_regular either. The problem with is_file
is that some people argue it should be true for directories. I could get
talked into is_file, if others support that and will help dealing with
those who think of directories as files. Or maybe some other name could
be found. is_regular_file()? Although longer, that seems clearer.


Boost list run by bdawes at, gregod at, cpdaniel at, john at