|
Boost : |
From: jbandela_at_[hidden]
Date: 2001-06-05 14:20:32
--- In boost_at_y..., Douglas Gregor <gregod_at_c...> wrote:
> I still have a few comments on Tokenizer to throw in:
>
> - The include guards contain "JRB015801." Is this intended to
be updated as
> the tokenizer version is updated? I'm curious because it's not a
practice
> I've seen before.
In iterator_adapters.hpp there is the #define
BOOST_ITERATOR_ADAPTOR_DWA053000_HPP_
Which I guess is the filename followed by the author's initials and
the creation date.
> The name "csv_separator" should probably be changed to something
more obvious
> ("csv" is not a common acronym. Perhaps "comma_separator"?)
I have had trouble thinking of a good name as well. Since the "comma"
does not have to be a comma, comma_separator might be misleading. A
name that reflects what it really is would be
escaped_quoted_delimited_field_separator which is too unwieldy. Any
ideas as to a better name would be appreciated.
> punct_space_separator::operator() has a common with some incorrect
grammar:
> "skip past the punctuation only if the told to do so"
>
> The whitespace_and_punct class shouldn't be in the "detail"
namespace if the
> user is expected to specialize it for other character types.
> Also, the names "punctuation1" and "punctuation2" don't convey much
meaning,
> though I'm at a loss to come up with better names. If
whitespace_and_punct is
> to be an interface element, I believe that it should only
have "punctuation"
> and "whitespace" functions. Perhaps the contents of "punctuation1"
and
> "punctuation2" should be separated into constants instead?
>
I would really like to get rid of that template, while still allowing
nice default behaviors. If you have any ideas, let me know.
> Perhaps the name "whitespace" in punct_space_separator is too
restrictive.
> strtok() calls these characters "delimiters," which seems to be
more general.
>
The idea was that there are some delimiters that you want returned
(punctuation) and some you don't want returned (whitespace). For
example, if you were breaking an expression such as
1 + 2 + 3-4 into a sequence of tokens, while both whitespace and +
and - would be delimiters, you would only want + - returned and not
the whitespace. Since most of the time, the stuff you want returned
is punctuation the delimiters that can be returned are called that.
Since whitespace is almost always ignored, the delimiters that never
get returned are called that.
Thanks for all your time and assistance,
John R. Bandela
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk