Boost logo

Boost :

From: Samuel Krempp (krempp_at_[hidden])
Date: 2002-01-20 07:49:07


(Sorry I did not respond sooner, I was busy in the weeks after christmas
holidays)

Thanks a lot for your very constructive comments,
I fixed hte points you raised, except some that are debated below.

I had planned to fix a few issues for weeks, so your message helped me
take the time to finally sort those things out..
The results are in the vault. (version '1.32')

On Tue, 2002-01-15 at 00:48, Ed Brey wrote:
> One problem is portability. It doesn't compile on either Cygwin gcc
> 2.95.3 or VC6.
> gcc doesn't like char_traits, and I'm not sure if much can be done
> about it.

Right, the errors reported by gcc 2.95 talk about char_traits, and my
only reaction was to leave this compiler, and use newer gcc-3.0.
But well, maybe there is a way to make it work with gcc2.95, if someone
has a clue about what fix could be tried..

> VC could probably be made to work fairly easily, just by dropping the
> separate .ihh file and merging the member function definitions
> together with the declarations.

well, if that's as easy as this, I should do it.
Unfortunately I don't have access to any VC++ compiler.
Do you mean the member functions should be defined inline, at the moment
they are declared, or just that they need to be in the same file ?

> The other problem is some issues with the interface (see "Interface
> comments" below). Details of the interface are a very important part
> of a library like this, since its application is so broad. If at all
> possible a consensus should be reached on the best syntax to use. In
> any case, the library should contain rationale describing why any
> reasonable-appearing interfaces are not used. Based on the current
> rationale document, it looks like some of the possibilities haven't
> been thought all the way through yet.

I completely agree. the interface issues are a key part, once we reach
an agreement on an interface, the implementation is not problematic.

> - While operator% does a nice job of binding the parameters to the
> format object, I believe it is possible to do even better. Consider
> operator[], which has these advantages:
> - - Tighter binding. No chance of accidentally expecting a different
> binding.
> - - Has a correlation to the built-in operator[]: Consider the
> 'format("%1") [num]'. Think of "%1" as representing a virtual
> array {"0", "1", "2", ... }, which num indexes into.
> - - Opens the door for a natural syntax which can eliminate the
> "glue" function. This is useful since the name "glue" does little to
> help the intuition. The syntax would be to use operator() for
> manipulators, which works nicely since it looks similar to what
> surrounds the specification string. The specification string and
> manipulators share the common trait of describing the output layout,
> which a syntactical similarity reinforces.

the choice of the operator is quite a problem.. simply because the pros
and cons are esthetical, or philosophical, rather than practical, so I
don't see how I could possibly have everybody agree with any decision I
take.

Personnally I'd be happy with this system, [] for arguments, and () for
manipulators, but nonetheless I doubt your your arguments will convince
enough people to reach a consensus..

In fact, using two operators to distinguish manipulators from arguments,
was my personnal favorite since the beginning.
at the time, I was thinking of % and *, but that had so little chance of
being accepted that I erased it from the documentation pretty fast..

Your proposal brings the same practical advantage, and makes more sense
'philosophically'.
(there's an extra bonus with operator() for manipulators : it complies
to the human language meaning of parenthesis :
format(" %1 %2") ["azf"] (hex) [37]
can mean hex is just extra details here, not a key part. and it's just
that)

The other solution for getting rid of the 'glue function', implies to
redesign the <iomaip> header, as Karl Nelson proposed in his message
Subject: [boost] iomanip mixin (was Review request : format)
Date: 05 Jan 2002
I have not looked at his file
http://www.ece.ucdavis.edu/~kenelson/iomanip.hh
but it could indeed be useful, for format but also for many possible
other pieces of code that deal with formatting, and need to distinguish
manipulators from the rest.
(I was very surprised when I found out that the C++ standard does not
provide an easy way to recognize manipulators from output. I think it is
a lack, and that Karl's proposition is justified.)

But it's yet again another problem, so for now format has to cope with
the manipulators provided by the compiler.
(and anyway, I'd like format's syntax to explicitly reflect what are
manipulators, and what are arguments. using the same for both is somehow
a misfeature)

So in conclusion, I replaced operator% and 'glue';
with operator[] and operator().
let's hope people will say 'okay, afterall why not..' rather than 'oh
no, I don't like that', else we'll never reach a consensus..

> However, operator() has been proposed, which has similar advantages,
> and I don't recall seeing any counterarguments to the proposal on the
> mailing list.

In fact, when I tried typing sample lines of code using operator() on
format objects, I found the result too cluttered for my taste. A lot of
parenthesis is hard to read, when you're so much used to see () in very
precise contexts.
I prefered [] a lot, on the same sample lines.

> Likewise, the idea isn't addressed in choices.html.

because I had not considered it :-)

> The choice between operator()/[]/% is important and should be well
> thought through and documented, just as was done for operator%/<<.

right. the operaotr% vs << issue showed up early, right after
explaining why using operators at all. While the choice of % rather than
operator() or [] is a recent addition to the debate.
Now that I think there is a better candidate than '%', I'll add some
arguments for 'operator[]' in the doc. but it's rather a matter of
taste, so I'm not sure I will be able to convince people with only a few
lines..

> - str() is one of those names that I've often heard on this list
> should not be abbreviated. No wonder - Does it stand for string
> or stream? :-) Clearly, it was abbreviated just to avoid a name
> conflict, which always calls the design into question. I realize
> that the standard library does this, and I question it, too. Is
> there any harm in just providing implicit string conversion instead?

as for all implicit conversions, I was taught it's always better to do
without. It only requires one function, we could name it 'to_string( )'
if you really want the name to be absolutely non-ambiguous. (but then
it's a long name..)
I don't have an example situation where the implicit conversion would
cause problems. But since the format objects are true objects, that can
be passed around and stored, etc, I guess such situations are possible
if a user starts writing functions that expect format objects, with
overloads on top of that.

Also, when format was using operator%, my fear was that the implicit
conversion could be called when a user mistakenly uses operator+ within
a ouptut with format :
cout << format("%1 %2") % "abcd"+"efgh";
means :
cout << (format("%1 %2") % "abcd" ) + "efgh";
while the user expected
cout << format("%1 %2") % ("abcd" + "efgh");

without implicit string conversion, the mistake is rejected at
compilation.
With conversion, the compiler finds a (bad) meaning to the mistake :
cout << string(format("%1 %2") % "abcd" ) + "efgh";
and the user will have an exception at run-time.

Now, with operator[], this kind of user confusions on operator
precedences is avoided. (Still, without implicit conversion, some typos
would be caught at compile-time instead of producing run-time errors..)
So now, we may provide an implicit conversion.. but I'm really
precautious on this.

> - Instead of having formatter automatically raise an exception,
> consider having it store its error state and set the failbit on
> the stream if there is an exception. I'm not sure this is a
> good idea - just something to think about. The consideration
> is that there may be some applications that would prefer to
> silently but gracefully degrade in the face of this situation
> than to have to handle an exception.

right.
I already said I'd provide a way to choose what errors throw exceptions,
so I've finally implemented it, with inspiration from the iostream
'exceptions' functions.

using namespace boost::format;
format fmter(" %1 %2 ");
fmter.exceptions( all_bits ^ (not_enough_args_bit | too_much_args_bit));
cout << fmter [100] [200] [300] [400] ;
// Okay, extra arguments quietly ignored.

for repeated usage, the best for the user is to make a small wrapper
function :
inline format my_format(const char* fstring) {
        error_types my_except = bad_format_string_bit;
        return format(fstring).exceptions(my_except);
}
 
then he can call my_format instead of format..

I could add a parameter in the basic_format constructors, to pass
the exception mask at construction time, but in practical use,
it would not be as clean as a user wrapper function anyway.

> - I didn't look deeply into the non-legacy formatting codes, since
> the documentation wasn't clear about what is legacy printf compatibility
> and what is the "latest and greatest" that new code would use. I'd like
> to have another look following some documentation refactoring (see below).

the non-legacy formatting code is very minimalistic..
It relies on the printf directives for left/right/centered alignment, padding,
and all other fancy formatting options.
I felt that designing a new syntax for precise formatting would complicate the
whole thing even more, so for now the new syntax is just "%1", "%2", etc..

There's more on the subject : this short syntax lacked a mechanism for printing
numbers right after an argument :
"%50000" meant the 50000-th argument.
Previously the only way to print the 5th argument followed immediately by four
zeros was to use a printf directive.
So I implemented the solution that was already proposed on the list :
"%{5}0000". straightforward to understand and to remember..

> Documentation comments :
[...]
> It should be sufficient so that most users can stop and make use
> of the class with just two or three minutes of learning curve.

I agree that it is a goal the documentation has to fullfill.
But it takes a lot of time to write a good documentation,
I made a first attempt, to at least explain things for reviewers,
and will spend more time on it before claiming it's ready for users..

> what if I'm not interested in printf compatibility, but want to center text?

As I said, the 'short-style' notation can't be used to specify formatting
options,
My decision was to provide 2 possible notations :
- one basic, very concise notation to adress simple needs
- printf notation for printf compatibility OR for any precise
formatting need.
(the new formatting features are provided in this style of notation,
not the short one)

Then you have to choose where 'simple needs' end, and where
'precise formatting' begins..
If you put centering, you also put other kinds of padding, etc..,
and in the end you might be better off with the full printf syntax..

So I chose to limit 'simple needs' to straight conversion of
an argument, without any formatting option.

> This library doesn't have a true implementation document, which is fine
> It is important, however, not to throw arbitrary pieces of
> implementation-specific information into the interface document.
> Specifically, the references to the ".ihh" file should not exist
> in the html doc or the sample documentation.

True, I corrected this.
(In the html doc, the link to the file was intended for those that
want to go directly and see what the code looks like, but it was
definitely not at the right place)

> It would be much nicer to have samples snippets incorporated directly
> into the HTML instead of having a separate sample program.
> Currently there is a hybrid sample/mini-test-harness that serves
> neither purpose very well.

Yes, I need to show more examples in the html.
About the example/ files, I used assert in them so that people
can see the results of the formatting without needing to compile
and launch.
I guess I'll eventually replace the assertions with simple comments,
it will be less cluttered, but right now the code changes frequently
and the assertions insure the samples stay correct..

> The docs could use some grammar clean-up. Here's are items
> that I found, organized by section:

Thanks for your grammar corrections !
I fixed all the items you pinpointed, unless detailed below.
I felt quite astonished when I went through all the terrible
mistakes I had made..
I'd really need to take lessons to brush up my English.

> - The last paragraph: Hardly one word. An implementation detail
> like using stringstream should not be documented here, since
> you won't want your users to rely on that fact (not that
> I see how they could).

[Fixed]
true. I've changed 'in one word', to : 'all in all'.
Also, it is true that I should not say that the internal stream is a
stringstream. I've replaced all occurences of 'stringstream' by
'stream'.

> Credits:
> - "Karl Nelson formatting" ->
> "Karl Nelson's formatting" ->

[Fixed]
on the same note, I'm not sure what to do with the first name, but I
think the whole sentence should be :
"ideas from both Rüdiger Loos' and Karl Nelson's formatting classes"
Is this right ?

> The current mixture of spaces and tabs in the code makes it quite
> hard to read. Please pick one or the other.

[Fixed]
oups, my emacs settings changed and tabs were inserted here and there..
It's now fixed. (spaces everywhere)

> str2int(): Why use the long if/else block rather than n += s[i] - '0'?

[Unsure]
because I'm not sure it's guaranteed to work in all cases.
(all platforms, and wchar_t as well as char)
Is it ?

I'll continue working on the documentation today, right now it's still
very much the same as before. I'll certainly upload a new file (version
1.33) with updated documentation shortly..

Thanks again for your detailed, and helpful comments,

-- 
Sam

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk