Boost logo

Boost :

Subject: Re: [boost] [contract] syntax redesign
From: Lorenzo Caminiti (lorcaminiti_at_[hidden])
Date: 2011-06-04 15:37:31


On Thu, Jun 2, 2011 at 1:56 PM, Matt Calabrese <rivorus_at_[hidden]> wrote:
> On Thu, Jun 2, 2011 at 9:28 AM, lcaminiti <lorcaminiti_at_[hidden]> wrote:
>
>> Hey Matt, I am still looking over your slides so sorry in advance if the
>> following suggestions don't make sense...
>
>
> First, thank you so much for taking the time to do this. I don't think
> anyone other than myself has really picked apart the latest syntax and tried
> to improve it.

:)

>> Is it possible remove some of the
>> extra parenthesis from your examples on page 79 and 96 as I suggest below?
>> If not, why?
>> ...
>> // Is the following syntax possible?
>> BOOST_GENERIC(
>> // Use leading `concept` "keyword" to determine macro functionality.
>
>
> I considered this during implementation, and same with "auto" concepts, but
> decided against it for a few reasons. For one, it complicates the
> implementation for, imo, little or no gain, and it can impact compile-time,
> which is something that I consider a problem with the library at this point
> in time.

OK.

>> // `typename` doesn't need extra parenthesis.
>>
>
> It does if I want to be consistent with all parameter kinds, but I could
> special-case typename and class if I disallow fully qualified names as you
> currently do. Right now I support typename/class, value parameters, and soon
> varidiac parameters which will work via the syntax ( (ParameterKindHere,,,)
> SomeNameHere ). I'll give some thought to special-casing, but for now, it's
> more consistent for users and easier to preprocess if I don't do that, and
> fully qualified names are handled without problem. The more I look at your
> library though and the more you mention it, the more I question my choice,
> so it is possible I'll start special-casing things as you are doing, but I
> have a lot on my plate before I devote time to changing features that
> already work.

My new syntax also supports these cases:

    template< class T = int, typename U, T x, int y = 0 > // usual C++

    template( class T, default int, typename U, (T) x, int y, default
0 ) // contract syntax

If for consistency you want to use extra parenthesis around all types,
the following also works:

    template( class T, default int, typename U, (T) x, (int) y, default 0 )

> // `extends` is not necessary but I find it more readable.
>>
>
> This was just because I wanted to use a keyword that an IDE would highlight.
> In the predecessor to this library I was using words that weren't keywords
> and Dave Abrahams suggested that I switch over to keywords, with a rationale
> that I ultimately agreed with (easy to remember, it's not highlighted if you
> make a typo). Extends is not and likely will not be a keyword in C++.
>
> Also, with concepts, there's no such thing as "private" or "protected"
> refinement -- the public there acts as "extends" acts in your example in
> that it's only used to signify that what follows is a list of less refined
> concepts. If I change it, the public part would just be replaced rather than
> both words being there, but I'd  probably use "refines" instead of extends
> since that's the proper term in Generic Programming, though that too is not
> a keyword.

This still keeps public that is highlighted (BTW, in the library docs
I will provide the DSEL grammar and a list of the new "keyword" like
extends, requires, precondition, etc so you can configure your IDE to
color them up) and yes, refines makes more sense:

    refines( public Semiregular<T> )

>>    // BOOST_GENERIC_ASSIGN just expand to `,` but I find it more readable.
>
>
> I'm not a big fan of that. I use , because it's short and simple, though I
> have considered (,) to make it jump out as being different and unambiguous.
> If you can provide a good rationale for BOOST_GENERIC_ASSIGN over those
> options, I'm open to it, but at this point I favor simplicitly over a big
> macro name (and my gut is that most users will too).
>
>
>>    , (MoveConstructible) reference BOOST_GENERIC_ASSIGN typename
>
>
> The encapsulating set of parentheses around each individual parameter are
> for important reasons. For one, your current example is actually ambiguous.
>
> //////////
> (MoveConstructible) reference, typename something_here
> //////////
>
> could either be a contrained typename requirement with a default, or it
> could be a constrained typename without a default followed by an
> unconstrained typename. The encapsulating parentheses is imo the simplest
> solution for users to remember and for implementation, although if I use
> something like (,) or BOOST_GENERIC_ASSIGN for default assignment I could
> disambiguate this particular issue that way.

I used ", default ..." for default parameters:

    (MoveConstructible) reference, default typename x::reference

> However, another issue is that the default may, itself, have commas in it.
> Without the encapsulating parentheses, I can't know if the next comma in a
> default is a parameter separator or a part of the default (for instance, a
> default of boost::array< int, 2 >). If I special-case it so that you have
> the option of wrapping the default in parentheses if it has commas, then
> it's ambiguous with the situation where there is no default and where the
> next parameter to the concept macro starts with parentheses, etc. I could
> keep trying to special-case further, or I could just make a single,
> consistent rule that's easy to remember as I've done. IMO, this option is
> simpler for both myself and users and also aids in keeping compile-times
> down.

To handle unparenthesized commas, I'd use the IDENTITY_TYPE/VALUE
macros. Commas not wrapped within parenthesis are rare so IMO it's not
worth to ask users to always use extra parenthesis to handle them.
It's simpler to ask user to use a special macro when they are present.

> Also, since I am consistent with wrapping each parameter with parentheses,

I also went down this path and called the previous syntax
"parenthesized syntax" because all tokens were consistently wrapped
within extra parenthesis. Then I was think it's easier for users to
learn the syntax "just wrap all tokens of the usual C++ syntax within
parenthesis (more or less)". Everyone that looked at the syntax
complained about the extra parenthesis :(( I also now think that the
newer syntax with only a few extra parenthesis is better.

> it is much easier for me to give descriptive error reporting if the user
> screws something up that I can catch at preprocessing time. For instance, if
> a user is creating a many-line concept, such as an allocator-like concept,
> and he makes a typo somewhere in the middle, I want to be able to give the
> best error possible, including a parameter number, that particular parameter
> text in full, and even be able to continue parsing the remaining parameters
> to the concept, catching other potential errors during preprocessing. If
> each parameter is wrapped in parentheses, it is much easier to do this --
> any noticeable error that occurs in the middle of a given parameter, even if
> it involves something tricky like commas, does not affect the parsing of
> further parameters, and I'm able to output the full parameter text of the
> parameter with the error in a static assert without accidentally cutting off
> part of what the user thought was a single parameter but ends up not being
> the case because of a misplaced comma.

Error reporting is a good point especially because all your code is
within the macros (for Boost.Local the function body code is outside
the macro and that's were most of the actual programming errors will
be so I likely don't have to worry about error reporting too much -- I
still check the syntax using the pp as much as possible).

> Finally, the concept macro, as I'm sure you can imagine, is fairly nasty
> underneath the hood. One nice thing about the implementation, however, is
> that I handle most of the top-level implementation via a simple preprocessor
> for-each operation over all of the parameters after the name and
> refinements. Having the parameters consistently separated is what makes this
> directly possible without any difficulty -- since the parameters are just
> individual macro arguments, I use a construct that invokes a macro for each
> parameter. It doesn't require any complicated logic to figure out where
> individual parameters are separated, is very efficient, and it keeps the
> loop logic separate from the parsing of individual parameters.
>
>
>>    // `operator(symbol, name)` allow user to name the operators
>>
>
> I can't use a symbol, and a user-provided name is unfortunately meaningless
> both to me and the user here. Associated operator requirements of concepts
> check if the operator is callable with respect to the pseudo-signature. The
> reason I use a fixed name is because, internally, I have to generate
> metafunctions during preprocessing that check to see if the operation is
> callable for the given parameter types, and what is generated can vary
> depending on which operator is used. In order to do this, I have to know,
> entirely during preprocessing, exactly which operator is specified. You
> can't do this if the user provides the symbol representation of the
> operator, and an arbitrary, user-specified name wouldn't help. The names I
> use come from N2914 standard concept names, so there is also some
> consistency here.

I see.

>>    // Types `X&`, `X&&`, `int`, etc do not need extra parenthesis.
>
>
> I could special case int and other built-ins, but I don't see how it would
> be possible to special case X, or X&, etc. The "X" is a user-defined concept
> parameter for that specific macro invocation, not a universal identifier
> that I'm using.

    , (postincrement_result) operator(++, postinc)( X&, int )

That's a tuple and X& is the only token... why can't you just get it with:

    PP_VARIADIC_TUPLE_ELEM(0, (X&, int)) // expand to X&

I am missing something here... maybe sometimes you have some other
tokens like in ( (X&) x, int y ) ??

>> BOOST_GENERIC(
>> // Use `concept_map` "keyword" to determine macro functionality.
>>
>
> Yes, I can do that, but again, it makes preprocessing more complicated,
> affects compile-time, and is one more place for a user to make a mistake
> that I would want to explicitly look for if I were to give an easy-to-read
> error message. Overall, I don't think I'm gaining enough, if anything, by
> pushing more stuff into the macro itself. In an ideal world, I agree, that
> looks cool, but I just don't think it's practical. Simply calling the macro
> BOOST_GENERIC_CONCEPT_MAP gives me all of the information that I need
> without additional preprocessing.
>
>
>> // Do you need ObjectType? Is was missing from above syntax...
>>
>
> Sorry, that's covered in the talk itself and is not directly in the slides
> (the video is not online yet, but I'll link it whenever whoever is doing
> post uploads it) -- I can't handle concept-constrained concept maps just yet
> and if it's not clear, ObjectType is a concept as opposed to a type. That's
> why the example is not exactly the same between the two. In the end, if you
> are curious, the syntax will be
>
> //////////
> template( ((ObjectType)) T )
> // or if I disallow fully-qualified names
> template( (concept ObjectType) T )
> //////////
>
>
>
>
>   , typedef T value_type
>>    , typedef ptrdiff_t difference_type
>>    , typedef T& reference
>>    , typedef T* pointer
>
>
> No, that's not really possible in practice. The problem is that the typedefs
> themselves may contain commas in them and what comes after the comma could
> potentially be something not parenthesized that is not valid to concatenate
> with at preprocessing time. I.E. a bool argument with a !, etc. Trying to

Again, can't you use IDENTITY_TYPE to handle unwrapped commas as
special cases? That way you simply the syntax removing the extra
parenthesis for the common cases (unwrapped commas are rare in my
experience).

> handle this syntax and do special casing would be complicated and
> incomplete. Worse, when someone does something that looks as though it
> should be valid but really is not, solely for preprocessor reasons that I
> can't explicitly check for such as the one mentioned above, they would get a
> strange preprocessor concatenation error that would likely be completely
> meaningless to them. I try hard to make it so that the user should not be
> seeing errors like that, and without them having to remember what may seem
> like arbitrary rules that only make sense if you have written something
> along the lines of this library or your library, or Boost.Parameter, etc.
> which most programmers have not. Even then, the issues are extremely subtle.
>
> Again, thanks for all of this feedback. I've been wanting another head to
> kind of tear things apart. If anything, I think I may go ahead and
> special-case parameters like void, class, typename, etc. though that won't
> be for a while. Not using a comma as a replacement for = when specifying
> defaults may happen too. All of the other special-casing and attempted
> removal of parentheses I'll have to give further thought to, but I think the
> current design has the least subtle gotchas.

Thanks to you to take a look at Boost.Contract newer syntax. BTW, this
is major syntax re-design #4 for Boost.Contract... it requires a lot
of patience and perseverance :)

--Lorenzo


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk