Re: [Boost-docs] [quickbook] Processing names with more than one underscore ( _ )

Subject: Re: [Boost-docs] [quickbook] Processing names with more than one underscore ( _ )
From: Daniel James (dnljms_at_[hidden])
Date: 2011-02-15 19:51:06

On 15 February 2011 05:30, Eric Niebler <eric_at_[hidden]> wrote:
> IIRC, the original version of quickbook that I wrote (actually ported
> from Joel's quickdoc) didn't have the shortcut notation for bold,
> italics, underline, etc. Joel added it later as a convenience. I never
> particularly cared for it, for just the reasons Edward describes: it's
> not quickbook-ish, it doesn't nest cleanly, and it sometimes catches you
> unawares, forcing you to examine the output carefully to catch mistakes.
> I wouldn't be opposed to a switch that disabled it, or at least warned.
> (I'd like to rip it out, but we can't now.) Daniel, it wouldn't be
> creating a new dialect since one would be a proper subset of the other.

If the same quickbook generates different boostbook, then I see that
as a different dialect. I've been more cautious about this since I
broke the maths documentation by 'fixing' html encoding in the
document info. I think it's okay when it's unlikely that anyone
actually wants the current behaviour - an example of this is the
copyright, a lot of people get the syntax wrong (i.e. they write
'2005-2007', should be '2005 2006 2007'), so I'm planning on changing
it to support the way it's used.

An error or warning would be fine, if that's desired. And if the
general opinion is against it, I'll remove support in 1.6.

> Besides, I think there must be a bug in the handling of shortcut
> notation. This:
>  some_ident_ifier
> should not put "ident" in italics, but IIRC it does.> The shortcut
> notation should only kick in when the it is surrounded by whitespace, right?

Or punctation, but it only checks after the closing character. This is
how it's always been (unlike regular expressions, spirit has no 'look
behind' parser which makes checking the opening character tricky). In
your example it shouldn't happen because there's a letter after the
second underscore, but it does for something like some_identifier_, or
an_identifier followed by this_, or if you've consecutive underscores
(because the underscore counts as punctuation) so 'this_will__match'
will generate this[_will]_match.

Could it check the first character? There's a few possibilities. Could
add some extra state to the parser, which would be a pain. A better
solution might be if the iterator kept a copy of its initial position,
so that it could scan backwards.

In the future, I'll probably have a better solution. I've been writing
an intermediate data structure for storing parsed quickbook (not
intended to be a full ADT, just for storing, say, a table's id, title
and cells before writing out the table). That could store possible
start and end markers and then try to match them up once a phrase has
been parsed, and if they don't match up just write the symbols. This
would allow the contents to be parsed properly. I'm not exactly sure
how far I'd go with that though, I suspect it's probably for the best
that these markups can't contain square brackets.


This archive was generated by hypermail 2.1.7 : 2017-11-11 08:50:41 UTC