Subject: Re: [Boost-docs] [quickbook] Direct code import...
From: Daniel James (dnljms_at_[hidden])
Date: 2013-01-10 12:55:59
On 10 January 2013 03:57, Rene Rivera <grafikrobot_at_[hidden]> wrote:
> On Wed, Jan 9, 2013 at 6:29 PM, Daniel James <dnljms_at_[hidden]> wrote:
>>
>> On 5 January 2013 21:27, Rene Rivera <grafikrobot_at_[hidden]> wrote:
>> > After having my Mac's drive get trashed I lost a set of notes I had on
>> > what
>> > needed to get finished for this. Does anyone have an idea on what needs
>> > to
>> > get done to complete this feature on boostbook-dev branch?
>>
>> I'm not sure. The quickbook-dev branch was completely merged to trunk,
>> but I reverted the glob feature because it didn't completely work. One
>> problem was that it uses a bitset for all the possible character
>> values, which is fine when using 8-bit characters, but not when using
>> 16-bit characters (on windows). There might have been others, I can't
>> remember.
>
>
> I don't see how the bitset would be a problem. As it's templated to adjust
> to the bitsize of the character type.
Which is 8K for a 16 bit character, which is pretty excessive for
matching a single character, and pretty big to use on the stack. It's
recreated every time so that bitset might need to be set up multiple
times when matching a single filename. It should be easy to adapt
globchars to match the character as it parses the glob, so there's no
need to build up a data structure.
But I really don't understand why you didn't use
'quickbook::path_to_generic' to convert filesystem paths to an 8 bit
representation.
> Hence it works for woth wide and
> narrow character types. But obviously it has no idea about variable sized
> encodings (UTF-*, etc.) and will misbehave in such cases. But there's
> nothing that can be done about that other than making the path strings
> return from the filesystem lib be decoded correctly to the compiler wide
> characters. In other words, it relies on `fs::path::generic_wstring` doing
> something sensible. Which I'm guessing from your further exposition below
> that it doesn't do something sensible?
The problem for Filesystem is that it's hard to say what "something
sensible" is. With HFS+ every file name is normalised, so it would
make sense to always normalise. But for other file systems it's
possible to have two files with the same name in different forms (this
has happened to me when transferring files between a linux server and
a mac), so if Filesystem always normalises, you won't be able to
distinguish the two files. So neither choice always works.
I also don't think you get proper conversion unless you use some sort
of codecvt, which I don't understand so I just did the conversions
myself.
The most important point for me is that quickbook is consistent on
different platforms. It's easy enough to do something like: a) require
that all characters in pattern are <= 127, b) change ? to only match
characters that are <= 127 and c) change * to only match characters
that are <= 127. (b) and (c) could both trigger a warning when a match
is rejected. I thought (c) was optional, but having thought about it,
it is required because otherwise 'cafe*.txt' will match 'café.txt' on
HFS+, but not elsewhere.
> I would highly recommend getting the user mapping working. It makes for far
> more readable histories. Dave should have a usable mapping file at this
> point (I hope). So we could use that.
Well, I really want something I can start using now, and then transfer
new changes over to a better repo once svn2git (or whatever) has been
fully worked out. Here's what I did last night.
https://github.com/danieljames/quickbook-tmp/
It includes the branches from branches/quickbook (only the quickbook
parts) and branches/quickbook-dev. There are other branches elsewhere
that might be worth including in the final repo.
This archive was generated by hypermail 2.1.7 : 2017-11-11 08:50:41 UTC