Boost logo

Boost :

From: Rob Boehne (robb_at_[hidden])
Date: 2024-04-25 15:25:04


From: Peter Dimov <pdimov_at_[hidden]>
Date: Thursday, April 25, 2024 at 9:53 AM
To: Rob Boehne <robb_at_[hidden]>, boost_at_[hidden] <boost_at_[hidden]>
Subject: RE: [boost] UUID design discussion
Rob Boehne wrote:
> * At the moment wide strings are processed by the name generators
> by converting every wchar_t to 32 bit, then hashing the bytes, zeroes
> and all. This doesn't strike me as correct. I think that the string should
> be converted to UTF-8 on the fly (with 32 bit wchar_t assumed UTF-16
> and 32 bit wchar_t assumed UTF-32.)
>
>
>
> To my thinking – a string should just be treated as binary data and it should
> not have its encoding changed – this should also make less work.

This behavior makes name UUIDs produced by e.g. "www.example.org<http://www.example.org>"
and L"www.example.org<http://www.example.org>" different, which is unlikely to be what one wants
in practice, and is against the recommendation of RFC 4122, which says

   o Convert the name to a canonical sequence of octets (as defined by
      the standards or conventions of its name space); put the name
      space ID in network byte order.

I don't think anyone can justify the choice of e.g. 0x41 0x00 0x00 0x00 as
the "canonical sequence of octets" for U"A".

Ok – I withdraw my comment.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk