Subject: Re: [boost] [locale] Review part 1: headers
From: Artyom (artyomtnk_at_[hidden])
Date: 2011-04-09 07:38:19
> > I don't think neither ICU nor Boost supports
> > non-32 bit platform.
> > I understand that strictly speaking it is non-standard
> > but de-facto there are no problems with this and this
> > assumption is quite safe.
> Since this assumption arbitrarily limits the library to particular
> platforms and it is trivial to avoid relying it, I see no point in doing so.
As I told you I can fix it, but practically it does
not give any advantage.
Give me a name of modern compiler that supports C++03
and has sizeof(unsigned) < 4?
> > I can change this to uint32_t
> >> line 131: Please be consistent about using unsigned vs. uint32_t.
> > Good point
> >> line 140: shouldn't offset be std::size_t?
> >> Why should the size of the string be limited
> >> to 4GB? std::size_t is guaranteed to be
> >> large enough to handle anything that can
> >> be stored in memory.
> > You are right, I'll fix this.
> > However, two points:
> > 1. I already know compilers were size_t != sizeof(void *)
> > so actually it must be uintptr_t... but it is not in C++03
> > but rather in C99. (it is in C++0x)
> 1) You work with contiguous memory, so size_t is the correct one, not
size_t is used in boost as it is closest way to use something
in C++03 standard that is similar to uintptr_t.
That is way C99 and C++03 created uintptr_t.
In any case as I told I'll change this for better consistency/
> 2) uintptr_t should ideally be available from boost/cstdint.hpp (it
> doesn't seem to be ATM, that's a bug)
> 3) I don't know of compilers where those are different, mind saying
> which ones exhibit this?
As I said before I prefer to relay on native types rather
then boost's types especially if it is feasible.
As I told above I'll change this were necessary but
it is just for "code-beauty".
> > 2. There is no reason on planet earth you need to
> > create boundary mapping for text>= 4G!
> > Even entire War& Peace book takes about 3MB.
> > So if you do this you are probably doing
> > something very-very wrong.
> > But this is different story.
> Another unnecessary arbitrary limit.
> Why would you want to prevent people from treating very large data with
> your library?
> On the Internet, there is text much bigger than War & Peace, even if not
> necessarily in the form of books.
As I told above I'll change it to size_t but...
You don't want to run boundary analysis on such chunks of text
in any case - and if you do it, you are doing something wrong.
> >> Why do you have explicit specializations of
> >> boundary_indexing? They're all the same.
> >> The compiler is perfectly capable of instantiating
> >> them all from a single definition.
> > Ok this is something that you'll see all over the Boost.Locale
> > code when it comes to deal with facets generations.
> > It is in order to support (brain damaged) DLL platform.
> Both MSVC and GCC support extern template, which is also a C++0x feature.
Believe me, it was only feasible solution I've reached after
spending many man hours fighting DLL's, std::locale facets and
(AFAIR MSVC uses specialization as well for its own facets)
So after fighting both compilers and spending days on solving and tuning
this problem I can say this is what should be done.
It requires some good experience with std::locale facets, DLLs
exporting, and having specific tests.
So what I did is unfortunately required and most reliable solution.
> > Yes... However all supported platforms are at least 32bits.
> > There is one small points. As far as I remember I
> > always used native types over strict types in the interfaces
> > (function calls) because if it is changed between
> > small releases (like stdint.h found) it would not
> > break binary compatibility so native types (as long
> > as they good enough) are better alternative.
> > (We are talking about C++ compilers that does not
> > handle C99 stdint well)
> Boost has a portable stdint equivalent.
I know and use it, see the description above.
> In any case, you shouldn't even use uint32_t, but uint_least32_t, or
> better yet, char32_t if it is available.
In any case I use uint32_t or uint16_t when I do relate to handling
of UTF-16 or UTF-32
> boost/cuchar.cpp in my Unicode library shows the right way to define the
> types Unicode needs to work with.
Same link as above, typedef of uint16_t or uint32_t is unsuitable for
See your Unicode library is orthogonal to Boost.Locale, it
handles things very different and has different purpose.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk