Boost :

Date view	Thread view	Subject view	Author view

Subject: Re: [boost] [locale] Review part 1: headers
From: Artyom (artyomtnk_at_[hidden])
Date: 2011-04-10 08:04:10

Next message: Artyom: "Re: [boost] [locale] review part 2.1: source"
Previous message: Mathias Gaunard: "Re: [boost] [locale] review part 2.1: source"
In reply to: Mathias Gaunard: "Re: [boost] [locale] Review part 1: headers"
Next in thread: Steven Watanabe: "Re: [boost] [locale] Review part 1: headers"

> >
> > - Does ICU compilable for DSP?
> > - Does standard C++ library of them supports something
> > beyond C and POSIX locales?
> >
>
> You wouldn't want to do text processing on DSPs anyway.
>

Indeed.

>
> > See... Where I need 32 bits I use them, however I need
> > this enum to have 20 bits and this is
> > practical not theoretical issue.
> >
> > So 16 bit platform would be not supported for now.
> > I'm really try to solve practical issues.
>
> Your library extends the standard locale system, it would be great
> if it was of good enough quality that it could work with any
>standard-conforming compiler,
> especially if you wanted someday to integrate this within the C++ standard
>itself.
>

I'll explain more briefly.

I defenselessly do not have problems using uint32_t where it should be. And I'll
fix it in places I missed.

However I have a enum that is a sort of bitmask that is designed
to be able to be split.

For example if someday ICU's boundary analysis would support marking
hiragana and katagana I would like to be able to split the mask into
two portions without affecting ABI compatibility.

This is very important for other purposes of Boost.Locale I
developed it for (CppCMS) - without it Boost.Locale would not
exist at all.

Of course if it becomes standardized it would not have exactly
the same values or exactly the same mask, you can always
make it much smaller and simpler.

So I don't see a real life reason why shouldn't I limit
myself to 16 bit where 90% of boost libraries would likely
fail in 16 bit platform.

I totally understand what you are saying and I agree that
if you should not insert artificial limitations to the
code.

However in this particular case IMHO such assumption is fine and justified.

>
>
> > OpenVMS's C++ compiler from HP, sizeof(size_t)=4 but sizeof(void *)=8
> > in 64 mode...
> >
>
> Good to know; quite surprising.
>
> I would consider OpenVMS somewhat of an obsolete platform though, but that's
>just me.
>

I rather consider it is as a bug that wouldn't be likely fixed.

>
>
> > No, because when you deal with text there
> > are real world assumptions you can made - there is no
> > single chunk of text of size written by man such that>= 4GB...
>
> I've got IRC and IM logs that easily go beyond 4GB for a single
> channel or discussion. Are you suggesting those aren't text and
> I shouldn't use your library when I want to perform analysis of that text?

But if it is a IRC log you likely have natural (IM specific) markers
to split the log in the first place and only
then apply boundary analysis on each message same as you
would not apply boundary analysis on XML or HTML file
but rather on tag-less content of block level items.

Each tool should fit its use, you can use Cray Supercomputer
to play Soliter or use Andriod Cellphone to analyse
whether...

The question should you really do this?

Also I'm not so sure that ICU would be actually able to handle
something beyond this.

>
> > In any case I suggest to drop it because I'll change it to size_t
> >
> > Read the second bullet in the link above. You can't create std::locale
>facets
> > for arbitrary types.
> >
> > In fact, because GCC's libstd++ does not specializes them for
>char16_t/char32_t
> > (library bug) I can't create such facets as I get undefined references.
>
> Yes, locale facets only work with char or wchar_t, but I don't
> see how using "unsigned int" (or whatever you meant by native types) is going
>to help you there...

I probably don't understand you. All I tell the Boost.Locale does not use
special characters types for characters as it uses only char and wchar_t for
this purpose and only when compilers would be ready for char16_t and char32_t it
would be possible to use them as well.

I was talking about characters not integer values.

In any case... I think UTF-16 and UTF-32 should fade away but this
is other story.

Best,
Artyom

Next message: Artyom: "Re: [boost] [locale] review part 2.1: source"
Previous message: Mathias Gaunard: "Re: [boost] [locale] review part 2.1: source"
In reply to: Mathias Gaunard: "Re: [boost] [locale] Review part 1: headers"
Next in thread: Steven Watanabe: "Re: [boost] [locale] Review part 1: headers"

Date view	Thread view	Subject view	Author view

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk