Boost :

Date view	Thread view	Subject view	Author view

Subject: Re: [boost] [locale] Review results for Boost.Locale library
From: Ryou Ezoe (boostcpp_at_[hidden])
Date: 2011-04-25 00:13:23

Next message: Steve M. Robbins: "[boost] How to use ASIO without SSLv2?"
Previous message: Phil Bouchard: "Re: [boost] [Memory Managed Pointer] Review Request"
In reply to: Artyom: "Re: [boost] [locale] Review results for Boost.Locale library"
Next in thread: Ryo IGARASHI: "Re: [boost] [locale] Review results for Boost.Locale library"

On Mon, Apr 25, 2011 at 6:04 AM, Artyom <artyomtnk_at_[hidden]> wrote:
>> From: Ryou Ezoe <boostcpp_at_[hidden]>
>
>> Number and Date Â formatting:
>> There are so many possible ways to express numbers.
>> Some Â people want comma separation by 3 digits, other want 4 digits.
>> Some want Â Â Â to be 100ä¸‡(ä¸‡ means 10000). some want ç™¾ä¸‡(ç™¾ means 100)ã€‚
>> Formatting Â based on locale doesn't work because there is no uniform Â format.
>>
>
> Have you actually read the manuals?
>
> This is the output of :
>
> Â std::cout << bl::format("{1}\n{1,num}\n{1,spell}\n") % 1000000 ;
>
> in ja_JP.UTF-8 locale
>
> Â 1000000
> Â 1,000,000
> Â ç™¾ä¸‡
>
> Not so bad, isn't it?
Not bad.
Still I doubt anybody want to use Boost.locale just for that.

>
>
>> Collation and Conversions:
>> Japanese doesn't have concepts of Â case and accent.
>> Since we don't have these concepts, we never need Â it.
>>
>
> Irrelevant, even when this feature not required
> for CJK it is required like many other things (spaces,
> plural forms for other languages)
>
>> Boundary analysis:
>> What is the definition of boundary and how does Â it analyse?
>> It sounds too smart for such a small things it actually Â does.
>> I'd rather call it strtok with hard-coded delimiters.
>> Japanese Â doesn't separate each words by space.
>> So unless we perform really complicated Â natural language
>> processing(which is impossible to be perfect since we never Â have
>> complete Japanese dictionary),
>> we can't split Japanese text by Â words.
>
> Ok this is word splitting
>
> Â |ç§|ã¯|æ—¥æœ¬|ã®|æ±äº¬éƒ½|ã«|ä½|ã‚“ã§ã„ã¾ã™|ã€‚|ç§|ã¯|å¤§|ããª|å®¶|ã«|ä½|ã‚“ã§ã„ã¾ã™|ã€‚
>
> of the text:
>
> Â ç§ã¯æ—¥æœ¬ã®æ±äº¬éƒ½ã«ä½ã‚“ã§ã„ã¾ã™ã€‚ç§ã¯å¤§ããªå®¶ã«ä½ã‚“ã§ã„ã¾ã™ã€‚

To me, it looks like splitting by contiguous kanas and kanzis.
I don't think I ever need that kind of splitting.

>
> I assume it is not perfect and I don't know Japanese to
> say but I can see at lease that words like:
>
> Â ç§ - I
> Â æ—¥æœ¬ - Japan
> Â æ±äº¬éƒ½ - City of Tokyo
>
> But this is not only defined by "space-based" separation.
> Also for some languages like Thai ICU uses dictionaries.
>
> So it is not naive algorithm that separates text by
> spaces.
>
>> Also, Japanese doesn't have a concept of word wrap.
>> So "find Â appropriate places for line breaks" is unnecessary.
>> Actually, there are some Â rules for line break in Japanese.
>> These rules are too complicated and it Â requires more than text processing.
>> Same for Chinese and Korean.
>
> This is possible line-break separation of the same sentences above.
>
>
> |ç§|ã¯|æ—¥|æœ¬|ã®|æ±|äº¬|éƒ½|ã«|ä½|ã‚“|ã§|ã„|ã¾|ã™ã€‚|ç§|ã¯|å¤§|ã|ãª|å®¶|ã«|ä½|ã‚“|ã§|ã„|ã¾|ã™ã€‚|
>
> At least I can see that it does not allows to start a line with "ã€‚" .
We have a lot of characters that should not be the initial character of a line.
But there is no uniform rule.
And it must be work along with font rendering.
Simple text processing doesn't suffice.

>
>
>>
>> Of Â course, strtok is still a handy tool and I appreciate yet another design.
>> But Â I think it's better be handled by more generic library, like Boost
>> String Â Algorithms.
>>
>
> It far more complicated then strtok.
>
> Bottom line I see that you hadn't really try
> to use this library or understand how it
> works.
>
> I'm sorry but it makes me doubt about the review
> you had sent.
>
> Artyom
> _______________________________________________
> Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

-- 
Ryou Ezoe

Next message: Steve M. Robbins: "[boost] How to use ASIO without SSLv2?"
Previous message: Phil Bouchard: "Re: [boost] [Memory Managed Pointer] Review Request"
In reply to: Artyom: "Re: [boost] [locale] Review results for Boost.Locale library"
Next in thread: Ryo IGARASHI: "Re: [boost] [locale] Review results for Boost.Locale library"

Date view	Thread view	Subject view	Author view

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk