Boost logo

Boost :

Subject: Re: [boost] GSoC Unicode library: second preview
From: Phil Endecott (spam_from_boost_dev_at_[hidden])
Date: 2009-06-22 10:00:01


Mathias Gaunard wrote:
> On another note, while I do think IF_LIKELY for UTF-16 is a good idea,
> doesn't that heavily penalize certain scripts, such as asian ones, in
> the case of UTF-8?

Not really:

- In many cases, documents that use a exotic script actually contain
large numbers of ASCII characters; consider an HTML page, for example,
which will be full of HTML punctuation and tags. (I believe that I
became aware of this after reading something written by a Mozilla
person who had been investigating Unicode issues.)

- The penalty of a wrong branch hint is not "heavy". We probably have
lots of places in our code where the compiler heuristic is wrong, but
we don't notice until we study it very carefully (as I did with this
UTF8 code). This is why processors still need to implement dynamic
branch prediction.

My normal policy for using compiler branch hints like IF_LIKELY is to
compile once with profile-driven optimisation, and then to find the
places where it made a significant difference and add branch hints. I
then get close to the profile-driven-optimised performance without
needing to actually re-do the profiling.

Regards, Phil.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk