|
Boost : |
Subject: Re: [boost] RFC: interest in Unicode codecs?
From: James Mansion (james_at_[hidden])
Date: 2009-07-18 07:21:42
Cory Nelson wrote:
>> I finally found some time to do some optimizations of my own and have
>> had some good progress using a small lookup table, a switch, and
>> slightly deducing branches. See line 318:
>>
>> http://svn.int64.org/viewvc/int64/snips/unicode.hpp?view=markup
>>
>> Despite these efforts, Windows 7 still decodes UTF-8 three times
>> faster (~750MiB/s vs ~240MiB/s on my Core 2. I assume they are either
>> using some gigantic look up tables or SSE.
How much cost are you incurring in the tests for whether the traits
indicate that
the error returns are valid?
I'm wondering if theer is a case for requiring that these be compile
time constants
in the Traits class rather than flags in a Traits value.
And why is 'last' passed in to decode_unsafe?
Is there any indication that duff's device will prevent aggressive
inlining? I'm
assuming you need this method to be fully inlined into the outer loop, and
maybe its not happening - ideally you;d want some loop unrolling too.
I suspect that as noted the lack of special case for largely 7-bit ascii
input
will tend to make it slow on mosts Western texts, though speedups for the
multi-character case will need care on alignment-sensitive hardware: you'll
need to fix that in the outermost loop.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk