Boost logo

Boost :

Subject: Re: [boost] RFC: interest in Unicode codecs?
From: Cory Nelson (phrosty_at_[hidden])
Date: 2009-07-18 11:03:08

On Sat, Jul 18, 2009 at 6:21 AM, James
Mansion<james_at_[hidden]> wrote:
> Cory Nelson wrote:
>>> I finally found some time to do some optimizations of my own and have
>>> had some good progress using a small lookup table, a switch, and
>>> slightly deducing branches.  See line 318:
>>> Despite these efforts, Windows 7 still decodes UTF-8 three times
>>> faster (~750MiB/s vs ~240MiB/s on my Core 2.  I assume they are either
>>> using some gigantic look up tables or SSE.
> How much cost are you incurring in the tests for whether the traits indicate
> that
> the error returns are valid?

I've played around with it and have not noticed any significant
difference for this.

> I'm wondering if theer is a case for requiring that these be compile time
> constants
> in the Traits class rather than flags in a Traits value.
> And why is 'last' passed in to decode_unsafe?

Leftover from copy-paste, good catch.

> Is there any indication that duff's device will prevent aggressive inlining?

I have looked at the output of both GCC 4.4 and VC++ 2008 with
optimization flags cranked up. Each is generating inlined code
exactly how I want them to.

> I'm
> assuming you need this method to be fully inlined into the outer loop, and
> maybe its not happening - ideally you;d want some loop unrolling too.
> I suspect that as noted the lack of special case for largely 7-bit ascii
> input
> will tend to make it slow on mosts Western texts, though speedups for the
> multi-character case will need care on alignment-sensitive hardware: you'll
> need to fix that in the outermost loop.

Indeed. I haven't done this because the code uses iterators, but I
think some small specializations could be made to enable this in
transcode() when the input is a raw pointer.

One thing I have been trying is in that decode_unsafe. It has less
branches overall and compiles down to the optimal assembly I'd expect.
 For some reason, it runs slower. No clue why!

Cory Nelson

Boost list run by bdawes at, gregod at, cpdaniel at, john at