Turns out that the char_separator shamelessly constructs std::strings under the cover so I gained something but not as much as I hoped. The split algorithm you mention requires a container to store the results so you still have to do one allocation, correct?

Frustrating! In theory one should be able to parse a sequence of tokens without constructing or copying any strings.

Florin.

On Wed, Mar 26, 2008 at 12:54 AM, Pavol Droba <droba@topmail.sk> wrote:
Hi,

Why don't you just use the split algorithm in the StringAlgo library?

http://www.boost.org/doc/html/string_algo/usage.html#id1638440


Regards,
Pavol.

Florin Trofin wrote:
> Hi,
>
>
> I've been using the boost tokenizer successfully in the past and I've
> been quite happy with it. I was using it with std::string as my token
> type, but now I need to use it differently because of performance
> reasons (the input string is a raw UTF8 buffer (const unsigned char*)
> and output is a specific UTF16 string class). So I thought: maybe I can
> just tokenize the unsigned char buffer in place using
> boost::iterator_range<const unsigned char*> as my token type.
>
> And it almost worked! With a hack:
>
> the tokenizer attempts to call assign on my TokenType but
> boost::iterator_range doesn't have such member function. I created a
> wrapper class that simply delegates to the iterator_range's assignment
> operator and it now works!
>
> This is great because I have no more useless string constructions: I can
> go directly from a raw UTF8 buffer to my output string type (UTF16
> based) with only one conversion and no extra allocations! I still have
> the nice syntax of boost tokenizer and the maximum efficiency!
>
> I think this solution should be mentioned in the tutorial docs because
> it might not be obvious for everybody. Also, maybe we can eliminate the
> hack I did by adding an assign() to the boost range interface (this
> seems simpler to me than modifying the tokenizer to not call assign).
>
> Thanks for the great work you guys put into this library!
>
>
> Best regards,
>
>
> Florin.
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Boost-users mailing list
> Boost-users@lists.boost.org
> http://lists.boost.org/mailman/listinfo.cgi/boost-users
_______________________________________________
Boost-users mailing list
Boost-users@lists.boost.org
http://lists.boost.org/mailman/listinfo.cgi/boost-users