Boost logo

Boost :

Subject: Re: [boost] [beast] Formal review
From: Vinnie Falco (vinnie.falco_at_[hidden])
Date: 2017-07-10 15:15:42


On Mon, Jul 10, 2017 at 7:47 AM, Artyom Beilis via Boost
<boost_at_[hidden]> wrote:
> You can use utf_traits
> ...
http://www.boost.org/doc/libs/1_64_0/libs/locale/doc/html/structboost_1_1locale_1_1utf_1_1utf__traits.html
>
> decode does the job.

The utf-8 validation of text websocket message payloads is a critical
bottleneck. The best websocket implementations apply significant
optimizations to this operation, recognizing the vast majority of
inputs are low-ascii (no code point larger than one byte). For example
JSON data. With collaboration from Ripple employees, the code in Beast
was developed as a high performance utf8-validation function.

Preliminary tests indicate that switching to Boost.Locale away from
Beast's well optimized and well tested utf8 validation function would
incur a 67,400% performance penalty. I hope you understand that I
might be reluctant to switch.

Benchmark results:

beast.benchmarks.utf8_checker
beast: 2,016,637,738 char/s
beast: 1,921,062,599 char/s
beast: 1,939,159,018 char/s
locale: 3,053,539 char/s
locale: 2,989,265 char/s
locale: 3,060,962 char/s
Longest suite times:
   17.8s beast.benchmarks.utf8_checker
17.8s, 1 suite, 1 case, 1 test total, 0 failures
The program '[75300] benchmarks.exe' has exited with code 0 (0x0).

Code:
<https://github.com/vinniefalco/Beast/commit/3df7de8ce2e8f797722118b9d751266241a8266e>


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk