Boost logo

Boost Users :

Subject: [Boost-users] A library for "string chains"?
From: Richard (legalize+jeeves_at_[hidden])
Date: 2014-08-08 23:44:33


[Please do not mail me a copy of your followup]

For a motivating example, see this gist:
<https://gist.github.com/LegalizeAdulthood/7b67968bd93fbd4f9dbb>

It uses boost::mapped_file to map a mail message into memory and then
proceeeds to parse it. This is simply an example, but I think this
hacked up parser comes fairly close to handling the full RFC2822
message, ignoring MIME extensions. <https://tools.ietf.org/html/rfc2822>
I hacked it up from memory of mail message format rules and a single
example message, so don't consider it production quality :-).

The intention here is to parse a file without copying it's input text
into any buffers. Notice that this parser builds structure from a
buffer by identifying interesting substrings as (b,e) pointer pairs.

A more traditional approach would have involved at least 2 more
copies: one that gets the data from the file system into the stream
buffer and another one that copies the data from the stream buffer
into a std::string. The extra copying can take significant amounts
of time when processing thousands of mail messages.

Mail messages tend to be short, so mapping them into memory is not
such a big deal. Obviously there is a lifetime relationship between
the mapped file (the source character buffer) and the associated
strings. For my use case, I'm not interested in being able to write
to the strings, just read from them. (If you look closely you'll see
that I "cheat" and add a few segments into my string that are
associated with const char* C-style strings, but they too are
read-only.)

I'm interested to know if anyone is aware of a "string chain" library
that provides a (read-only) API similar to std::string, but is
fundamentally just managing (b,e) pointer pairs into some larger
buffer.

Boost.Test has a funky string class hiding in it called basic_cstring,
but it wasn't created for this purpose.

-- 
"The Direct3D Graphics Pipeline" free book <http://tinyurl.com/d3d-pipeline>
     The Computer Graphics Museum <http://computergraphicsmuseum.org>
         The Terminals Wiki <http://terminals.classiccmp.org>
  Legalize Adulthood! (my blog) <http://legalizeadulthood.wordpress.com>

Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net