|
Boost Users : |
Subject: Re: [Boost-users] A library for "string chains"?
From: Jan Herrmann (jherrmann79_at_[hidden])
Date: 2014-08-11 03:21:17
On 09.08.2014 05:44, Richard wrote:
> [Please do not mail me a copy of your followup]
>
> For a motivating example, see this gist:
> <https://gist.github.com/LegalizeAdulthood/7b67968bd93fbd4f9dbb>
>
> It uses boost::mapped_file to map a mail message into memory and then
> proceeeds to parse it. This is simply an example, but I think this
> hacked up parser comes fairly close to handling the full RFC2822
> message, ignoring MIME extensions. <https://tools.ietf.org/html/rfc2822>
> I hacked it up from memory of mail message format rules and a single
> example message, so don't consider it production quality :-).
>
> The intention here is to parse a file without copying it's input text
> into any buffers. Notice that this parser builds structure from a
> buffer by identifying interesting substrings as (b,e) pointer pairs.
>
> A more traditional approach would have involved at least 2 more
> copies: one that gets the data from the file system into the stream
> buffer and another one that copies the data from the stream buffer
> into a std::string. The extra copying can take significant amounts
> of time when processing thousands of mail messages.
>
> Mail messages tend to be short, so mapping them into memory is not
> such a big deal. Obviously there is a lifetime relationship between
> the mapped file (the source character buffer) and the associated
> strings. For my use case, I'm not interested in being able to write
> to the strings, just read from them. (If you look closely you'll see
> that I "cheat" and add a few segments into my string that are
> associated with const char* C-style strings, but they too are
> read-only.)
>
> I'm interested to know if anyone is aware of a "string chain" library
> that provides a (read-only) API similar to std::string, but is
> fundamentally just managing (b,e) pointer pairs into some larger
> buffer.
>
> Boost.Test has a funky string class hiding in it called basic_cstring,
> but it wasn't created for this purpose.
I think
http://www.boost.org/doc/libs/1_55_0/libs/utility/doc/html/string_ref.html
and http://en.cppreference.com/w/cpp/experimental/basic_string_view
could help.
Jan Herrmann
Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net