Boost logo

Boost :

Subject: Re: [boost] [Potentially OT] String Concatenation Operator
From: Jeroen Habraken (vexocide_at_[hidden])
Date: 2010-08-25 06:18:12


Hi,

On 24 August 2010 18:11, Dean Michael Berris <mikhailberis_at_[hidden]> wrote:
> Good day everyone,
>
> I am currently taking some time to implement some functionality into
> cpp-netlib [0] (shameless plug) and somehow I've stumbled into a sort
> of conundrum.
>
> First, some background: I'm trying to abstract away the string
> building routines of the network library to come up with the most
> efficient way of doing the following:
>
> 1. Efficiently allocate space to contain a string being built from
> literals and variable length strings.
> 2. Be able to build/traverse the string lazily (i.e., it doesn't
> matter that the string is contiguous in memory as in the case of
> C-strings, or whether they are built/backed by a stream as in Haskell
> ByteString).
> 3. As much as possible be "automagically" network-safe (i.e. can be
> dealt with by Boost.Asio without having to do much acrobatics with
> it).
>
> At the heart of the issue is the semantics of the '+' operator to
> signify string concatenation. Trying not to sound pedantic about it,
> the addition operator in traditional mathematical notions is both
> commutative and associative, while string concatenation is not
> commutative but right associative. Trying to remember my C++ operator
> precedence and associativity rules, it looks like operator% and/or
> operator^ might be good candidates for this, but only in expression
> templates where you fold from the right.
>
> Now I don't want to start beating on the STL's standard string
> implementation, but I'd like to know if anyone is already working on a
> string implementation that meets the above requirements? I'd be happy
> to wait on compile times with Proto, if it means I can save big at
> runtime.
>
> What I wanted to be able to do (and am reproducing at the moment) is a
> means of doing the following:
>
>  string_handle f = /* some means of building a string */;
>  string_handle s = str("Literal:") ^ f ^ str("\r\n\r\n");
>  std::string some_string = string_handle; // convert to string and build lazily
>
> If for instance f were also a literal, then s can efficiently already
> hold the string in some fixed sized byte array whose size is
> determined at compile time. Somehow the function str() would only be
> able to take a literal and look something like this:
>
>  template <size_t N>
>  inline
>  bounded_fragment<N> str(char const s[N]) {
>    return bounded_fragment<N>(s);
>  }
>
> The evaluation of the assignment (or copy constructor) of the
> string_handle will then evaluate the expression template and already
> know at compile time:
>
> A. Whether the string is just a long literal and allocate enough space
> to effectively hold the whole string at compile time, or at least
> reserve enough space statically (a boost::array perhaps) so that a
> simple range copy can be done (and optimized by the compiler as well)
>
> B. Whether the string is a list of variable length strings, having a
> list of handles built
>
> C. Whether it is a mix and have all the adjacent literals joined
> effectively at compile time and those variable sized strings retrieved
> when required
>
> Pointers to ongoing work would be most appreciated -- I'm currently
> too preoccupied to chase this particular rabbit down the hole (I'm
> chasing a different rabbit in a different hole) but maybe this is an
> interesting enough problem for the template metaprogramming guru's to
> look into?
>
> Thanks in advance and I look forward to any thoughts/pointers.
>
> --
> Dean Michael Berris
> deanberris.com
> _______________________________________________
> Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
>

There are two things that come to mind:
- ropes <http://www.sgi.com/tech/stl/Rope.html>
- libevent buffers
<http://monkey.org/~provos/libevent/doxygen-2.0.1/structbufferevent.html>

The latter has been designed for a purpose similar to yours and I
believe a lot can be learned from its implementation, even though it
is C code.

Regards,
Jeroen


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk