|
Boost : |
Subject: Re: [boost] Interest in a simple buffer abstraction?
From: Cory Nelson (phrosty_at_[hidden])
Date: 2011-01-30 18:34:31
On Sat, Jan 29, 2011 at 2:59 PM, Vicente Botet <vicente.botet_at_[hidden]> wrote:
>
>
> Cory Nelson wrote:
>>
>> On Thu, Jan 27, 2011 at 6:49 AM, Vicente Botet <vicente.botet_at_[hidden]>
>> wrote:
>>> I'm working on a frame concept, which is something like a bidirectional
>>> buffer. It allows to prepend and append binary data, and obtain/remove
>>> the
>>> header and the trailer.
>>>
>>> In addition we can split a frame in fragments and join fragments to make
>>> a
>>> frame.
>>>
>>> All these features are needed when working with protocol stacks.
>>
>> You'll have a winner if you expand that to allow bidirectional
>> iteration of both bytes and fragments (and then a fragment is just a
>> raw array giving random-access iteration within them). Â VERY useful
>> for high-performance incremental parsing, I would definitely give my
>> +1 for boost inclusion.
>>
>
>
> I wanted to hide whether the memory is continuous or not. Only when the user
> needs continuous memory the library will ensure it on the resulting data. Of
> course this continuous memory will have random access.
>
> Could you give an example of how you want to use the features you seem to
> need? This will help me to see if the current interface is enough or if it
> must be extended.
I've designed a similar buffer class before, with I/O and incremental
parsing in mind. In addition to what you describe, it also had
reserve(size_t), commit(size_t), release(size_t), and access to the
reserved area. I'm still not 100% sure if this type of access is
appropriate for a general buffer, but allowing it would not cost
anything to those who aren't using it and it would reduce a ton of
code duplication for a specific io::buffer, since they're basically
the same but with the internals a bit more exposed.
1) Incremental parsing.
enum result { ok, need_more };
template<typename Iterator>
result parse(Iterator first, Iterator last);
I can call parse() on the first fragment (just using pointers for
iterators) for high performance. Only if the first fragment doesn't
contain enough to parse (returning need_more) will I call parse() on
the entire buffer, which will be slower (like dequeue iterators) but
rare. If that returns need_more still, then I'll perform more I/O to
fill the buffer with more data. Think of, eg. HTTP or XML, where the
individual bits (like a header or <element>) are quite small but
you'll want to read in large page-sized fragments to reduce I/O calls.
A single fragment will hold a lot of things and there's no reason to
use slow iterators to get them.
2) Scatter/gather I/O.
I will typically have a pool of page-sized fragments. A slow stream
might only work on one or two fragments at a time, but for a fast one
I might want to reduce I/O calls and work with a lot of fragments at
once. If a stream is coming in fast, I can reserve() 64KB worth of
fragments, request a scatter into the reserved area, then commit()
whatever amount of data it actually gave me. Whatever didn't get
committed will still be usable for the next I/O. I then release()
however much data I was able to parse.
This can be implemented pretty cleanly with hierarchical iterators, like so:
class fragment
{
T* begin();
T* end();
};
class iterator
{
fragment& fragment();
};
class buffer
{
iterator begin(); // go over committed range only.
iterator end();
iterator reserved_begin(); // go over reserved
(uninitialized/unconstructed) range only.
iterator reserved_end();
iterator to_iterator(fragment &f, T* iter); // creates a slower
iterator from a fragment iterator.
size_type size(); // committed size.
size_type reserved_size();
size_type capacity(); // size() + reserved_size().
size_type reserve(size_type minsize); // ensures there are at least
'minspace' elements reserved, and returns the new reserved_size().
size_type commit(size_type size); // commits 'size' elements from
the reserved space, returns the new size().
size_type release(size_type size); // releases 'size' elements from
committed space, returns the new size().
};
-- Cory Nelson http://int64.org
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk