Boost logo

Boost :

From: Maarten Hilferink (mhilferink_at_[hidden])
Date: 2003-06-05 06:21:10


sequence_array<T>, acts as a vector of vector<T>, but faster.

I have made the mentioned attatchments available in the boost files section
in
a directory with the name sequence_array. A separate documentation file will
follow, but the header file contains a desription of the main design
considerations
and usage.

Maarten.

----- Original Message -----
From: "Maarten Hilferink" <mhilferink_at_[hidden]>
To: "Toon Knapen" <toon.knapen_at_[hidden]>; "Boost mailing list"
<boost_at_[hidden]>
Sent: Thursday, June 05, 2003 11:14 AM
Subject: Re: [boost] sequence_array contribution

> ----- Original Message -----
> From: "Toon Knapen" <toon.knapen_at_[hidden]>
> To: "Boost mailing list" <boost_at_[hidden]>; "Maarten Hilferink"
> <mhilferink_at_[hidden]>
> Sent: Wednesday, May 28, 2003 12:59 PM
> Subject: Re: [boost] sequence_array contribution
>
>
> On Thursday 22 May 2003 12:51, Maarten Hilferink wrote:
> > template<typename T> struct sequence_array<T>;
> > which has an interface that is almost compatible with
> > std::vector<std::vector<T> >;
> > but faster.
>
> Toon wrote>and more compact I presume. AFAICT the drawback will
> be that the sequece is not as dynamic anymore as a
> vector-of-vectors (vov)
>
> Well, sequences of a sequence_array can be resized, for example:
>
> sequence_array<std::pair<double, double> polygons(10); // 10 polygons.
> polygons[1].resize(50); // 50 points in second polygon.
> polygons[0].resize(20); // 20 points in first polygon.
>
> I have made the sequence_arry.hpp a bit more self contained (see
attachment)
> and will continue to integrate the stuff into the boost-style.
> I have done some time testing (see attached: sequenceTest.cpp) and it
looks
> promising
> Inserting 1000 points in each of 5000 polygons and then destructing it
takes
> 13 secs on my computer with a vector of vectors (vov) and 3 second with a
> sequene_array.
> (this is in debug mode in which MVC adds stuff to heap allocations and
does
> an expensive check for de-allocations; nevertheless, a saving of more than
> 75%).
>
> > template<typename T> struct sequence_array<T>;
> > that uses an internal
> > std::vector<T> m_Data;
> > for sequencial storage of sequences of T
> > and an additional
> > std::vector<std::pair<T*, T*> > m_Seqs;
> > for identifying allocated sequences.
>
> Toon wrote>But isn't always gauranteed : m_seqs[i].second ==
> m_seqs[i+1].first ?
> In this case you could shrink the memory footprrint with another pointer
> per innter vector.
>
> No, Sequences can be reallocated in m_Data (see example above),
> and therefore this equation is not guaranteed.
>
> > sequence_array offers typedef iterator, const_iterator, reference, and
> > const_reference
> > with interfaces nearly compatibe with
> > std::vector<T>*, const std::vector<T>*, vector<T>&, resp. const
> vector<T>&.
> > (the sequences however don't have a reserve() member function,
>
> Toon wrote>Why not ? I have something similar as you and definitly need
> reserve.
> But my reserve takes two arguments : the number of internal vectors
> (will reserve m_seqs) and the storage consumed by all internal vectors
> (will reserve m_data).
>
> Yes, you can reserve data for the sequence_array, but not for each
sequence
> separately,
> I made the design decision to have only 2 pointers (not 3) per sequence,
so
> I can
> resize, but not reserve for, a sequence, for example:
>
> sequence_array<std::pair<double, double> polygons;
> polygons.resize(2) // make room for 2 polygons
> polygons.data_reserve(70); // to occupy a total of 70 points.
> polygons[1].resize(50); // 50 points in second polygon.
> polygons[0].resize(20); // 20 points in first polygon.
> // polygons[0].reserve(x); // illegal: with only 2 pointers per sequence I
> cannot identify uninitialized data.
>
> > I specifically would like to hear from you on the following design
issues:
> > - I am considering to change the definition of the above mentioned
m_Seqs
> > to:
> > std::vector<std::pair<size_t, size_t> > m_Seqs;
> > where m_Seqs[i].first and second are offsets of the start and end of
> > sequence i relative to m_Data.begin().
> > This avoids adjustment when m_Data grows, for a small const access-time
> > penalty.
> Toon wrote>In my implementation, I allways recalculate m_seqs when m_data
> grows.
> But you would need to perform profiling to know what's best actually.
>
> Actually, the recalculation works quite well, so I think I keep the direct
> pointers
> to favour access speed.
>
> > 1. Arye you interested in such contribution?
>
> Toon wrote>I am but need to make sure it provides enough performance for
my
> specific case.
>
> See the above testing results, and more details in the enclosed files.
>
> I am still very interested in your feedback, especially about how to
> integrate sequence_arrays into
> boost, but will not respond to email during my holiday (10-June - 8-
July).
> When cc'd to my email account, I will respond on the 9-th of June or after
> 8-July.
>
> Greetings,
>
> Maarten.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk