Boost logo

Boost :

From: Maarten Hilferink (mhilferink_at_[hidden])
Date: 2003-06-05 04:14:57


----- Original Message -----
From: "Toon Knapen" <toon.knapen_at_[hidden]>
To: "Boost mailing list" <boost_at_[hidden]>; "Maarten Hilferink"
<mhilferink_at_[hidden]>
Sent: Wednesday, May 28, 2003 12:59 PM
Subject: Re: [boost] sequence_array contribution

On Thursday 22 May 2003 12:51, Maarten Hilferink wrote:
> template<typename T> struct sequence_array<T>;
> which has an interface that is almost compatible with
> std::vector<std::vector<T> >;
> but faster.

Toon wrote>and more compact I presume. AFAICT the drawback will
be that the sequece is not as dynamic anymore as a
vector-of-vectors (vov)

Well, sequences of a sequence_array can be resized, for example:

sequence_array<std::pair<double, double> polygons(10); // 10 polygons.
polygons[1].resize(50); // 50 points in second polygon.
polygons[0].resize(20); // 20 points in first polygon.

I have made the sequence_arry.hpp a bit more self contained (see attachment)
and will continue to integrate the stuff into the boost-style.
I have done some time testing (see attached: sequenceTest.cpp) and it looks
promising
Inserting 1000 points in each of 5000 polygons and then destructing it takes
13 secs on my computer with a vector of vectors (vov) and 3 second with a
sequene_array.
(this is in debug mode in which MVC adds stuff to heap allocations and does
an expensive check for de-allocations; nevertheless, a saving of more than
75%).

> template<typename T> struct sequence_array<T>;
> that uses an internal
> std::vector<T> m_Data;
> for sequencial storage of sequences of T
> and an additional
> std::vector<std::pair<T*, T*> > m_Seqs;
> for identifying allocated sequences.

Toon wrote>But isn't always gauranteed : m_seqs[i].second ==
m_seqs[i+1].first ?
In this case you could shrink the memory footprrint with another pointer
per innter vector.

No, Sequences can be reallocated in m_Data (see example above),
and therefore this equation is not guaranteed.

> sequence_array offers typedef iterator, const_iterator, reference, and
> const_reference
> with interfaces nearly compatibe with
> std::vector<T>*, const std::vector<T>*, vector<T>&, resp. const
vector<T>&.
> (the sequences however don't have a reserve() member function,

Toon wrote>Why not ? I have something similar as you and definitly need
reserve.
But my reserve takes two arguments : the number of internal vectors
(will reserve m_seqs) and the storage consumed by all internal vectors
(will reserve m_data).

Yes, you can reserve data for the sequence_array, but not for each sequence
separately,
I made the design decision to have only 2 pointers (not 3) per sequence, so
I can
resize, but not reserve for, a sequence, for example:

sequence_array<std::pair<double, double> polygons;
polygons.resize(2) // make room for 2 polygons
polygons.data_reserve(70); // to occupy a total of 70 points.
polygons[1].resize(50); // 50 points in second polygon.
polygons[0].resize(20); // 20 points in first polygon.
// polygons[0].reserve(x); // illegal: with only 2 pointers per sequence I
cannot identify uninitialized data.

> I specifically would like to hear from you on the following design issues:
> - I am considering to change the definition of the above mentioned m_Seqs
> to:
> std::vector<std::pair<size_t, size_t> > m_Seqs;
> where m_Seqs[i].first and second are offsets of the start and end of
> sequence i relative to m_Data.begin().
> This avoids adjustment when m_Data grows, for a small const access-time
> penalty.
Toon wrote>In my implementation, I allways recalculate m_seqs when m_data
grows.
But you would need to perform profiling to know what's best actually.

Actually, the recalculation works quite well, so I think I keep the direct
pointers
to favour access speed.

> 1. Arye you interested in such contribution?

Toon wrote>I am but need to make sure it provides enough performance for my
specific case.

See the above testing results, and more details in the enclosed files.

I am still very interested in your feedback, especially about how to
integrate sequence_arrays into
boost, but will not respond to email during my holiday (10-June - 8- July).
When cc'd to my email account, I will respond on the 9-th of June or after
8-July.

Greetings,

Maarten.





Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk