Boost logo

Glas :

Re: [glas] Vector Spaces vs. "Objects called vectors"

From: Ian McCulloch (ianmcc_at_[hidden])
Date: 2005-02-04 10:24:10


On Fri, 4 Feb 2005, Kevin Wheatley wrote:

> Ian McCulloch wrote:
> > Clearly, in a lot of cases it doesn't make sense to resize the vector.
> > eg, "x += y", x and y should be the same size. The only case where it
> > would be sensible to consider resizing is for ordinary assignment. Here,
> > I don't think it is even sensible to fix a policy, if only for
> > interoperability reasons (how would you prevent a resize for std::vector?
> > How would you allow resize for a fixed-size array?).
>
> x += y, what if x has a sparse representation? (OK so I'm playing
> advocate here)

The guiding rule needs to be the computational complexity. The assignment
here should be no worse than O(N) (probably O(N log N) would be acceptable
too). Beyond that, it would be faster to create a temporary vector x+y in
some easily resizable structure that is O(N) in speed, and then copy that
structure into X at the end (again O(N)).

For example, if the left hand side uses a sparse structure that requires a
O(N) reallocation on every insert, then the operation as a whole would be
O(N^2), which could be catastrophic.

Another example is fixed structure sparse vectors. For an operation on
two such vectors the index structure is assumed to be the same, so many
operations can be very fast - often you don't even need to touch the
index array at all.

Of course, in the above example if ONLY x had a sparse representation and
y was dense, then the operation should be forbidden - it would effectively
destroy the sparse structure of x.

>
> I think what your meaning is that the size (math?) is like the
> capacity (CS) and that x and y have the same notional
> capacity/dimensionality. This would be a different thing to the
> underliying storage capacity/size which may need to be modified.

yes.

>
> Is this a desirable property of a vector (and thus by orthoganally a
> matrix), to be able to allow the user of the objects to stop
> themselves from incurring this 'hidden' memory allocation cost. I
> think it is.

Well, it depends how relevant the allocation cost is. For fixed
structure vectors, then there is no memory allocation penalty beyond the
inevitable construction cost. If you are concerned about memory
allocation, then by all means use such a structure. But these are not
useful in cases where the sparse structure is not known in advance.

>
> Conversly is it desirable to allow flexibility and the user to care
> less about the underlying representation, also a yes I think.
>
> Conclusions:
>
> a point of variation in policy is probably needed for storage
> reallocation
> maybe dimensionality miss-match is an error - provide an explicit
> conversion mechanism like a dimensionality_cast<>() ?

Well, certainly different storage formats have different
assignment/reallocation semantics. The question is, whether these need to
be explicitly spelled out, or whether it works to just 'do the natural
thing'. For example, if you had a sparse matrix in CSR format implemented
by something like a vector of sparse vectors, then the matrix assignment
operator is severly complicated by having to do a dimensionality_cast<>()
on all of the row vectors [unless it is a fixed-structure type of course].

By the way, do the people here pushing for strict dimensionality checks
have the same opinion about std::string automagically doing a reallocation
on assignment? Why/why not? This part of the discussion seems very
surreal to me.

> a glossary/dictionary of terms and concepts is important (most
> important?) to progress anywhere between the two meeting groups
> here...

Good idea. I think the notation is pretty uniform, try
http://www.psc.edu/~burkardt/papers/linear_glossary.html

David (or anyone else ;), do you have a link to a good intro on generic
programming?

Cheers,
Ian