Boost :

Date view	Thread view	Subject view	Author view

From: Jason Hise (0xchaos_at_[hidden])
Date: 2006-10-31 19:28:30

Next message: Jeff Garland: "Re: [boost] Boost Timer Update"
Previous message: Eric Niebler: "Re: [boost] [proto] interface changes"
In reply to: Martin Knoblauch Revuelta: "Re: [boost] [proposal] Sequence container with O(log n) random access _and_ O(log n) insertion/removal (wherever)"
Next in thread: Martin Knoblauch Revuelta: "Re: [boost] [proposal] Sequence container with O(log n) random access _and_ O(log n) insertion/removal (wherever)"
Reply: Martin Knoblauch Revuelta: "Re: [boost] [proposal] Sequence container with O(log n) random access _and_ O(log n) insertion/removal (wherever)"

On 10/31/06, Martin Knoblauch Revuelta <mkrevuelta_at_[hidden]> wrote:
> On 10/31/06, Jason Hise <0xchaos_at_[hidden]> wrote:
> >
> > I don't entirely understand the benifit of your 'non-proportional
> > sequence view'. What use cases can you think of where this would be
> > beneficial to client code?
>
> The best example I've thought of is a text editor [...]
> For the next version I have designed an interface that allows having
> more than one NPSV width, and indexing separately by each one of them.

Wouldn't it make more sense to have a container of containers if you
want to do this? Or am I misunderstanding something?

>
> > Right now it seems like it adds
> > unjustified bloat to the library.
>
> The empty_number class, used as default value for the template
> parameter W, contains no data, so it should be optimized away by any
> compiler.

I was referring more to lines of code and elegance of implementation.

> By the way, somebody asked me if I could implement splice methods. I
> am doing it, but I'm calling them move, since splice is a specific
> name referring how this is done with lists, and remarking that it
> takes O(1) time, which will be false here. What do you think?

I think that the member function should still be called splice for the
sake of conforming to a common interface.

> > Although I'm not sure why
> > either the nodes or the iterators should be aware of the container
> > they are a part of to begin with.
>
> Good point.
>
> I think that, in some operations, specifying the container is both
> redundant and disgusting. Redundancy is error prone. If you are
> specifying a location with an iterator, why should you specify the
> container too. Specially when it is not necessary at all, or it can be
> deduced from the iterator reasonably fast.

Algorithms should not need to know to which container the iterators
belong. Iterators are meant to be redirectable pointers to locations
in a container, and nothing more. It seems like you are trying to
make iterators capable of so much magic that one never needs to refer
to the container. But the container is the place where the interface
is supposed to be.

>
> In non-static methods where an iterator is required, an assert checks
> the consistency of the call. This helps in debug processes.
>
> assert (owner(it.ptr)==this); // it must point into this arr.

That's very kind of you, but I don't think this check is your
responsibility. Especially if it results in more overhead being
required.

>
> See this other assert in the operator for iterators subtraction:
> [..]

Same comment.

> Some methods have been made static just to provide enhaced
> flexibility. For example:
>
> static iterator insert (const iterator & it, const T & t);
>
> (Now I'm writing all this, i note that i forgot to make static the
> method erase(iterator). Some other members might be made static
> too...)

Making this function static is the equivalent of saying that an
iterator is a micro-container capable of the equivalent of a
push_back. Although this may be true, it should not be exposed like
this. Conceptually the functionality is at the wrong level. Values
are inserted into containers, or erased from them. An iterator only
specifies the location.

>
> Other methods, like static swap(iterator,iterator) have their
> functionality extended (swapping nodes between different containers)
> without requiring any additional parameter. It would be absurd to
> require specification of only one of the involved containers. The
> same thing happens with all versions of move() (coming soon; see
> comments about splice above).

One would expect swapping iterators to swap what they point to, not to
which containers their values belong. Containers normally expect to
contain a sequence of value types which are capable of being swapped
themselves easily, therefore negating the need for this function.

A swap should be provided at the container level, but providing a
specialized one at the iterator level doesn't make sense.

>
> I think that this provides flexibility and ease of use.
>
> Whether the "static-where-possible" approach is kept or not, reaching
> the container from iterators is definitely useful. If kept, for not
> requiring references to it. If not, for checking consistency.

Static where possible is in my opinion a bad idea. The container
should provide member functions which modify or access the container,
and everything else should be moved out into free floating utility
functions. This is what a separation between containers and
algorithms means, and it is what the stl is built on.

> > One of the most important features of the stl is that it separates
> > algorithms from the container types they operate on. Most of these
> > algorithms already exist in the stl. What is your justification that
> > they need to be overridden for your container in the first place?
>
> Because I bet that these are optimized for either O(1) random access
> or O(1) insertion/removal. The new container doesn't fit any of those,
> and it would perform suboptimal if treated by algorithms based on
> those assumptions.
>
> Is there a sort specialization for sorted trees? I mean, creating a
> tree from scratch and populating it with an insert_sorted method. Even
> if it exists, note the optimization I am using: instead of extracting
> the nodes one by one from the source tree (with the corresponding
> rebalancing), I detach the list from it, and traverse it. The
> complexity will remain the same, but this trick will save a lot of
> time. This can only be done from inside the class or from a friend
> function. The same is for stable_sort().

You are trying to make one container that is supposed to look like a
sequence do absolutely everything a tree can do, and I think this is a
terrible mistake. If people need a data structure that can do
everything a tree can do, they should use something that provides an
interface to a tree. Not something that provides an interface to a
sequence.

>
> > > > * I would make the value sorted aspect a separate container, [..]
> > >
> > > Why? I wouldn't like to restrict the hybrid use of a container object.
> > > [..]
> >
> > Because it desimplifies the interface, causing bloat and making client
> > code potentially pay for features it doesn't want/need.
>
> This container is a template. Unused methods don't even get compiled.

Again, I meant it bloats the interface. Not necessarilly the compiled
code size or algorithmic complexity.

In general, a container either needs to stay sorted or it doesn't.
This is why there is a separate priority queue class, instead of an
insert_sorted method in each sequence container.

> > [..]
> > Supporting both random location inserts by the user and sorted inserts
> > doesn't make sense.
>
> I think it might make sense. Think of the tables of a word processor,
> or the rows of a spreadsheet: the final user can freely
> insert/extract, and there's a sort command in the menu. This command
> is often used consecutively for different ordering criterions.

Unless the document is a million pages long, I doubt calling the
standard sort algorithm would really result in that much of a
performance hit.

You keep using the text editor as justification for the extra
features. Wouldn't it be better for the text editor to use a tree
interface for its storage in the first place, rather than a sequence
interface?

Summary:
I think that it is absolutely vital that you consider this to be a
sequence library, rather than a tree that happens to use stl
constructs. Trees and the algorithms that operate on them are
separate animals, and should be dealt with in their own library.
Right now I am seeing a dangerous degree of feature bloat resulting
directly from your desire to build on the optimizations possible
because of the tree nature of the implementation.

-Jason

Date view	Thread view	Subject view	Author view

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk