Boost logo

Boost :

From: Fernando Cacciola (fcacciola_at_[hidden])
Date: 2003-07-12 12:32:54


Hi Joaquín,

JOAQUIN LOPEZ MU?Z wrote:

> Hi Fernando,
>
> ----- Mensaje Original -----
> De: Fernando Cacciola <fernando_cacciola_at_[hidden]>
> Fecha: Sábado, Julio 12, 2003 1:22 am
> Asunto: [boost] Re: Interest in multiindex_set?(again)
>
>> [snip]
>
> Now, index_n.begin() and index_n.end() let you enumerate *all* the
> elements in mxs, regardless of n; the difference is in the order
> they are enumerated, which depends on the comparison predicate
> associated with index #n. By convention, my_mx_set inherits the
> functionailty of index #0, that is
>
> my_mx_set::some_memfun(...);
>
> is the same as
>
> my_mx_set::index_type<0>::type::some_memfun(...);
>
Ah! I see. Sorry about the confusion.

> In particular, my_mx_set::begin() and my_mx_set::end() let
> you traverse through then entire set of elements contained,
> with the order induced by index #0. Does this ask your question?
>
Absolutely!

> If not, please let me know and I'll be happy to explain it further.
> I'm most determined to eliminate this semantic barrier between
> both of us :)
>
Me too... I got it wrong from the start though :-)

> Also, may I suggest you download some distribution of g++ and
> play with the library a little. Maybe then things become clearer.
>
>> Now, suppose that you can iterate over the all of the elements:
>> what's the
>> order in which elements will appear w.r.t the insertion sequence
>> and the
>> ordering implied by the indices? This is what I've asked you about
>> the "ordering and clustering invariants" of the data structure.
>>
>> If, during a linear traversal (that is, iterating over all of the
>> elementsas if it were a sequence), the elements will appear in an
>> unspecified order,
>> then the data structure is not an associative container (much less
>> a set),
>> so it should _definitely_ be called 'table' and not 'set'.
>>
>> I think I understand now why the term 'index'. It reseambles the
>> filtering-key associated with the 'field' which designates a
>> 'column' on a
>> database table. However, if I'm not mistaken, the filtered-column
>> itself,that is, the sequence of elements with such value in the
>> correspondingfield, is not an 'index'. So the subsquence which is
>> mapped to a given index
>> shouln't be called index, I think. I would call it a 'cluster' as I
>> suggested before.
>>
>> Since I can't play with the code I couldn't see how much of a set
>> is each
>> 'cluster'. Make sure to check with the requirements and document this
>> clearly.
>>
>
> Hopefully, this has been asked above. Again, please let me know
> otherwise
> so that I try to make myself clearer.
>
It's OK. I guess I got it right this time.

>> Another thing I couldn't figure out is how to compose indices
>> hierachically.That is, how to reproduce a typical SQL SELECT *
>> ORDER BY X,Y,Z.
>> Since you're modeling an indexed table, this functionality should be
>> supported.
>
> This is an interesting subject to study, but as some others are
> already
> writing a relational framework I guess we should limit here to the
> basics of the container.
> SQL-like predicates can be built on top of it
> in the context of more ambitious libraries.
>
I see.
If this functionality can be built on top of it,
then I think it is OK not to support it explicitely.

>>
>> I vote for a separate KeyExtraction because:
>>
>> (a) Key extraction and key comparison are conceptually orthogonal.
>>
>> (b) The orthogonality will be important in practice when users
>> want to
>> compare whith other predicates besides less_than and/or based on
>> more than
>> just data members.
>> With your mixin approach, it would be more difficult to have an
>> index based
>> on a runtime expression
>> (such as a hash value) combined with a comparison other than <.
>
> The most usual situation for specification of an index (standard
> less_than comparison on some member of the element) is already
> covered by less_by. Furthermore, less_by accepts an additional
> template parameter for specifying a comparison predicate other
> than std::less. For instance
>
> less_by<employee,std::string,&employee::name>
>
> specifies an index based on less_than comparison of employee::name
> (a string). If you want to use some other comparison criterium on
> empoyee::name (for instance, case insensitive comparison), you can
> write
>
> less_by<employee,std::string,&employee::name,string_case_insensitive_compare>
>
> Is this what you're referring to?
>
Half of it.
I see that I can use the precanned 'less-by' and specify a different comparison semantic (though in this case why is it named
'less_by' instead of 'order_by'), but I don't see how can I use it with something different than a data member as the key
itself. On some applications, specially outside databases, the keys are given by runtime expressions and not just stored data
members.
If the users would have to end up writting their own predicates, the separaton will make that task a lot easier.

>>
>> Finally, I'm not entirely happy with the coallision response of
>> 'modify' (or
>> maybe I don't understand it).
>> Is it ever possible to afford removing the colliding modified
>> element? Imagine I change the Social Security Number of Jhon, and by
>> mistake, I
>> assign him
>> the same number as Carla... the modify() will return 'false', but
>> therewon't be a
>> Jhon anymore in the table....
>
> You're correct, John will be erased. What else could the library do?
>
Nothing else that I can see, unfortunately.

>
> In order to restore the original SS number of John, a local copy
> of the element would have to be kept: in this case, we're back to
> the undesired situation in which an additional element copy is
> required.
>
Indeed.

> If you want restoring-on-collision, use update(). To sum it up:
>
> * update() does restore-on-collision, with the overhead of an element
> copy.
> * modify() does not incur an element copy, but on collision the
> modified element is erased.
>
> I don't think we have other options here. I've tried to model
> modify() following your request and those of others, and the approach
> is IMHO forced
> by design. Anyway, I'm open to new ideas.
>
OK, this is the best we could think of, at least right now.
Many times the user knows exactly what she's doing so that no collisions
will ever really ocurr. For example, if the modify does not change keys
but other data.
For this cases modify() is a big plus. The acutal user code could assert
the result of modify() so that a coallision is treated as postcondition
violation.

The documentation could express emphatically that if the modification
changed keys, update() must be used, else, modify() can.

A possibility is to protect modify() so that it can used but not accidentally.
That is, we can prevent a user for just seeing modify() on the public interface
and blindly and incorrectly use it.
Simply putting it into a protected section is problematic because
it requires the user to derive from the class, so something better
should be used to protect such a method.

...but I don't know right know how to code this :-)
but I'm sure there are patterns for this already.
Maybe with an addtional 'Modifier' template parameter which
the user supplies with a 'modify' wrapper, or something like that.
I'll let you know if I think of something.

Fernando Cacciola


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk