Boost logo

Boost :

Subject: Re: [boost] [Review] ITL review starts today, February 18th
From: Joachim Faulhaber (afojgo_at_[hidden])
Date: 2010-02-28 18:23:23


Hi Barend,

thank you again for all the effort and diligence that you invested to
investigate my library :)

I am going to address the issues you raised piecemeal in a sequence of
importance that I consider for them. Generalized addition, subtraction and
intersection operators are at the heart of the library. So I'll begin here:

5) OPERATORS and FUNCTIONS

> During the review period I had a discussion on the list with Joachim about
> the operators and the naming of functions. However, I thought that the +=
> and |= were both doing a pure union, but it appears that they are doing a
> union PLUS a (sometimes mathematic) addition.

this is correct.

As I have outlined in the documentation, all Sets and Maps of the ITL are
Addable and Subtractable. Addability is more fundamental than
Subtractability. Addability in the sense I have in mind could also be called
Combinability: Some kind of a primary possibility to combine values of a
type T:

string, sequence : concatenation
numbers : addition
sets : union
maps : ? ... Since a map is a set of pairs we could use set union. But this
is NOT done in the ITL. The ITL defines a generalized addition, based on a
union operation, that propagates operator + to combine associated values, if
the inserted intervals overlap (or inserted elements collide).

By default we propagate operator + to combine associated values. This yields
to
concatenation, if lists, strings, etc. are associated
addition, if numbers are associated
union, if sets are associated
generalized union, if maps are associated.
This is [reason 1], why union on sets and generalized union on maps must be
expressed to operator +, +=

See also:
http://www.herold-faulhaber.de/boost_itl/doc/libs/itl/doc/html/boost_itl/concepts/addability__subtractability_and_aggregate_on_overlap.html

So again, this design is based on the assumption, that, for a combinable
type T, we can identify a primary operation for the combination of it's
values and for this primary operations we expect operator + . This is
crucial for the combination of maps in the ITL. Maps propagate this primary
combine function. And as default we always use operator + . Since associated
type can be sets, interval_sets and, yes, maps and interval_maps : We must
assign operator + to the primary way of combining sets and maps in the ITL.
Because, by introducing a generalized addition on maps, we start to
propagate the combination operator. The best choice for this? Clearly the
archetypal operator that is there for the combination of *something*:
operator + .

This way we can, out of the box, work with things like (Phil, please close
your eyes for a monent ;)
interval_map<int, interval_map<int, string> > m1, m2;
// fill ...
m1 += m2; // performing aggreations on interval_map<int, string>
          // via += on interval_map<int,string> that in turn calls
          // += on strings.

[...]

> For interval_map, it is currently much less clear to me now. The operators
> do two things at the same time, which is confusing me.
>
> Let's see the following code pieces:
> 1| itl::interval_map<int, int> map1, map2;
> 2| map1 += std::make_pair(itl::interval<int>(3,5), 4);
> 3| map2 += std::make_pair(itl::interval<int>(4,6), 5);
> 4| map1 |= map2;
> This gives: 3..3: 4 4..5: 9 6..6: 5. So it is a set-theoretic union,
> PLUS mathematical addition. Aggregation on overlap. It IS very useful, but
> it does two things at the same time, and should be implemented and described
> carefully.
>
> Changing the last line to intersection:
> 4| map1 &= map2
> gives 4..5: 9, so it is a set-theoretic intersection on intervals, PLUS
> mathematical addition. I still agree.
>

As for generalized addition on maps, where operator + is propagated, all
itl::Maps propagate the & operator to combine associated values, if the
associated values are intersectable. For numeric codomain_types this is not
so. Here the generalizes intersection degenerated to an addition on the
intersection of the domain values of the intersected values. This is what
happens here.

>
> But now:
> 4| map1 -= map2
> gives 3..3: 4 4..5: -1 This is actually unexpected. It is not
> set-difference, because that would leave only 3..3 in the set and remove
> 4..5 completely. So what it actually does is subtracting 5 from all there
> was before in map1, ONLY mathematical operation (minus), no set-theory
> here...
> The documentation describes it like this: "Subtraction on Maps implements a
> map difference function similar to set difference",

this citation is incomplete it continues:
"If, on subtraction of an element value pair (k,v) it's key k is in the map
already, the subtraction function is propagated to the associated value. On
the associated value an aggregation is performed, that reverses the effect
of the corresponding addition function."

Now for the interpretation of this subtraction operation:
  {[3 .. 5]->4}
- { [4 .. 6]->5} it leaves [3..3]->4
  on the overlap it subtracts : [4..5]->(4-5), which is consistent with the
definition
  the part of map2 that has no match in map1 is not subtracted (kind of set
theoretic ;-)

> To be complete:
> 4| map1 ^= map2
> gives 3..3: 4 6..6: 5. So yes, this IS the set-symmetric difference
> and (by nature of that) does or does not mathematical addtion, that is not
> important here.
>
> Therefore I find the interval_map operators still confusing, even more
> confusing than before. Are they doing mathematics or set-theory?

Well set theory is a field of math. The distinction is little helpful. The
generalized map operations are defined on the bases of the set theoretic
operations and by propagation of the respective operation, for the case of
overlaps. Their properties are formally defined by sets of laws. More about
laws later.

> So I would suggest something like this:
> 4| map1 |= something(map2, std::plus);
> to make it more explicit. This is the set-theoretic union where on all
> overlaps the two sets are added, and
> 4| map1 |= something(map2, std::minus); where the two sets are unioned,
> and the values of map2 are subtracted from map1. This is slightly different
> from the current -= implementation.
>
      yes, and it does not account for the fact that the result of
set_difference does not contain the part of map2, which is not contained in
map1. W.r.t. to your example: [6,6]->-5 would be in the result.

> 4| map1 |= something(map2, std::multiplies); item, but values are
> multiplied
>
> And then the same for &= (intersection), -= (set theoretic difference).
> Then ^= probably need no policy because it always handled the intervals
> which are not common, an operator is not applicable here.
> There are combine functors described (I saw later), but they are used
> different (I think) than this suggestion.
>
> Defining the operations like that brings a loss of simplicity, introduces
complexity and opens a Pandorra's box of possible errors. The aggregation
operations in itl::Maps are properties of the type and are not to be changed
at runtime. As long as I work with this I found not a single use case where
this seems to make sense. But it opens up the possibility of spoiling
aggregation result by accidently using inconsistent operations for
aggregations.

>
> This would make it less confusing AND probably would give it more
> functionality, because the *= is currently not implemented (and might be
> reserved for scaling).
>
> This was map + map. The same applies for the operator-= on map, interval,
> so:
> 1| itl::interval_map<int, int> map;
> 2| map += std::make_pair(itl::interval<int>(3,5), 4);
> 3| map -= std::make_pair(itl::interval<int>(4,6), 3);
> result in 3..3: 4 4..5: 1
>
> This could actually be stated clearer by using the std::map approach, on
> intervals, I tried it intuitively but it does not compile:
> map[itl::interval<int>(3,5)]+= 4;
> map[itl::interval<int>(4,6)]-= 3;
> I would not have any problems with this, and I would expect still doing
> aggregate on overlap behind the scenes, and not set-theory, resulting in:
> 3..3: 4 4..5: 1 6..6: -3
>
> try this:
    itl::interval_map<int, int, total_absorber> map;
    map += std::make_pair(itl::interval<int>(3,5), 4);
    map -= std::make_pair(itl::interval<int>(4,6), 3);
    cout << map << endl;

So much for today ;)
More tomorrow
Good night from stormy Germany,
Joachim


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk