Boost logo

Boost :

Subject: Re: [boost] [dynamic_bitset] Comparison operators without assertions
From: Pavel Kretov (firegurafiku_at_[hidden])
Date: 2017-05-02 17:47:35

Hi Edward and Matt!
Thank you for your replies. Let me comment them both in one (maybe,
overly long) message.

Edward Diener wrote:
> Are you suggesting that the <,>,<=, and >= operators always return
> 'false' if the sizes do not match, rather than assert ?

Definitely, no. I'm suggesting to discuss the rules of meaningful and
useful comparison for dynamic bitsets of different lengths. Returning
'false' each time sizes are different would make 'operator <' not only
inappropriate for std::set or std::map, but will force many STL
algorithms to treat these objects as equivalent, since equivalence is
defined is the standard library as:

   equiv(a, b) = !(a<b) && !(b<a)

We need a more elaborate solution.

> > Shouldn't BOOST_ASSERT be used instead of the one from assert.h?
> Possibly. I think the intention was always to assert, whereas
> BOOST_ASSERT lets the assertion be overridden. I do agree with you
> that the current way dynamic_bitset works with the assertion should
> have been documented.

I think, these assertions should be eliminated, not documented. We can
get rid of them without breaking even a bit of backward-compatibility.
Would you feel happy if an innocent-looking code like

    std::string("c++") < "boost"

call std::terminate just because the class authors don't want to
handle all possible inputs for some reason? Less-comparison must
either be completely and safely defined, or not defined at all. That's
how the math works.

If we cannot reliably define total ordering, we can, probably, return
std::optional<bool> or even boost::tribool from 'operator <', or, at
the very bad, put an exception to the throw()-list. Terminating the
whole application immediately is *the worst possible* option, since it
is not the expected behavior for the comparison operator.

Matt Calabrese wrote:
> I agree that such comparisons should be valid. We should represent a
> total order, just like standard containers do (assuming their
> elements do, of course). I'm actually rather surprised that this
> isn't already supported.

At the time, I can see two completely different, mutually exclusive
total-order generalizations:

1. Mimic lexicographic string comparison: compare a[0] with b[0], if
   they're equal, compare a[1] with b[1], etc. This is also the
   way how std::vector<T> works if T is less-comparable (including
   the special case for std::vector<bool>).

2. Mimic little-endian (or big-endian) numeric comparison.

   -bit-: (9) (0)
       a: 0000001101
       b: 0001101
       c: 00010000
     a<b: false
     b<a: false
     equiv(a,b): true
     a<c: true
     b<c: true

I strongly prefer the former approach, since the latter makes
equivalence (!(a<b) && !(b<a)) so badly different from equality
(a==b). Equality already works in the lexicographic way, there is
little reason to introduce incompatible ordering scheme
to 'operator <', unless we're going to generalize bitwise operators
to mimic numbers, too.

It must also be noted that std::bitset avoids defining 'operator <'
for some reason. I think, they did it in order not to give users false
expectations about the ordering. The very name 'bitset' is the source
of confusion. I think, if we could rename boost::dynamic_bitset to
simply boost::bitstring, anyone would have no second idea about how
the class does its comparisons.

> I'm not sure, though I agree that it's definitely worth
> investigating with further discussion on the list and some example
> use-cases. I intuitively suspect it's a reasonable generalization to
> act as though implicit 0s were present for the shorter operand,
> based on an interpretation as a binary value. The side on which to
> pad should be consistent with the lexicographical ordering. Even
> this generalization is somewhat debatable though, since extraneous
> 0s contribute to value in a dynamic bitset.

If we continue thinking about dynamic_bitset as a bitstring (or, more
generally, bitlist), it would be consistent to define bitwise
operators in terms of 'zip' function, commonly found in a number of
functional programming languages (some pseudo-Haskell code follows):

   (&) :: [Bit] -> [Bit] -> [Bit]
   a & b = zipWith (&) a b

Usually, 'zip' combines lists until one of them is over (see [1]
and [2], though). Hence, applying 'operator &' to bitsets of
different size should create a new bitset with the length of the
shorter. The great feature of this definition is that it naturally
obeys the De Morghan's laws:

  (~a & ~b) == ~(a | b)
  (~b | ~b) == ~(a & b)

I also tried different padding schemes and had no luck keeping them
consistent to De Morghan laws. It anyone interested, I can post my
generalization attempts on the list.

With best wishes and hope that my suboptimal language
skills won't totally ruin my arguments,
――― Pavel K.

     (Why does 'zip' ignore the dangling tail of the collection?)
     (TLDR: Boost's zip_iterator doesn't behave like Haskell's zip).

Boost list run by bdawes at, gregod at, cpdaniel at, john at