Boost logo

Boost :

Subject: Re: [boost] [dynamic_bitset] Comparison operators without assertions
From: Pavel Kretov (firegurafiku_at_[hidden])
Date: 2017-05-02 17:47:35


Hi Edward and Matt!
Thank you for your replies. Let me comment them both in one (maybe,
overly long) message.

Edward Diener wrote:
> Are you suggesting that the <,>,<=, and >= operators always return
> 'false' if the sizes do not match, rather than assert ?

Definitely, no. I'm suggesting to discuss the rules of meaningful and
useful comparison for dynamic bitsets of different lengths. Returning
'false' each time sizes are different would make 'operator <' not only
inappropriate for std::set or std::map, but will force many STL
algorithms to treat these objects as equivalent, since equivalence is
defined is the standard library as:

   equiv(a, b) = !(a<b) && !(b<a)

We need a more elaborate solution.

> > Shouldn't BOOST_ASSERT be used instead of the one from assert.h?
>
> Possibly. I think the intention was always to assert, whereas
> BOOST_ASSERT lets the assertion be overridden. I do agree with you
> that the current way dynamic_bitset works with the assertion should
> have been documented.

I think, these assertions should be eliminated, not documented. We can
get rid of them without breaking even a bit of backward-compatibility.
Would you feel happy if an innocent-looking code like

    std::string("c++") < "boost"

call std::terminate just because the class authors don't want to
handle all possible inputs for some reason? Less-comparison must
either be completely and safely defined, or not defined at all. That's
how the math works.

If we cannot reliably define total ordering, we can, probably, return
std::optional<bool> or even boost::tribool from 'operator <', or, at
the very bad, put an exception to the throw()-list. Terminating the
whole application immediately is *the worst possible* option, since it
is not the expected behavior for the comparison operator.

Matt Calabrese wrote:
> I agree that such comparisons should be valid. We should represent a
> total order, just like standard containers do (assuming their
> elements do, of course). I'm actually rather surprised that this
> isn't already supported.

At the time, I can see two completely different, mutually exclusive
total-order generalizations:

1. Mimic lexicographic string comparison: compare a[0] with b[0], if
   they're equal, compare a[1] with b[1], etc. This is also the
   way how std::vector<T> works if T is less-comparable (including
   the special case for std::vector<bool>).

2. Mimic little-endian (or big-endian) numeric comparison.

   -bit-: (9) (0)
       a: 0000001101
       b: 0001101
       c: 00010000
     a<b: false
     b<a: false
     equiv(a,b): true
     a<c: true
     b<c: true

I strongly prefer the former approach, since the latter makes
equivalence (!(a<b) && !(b<a)) so badly different from equality
(a==b). Equality already works in the lexicographic way, there is
little reason to introduce incompatible ordering scheme
to 'operator <', unless we're going to generalize bitwise operators
to mimic numbers, too.

It must also be noted that std::bitset avoids defining 'operator <'
for some reason. I think, they did it in order not to give users false
expectations about the ordering. The very name 'bitset' is the source
of confusion. I think, if we could rename boost::dynamic_bitset to
simply boost::bitstring, anyone would have no second idea about how
the class does its comparisons.

> I'm not sure, though I agree that it's definitely worth
> investigating with further discussion on the list and some example
> use-cases. I intuitively suspect it's a reasonable generalization to
> act as though implicit 0s were present for the shorter operand,
> based on an interpretation as a binary value. The side on which to
> pad should be consistent with the lexicographical ordering. Even
> this generalization is somewhat debatable though, since extraneous
> 0s contribute to value in a dynamic bitset.

If we continue thinking about dynamic_bitset as a bitstring (or, more
generally, bitlist), it would be consistent to define bitwise
operators in terms of 'zip' function, commonly found in a number of
functional programming languages (some pseudo-Haskell code follows):

   (&) :: [Bit] -> [Bit] -> [Bit]
   a & b = zipWith (&) a b

Usually, 'zip' combines lists until one of them is over (see [1]
and [2], though). Hence, applying 'operator &' to bitsets of
different size should create a new bitset with the length of the
shorter. The great feature of this definition is that it naturally
obeys the De Morghan's laws:

  (~a & ~b) == ~(a | b)
  (~b | ~b) == ~(a & b)

I also tried different padding schemes and had no luck keeping them
consistent to De Morghan laws. It anyone interested, I can post my
generalization attempts on the list.

With best wishes and hope that my suboptimal language
skills won't totally ruin my arguments,
――― Pavel K.

[1]: https://softwareengineering.stackexchange.com/q/274983
     (Why does 'zip' ignore the dangling tail of the collection?)
[2]: http://stackoverflow.com/a/8513803
     (TLDR: Boost's zip_iterator doesn't behave like Haskell's zip).


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk