Boost logo

Boost :

Subject: Re: [boost] [Hana] Formal review for Hana
From: Louis Dionne (ldionne.2_at_[hidden])
Date: 2015-06-16 09:48:26


Paul Fultz II <pfultz2 <at> yahoo.com> writes:

> [...]
>
> >
> >
> > > - IntregralConstant is very strange. In Hana, its not a concept(even
> > > though its capitalized), but rather a so called "data type".
> > > Furthermore, because of this strangeness it doesn't interoperate
> > > with other IntregralConstants(such as from Tick) even though all
> > > the operators are defined.
> >
> > IntegralConstant is not a concept, that's true. The fact that it does not
> > interoperate with Tick's IntegralConstants out-of-the-box has nothing to
> > do with that, however. You must make your Tick integral constants a model
> > of Hana's Constant concept for it to work. See the gist at [2] for how to
> > do that.
>
> Awesome, thanks. So all the concepts are explicit.

Actually, not __all__ of them. There are a few concepts that refine other
concepts, and whose laws are strict enough to guarantee that only a single
model of these refined concepts can exist. In these cases, it is safe to
provide an automatic model of some concept. For example, this is the case
of Sequence. It refines several concepts, like Functor for example. However,
the laws are so strict that if you are a Sequence, there is a unique valid
model of the Functor concept that you could (and must) define. In this case,
the model of Functor is provided automatically. You can of course specialize
the algorithm for performance purposes.

If the model of Functor was not forced to be unique because of Sequence's
laws, Hana would not make an arbitrary choice and you would have to define
that model explicitly.

> > > - The `.times` seems strange and should be a free function so it works
> > with any IntregralConstant.
> >
> > The idea of having a `.times` member function comes from Ruby [3].
> > I personally think it is expressive and a simple tool for a simple job.
> > The syntax is also much cleaner than using a non-member function. Consider
> >
> > int_<10>.times([]{ std::cout << "foo" << std::endl; });
> > times(int_<10>, []{ std::cout << "foo" << std::endl; });
> >
> > However, as we've been discussing in this issue [4], I might add an
> > equivalent non-member function.
>
> If you find the chaining much cleaner, then perhaps you could make it a
> pipable function instead:
>
> int_<10> | times([]{ std::cout << "foo" << std::endl; });
>
> This way it could still apply to any IntegralConstant.

That's an interesting idea. However, Hana does not have a concept of
pipable function, so that would need to be added somehow.

> > > - In the section 'Taking control of SFINAE', seems like it could be
> > solved a
> > > lot simpler and easier using `overload_linear`.
> >
> > First, `overload_linearly` is an implementation detail, since I've moved
> > the Functional module out of the way to give me the possibility of using
> > Fit in the future. However, your point is valid; I could also write the
> > following instead:
> >
> > auto optionalToString = hana::overload_linearly(
> > [](auto&& x) -> decltype(x.toString()) { return x.toString(); },
> > [](auto&&) -> std::string { return "toString not defined"; }
> > );
> >
> > This approach is fine for the optionalToString function, which is rather
> > simple. I wanted to show how to use Optional to control compile-time
> > empty-ness in complex cases, so I'll just expand this section or change
> > the example to something that is really better solved using Optional.
> >
> > Thanks for the heads up; you can refer to this issue [5] in the future.
>
> Ok got it. Although, SFINAE is a pretty powerful compile-time Optional built
> into the language.

Sure, but a proper library-based Optional allows one to represent failures
caused by more than invalid expressions, which I think is more generally
useful. For example, you can define a safe division that returns `nothing`
when you divide by zero. Using SFINAE for this seems like using a hammer to
screw something. Also, Optional allows more sophisticated operations like

    transform(opt, f) -> applies f to the optional value if it is there, and
                         return just(the result) or nothing.
    filter(opt, pred) -> keep the optional value if it is there and it
                         satisfies the predicate

and many more. Composing optional values like this using SFINAE might be
harder, IDK.

> > > - Concepts refer to 'superclasses' these should be listed either as
> > > refinements or listed under the requirements section(which seem to be
> > > missing). It would be nicer if the concepts were documented like how
> > they are at cppreference: http://en.cppreference.com/w/cpp/concept
> >
> > This was fixed on develop. I now use the term "Refined concept" instead of
> > "Superclass". Regarding concept requirements, they are listed in the
> > "minimal complete definition" section of each concept. Then, semantic
> > properties that must be satisfied are explained in the "laws" section.
> >
>
> Great.
>
> >
> > > - Concepts make no mention of minimum type requirement such as
> > > MoveConstructible.
> >
> > I believe the right place to put this would be in the documentation of
> > concrete models like `Tuple`, but not in the concepts (like `Sequence`).
> > Hana's concepts operate at a slightly higher level and they do not really
> > have a notion of storage. But I agree that it is necessary to document
> > these requirements. Please refer to this issue [6] for status.
>
> I was thinking more when using algorithms, since tuple will take on the
> constructibility of its members.

So you mean like since `filter` does a copy of the sequence, the elements in
the sequence should be copy/move-constructible, right? That makes sense.
See this issue [1].

> > > - Organization of documentation could be better. Its nice showing what
> > > algorithms when the user views a concept, but it would be better if
> > > all the algorithms could be viewed together.
> >
> > I assume you are talking about the reference documentation and not the
> > tutorial. I agree that it could be easier to look for algorithms. There
> > are also other quirks I'd like to see gone. The problem is that Doxygen
> > is pretty inflexible, and I'm already tweaking it quite heavily. I'm
> > considering either using a different tool completely or making some
> > changes in the organization of the reference.
> >
> > Your more precise comment about viewing algorithms on their own page is
> > already tracked by this issue [7].
>
> Awesome. Have you thought about using a different documentation tool
> instead, like mkdocs or sphinx?

Yes, I'm currently considering switching away from Doxygen. Though I don't
know what I'd switch to. I have almost only one requirement; that the
documentation be written in the source code.

> > > - For compile-time `Iterable` sequence(which is all you support right
> > > now), `is_empty` can be inferred, and should be optional.
> >
> > How can it be inferred?
>
> Well, there is several ways it could be formailised, but either `head` and
> `tail` do not exist for an empty sequence,

I don't want to use SFINAE to determine this. What if we want to support
runtime Iterables like std::vector, that only know they're empty at runtime?

> or if `tail` always returns an empty sequence even when empty, you just
> detect that `seq == tail(seq)`.

Well, `tail` will fail when called on an empty sequence. But let's assume
this was not the case. You would then be required to implement `==` for
your Iterable, which is much more complicated than implementing `is_empty`.

> > > - Overall, I think the Concepts could be simplified. They seem to be too
> > > complicated, and it leads to many surprises which seem to not make
> > > sense(such as using `Range` or `String` with `concat` or using
> > > `tick::integral_constant`).
> >
> > 1. Concatenating ranges does not make sense. A Hana Range is a contiguous
> > sequence of compile-time integers. What happens when you concatenate
> > `make_range(0_c, 3_c)` with `make_range(6_c, 10_c)`? It's not
> > contiguous anymore, so it's not a Range anymore.
>
> Even though concat takes a Range, why can't it just return a tuple instead?

`concat`'s signature is

    M(T) x M(T) -> M(T)

where M is any MonadPlus. Mathematically, this is similar to the operation
of a Monoid, except it is universally quantified on T. It would really break
the conceptual integrity (and your expectations abou `concat`) for it to
return a container of a different kind. It is much cleaner to explicitly
perform the conversion by using

    concat(
        to<Tuple>(make_range(...)),
        to<Tuple>(make_range(...))
    )

and then there's no surprise that you get back a Tuple.

> > [...]
> >
> > Also, there's the problem that being SFINAE-friendly could hurt
> > compile-time performance, because everytime you call an algorithm we'd
> > have to check whether it's going to compile. However, because we're
> > working with dependent types, checking whether the algorithm compiles
> > requires doing the algorithm itself, which is in general costly.
>
> Templates are memoized by the compiler, so the algorithm isn't done twice.

That's right. Screw my last argument, but I still think it would be more
harmful than helpful. Also, if you need your algorithms to be constrained,
you can simply use `is_a<...>` or `models<...>` from Hana to constrain it
on your side.

> > We could however emulate this by using
> > the Tag system. For example, `fold_left` could be defined as:
> >
> > template <typename Xs, typename State, typename F,
> > typename = std::enable_if_t<models&lt;Foldable, Xs>()>
> > >
> > constexpr decltype(auto) fold_left(Xs&& xs, State&& state, F&& f) {
> > // ...
> > }
> >
> > This would give us an approximative SFINAE-friendliness, but like I said
> > above I think the best approach is to fail loud and fast.
>
> It would fail loud not fast. Using substitution failure, the compiler will
> stop substitution as soon as their is a failure, whereas with a
> static_assert,
> it substitutes everything and then checks for failure. So using enable_if is
> always faster than static_assert.

What I meant by _fast_ is that Hana asserts in the interface methods, so
that you get an error as soon as you call that method with wrong arguments.
Also, the actual implementation is not called when the static_assertion is
triggered, to reduce the compiler spew.

> Also, as a side note, you should never use `enable_if_t`, as it is harder
> for the compiler to give a good diagnostic(a macro still works really well
> though if you don't mind macros).

Good to know, thanks. Inside the implementation, I tend to use
`std::enable_if<>` to avoid instantiating the `enable_if_t`
template alias anyway.

> > > - It would be nice if the use of variable templates would be
> > optional(and
> > > not used internally), since without inline variables, it can lead to
> > ODR
> > > violations and executable bloat.
> >
> > Without variable templates, we would have to write `type<T>{}`,
> > `int_<1>{}`,
> > etc.. all of the time instead of `type<T>` and `int_<1>`. Sure, that's
> > just
> > two characters, but considering you couldn't even rely on what is
> > `type<T>`
> > (if it were a type, since `decltype(type<T>)` is currently unspecified),
> > we're really not gaining much. In short; no variable templates means a
> > less
> > usable library, without much benefits (see next paragraph).
>
> How is it less usable? It seems like it would be more usable, since the
> library can now support compilers with no or flaky variable templates.

I simply mean that `int_<1>{}` is less pretty than `int_<1>`. It is longer
by about 28% and most importantly it defeats the philosophy that we're
manipulating objects, not types. At any rate, compilers other than Clang
and GCC are probably missing far too many features (other than variable
templates) for them to compile Hana. I don't think variable templates
would change much to that.

> > Regarding variable templates and ODR, I thought variable templates
> > couldn't
> > lead to ODR violations? I know global function objects (even constexpr)
> > can
> > lead to ODR violations, but I wasn't aware about the problem for variable
> > templates. I would appreciate if you could show me where the problem lies
> > more specifically. Also, for reference, there's a defect report [9]
> > related
> > to global constexpr objects, and an issue tracking this problem here [10].
> >
> > Finally, regarding executable bloat, we're talking about stateless
> > constexpr
> > objects here. At worst, we're talking 1 byte per object. At best (and most
> > likely), we're talking about 0 bytes because of link time optimizations.
> > Otherwise, I could also give internal linkage to the global objects and
> > they
> > would probably be optimized away by the compiler itself, without even
> > requiring LTO. Am I dreaming?
>
> The size of the symbol table is usually larger than 1 byte for binary
> formats.

But is it reasonable to think that they'll be optimized away?

> > > Overall, I would like to see Hana compile on more compilers before it
> > gets
> > > accepted into boost(currently it doesn't even compile on my macbook).
> >
> > What compiler did you try to compile it with? Also, an important GCC bug
> > that was preventing Hana from working properly on GCC was fixed a couple
> > of days ago, so I'm going to try to finish the port ASAP and I'm fairly
> > confident that it should work on GCC 5.2.
>
> Using Apple's clang 6, which corresponds to clang 3.5 off of the trunk.

So you're with XCode < 3.6, right? This is not supported, unfortunately.
I don't know when Apple branched of clang 3.5, or what they did to it,
but it happily explodes on my computer too.

> What is the bug preventing compilation on gcc 5.2?

There are several of them. An important one was [2], which was recently
fixed. I'll continue working on the port as time permits.

Regards,
Louis

[1]: https://github.com/ldionne/hana/issues/125
[2]: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65719


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk