Boost logo

Boost :

Subject: Re: [boost] [Hana] Formal review
From: Louis Dionne (ldionne.2_at_[hidden])
Date: 2015-06-19 10:07:06


Joel de Guzman <djowel <at> gmail.com> writes:

>
> On 6/19/15 9:12 AM, Louis Dionne wrote:
> >
> > [...]
> >
> > However, even though you may be content with the ease of writing TMP code in
> > C++14, I think you might be surprised to see how shorter it could be if you
> > used Hana. This is, for example, the case of Zach's Units-BLAS library. It
> > was really quite short with C++14 only, but it was even shorter with Hana.
> > And the code was written with a higher level of abstraction. And in that
> > case, there was even a compile-time speedup over std::tuple + handwritten
> > stuff.
>
> "really quite short" is good enough if the cost is having to depend on
> a "32k header mega library" as David notes. Not to mention having to
> learn another library for me and for all future maintainers.

Regarding the "32k header mega library" thing, I'd like to precise that it's
not as bad as it seems. First, part of it is just documentation. Second, that's
a _real_ 32 kLOC, not a 100 kLOC of dependencies hidden behind a 5 kLOC library.
Hana as no dependencies except the <type_traits>, <utility> and <cstddef>
headers, which you probably already use anyway. In comparison, including almost
any other Boost library will pull in a lot more than 32 kLOCs in dependencies.

> So where is the speedup? is it because of std::tuple? If so, why don't you
> decouple your nice tuple implementation and offer it separately?
> Or is it the handwritten stuff? If so why?

The speedup is partly the tuple implementation, but also the tight coupling
between some algorithms and that tuple implementation. There are also a lot
of small decisions you can take in your library to reduce the compile times,
even if individually they seem to only have a minor impact. For example, I
decided to use static_cast<T&&>(t) instead of std::forward<T>(t) everywhere
in the code, because Hana uses perfect-forwarding a lot and I measured a 13.8%
speed up of my test suite by just using static_cast. This was the cost of
instantiating the std::forward function.

> Why can't Zacc use the same tricks that you used in Hana?

He can, but then he'll be rewriting quite a bit of code he wish he didn't
have to rewrite. It is also unlikely that he ends up with something as fast,
for the simple reason that he does not want to spend three days optimizing
the compile-time of an algorithm, like I sometimes do. That's nothing new;
you spend a lot of time making a library really good at what it does, and
then other people use it. And another day, you'll use one of their libraries.

> > Anyway, I'd like to have a look at your Phoenix-lite project, to see if it
> > could be written using Hana, and how so. It is also definitely possible that
> > no gains can be obtained from using Hana in that project, in which case that
> > would give me a good example of what _not_ to use Hana for.
> >
> >
> >> There are other issues, such as debuggability of TMP code using the
> >> lambda trick, if you are still using that for CT efficiency, but I
> >> guess I need to dive deeper to give a real review. Not being able
> >> to debug TMP code is a showstopper for me.
> >
> > I'm not using the lambda trick anymore because of shabby support for
> > generic lambdas and the lack of constexpr lambdas.
>
> Ok, so what new tricks are you using to speed up compile time then?
> In my experience, the main reason for excessive compile time are
> the long type names. How are you able to overcome that?

Nope, the type names are still long AFAICT. Hana is not magic; it won't give
you good compile-times and good error messages suddenly. We're still in C++.
However, it tries very hard to be clever whenever it can, and if you write
your code in a half-decent way, you should end up with something OK at
compile-time. Of course, I expect writing a complex metaprogram with Hana
will also result in long compile-times, but my goal is to make it faster
or on-par with something you would write yourself.

To do much better, we would probably need a compiler-provided closure type.
Basically, std::tuple as a compiler intrinsic. I think this could be lightning
fast, but we're not there yet.

Regards,
Louis


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk