Boost logo

Boost :

Subject: Re: [boost] Hana Views and References
From: Joel de Guzman (djowel_at_[hidden])
Date: 2015-06-27 18:46:47


On 6/27/15 10:48 PM, Louis Dionne wrote:
>> Joel de Guzman <djowel <at> gmail.com> writes:
>> [...]
>>
>> Your tests focused on a mere subset of Fusion which is the least
>> optimized: fusion::vector. Your tests do not cover other areas
>> such as views, and other containers.
>
> Frankly, I thought fusion::vector would be the most efficient container.
> Isn't vector documented as such? Quoting from the documentation:
>
> vector is the simplest of the Fusion sequence container (a vector with N
> elements is just a struct with N members), and in many cases the most
> efficient.

That was true a decade ago in a purely c++03 world. The C++ world
has changed since then and fusion has evolved since then. I'm sorry
for the confusion. Anyway, Fusion is undergoing transformation again
following advances in C++. The documentation will have to be updated.

Anyway, it's not just containers I am referring to. And you know now
that I am referring to views, and on runtime performance.

>> I can imagine a case, when say
>> you push_back or insert where views should shine. It's the difference
>> between copy by value and pass by reference. Try doing that with large
>> tuple elements (those without resources and are not efficiently
>> moved). Actually, have you tried reverse with large tuples like
>> bitmaps and stuff?
>>
>> Hmmmm, how did you test reverse in Fusion BTW? Fusion reverse should
>> return a view with ZERO(!) copy. Did you create another vector from the
>> reverse_view? I guess so, since your test subject is Fusion vector.
>> In that case, you are not doing it right! You don't copy to another
>> vector. You should use the results as-is. Lazily.
>>
>> For that matter, the same is true with transform! Fusion returns
>> transform_view. You should get zero overhead with those because
>> they are lazy. If you are copying back to a fusion vector, then
>> you are not doing it right. Have you tried accessing, say only
>> the Nth element instead of (presumably) copying back to a vector
>> before accessing the Nth element? or how about transforming only
>> a range within the container?
>>
>> It seems you are doing your tests unfairly using eager evaluation
>> (Hana) instead of lazy evaluation (Fusion).
>
> There seems to be a larger issue here. That issue is: How to benchmark
> lazy computations and eager computations? To illustrate the problem,
> consider the following "benchmark":
>
> transforming a vector in C++:
>
> #include <algorithm>
> #include <vector>
>
> int main() {
> std::vector<int> xs(1000);
> std::vector<int> ys; ys.reserve(xs.size());
> std::transform(xs.begin(), xs.end(), std::back_inserter(ys), [](int
> x) {
> return x + 1;
> });
> }
>
> transforming a list in Haskell:
>
> import Data.List
>
> main = do
> let xs = take 1000 (repeat 0)
> let ys = fmap (+1) xs
> return ()
>
> Written this way, the Haskell code will always be faster because it does not
> do anything, since it is lazy. However, that does not tell us much about
> what we're interested in. What we'd like to know is how the above code
> behaves when I _actually_ access the elements of the lazy list. So, of
> course I copy the result into vectors when benchmarking Fusion algorithms,
> since otherwise there's nothing to benchmark!

And so you are doing it wrong! What you are getting is the extra construction.
The result *IS* already a sequence that can be traversed. That extra
copy is not necessary. If you've traversed over the elements of the
sequence instead, you should see a gain with Fusion because Hana
will have to create a new tuple while Fusion will not.

> I think this ought to be explained in the documentation, but I don't think
> it
> invalidates Hana's claims. For equivalent functionality (i.e. when accessing
> all the members of the resulting container, which is assumed to usually be
> the case), Hana will perform as good as Fusion.

Try getting a single element, or a range, for example and, how about trying
multiple concatenated algorithms, or how about push_back and insert multiple
times? I can think of a LOT of examples where views will blow eager evaluation
out the water. I'm sure you've seen Eric's range V3 presentation? Imagine
doing all that computation eagerly and you'll know what I mean!

Bottom line, what you did in your tests is force an apple to be an orange
and test it against an orange.

That said, I'm still surprised that Fusion withstood the tests and
performed admirably despite the added construction step. The test on
reverse deserves investigation and optimization though, so I'll take
the liberty of using your words, changing "Hana" to "Fusion":

   Performance is important to us: if you ever encounter a scenario
   where Fusion causes bad code to be generated (and the fault is
   not on the compiler), please open an issue so the problem can be
   addressed.

Regards,

-- 
Joel de Guzman
http://www.ciere.com
http://boost-spirit.com
http://www.cycfi.com/

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk