Boost logo

Boost :

Subject: Re: [boost] [compute] Review comments
From: Kyle Lutz (kyle.r.lutz_at_[hidden])
Date: 2014-12-30 14:33:12


On Tue, Dec 30, 2014 at 4:12 AM, Asbjørn <lordcrc_at_[hidden]> wrote:
> On 29.12.2014 01:42, Kyle Lutz wrote:
>>
>> On Sun, Dec 28, 2014 at 1:40 PM, Asbjørn <lordcrc_at_[hidden]> wrote:
>>>
>>> 2) I did miss async versions of the algorithms, so it's possible to chain
>>> together multiple calls. Even though all the data sits on the compute
>>> device, the overhead of waiting for each operation to finish before
>>> queuing
>>> the next can make the compute gains completely irrelevant.
>>
>>
>> Can you let me know what chain of functions you're calling? Many
>> algorithms should already execute asynchronously and provide the
>> behavior you expect.
>
>
> Seems I had missed this crucial part. Given that copy() is sync and there's
> a special async version of it, if you forget the details it's easy to forget
> that other operations are enqueued non-blocking. For example, on the
> reference page of transform()[1] there's no mention of it being
> asynchronous. It makes sense assuming the default in-order command queue
> execution, but I think it should be more explicit to make it harder to
> forget late at night :)

I agree, I'll work on documenting this behavior better.

>>> 3) I think relevant calls should have a non-throwing form returning an
>>> error
>>> code, ala Boost.ASIO.
>>
>>
>> This could be implemented, but would be a large amount of work
>> (essentially doubling the size of the public API). Can you let me know
>> more about your use-case and why the current exception-based API is
>> not suitable?
>
>
> Fair enough. From experience, the main errors which one can handle sensibly
> would be insufficient memory for a buffer (one can try using smaller
> buffers/reduce dataset/alternate algorithm) and kernels failing to
> compile/run due to lack of registers or similar resources (one can try an
> alternate algorithm/kernel).
>
> For example, using a single large buffer may be significantly faster, but if
> the max buffer size is small, one can switch to a ping-pong algorithm.

Yeah, I would be very interested in exploring error handling
strategies like these.

-kyle


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk