Boost logo

Boost :

Subject: Re: [boost] Synchronization (RE: [compute] review)
From: Vicente J. Botet Escriba (vicente.botet_at_[hidden])
Date: 2014-12-30 09:08:21


Le 30/12/14 14:48, Gruenke,Matt a écrit :
> -----Original Message-----
> From: Boost [mailto:boost-bounces_at_[hidden]] On Behalf Of Thomas M
> Sent: Tuesday, December 30, 2014 7:37
> To: boost_at_[hidden]
> Subject: Re: [boost] Synchronization (RE: [compute] review)
>
>> If you are going to implement such RAII guards here's a short wish-list of features / guard classes:
>>
>> a) make guards "transferable" across functions
> I agree they should be movable, but it makes no sense for them to be copyable.
>
>
>> b) a container of guards and/or a guard for a wait_list as whole
> Hmmm... I can see the benefits (convenience). I'd make it a different type, though.
>
> I assume it should hold a reference to the list? Since the guarantee is designed to block when the wait_list goes out of scope, I think it's reasonable to assume its scope is a superset of the guarantee's.
>
>
>> c) a guard for a command-queue as whole
>> [possibly guards for other classes as well]
> Why? Convenience?
>
> Unless you're using it as a shorthand for waiting on individual events or wait_lists, there's no need. The event_queue is internally refcounted. When the refcount goes to zero, the destructor will block on all outstanding commands.
>
>
>> a) + b) because something like this is really useful:
> Um... how about this:
>
> void foo()
> {
> // setup all memory objects etc.
>
> wait_list wl;
> wait_list::guarantee wlg(wl);
>
> // send data to device
> wl.insert(cq.enqueue_write_buffer_async(devmem, 0, size, host_ptr));
> wl.insert(cq.enqueue_write_buffer_async(devmem2, 0, size, host_ptr2));
>
> // a kernel that reads devmem and devmem2 and writes to devmem
> wl.insert(cq.enqueue_task(kern, wl)); // Note: wl is copied by enqueue funcs
>
> // copy result back to host
> wl.insert(cq.enqueue_read_buffer_async(devmem, 0, size, host_ptr, wl));
>
> // wl.wait() would only be necessary if you wanted to access the results, here.
>
>
> // Enqueue an independent set of operations with another wait_list
> wait_list wl_b;
> wait_list::guarantee wlg_b(wl);
>
> // send data to device
> wl_b.insert(cq.enqueue_write_buffer_async(devmem_b, 0, size_b, host_ptr_b));
>
> // ...
> }
>
>
Maybe you can follow the task_region design (See
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4088.pdf).
>> With c) I have something like this in mind:
> What about this?
>
> {
> command_queue cq(cntx, dev);
> command_queue::guarantee cqg(cq);
> cq.enqueue_write_buffer_async(devmem, 0, size, host_ptr)
> transform(..., cq); // implicitly async cq.enqueue_read_buffer_async(...);
>
> // here automatic synchronization occurs
> }
>
>
> It does presume that command_queues are local and tied to related batches of computations. Those assumptions won't always hold.
The same here.

Best,
Vicente


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk