Boost :

Date view	Thread view	Subject view	Author view

Subject: Re: [boost] interest in an asynchronous linear algebra library?
From: Oswin Krause (Oswin.Krause_at_[hidden])
Date: 2015-12-07 03:47:36

Next message: Adam Wulkiewicz: "[boost] [boost-users][boost-announce][QVM] Formal review begins today"
Previous message: Damien Buhl: "Re: [boost] [asio] Any interest in a boost::asio::mockup_serial_port ?"
In reply to: David Bellot: "Re: [boost] interest in an asynchronous linear algebra library?"

Hi David,

> And your work would be recognized as being part of ublas directly and
> be
> used right away by thousands of users.
> Let me know if you're interested contributing back to ublas instead.

Yes, I am interested, but currently I do not see a roadmap as ublas is
SO big, that I fear that I can not fix as much code as my proposed
changes would break (even without the asynchronous part, gpus...). One
of the starting points of my rewrite was that I deleted around 40k lines
of ublas code before i started and then designed the interface around a
defined subset of features (namely one dense format and one sparse
format, later adding the triangular packed format. I still haven't
gotten sparse right, though.). From then on i decided to break the
functionality apart in a few places (assignment, kernels, and iterators)
and use the expression templates as a simple connection between the
parts.

Another point: I would not propose a major change to an existing boost
library without triggering a formal boost review process - this change
would break any code that relies on any ublas internals.

Best,
Oswin

> Cheers,
> David
>
> On Thu, Nov 19, 2015 at 2:55 PM, Oswin Krause <
> Oswin.Krause_at_[hidden]> wrote:
>
>> Hi list,
>>
>> Disclaimer: What I am proposing here does not exist (yet). But it will
>> in
>> the near future and if there is interest i would like to give
>> something
>> back to the boost community.
>>
>> I would like to propose a new library to boost: asynchronous blas.
>>
>> What is it and what can I do with it?
>> It is a linear algebra library where all computations are carried out
>> in
>> an asynchronous manner on GPU and CPU. Computing a matrix-matrix
>> product
>> does not block the current thread but instead launches a task which at
>> some
>> point returns with a result, allowing the main thread to continue
>> until the
>> result is needed. GPU and CPU tasks can be started at the same time
>> depending on each other (e.g. a CPU computation waits until the GPU
>> returned some needed result and vice-versa)
>>
>> Why is this important? Which problems does it solve?
>> Many linear algebra libraries are target towards either
>> a) solving a few large linear algebra problems
>> b) solving many small similar problems
>>
>> This library is targeted towards the middle ground: a medium amount of
>> decently sized problems. Here the problem is that it is often hard to
>> use
>> all the computing resources available as a single problem will not
>> exhaust
>> the device resources and there is no easy way to launch several
>> similar
>> tasks in parallel.
>> This problem occurs often in GPU computing where out-of-order
>> execution of
>> kernels is used to fill up the device. Similarly, it is often hard to
>> make
>> CPU and GPU work together as GPU tasks are asynchronous while CPU
>> tasks are
>> often synchronous and it is not straight forward to couple both
>> without
>> blocking either CPU or GPU.
>>
>> How does it work?
>> Expression templates similar to boost::ublas are used to create a
>> dependency graph: which object is used in which computations and which
>> tasks have to wait to ensure a correct outcome?
>>
>> a simple case would be two independent computations like
>>
>> matrix A=prod(B,trans(B));
>> matrix C=prod(trans(B),B);
>>
>> a normal library like ublas would wait until A is computed before C is
>> evaluated. Instead, it can be more efficient to just add the two tasks
>> to a
>> list and let worker threads execute them as soon as resources are
>> available. First the computation of A is put into the graph and A is
>> marked
>> as "being written to" while B is marked as "being read". When C is
>> added to
>> the list, it does not depend on the first computation as several tasks
>> can
>> read from B at the same time. So another worker thread can grab the
>> task
>> and execute it.
>>
>> If we now add the task
>>
>> B += prod(D,prod(C,trans(D));
>>
>> the library computes it as
>>
>> matrix temp = prod(C,trans(D);
>> B += prod(D,temp).
>>
>> temp can be computed as soon as C is computed. But B can only be
>> computed
>> after A and temp are computed as we can only write to B after all
>> previous
>> tasks using B are finished.
>>
>> Are you qualified?
>> I am main contributor to https://github.com/Shark-ML/Shark/
>> and as part of it I forked
>> ublas
>> https://github.com/Shark-ML/Shark/tree/master/include/shark/LinAlg/BLAS
>> which now uses several linear algebra libraries as possible backends
>> (ATLAS, openBLAS...).
>>
>>
>> What is the timeline?
>> I would start developing a minimal example ASAP based on the rewrite
>> of
>> ublas above. I would use a minimum set of features (only dense square
>> matrices) and basic GPU support.
>>
>> Opinions?
>>
>> Best,
>> Oswin
>>
>> _______________________________________________
>> Unsubscribe & other changes:
>> http://lists.boost.org/mailman/listinfo.cgi/boost
>>
>
> _______________________________________________
> Unsubscribe & other changes:
> http://lists.boost.org/mailman/listinfo.cgi/boost

Next message: Adam Wulkiewicz: "[boost] [boost-users][boost-announce][QVM] Formal review begins today"
Previous message: Damien Buhl: "Re: [boost] [asio] Any interest in a boost::asio::mockup_serial_port ?"
In reply to: David Bellot: "Re: [boost] interest in an asynchronous linear algebra library?"

Date view	Thread view	Subject view	Author view

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk