Hi,
(I posted this mail in Boost Archive, but it seems I should send the mail to ublas mail list directly)
I'm Wei Wang, a CS master student focusing on high-performance computing
field. `boost::ublas` project 3 adding GPU computation interests me a lot
and I'd like to help add this feature to ublas. I find this project was also
on last years' list and I'm curious if anyone did this before or on which
stage he/she has finished.
I've already read the initial source code of `ublas` in `boost 1.29`(I also
read 1.66 API, and found it add one concept `container`, which used to be
`bounded_array` and `unbounded_array`). I wrote a passage describing its
template parameter deduction relationships. Besides, I wrote a series blogs
teaching how to use openCL efficiently with proper data partition and memory
usage.
This is my first time participating in GSOC, and I'm a bit of confused on
following question:
1. Integrating openCL requires preparing for context, command_queue, event
and other "environment objects", should they also be included in this lib?
2. Take matrix matrix multiplication A*B for example. The last stage before
matrix copy assignment is in `matrix_matrix_prob` class and its evaluation
requires loop through all items on both matrix. If I want to add GPU compute
features, I need to launch kernel for each computation expression at this
step, but it seems to be contradictory to `ublas`'s lazy evaluation
rationale. Is it possible to bypass the rule?
3. What should I implement in the competency matrix class? Just integer
matrix or template matrix class?Should I support current `ublas`
interface(those typedefs and traits)?
Best regards,
Wei Wang
发送自 Windows 10 版邮件应用