|
Boost Users : |
Subject: Re: [Boost-users] Hybrid parallelism, no more + mpi+serialization, many questions
From: Brian Budge (brian.budge_at_[hidden])
Date: 2010-11-18 10:14:08
This partly depends on how many processors/machines you have
available. You need to find a way of partitioning your state space
into tasks, and then doling those tasks out to processes/threads. How
expensive is f()? How much memory is used?
Brian
On Thu, Nov 18, 2010 at 3:00 AM, Hicham Mouline <hicham_at_[hidden]> wrote:
>> -----Original Message-----
>> From: boost-users-bounces_at_[hidden] [mailto:boost-users-
>> bounces_at_[hidden]] On Behalf Of Matthias Troyer
>> I would go with reading once and broadcasting, especially if, as was
>> mentioned before, one aims at going to thousands of processes. No I/O
>> system can scale, and implementing the broadcast is trivial: a single
>> function call.
>>
>> Matthias
>>
>> _______________________________________________
>
> The large calculation that I currently do serially and that I intend to
> parallelize is the maximum of the return values of a large number of
> evaluations of a given "function" in the mathematical sense.
> The number of arguments of the function is only known at runtime.
> Let's say it is determined at runtime that the number of arguments is 10, ie
> we have 10 arguments x0, ..., x9
> Each argument can take a different number of values, for e.g. x0 can be
> x0_0, x0_1 .... x0_n0
> x1 can be x1_0, x1_1, ...x1_n1 and so on...n0 and n1 are typically known at
> runtime and different
>
> so serially, I run
> f(x0_0, x1_0, ..., x9_0)
> f(x0_0, x1_0, ..., x9_1)
> ...
> f(x0_0, x1_0, ..., x9_n9)
> then with all the x8 then all the x7 ... then all the x0.
> There is n0*n1*...*n9 runs
> Then I get the maximum of the return values.
>
>
> Imagining I have N mpi processes, ideally each process would run
> n0*n1*...*n9/N function evaluations.
> How do I split?
>
> In terms of current implementation, each of the x is a boost::variant over 4
> types:
> a double, a <min,max> pair, a <min, max, increment> triplet or a
> vector<double>
> A visitor is applied recursively to the variants in order to traverse the
> whole parameter space.
> apply_visitor on x0 => say if x0 is a triplet, then
> for (x0 from min to max with increment)
> apply_visitor on x1
> and so on until x9, then we actually call the f function with all the
> arguments collected so far.
>
> How can one parallelize such a beast?
>
> rds,
>
>
> _______________________________________________
> Boost-users mailing list
> Boost-users_at_[hidden]
> http://lists.boost.org/mailman/listinfo.cgi/boost-users
>
Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net