|
Boost Users : |
Subject: Re: [Boost-users] Extending boost::function for mathematics?
From: Jesse Perla (jesseperla_at_[hidden])
Date: 2009-01-25 13:41:52
Thanks as always Stephen (who I still refuse to accept is either a single
individual or human): Sorry to all if this is getting so long and specific,
this project is justified later.
Your comments generated a lot of thought about what my main problems are. I
think I distracted myself by the generation of derivatives here before I had
worked out some other issues. But your implementation looks very elegant.
For now lets assume that people supply the gradient/jacobian directly or
from their own finite_difference/autodifferentiation. So here are some,
more basic, questions.
Question 1: Based on your template, how do I forward a function to the
boost::function's operator() in this wrapper? Why might we want to do this?
Because we are trying to implement a concept with operator() and
.gradient(..) with a function wrapper here. Library writers may want to
write generic code that operates on any of these, not just ones wrapped in
math_function... Then we may end up with true inlining of operator() and
.gradient(...) when a full differentiable function object is implemented,
instead of always going through an indirection pointer in boost::function.
template<class Signature>
class math_function {
public:
typedef boost::function<Signature>::result_type result_type;
template<class F>
math_function(const F& f, const G& f) : f_(f) : g_(g) {}
result_type operator()(How to extract from the signature the input
parameters? myvalues...)
{
return f_(myvalues....);
}
result_type gradient()(How to extract from the signature the input
parameters???????)
{
return g_(myvalues...);
}
private:
boost::function<Signature> f_;
boost::function<Signature> g_;
};
void test_mathfunctions()
{
using lambda::_1;
using lambda::_2;
math_function<double (double)> myfunc(_1*_1, 2* _1); //simple quadratic
cout << myfunc(.1);
cout << myfunc.gradient(.1);
}
Question 2: How can this wrapping interact with binding?
void test_mathfunctions_binding()
{
using lambda::_1;
using lambda::_2;
//But you might also want a constant thrown in there... Why a
separate parameter? Because you don't take the derivative over the other
stuff. And you want to keep your functions pure so you can bind, not force
them to be stateful, etc.
math_function<double (double, double)> myfunc(_1*_1 + _2, 2* _1);
//simple quadratic translated by a constant
//But how would you work with bind? Would these following work
immediately?
boost::function<double (double)> myfunc_bound = bind(myfunc, _1, 2.0);
boost::function<double (double)> myfuncderiv_bound =
bind(myfunc::gradient, _1, 2.0); //or something like this... I always forget
member function notation.
//Any way to bind that entire structure back to a math_function? Very
useful since we would bind parameters before calling optimizers, etc. With
dynamic programming we usually have a grid of values and call the optimizer
binding over the entire grid.
math_function<double (double)> myfunc_bound = math_bind(myfunc, _1,
2.0);
cout << my_func_bound.gradient(1.0); //Note that the binding can now use
the gradient.
}
Question 3: How would we adapt a structure implementing the operator() and
gradient() to be stored here? Just like boost::function accepting any
function object?
struct square
{
double operator()(double x) {return x * x;}
double gradient(double x){return 2 * x;}
};
math_function<double (double)> myfunc(square);
Given the previous code, would we write a constructor like?:
template<class F>
math_function(const F& f) : f_(f), g_(bind(F::gradient, f, _1...?) //Here
I don't see how to bind to the arbitrary signature, extracting the inputs
from the Signature which (may) be more than 1
Question 3: What are the thoughts on whether/how this is extensible to
higher dimensions? Lets start on the gradient with mapping functions: f
: R^N -> R
It is perfectly reasonable for these types of problems for the first
parameter in the function to be the only one that the derivative is taken
over. The dimensionality of the derivative/gradient/jacobian would come out
of a combination of the result_type from the signature and the
dimensionality of the domain. For example, something like:
template<int DomainDim, class Signature>
class math_function {
public:
typedef boost::function<Signature>::result_type result_type;
typedef boost::array<DomainDim, result_type> gradient_result_type; //Note
that this adds the dimensionality, the data structure could be a template
parameter.
template<class F>
math_function(const F& f, const G& f) : f_(f) : g_(g) {}
result_type operator()(How to extract form the signature the input
parameters???????)
{
return f_(Signature);
}
gradient_result_type gradient()(How to extract form the signature the input
parameters???????)
{
return g_(Signature);
}
private:
boost::function<Signature> f_;
boost::function<gradient result_type with OLD INPUT Signature> g_;
};
void test_mathfunctions_multidim()
{
using lambda::_1;
using lambda::_2;
boost::array<double, 2> mygrad(boost::array<double, 2> x, double a)
{boost::array<double, 2> ret; ret[0] = 2*x[0]; ret[1] = 2*x[1]; }; //I put
this separately since I don't know how to construct arrays in lambda
math_function<2, double (boost::array<double, 2> , double)>
myfunc(pow(_1[0]*, 2) + pow(_1[1], 2) + _2, mygrad);
cout << myfunc.gradient({{0.0, 0.0}}); //etc.
}
Question 4: Now lets say that what we really want is to manage f: R^N -> R^M
functions. Here, we could potentially use ublas::matrix as a default data
structure.
template<class Signature, class SecondDimStructure = ublas::matrix,
template <class T, int N> class NDimStructure =
boost::multi_array > //We really don't need the dim's here since we are
using heap based data structures.
class math_function {
public:
typedef boost::function<Signature>::result_type result_type;
typedef SecondDimStructure gradient_result_type; //This really is a
jacobian, size N * M
typedef NDimStructure<result_type::value_type, 4> hessian_result_type;
//This really is size (N*M) ^2 if I remember my calculus....
template<class F>
math_function(const F& f, const G& f) : f_(f) : g_(g) {}
result_type operator()(How to extract form the signature the input
parameters???????)
{
return f_(Signature);
}
gradient_result_type gradient()(How to extract form the signature the input
parameters???????)
{
return g_(Signature);
}
private:
boost::function<Signature> f_;
boost::function<gradient result_type with OLD INPUT Signature> g_;
//should add in the higher dimensional hessian of the function if I
remember
};
Question 5: I want to also associate information about the nature of the
function. Whether it is convex/concave, increasing, strictly increasing,
etc. These tend to have an inheritance structure, but it isn't absolutely
necessary. I was thinking about using trait classes. But I assume that we
would lose all those traits when we use type erasure? Are we better off
storing a bunch of enums in this data structure? Increasing the size of the
class shouldn't be an issue.
Question 6:
Let me justify why this is all very useful and a general problem: Right
now, everyone writing non-linear algorithms in C++ has their own structures
for managing mathematical functions that contain derivatives, jacobians,
hessians. I am running into this right now because I am trying to wrap a
bunch of other people's algorithms for my own work. A few comments:
1) Everyone is using dynamic polymorphism for these overloading a virtual
operator() and some type of gradient/jacobian... I assume this is because
there is no agreed upon concept for functions objects with gradients... In
reality, none of these seem to require dynamic polymorphism. Why shouldn't
a newton's method optimizer have inlining of the operator() and derivative
through static polymorphism where possible?
2) Most of these guys have conflated functionality/settings specific to
their optimizers/solves with the evaluation of the function/gradient. This
makes generic code difficult without wrappers/adapters.
3) Many use their own data structures for arrays, matrices, multidim
results. This may be inevitable, but means that you need to have these
types parameterized in any mathematical callback routine.
4) While a lot could be gained from having people agree on a static concept
for mathematical applications akin to standardizing on operator() for
functors, it is also very useful to have a generic callback wrapper akin to
boost;:function. The main reason for this is that you will tend to write
parameterized, pure functions where you want to bind parameters. Worst
case, people could write their own functions with this mathematical callback
structure and adapters could be written for the different libraries.
5) Not to disparage these superb libraries, but a few examples to see what I
mean: http://quantlib.org/reference/class_quant_lib_1_1_cost_function.html and
http://trilinos.sandia.gov/packages/docs/r6.0/packages/didasko/doc/html/nox_ex1.html
and http://www.coin-or.org/CppAD/Doc/ipopt_cppad_simple.cpp.xml
If anyone is interested in working with me on this very interesting and
useful problem, please give me an email. I think it has a lot of potential
for general use.
-Jesse
Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net