Boost logo

Boost :

From: Sascha Krissler (boost-dev_at_[hidden])
Date: 2006-08-07 22:37:48


-2. God, what's this?

I have had fun recently doing some generic programming and as
boost does not have a multiple dispatcher, i decided to send
a reconnaissance posting to your mailing list.
I want to TIA for your time and possible advice.

-1. How to use this document

This is a post to a mailing list and as such a request for comments and
suggestions. There is no price to win for finding bad jokes.

0. Introduction

This is a collection of algorithms for multiple dispatch and the
planning phase of a C++ library implementation for multimethods
that is very flexible because the algorithm to use is left as
a choice to the library user. So a buzzword^W policy based
multiple dispatcher framework is set as a goal of this effort.
This seems not completely useless as MD is probably not going
to be a language feature in this century and all prior work
known to me does not do it right.

0.1 Disclamer

all judgements made are misleading and are really a form of guessing
and a priori estimations in disguise

0.2 Prior work

Paradoxically C++ is looked down upon by folks that have multiple dispatch
built into the language as in my-power-comes-from-the-outlet-in-the-wall
although there is a big pile of libraries that implement multiple dispatch
in C++. It seems to be a famous topic.
Here is an incomplete list of such libraries:
The date indicates the last change AFAIK

* Modern C++ Design, Chapter 11 (2001)
* Stroustrup and Meyers have some primitive implementations which i don't
  have access to
* The cmm preprocessor (2003)
* http://homepage.ntlworld.com/w.weston/multiple_dispatch.html (?)
* http://www.eptacom.net/pubblicazioni/pub_eng/mdisp.html (1998)
* http://archives.free.net.ph/message/20040917.110529.dcfe7ed0.en.html (2004)
* http://archives.free.net.ph/message/20060301.200054.572b35f3.en.html (2006)
  (i didnt find files in the vault)

0.3 Source code

about 4000 Lines of code, only compiled with gcc so far
If you read source code for breakfast you can download it at
http://k.datenfreihafen.de/~sascha/multiple-dispatch-0.0.0.tgz
This is not a release, it requires some trivial preprocessing
to be compilable. Some of the files have real comments.
Chances are you can trade it for gold nuggets if you
find a tribe of primitives (e.g. basic programmers)
The file multiple-dispatch.cpp in the root of the archive
is worth reading as it documents the interface used so far.
It is included in this posting at the end.

0.4 Dependencies

STL
boost: optional, mpl, tuple (fusion in the future), type_traits, preprocessor,
       enable_if, call_traits, variant

1. Static dispatch algorithms
   (misleading naming!
    meaning: the set of candidate functions is known at compile time)
   )

1.0 General

Static dispatch algorithms dispatch a method invocation to a fixed set
of candidate methods. This set is known at compile time.

1.0.1 Advantages

There are some kind of dispatch algorithms that only work on a static
function set. This involves all algorithms that do not primarely rely on data
structures but on code generation through chaining calls to template
instances that test arguments.
Many use cases of multiple dispatch can be handled by static dispatchers,
roughly all which need not be extensible by dynamic loading and do not
express application logic through the runtime modification of the
candidate function set.

1.0.2 Disadvantages

A compile-time fixed set does not cover all use cases, especially
it does not allow extension of a multi-method
by dynamic loading of shared libraries or the use of modification
of the candidate function set to express application logic.

1.0.3 Common properties

Reentry: it is simple to call the next specific method of a candidate set.
         Because we know the types of the function that is executing
         we can find the next less specific function from our candidate set.
         The next candidate is found at compile-time, when compiling
         the call_next<>() expression.

Intrusive: checking for the dynamic type of an argument either involves
           a dynamic_cast<>() or a lookup in a hand-crafted map data
           structure. Doing a dynamic cast is most efficient for
           inheritance hierarchies without multiple inheritance.
           For more complex hierarchies it is more efficient to
           make a hash of the typeids of all possible subclasses
           which map to an adjustment offset.
           Even more efficient is a perfect hash map that automatically
           resizes itself to prevent collisions.
           The hashing of the typeid would not be necessary if there
           is a virtual function overridden in each subclass that
           returns a small integer that can be used as an index
           into a vector instead of a hash.

1.1 Dumb dispatcher

Type: code generation

Complexity: O(f * a) where f is the number of functions and a is
            the number of arguments

Description: Every candidate function is checked in turn for an argument match
             The candidate functions are sorted so that the first match is also
             the best match.

Use when: the number of candidate functions is very small

Implementation status: works, not extensively tested

Prior work: Modern C++ design Chapter 11 "Static Dispatcher"

1.2 Logarithmic smart dispatcher

Type: code generation

Complexity: O(log(f) + a)

Description: A compile time algorithm tries to divide the candidate function
             set into 2 equally sized halves depending on the test of
             an arbitrary type at an arbitrary position.
             When the 2 parts are not halves they should be as equally sized
             as possible.
             This algorithm recursively invokes itself until only one
             function is left to check. This last check is then a check
             for the exact types of the candidate function.

Use when: medium sized candidate set, small memory requirements

Implementation status: not finished

1.3 N-Dimensional VTable

Type: data structure driven

Complexity: O(a)

Description: An n dimensional vtable is built at compile time
             with one dimension for each argument.

Cool features: Table compression. The table is not really n-dimensional
               physically, because each argument test reduces the
               possible candidate functions left and thus the possible
               types to check for in the tests of the remaining
               arguments

Use when: you are in need for speed

Prior work: Modern C++ design Chapter 11 "Constant time dispatcher"

Implementation status: not finished

2. Dynamic dispatch algorithms

2.0 General

Dynamic dispatch algorithms allow for the set of candidate functions
to be modified at runtime. This can either be used to extend a multi-method
by dynamic loading of a shared library or by altering the candidate function
set at runtime to express application logic.

2.0.1 Advantages and Disadvantages

Compare static dispatcher disadvantages and advantages in reverse

2.1 Logarithmic binary tree based dispatcher

...to be done...

2.2 N-Dimensional VTable based dispatcher

...to be done...

2.3 Decision tree based algorithmic dispatcher

...to be done...

3. Source code snippet

#include <Pattern/Meta/MultipleDispatch.hpp>
#include <boost/progress.hpp>

using namespace Pattern::Meta::MultipleDispatch;

class Base;
class Sub;
class MoreSub;
class EvenMoreSub;

struct my_functor {
  typedef int return_t;
  typedef boost::mpl::vector<Sub *, Base *> args_t;
  static int call(Sub * s, Base * b) {
    std::cerr << "functor\n";
  }
};

// need to repeat the signatures here until C++ gets an auto keyword
typedef Overload<0, Algo::Typeid>
::add<int (*)(Sub *, Sub *)>::type
::add<int (*)(MoreSub *, MoreSub *)>::type
::add<int (*)(Base *, Base *), struct predicate5 {
  template <typename Tuple>
  static bool triggers(Tuple & args) {
    return args.template get<0>()->dummy() == 42
      && args.template get<1>()->dummy() == 23;
  }
}
>::type
::add<int (*)(Sub *, MoreSub *)>::type a_fun;

int a1(int a, int b) {
  std::cerr << "a1\n";
}

class Base {
public :
  virtual int dummy() { return -1; }
};

class Base2 {
public :
  virtual int dummy2() { }
};

class Sub : public Base2, public Base {
public :
  int dummy() {
    return 42;
  }
};

class Sub2 {
public :
  virtual int dummy3() { }
};

class MoreSub : public Sub2, public Sub {
public :
  int dummy() {
    return 23;
  }
};

class EvenMoreSub : public MoreSub {
};

int a2(Sub * s1, Sub * s2) {
  std::cerr << "a2\n";
  // call_next<a_fun>(s1, s2);
}

int a3(MoreSub * s1, MoreSub * s2) {
  std::cerr << "a3\n";
  // call_next<a_fun>(s1, s2);
}

int a4(Sub * s1, MoreSub * s2) {
  std::cerr << "a4\n";
  std::cerr << reinterpret_cast<int>(s1) << " " <<
    reinterpret_cast<int>(s2) << "\n";
  // call_next<a_fun>(s1, s2);
}

int a5(Base * b1, Base * b2) {
  std::cerr << "a5\n";
}

main() {
  Sub b1, b2;
  MoreSub b3, b4;
  EvenMoreSub e1, e2;
  int ret;

  // a_fun::set(&a1);
  a_fun::set(&a2);
  a_fun::set(&a3);
  a_fun::set(&a4);
  a_fun::set(&a5);
  a_fun::finish();

#if 0
  {
    boost::progress_timer t;
    for (int i = 0; i < 10000000; ++i) {
      ret = call<a_fun>((Base *)&b1, (Base *)&b3); // calls a4
    }
  }
#endif

#if 0
  {
    boost::progress_timer t2;
    for (int i = 0; i < 10000000; ++i) {
      ret = ((Base *)&b1)->dummy();
      ret = ((Base *)&b1)->dummy();
    }
  }
#endif

  std::cerr << reinterpret_cast<int>((Sub *) &b1)
            << " "
            << reinterpret_cast<int>((MoreSub *) &e1)
            << "\n";
  ret = call<a_fun>((Base *)&b1, (Base *)&e1); // calls a4
  ret = call<a_fun>((Base *)&b1, (Base *)&e1); // calls a4
  // ret = call<a_fun>((Base *)&e1, (Base *)&e2)
  // ret = call<a_fun>((Base *)&b3, (Base *)&b4); // calls a3
  // ret = call<a_fun>((Base *)&b1, (Base *)&b2); // calls a2
  // ret = call<a_fun>(1, 2); // calls a1
}


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk