Boost logo

Boost :

From: Brian McNamara (lorgon_at_[hidden])
Date: 2003-10-20 13:48:47


On Mon, Oct 20, 2003 at 12:25:26PM -0500, David B. Held wrote:
> Is it fair to say that a monad is essentially a generic algorithm?

I would say that each monad embodies some generic algorithm, yes.
However it does so in a way that conforms to a general interface, so
that structuring your programs monadically allows you to leverage the
infrastructure of the whole monadic framework, rather than just the
specifics of one particular generic algorithm. I'll try to illustrate
this with an example below.

> What I don't understand is that if my understanding is more
> or less correct, why monads are supposed to be so profound?
> I guess they do provide more flexibility than C++ generic
> algorithms, but do they do so in a way that is truly useful?

I think the answer is yes. But this is a truly excellent question, and
I am "stuck" thinking about monads only from the FP/Haskell point of
view, and so perhaps some of you can shed light on if/how you can
accomplish the same things using C++/generics in a different way.

> In your example, we have roughly the C++ equivalent:
...
> Is my understanding close?

Not really. There are problems with your mapping (from my example into
your C++ code), as well as bigger problems stemming from my poor choice
of illustrative example. Rather than try to correct this example, I'll
use another one which will hopefully make "what monads buy you" more
apparent and concrete. I would be interested to see what someone like
you (who, unlike me, doesn't already suffer from "monad bias") thinks of
the solution to the problem...

I'm stealing this example from the monads tutorial I posted a link to
earlier, because it's a fun example that illustrates the issues, even if
the problem domain is a little contrived.

----------------------------------------------------------------------
Suppose that we are writing a program to keep track of sheep cloning
experiments. We would certainly want to know the genetic history of all
of our sheep, so we would need mother and father functions. But since
these are cloned sheep, they may not always have both a mother and a
father!

We would represent the possibility of not having a mother or father
using the Maybe type constructor in our Haskell code:

    type Sheep = ...
   
    father :: Sheep -> Maybe Sheep
    father = ...
   
    mother :: Sheep -> Maybe Sheep
    mother = ...

Then, defining functions to find grandparents is a little more
complicated, because we have to handle the possibility of not having a
parent:

    maternalGrandfather :: Sheep -> Maybe Sheep
    maternalGrandfather s = case (mother s) of
                              Nothing -> Nothing
                              Just m -> father m

and so on for the other grandparent combinations.

It gets even worse if we want to find great grandparents:

    mothersPaternalGrandfather :: Sheep -> Maybe Sheep
    mothersPaternalGrandfather s = case (mother s) of
                                     Nothing -> Nothing
                                     Just m -> case (father m) of
                                                  Nothing -> Nothing
                                                  Just gf -> father gf
----------------------------------------------------------------------

Ok, so that's enough to see the "problem". The solution is to use the
Maybe monad:

    mothersPaternalGrandfather :: Sheep -> Maybe Sheep
    mothersPaternalGrandfather s =
       (Just s) `bind` mother `bind` father `bind` father

In the Maybe monad, bind() does the work of doing the exception-handling
at each level. We could equivalently write it using do-notation:

    mothersPaternalGrandfather s =
       do orig <- (Just s)
          mom <- mother orig
          gramps <- father mom
          father gramps

"do" just sugar-coats over the calls to bind(), and forces you to name
the intermediate results. What is nice is that the "do" code looks
almost just like what we might write in C++ if we used exceptions. That
is, imagine

----------------------------------------------------------------------
   // Now in C++.
   // (throw specs for purposes of exposition only)
   Sheep mother( Sheep s ) throw( NoSuchAncestor );
   Sheep father( Sheep s ) throw( NoSuchAncestor );
   Sheep mothersPaternalGrandfather( Sheep s ) throw( NoSuchAncestor ) {
      Sheep mom = mother(s);
      Sheep gramps = father(mom);
      return father(gramps);
   }
----------------------------------------------------------------------

In C and in non-monadic Haskell, you end up with exception-handling code
sprinkled throughout your code. (See the original version of
mothersPaternalGrandfather, or any C code which tests the result of each
call to malloc() for NULL and handles it, for examples.) In C++ we can
avoid this "sprinkling" using a language construct (exceptions). In a
monadic program (e.g. the second mothersPaternalGrandfather
implementation, or the equivalent C++ program using FC++ monads), we
can similarly avoid the "sprinkling" problem by letting the monads do
the work of propogating the exceptions. ("Exception handling without
exceptions" is a neat idea.)

Now let's generalize the monadic solution. Imagine that, via some crazy
new feat of genetics, a sheep might not just have 0 fathers or 1 father,
but possibly any number of fathers! (The problem is now getting really
contrived, but I'll show similar, more realistic problem shortly.) In
order to compute all possible mothersPaternalGrandfathers, we need only
change out code like this:

    father :: Sheep -> List Sheep -- List Sheep = [Sheep]
    mother :: Sheep -> List Sheep -- "List" is clearer for exposition
    mothersPaternalGrandfather :: Sheep -> List Sheep
    mothersPaternalGrandfather s =
       [s] `bind` mother `bind` father `bind` father

Now we get a list of all the Sheep who fit this criteria as our result.
The only changes we had to make to our code were
   "Maybe" -> "List" in three places
   "(Just s)" -> "[s]" in one place
That was all. I imagine it would take many more (fundamental) changes
to modify the C++-code-using-exceptions to do the "list" version.

Furthermore, we can generalize more:

    father :: (Monad m)=> Sheep -> m Sheep
    mother :: (Monad m)=> Sheep -> m Sheep
    mothersPaternalGrandfather :: (Monad m)=> Sheep -> m Sheep
    mothersPaternalGrandfather s =
       (unit s) `bind` mother `bind` father `bind` father

Now the same implementation works for both the Maybe case (Sheep have 0
or 1 of each parent), the List case (Sheep have any number of parents),
and now also the more common case (where each Sheep has exactly one of
each kind of parent), the last being done by using the "Identity" monad.

That example was contrived. Here is a more realistic example.

Parsers form a monad. A "Parser m a" is a function which takes in a
string and parses it, returning a value of type "m a", where "m" is
again some monad.

A typical parser might be built atop the Maybe monad, so that a parse
either succeeds (with result "Just a") or fails (Nothing). We can
imagine a C++-source-code parser (called "parse_cpp()") which returns a
Maybe AST:
   parse_cpp("int main() {}") --> Just <some AST representing program>
   parse_cpp("$bunch of_at_crap") --> Nothing
Not very exciting yet.

Now, as we all know, C++ forces us to write the annoying "typename" and
"template" keywords to make various dependent names easier for C++
compilers to parse. In the vast majority of cases, these keywords are
just redundant noise which makes the C++ grammar closer to being LL(1).
It is extremely rare to find a C++ program which will successfully parse
both ways (e.g. mean one thing when "typename" appears, and mean
something else, but still be legal, when "typename" is omitted).

Now, if we've written the front-end parser for our C++ compiler using a
monadic style, we could easily make "typename"/"template" optional. By
building the parser atop the List monad, we could return a list of all
possible parses. In the vast majority of cases, we'd get back a list of
0 or 1 results (corresponding to programs that fail to parse (like the
"bunch of crap" example), and legal programs, respectively). In the
extremely rare case where we get back 2 or more parses, the compiler
could report this to use, and point us to that oh-so-rare source code
location that actually truly _needs_ the "typename" keyword in order to
prevent ambiguity.

Ok, this message is getting way too long. I'd be very interested to see
how someone who doesn't know monads goes about writing a generic C++
implementation of "mothersPaternalGrandfather", which works for all the
cases (Sheep have {exactly 1, 0 or 1, any number} of each type of
parent). That example, while contrived, captures much of the essence of
monads.

-- 
-Brian McNamara (lorgon_at_[hidden])

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk