Boost logo

Boost :

From: Jonathan Turkanis (technews_at_[hidden])
Date: 2004-03-01 01:51:28


Here is my formal review of FC++ -- sorry to wait to the last minute.

I find FC++ to be very exciting and potentially very powerful;
however, I vote that it not be accepted yet, because the changes
required are extensive enough that a second review is essential. (See
below for full details.) There are three reasons for the second
review:

- Some of the proposed resolutions (particularly the extension to
handle reference parameters) may turn out on closer examination not to
work adequately, or may seem to work okay for FP but raise unforseen
issues when FC++ is used in other C++ frameworks (e.g. STL).
- This review period has tended to focus, naturally enough, on major
issues; with a library the size of FC++ there are probably many
(possibly a dozen or more) smaller issues which deserve discussion.
For one minor example, I'll discuss the reuser syntax below.
- Once extensive changes have been made, there may be many minor
issues with the new code which deserve discussion but which could not
possibly have been addressed by this review since the could in
question doesn't exist yet.

Outline
    I. Basic formal review questions
    II. detailed discussion
        1. Purpose of library
        2. Naming
        3. Configurable max arity
        4. Support for reference parameters
        5. Simplifications:
             simpler return type deduction;
             eliminate full; make full-functors simple to define;
        6. Lambda expressions
        7. Implementation
        8. Testing

I. Basic formal review questions

Some of the formal review questions don't fit naturally into the rest
of my discussion.

* What is your evaluation of the documentation?

The documentation is terrible. This has been thouroghly discussed by
others, so I won't say any more.

* Did you try to use the library? With what compiler? Did you have
any problems?

I attempted to compile all the examples with VC7.1. Five test programs
failed to compile (I'll send the errors to Brian separately); one
compiled but caused the IDE to crash. I compiled some on Codewarrior
8.0 and 9.2, which have more problems with FC++ than VC7.1. I'll send
complete reports when I get a chance.

Having studied most of the implementation, I am fairly confident that
it can be made to work on any highly conformant compiler, so these
errors don't worry me.

* How much effort did you put into your evaluation? A glance? A quick
reading? In-depth study?

Over the course of several weeks I spend many hours studying the
library. I read the main documentation, plus two of the referenced
papers, and parts of the monad tutorial. I also read most of the
source.

* Are you knowledgeable about the problem domain?

Although I am familiar with explicit lambda syntax, currying, etc., I
am not knowledgeable about the FP paradigm. Until recently I knew
nothing about FP monads, for instance.

II. Detailed Discussion

1. Purpose of the library.

I think the stated aim of the library, even as a part of boost, should
be to provide a platform for functional programming research in C++,
parts of which can also be used to solve problems in non-FP code. Some
of these uses may be known, but many more will probably be discovered.

This leads me to the following ranking of criteria for evaluating
FC++, in order of importance:

    1. fidelity to the FP paradigm
    2. usefulness of parts of FC++ in other C++ frameworks (e.g. STL)
    3. consistency with the design of other Boost libraries
    4. similarity to Haskell or ease of use by FP-programmers with
little C++ knowledge.

>From this perspective, unlike some other reviewers, I am not concerned
at all if parts of FC++ ultimately end up duplicating some
functionality found in other boost libraries. Since none of them have
the aim of providing a platform for functional programming in C++, it
would not be surpirsing to find out that existing components with a
functional flavor are unsuitable for use by FC++. I said 'ultimately',
however, because I believe that if after making some of the
recommended changes, such as support for reference parameters, it is
found that some of the current FC++ components can be replaced with
existing Boost components, they defintely should be. Similarly, if it
is found that with these changes parts of FC++ contain the full
functionality of other boost libraries, with similar or identical
interfaces, the authors of those boost libraries should be contacted
to see if they are willing to make their components FC++-compatible to
avoid duplication. With these qualifications, I do not see duplication
as a problem.

For the same reasons I am also not worried if it cannot be
demonstrated, at this moment, that FC++ is the best way to solve any
particular C++ problem. I am confident that it will be useful in many
ways, but even if it should ultimately turn out that FC++ is never the
best tool, I think accepting FC++ into boost -- with the exploration
of new programming techniques and the interaction between programmers
with different backgrounds which it will bring about -- would still be
worthwhile.

2. Naming.

A) Many of the public types and functions have obscure names. (There
are also problems with the internal names, which I'll discuss under
'implementation'.) I'm willing to accept using strange names if they
are standard names from FP; however many of the FC++-specific names
are needlessly obscure.

For example, the templates c_fun_type, fun_type and the
smart_functoidn family are all helper templates designed to add
functionality to a derived class, along the lines of
std::unary_function. But the names give no indication what they are
supposed to do. I still have no idea what the 'c' stands for in
c_fun_type. (I'll suggest in (5) below how to eliminate fun_type and
smart_functoidn entirely, and replace c_fun_type with 'adaptable'; I'm
just using these as examples.)

'fun' should defintely be 'functoid'. 'full' is almost meaningless;
I'd suggest other names, but I think full should be eliminated (see
(5)).

B) There are many families of functions and classes which have
numerals built in to their names, e.g., funn, fulln, monomorphizen,
make_fulln. In almost all cases, it is possible to leave out the
numeral, and let the compiler deduce the relevant information. In some
cases, unifying these families is essential to making the maximum
functoid arity configurable (See (3) below). In most of the other
cases, the numerals should be eliminated because they make identifiers
harder to read, make programmers do more work than necessary, and may
leave the impression with some programmers that there could be subtle
differences between the numbered versions other than arity. (In fact,
'if0', 'if1', and 'if2' use numerals in a way completely unrelated to
arity, seeming to lend credibility to the suspicion that fun0, fun1
and fun2 might be fundamentally different types of entities.)

Consistency with the rest of Boost virtually requires that these
changes be made. We would never say make_tuple1, make_tuple2, for
instance.

In the rare case that the numeral cannot be eliminated (thunkn may be
one of these cases, because of prefix currying), the integer should be
made a template parameter, to signal that thunk<1>, thunk<2>, .. do
in fact form a parameterized sequence of functions.

C) Metafunctions should follow boost conventions. The return type
should be called 'type'. Type traits returning integral values should
be made models of MPL.Integral Constant. Type traits should be
separated so that each template provides one piece of information,
unless there is some special reason to group them.

D) Object generators should follow a consistent naming convention.
'funify' and 'monomorphize' are cute, but with so many names to
remember I'd prefer if the name of the object generator bore a
consistent relation to the name of the type being generated.

E) I don't expect much agreement on this last point, but I would like
to express my opinion that since Boost is enirely devoted to C++, it
makes no sense to have 'C++' in the name of a Boost library. Since
FC++ is already well-known under its current name, I suggest that
keeping FC++ indefintely as a sort of nickname, and that the
Boostified version be known as Boost.Functoid. The namespace
boost::functoid would also be easier to pronounce than boost::fcpp.

The name FC++ could still be used in research papers, as long as it is
mentioned somewhere that it is now officially Boost.Functoid.

3. Configurability.

It is essential that FC++ be reimplemented so that the maximum arity
is configurable. This is particularly important if FC++ is to be
integrable into other C++ frameworks. For instance, I'd like to
implement some logical formalisms using FC++ functoids as atomic
predicates; for this I'd need arity well above 3.

Making arity configurable can be done using the Boost Preprocessor
library. In some cases it will be straightforward; in other cases --
for example with currying -- a new implementation will be required,
more along the lines of the current lambda implementation (though it
should use MPL).

4. Support for reference parameters.

It is essential that FC++ funtoids be able to handle (non-const)
reference parameters. Without this, a large part of the usefulness of
FC++ with the STL and other frameworks is lost. This is unacceptable
to me, since usefulness of FC++ with other C++ frameworks is my 2nd
most important criterion.

I am very encouraged that Brian seemed to discover a way to do this
(message Thursday, February 19, 2004 3:03 PM). I'm not sure the
solution generalizes well to higher arities.

5. Simplifications

Several major simplifications should be made to the functoid part of
the library.

A) fun0, fun1, ... should be replaced by a single template 'functoid'
able to handle multiple arities. I'd suggest using a function type as
the template parameter, as in Boost.Function. I mentioned this already
under naming, but it's more than a simple name change since it would
require some reimplementation.

B) Rather than having users define direct functoids, then wrapping
them in full functoids when they want to take advantage of 'sugar', it
should be made easy to define full funtoids, so that all function
objects designed for use with FC++ can be expected to be full. (One
still needs adapters for foreign function objects, of course.) I think
this could be done with macros, e.g.:

    struct plus {
       // declare signature.
       BOOST_FCPP_LAMBDA_SUPPORT(2, plus)
       int operator() (int, int);
    };

I thought I remembered several reviewers suggesting this, but looking
back, I don't see it. Brian's suggestion for allowing non-const
reference parameters seems to require different overloads of
operator() depending on which parameter types were declared to take
non-const references. If this is true, more information might have to
be passed to the macro to avoid an explosion of (mostly disabled)
overloads.

One could still keep 'full' as an implementation detail, of course.

C) The technique for handling signatures of momomorphic and
polymorphic funtoids can be unified. Currently polymorphic functoids
declare their signatures using a nested sig template. Monomorphic
functoids declare their signatures with a nested sig template taking
dummy template arguments, together with typedefs first_argument_type,
second_argument_type and third_argument_type.

I suggest a single, unified system. Here is one way to do it:

Let a functoid's signature be represented by a nested typedef
'signature' of function type. In the monomorphic case, the argument
types and return type would be concrete types (int, string,
functoid<int(long)>, etc.). For polymorphic functoids, the argument
types would be formal placeholders; the return type would be an MPL
metafunction class or lambda expression evaluating to the return type
of the functoid when invoked with arguments having a given sequence of
types. I may have made it sound complicated, but it's really simple.

For instance, one might define a monomorphic functoid like so:

    struct plus {
        int signature(int, int);
        int operator() (int i, int j) const { return i + j; }
    };

A polymorphic functoid, which expects a unary funtoid argument plus a
value of some other type, and returns the result of applying the given
functoid to the given value, could be defined like so (see below for
explanation):

    struct apply {
        result_type<_1, _2> signature(any, any);

        template<typename F, typename T>
        typename result_type<apply, F, T>::type
            operator() (F f, T t) const { return f(t); }
    };

Here:

* result_type is a namespace-level metafunction taking a function
pointer, member function pointer or function object type, together
with a sequence of argument types, and returning the appropriate
result type. It should be smart enough to handle the system I am
describing, plus the various other return type deduction systems one
wishes to support.
* the symbols _1, _2 are MPL lambda placeholder symbols
* the tag struct 'any' is a formal placeholder; I could have used
_1, _2, ... here instead, but this would just make the metacomputation
harder. (I realise 'any' is already used by the library. One could
also
use 'any_type'.)

This technique also provides a way for polymorphic functoids to
specify that they take certain arguments by reference or const
reference, if that turns out to be necessary. E.g.:

    result_type<_1, _2> signature(const any&, any&);

It also allows for the possibility of mixed monomorphic-polymorphic
functoids (although I'm not recommending it):

    result_type<_2> signature(int, any);

If one needs genuine variable length argument lists (other than just
prefix currying), an ellipsis could be used.

With this technique, or something similar, one can get rid of fun_type
and smart_functoid, and replace c_fun_type with 'adaptable':

    template<typename Signature>
    struct adaptable {
        typedef Signature signature;
    };

    struct plus : adaptable<int(int, int)>;

I'll give two more examples.

>From compose2.cpp:

  template <class I>
  struct sig : public fun_type<typename RT<F,I>::result_type> {};

and

  template <class F, class G>
  struct sig: public fun_type<AddHelper<F,G> > {};

would be rewritten

  typedef return_type<F, _1> signature(any);
  typedef AddHelper<_1, _2> signature(any, any);

(This last example requires SFINAE, I believe, but Brian has said
that's okay.)

>From the main docs:

    template <class F, class L>
    struct sig {
        typedef list< typename F::template sig<
                      typename L::value_type >::result_type >
        result_type;
    };

would be rewritten

    typedef list< result_type<_1, _2> > signature(any, any);

(I should mention that this proposal could run into (solvable)
problems if some
of the 'concrete' types being manipulated are simultaneously
metafunctions.)

6. Lambda Expressions and other sugar

When I first saw the explicit lambda notation, I disliked it. However,
after reading the material on monads and looking at the sample
programs, I came to see why the various explicit variable binding
constructs are useful.

I doubt I will find much support on this point, but I would like to
urge FC++ to support both an implicit lambda and an explicit lambda
mechanism. It don't think it's quite as crazy as it sounds. The
explicit and implicit versions could share the underlying machinery
for managing lambda variables. Implicit lambda varibales would simply
be instances of lambda_var<N> for certain reserved values of N. The
library would provide instances of the reserved variable types. The
machinery is already in place to determine whether an expression
contains an implicit variable, if so, operator overloads could be
enabled liberally as in Boost.Lambda. If not, operator overloads would
be restricted to allow the let, letrec syntax, etc. You could tell
immediately whether an expression was an implicit lambda expression or
(part or) an explicit lambda expression: you would just look for the
formal placeholder symbols _1, _2, ... .

The reason I urge this is that FC++ with the implicit lambda syntax
would then be the most natural framework for defining anonymous
function objects for use with the STL (and probably with other
frameworks), assuming that an easy way to make all functoids 'full' is
provided. FC++ implicit lmabda expressions would be like MPL lambda
expressions, which are remarkable in that they can be used with
exactly the same syntax as the ordinary use of the expressions from
which they are composed; you only need to substitute placeholders
where concrete types occur. FC++ is so close to being able to provide
this facility that it would be a shame if it didn't take the extra
steps to do so. One of the most difficult parts, I think, would just
be explaining that FC++ supports two types of lambda syntax. I think
if the documentation is well done, this feature could be successful.

I am worried by Joel de Guzman's observation that the infix notation
could cause problems with operator precedence. Because I think that
C++ programmers quickly learn to read prefix notation using
parentheses, and because similarity with Haskell is low on my list of
priorities, I would recommend dropping the infix notation.

(I deliberately avoided discussing the question of parenthesis
vs.square brackets here, and also the question of whether more
operator overloading should be allowed with explicit lambda
expressions.)

7. Implementation

Some parts of the implementation are quite clever, for instance the
treatment of lazy lists. A few parts are antique; at the moment I can
think only of the currying, which would lead to a combinatorial
explosion if generalized to handle higher arities. The implementation
of lambda expressions (using techniques like the current Boost Tuple
implementation) shows that the authors know more modern tecniques.
Matt Marcus mentioned in his introduction that the lambda machinery
should be rewritten to use MPL; I believe more of Boost could be used
profitably in many places, esp. Type Traits, MPL, Preprocessor and
static assert. (I can give more details after the review if there is
interest.)

Most of the implementation is acceptable, but often the code seems
disorganized and inconsistent.

A) There are no discernable naming conventions. Some identitifers are
all uppercase, some are mixed case and some are lowercase with
underscores. Personally I don't care if the private names follow
different conventions from the Boost conventions for public names, but
there must be some convention.

B) A technique used throughout the library is for non-static member
functions to delegate to static member functions of helper class
templates to compensate for the lack of partial function template
specialization. This is a reasonable tecnique (although I would
usually prefer tag-dispatch), but it is applied inconsistently.
Sometimes the helper classes are defined as nested classes, sometimes
they are defined at namespace scope in the vicinity of the template
being implemented, sometimes they are defined in a completely
different header, with no explanation. The name of the helper template
often gives some clue as to how it is going to be used, but sometimes
not. For example, Curryable3Helper is defined in curry.hpp, but only
used in full.hpp, to implement the template full3. Why not call it
Full3Helper and define it in full.hpp?

C) The code contains few comments, and many of the existing comments
are completely useless. For instance, in pre_lambda.hpp, we have this
comment:

   // TE is a type environment; a list of TEPair<i,T>
    <snip>
   // BE is a value environment; a list of BEPair<i,LE>

Nothing in pre_lambda.hpp is named TE or BE. Later, in lambda.hpp, we
have the following:

  // I seem to recall that
  // ET - Environment Thunk
  // BE - Binding Environment
  // TE - type Environment
  // Yes. See pre_lambda.hpp for a little more explanation.

D) Some things are public which should be hidden. For instance, funny
tag structures often occur in the interfaces of public member
functions. list and odd_list both have public constructors which take
a single argument of type 'a_unique_type_for_nil'. odd_list has a
second public constuctor which takes this tag at the second argument
position. I'm aware that tags are occassionally necessary for
disambiguation, but do they really need to be public? If it's used
only by the implementation it should be private. If users ever need
these constructors, they should be replaced with clearly named object
generators.

All functoids which derive from the smart_functoidn family inherit
members with names 'crazy_max_args' and 'crazy_accepts'. The comments
explain that these identifiers are used so that 'user don't talk to
functoids directly'. The proper C++ way to acheive this is to make
these names private, and grant friendship to the traits templates
which use them.

I'm sure the reason for many of these problems is that the library was
modified many times, the most pressing concern always being to make
the library work rather than to make the implementation as a whole
coherent. This is a familiar situation, and quite understandable.
However, since software naturally tends to become more convoluted over
time, the best way to keep Boost.FC++ from becoming completely
inscrutable over the course of several years is to start with
completely clean, well-organized code.

I said at the beginning I would discuss the reuser implementation.
This is not an example of poor design, by any means; it is just an
example of one of mainy details which could benefit from discussion.
reuser2 is declared as follows:

    template <class V1, class V2, class V3, class F, class X, class Y>
    struct reuser2;

Each of the first three arguments is expected to be one of the tag
structs INV or VAR, indicating whether the the type at the
corresponding position in the second half of the argument list is
variant or invariant. This decouples the type being classified -- Y,
for instance -- from the type providing the classification -- V3, in
this case. It would make more sense to me to do something like this:

    template<typename T>
    struct inv { typedef T type; };

    template<typename T>
    struct is_invariant { /* omitted */ };

    template <class F, class X, class Y>
    struct reuser2;

Now, instead of writing reuser2<VAR, INV, VAR, T, U, V>, you write
reuser2<T, inv<U>, V>. This looks easier to use, understand, and
generalize. You might even be able to drop the '2'.

9. Testing. There needs to be a comprehensive set of tests, preferably
using Boost.Test, which makes it easy to tell which FC++ features work
on a given platform. Many of the fcpp_client programs were
instructive, but some seemed not to use FC++ at all, and for some it
was unclear what the expected output was supposed to be. The
fcpp_clients need to be separated into examples (in an 'example'
directory -- preferably with discussions of the important examples in
the documentation), and reqression tests (in a 'test' directory).

I hope my negative comments do not obscure my overall positive view of
this library.

Jonathan


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk