My apologies in advance for the light novel.
# Do you think the library should be accepted as a Boost library?
Not in its current form, although I encourage cleaning it up and 
resubmitting it.
# Your knowledge of the problem domain
I've used COM smart pointers (usually CComPtr) a few times, and more 
rarely wrapped some other C or C-like APIs.  (Although mostly pre-C++11, 
requiring something more hand-rolled than unique_ptr.)
# What is your evaluation of the library's:
## Design
The API seems reasonable for both out_ptr and inout_ptr.  (Although also 
see the end of this post.)
Actual uses for inout_ptr outside of a "realloc" type function seems a 
bit rare to me (and such APIs also seem rather rare), but perhaps there 
are other justifying cases I am not aware of, or it's more common in a 
particular domain.
I do like that unspecified function argument evaluation order was 
considered, and I agree that it is better to try to maintain safety 
regardless of evaluation order rather than simply telling users to not 
access the smart pointer in other arguments -- it is a very easy trap 
for users to fall into, especially where the same code that has been 
working forever may suddenly fail with a different compiler version or 
different compiler settings.
(Sequencing of destructor execution with additional && conditionals is 
admittedly a similar problem with no good solution within this design, 
as is explained in the caveats.)
I don't like that it is left unspecified whether release() is called on 
construction or destruction (as that decision is left to the custom 
specialisations), as this potentially reintroduces that problem.  I 
think it should be required that all specialisations behave in the 
following manner:
For out_ptr:
   - construction initialises a raw ptr to nullptr and does not alter 
the smart ptr.
   - destruction resets the smart pointer to the raw ptr (in all cases).
     - this provides a useful guarantee that the smart ptr is always 
reset to nullptr if the called method returns nullptr, even if it had 
some other value previously.
For inout_ptr:
   - construction get()s the current value of the smart ptr in two 
separate raw ptrs (one passed as the function argument, one cached for 
the destructor to use for comparison), but does not release() it.
   - destruction will (only if the raw ptr now has some different value) 
release()s the smart ptr and then reset()s it to the new raw ptr.
     - this makes it safe to use the smart ptr's value in other 
arguments of the method call; unspecified release() timing would have 
made that unsafe.
(There is a slightly dodgy step in there where technically the smart 
pointer could be pointing at something which it thinks it owns, but that 
the called function has already deleted.  However given that any normal 
or exceptional exit path would then call the inout_ptr destructor 
[guaranteed before the smart pointer destructor], which would safely 
release() the pointer without double-deleting it, this seems 
sufficiently safe, and worth it to permit safe argument usage.  The only 
caveat is that usage of the smart pointer sequenced after the method 
call but before the destructor execution [as in additional && 
conditions] is not safe, as it may access deleted memory; however that 
code would never have the correct behaviour anyway and the documentation 
already warns to avoid doing that.)
Perhaps there are some types of smart pointer for which this contract 
wouldn't work (eg. if it is possible for release() to fail, or to 
release a different pointer from what get() returns)?  But then if so 
I'm not convinced it's safe to give them a uniform interface with 
pointers where it is safe, because otherwise inevitably someone is going 
to hit the unspecified-parameter-evaluation-order problem (or assuming 
that it will be nullptr on failure).
Especially given these suggested constraints, I would prefer that a 
simple ADL or traits-type customisation point were added, so that you 
can adapt to smart pointers which merely have different spellings of the 
reset and release methods and don't require deeper customisation -- 
though the existing customisation point could be retained for especially 
weird smart pointer types if needed.  (ADL customisation points are 
superior because, if designed correctly, they would not introduce a 
dependency on Boost.OutPtr, and if named well could theoretically also 
be used by other libraries, as with non-member begin and swap.)
Similarly, while the static_assert check in base_out_ptr_impl 
specifically for std::shared_ptr and boost::shared_ptr seems valuable to 
prevent a significant class of bugs, I question whether it might be 
better handled by checking a type trait or other customisation point 
rather than checking directly for those types, such that it would be 
possible for a user-defined smart pointer type to also participate if it 
has a similar concern.  Or perhaps it should just discourage use of 
shared_ptr at all; there should be little performance difference between 
using shared_ptr directly and using a unique_ptr which is later moved to 
a shared_ptr.  (Since neither can take advantage of the make_shared 
optmisation, and by definition ownership must be unique at that point.)
## Implementation
I haven't verified whether the current implementation exactly follows my 
suggested contract above, as there are a few too many layers of 
indirection and inheritance for me to easily parse.  Simplifying the 
implementation would be nice if possible, especially for out_ptr, to 
improve readability.  (However I think that my description is close at 
least.)
I am strongly opposed to both of the "clever" implementations in the 
library.  They should be completely removed.
I can think of absolutely no justification (performance or otherwise) 
for a "clever" out_ptr.  Due to both the requirement to pass in nullptr 
and to not delete the original smart pointer on construction (as it 
might be used in other parameters), the raw pointer can never have the 
same storage as the smart pointer anyway.  (At least not unless you 
absolutely promise to never use a non-nullptr smart pointer, which 
reduces usefulness, albeit admittedly for a slightly peculiar use case 
[which would have been unsafe when used with raw pointers directly].)
(In theory you could pass in an uninitialised raw pointer and then 
conditionally reset or do nothing depending on whether it still has the 
same uninitialised value after the call, but this both requires 
capturing the actual value at construction and providing a weaker 
contract to the C API, so it seems like a bad idea and a pessimisation 
to me.  Another option is to unconditionally reset() on construction, 
but this sacrifices use in other parameters.)
There is a possible justification for the clever inout_ptr, in that you 
can save two raw pointer copies if you reuse the smart pointer's 
storage, and this does not have the same caveats as with out_ptr.  I am 
not convinced that anyone should care about that (especially in code 
paths also including memory allocation), although it is possible that 
the savings may be more significant with a smart pointer more complex 
than unique_ptr.
Having said that, this "clever" implementation should absolutely never 
be used for any existing smart pointer, especially including 
std::unique_ptr or boost::movelib::unique_ptr.
If there is sufficient justification for its existence at all, then you 
should instead create a new smart pointer (bikeshed: 
boost::out_ptr::fast_ptr) which behaves equivalent to std::unique_ptr 
except that you can steal its raw pointer storage when using inout_ptr. 
This could be made move-compatible with std::unique_ptr for the people 
who want to keep that type in their interfaces -- although that would 
presumably also remove any dubious performance benefit.
As previously brought up in a separate thread, the current review 
implementation also leaks public inheritance of std::tuple, which is 
presumably intended as an empty-base optimisation, but must not be 
public or it will allow inappropriate slicing (and Very Bad Things™).
## Documentation
The documentation seems decent, although being split into many short 
pages as at present can at times make it harder to locate something 
specific.  I would prefer longer single pages divided into sections, but 
that's a matter of taste and not a review condition.
I dislike the assertion that a "well-designed" C API would always set 
the parameter to nullptr on failure.  That is certainly one option (and 
typically a preferred one), but leaving the parameter unchanged is 
another entirely reasonable option, and does not signify "poor design". 
It is only reasonable to claim poor design if the API allocates and sets 
the value, internally deletes it, and then reports failure (while still 
returning the altered and deleted pointer), as this condition cannot be 
handled by out_ptr (or indeed by anything that doesn't understand the 
failure return semantics of the API).
The time axis scale in the benchmarks seems not entirely helpful as a 
lot of the time the difference between two implementations is within one 
axis division of each other.
And again, the magnitude of performance difference sitting in the <1ns 
range in code which inherently implies memory allocation suggests a 
severe case of premature optimisation to me.  (Though perhaps that might 
be more justifiable when used in non-memory handle contexts -- but then 
that might be an abuse of calling it a "pointer".)
## Usefulness
I think that this library or something like it could be quite useful to 
help unify and improve safety when interacting with C-style T** APIs (or 
T* output arguments for some opaque handle type that requires cleanup), 
without requiring exploitation of operator& tricks like CComPtr does.
(Sure, it's trivial to write an RAII wrapper around such APIs, but this 
can become cumbersome when there are hundreds or thousands of such APIs, 
unless you're using a code generator.  And it's easy to accidentally 
forget subtle but important things such as deleting copy operations.)
# Did you try to use the library? With what compiler? Did you have any 
problems?
I did not try it.  However the "clever" implementation is highly 
compiler-and-library-specific and quite aside from its design issues I 
think it would also produce significant portability issues.
# How much effort did you put into your evaluation?
I spent several hours on this review over the course of several days, 
and some related prior discussion in the mailing list.  I read through 
all the documentation and most of the code (though not the tests).
============================
Having said all of the above, I wonder whether the library author has 
considered a different design for this functionality?
Rather than the given example code:
     using bop = boost::out_ptr;
     boost::shared_ptr<mll_driver> axis_driver(nullptr);
     mll_errno err = mll_driver_init(
         bop::out_ptr(axis_driver, mll_driver_free),
         /* additional args ...*/);
     if (err != MLL_SUCESS) {
         // ...
     }
A wrapping lambda syntax could be used instead:
     boost::shared_ptr<mll_driver> axis_driver(nullptr);
     mll_errno err = bop::invoke(
         [&](mll_driver** p) {
             return mll_driver_init(p,
                 /* additional args ...*/); },
         bop::out_ptr(axis_driver, mll_driver_free));
     if (err != MLL_SUCESS) {
         // ...
     }
Here, rather than its current meaning, bop::out_ptr just defines a 
parameter grouping which is then turned into a raw pointer-pointer 
argument for the lambda; it has no destructor-based shenanigans.  (This 
could also be used for other non-pointer argument types -- there just 
needs to be a way for invoke to query the expected API argument type 
from the out_ptr or from the lambda itself.)
This is admittedly more verbose than the current design (although 
bop::invoke could be written as just "invoke" due to ADL, for a slight 
simplification); but it would allow you to more safely control the 
sequencing between the API call and the reassignment of the smart 
pointer -- in particular all of the existing caveats with order of 
evaluations and with if-statement and composite conditional ordering no 
longer apply.  (You can safely "&& axis_driver" above as long as you do 
it outside of the invoke call.)
The API would be called (via the lambda) from a try-catch block inside 
of bop::invoke (or instead of the try-catch an RAII helper can be used 
to perform a try-finally action), and either the existing guarantee of 
not calling reset() or release() until after the call can be maintained 
(thereby allowing the smart pointer to be used by reference within the 
lambda), or it could be forbidden to use the smart pointer inside the 
lambda and the caller could lambda-capture the necessary arguments by 
value instead.  Either way, the caller should be less surprised about 
lifetime issues across a lambda "border", and would quickly discover if 
they did it "wrong" (because the reset/release would either always occur 
before or always occur after argument evaluation, never in mixed order).
invoke would normally be called with only one out_ptr or inout_ptr 
argument (and consequently one argument to the lambda), but could in 
principle be extended to any number of arguments either by nested invoke 
calls or by supporting a variadic form which does the equivalent within 
the library.
It could possibly also be extended to support user-defined argument 
types to do something more complex, by defining a concept interface that 
it uses to interact with the argument, rather than explicit overloads or 
specialisations on those specific types.  This could leave the decision 
of when to reset/release up to the smart-pointer-specific specialisation 
(if that's useful for performance reasons), and without leaving any time 
bombs for the caller.