|
Boost : |
From: williamkempf_at_[hidden]
Date: 2001-03-15 12:36:21
--- In boost_at_y..., Dan Nuffer <dnuffer_at_c...> wrote:
> williamkempf_at_h... wrote:
> >
> > Thanks very much for your interest and comments.
>
> Your welcome. I also didn't mention it initially, because I was too
> eager to write down my suggestions before I forgot it all, but I
think
> you've done a great job getting everything to its current state.
Thanks
> for all your work.
Thanks.
> > Phase 1. I expect to add them in Phase 3. The introduction in
the
> > documentation will help you to understand why, though this type
was
> > not specifically mentioned.
> >
>
> Okay, I just wanted to make sure that they weren't being overlooked
or
> ignored :)
Rest assured, they are not. Other concepts I've kept in the back of
my mind that may be included in Phase 3 include gates, once functions
and events, though there's some desire to instead focus on other
higher level concepts such as Communicating Sequential Processes.
> > > The example you provided for the semaphore is more appropriate
for
> > a mutex.
> > > I think that a bounded buffer would show the semaphore concept
> > better.
> >
> > I very much agree that the example is contrived and does not show
the
> > benefits of the semaphore at all. I don't think a bounded buffer
is
> > a better example, however, as that example is better suited for
(and
> > given in the documentation for) a condition. I'll think about a
> > better example for this one.
> >
>
> True. I suggested the bounded buffer because it is the example
used in
> my college textbook as an example use of semaphores.
The problem I'm running into is that all examples I can come up with
would be better implemented with other synchronization concepts. It
truly seems that the semaphore is only suited for building higher
level concepts. If anyone has any ideas here I'd love to hear them.
Otherwise I may leave the example as is.
> > > I think you should move a lot of the simple 1-3 line functions
from
> > the .cpp
> > > files into the .hpp files such as:
> > >
> > > semaphore::semaphore(unsigned count, unsigned max)
> > > : pimpl(new impl(count, max))
> > > {
> > > }
> > >
> > > semaphore::~semaphore()
> > > {
> > > delete pimpl;
> > > }
> > >
> > > bool semaphore::up(unsigned count, unsigned* prev)
> > > {
> > > return pimpl->up(count, prev);
> > > }
> > >
> > > bool semaphore::down(unsigned milliseconds)
> > > {
> > > return pimpl->down(milliseconds);
> > > }
> >
> > This would require the "impl" class to be visable to client code,
> > which is exactly the opposite of the purpose. The implementation
is
> > supposed to be hidden completely from client code. Specifically,
I
> > don't want client code to include files such as Windows.h, even
> > indirectly, since this will polute the namespace and slow
> > compilation. All refactoring that will place these functions in
the
> > header and yet insure my goal of hiding the implementation will
> > result in no speed increase from inline optimizations.
> >
> > > Also, I wonder if the advantages of the pimpl idiom are greater
> > than the
> > > disadvantages? I definitely think it will cause a performance
hit.
> > > Everytime you lock or unlock a mutex, you've got to dereference
the
> > pimpl
> > > pointer. And for every primitive you create you've got to call
new
> > and then
> > > delete when it goes away. Is there any advantage other than a
> > slightly
> > > faster compile?
> >
> > It's more than "slightly" on many platforms, Windows being one of
> > them. Further, I don't want to pollute the namespace. The only
> > overhead occurs within construction/destruction, which are not
likely
> > to occur in time critical locations and so shouldn't be noticed in
> > most applications.
>
> AFAIK, there is no compiler that can inline functions that haven't
been
> included in a header file. That means, when I write some code that
uses
> the library, when I create a mutex or other primitive, it will have
to
> call new to allocate the pimpl, that much dynamic allocation is
> unacceptable for some applications. Then when I create a lock, it
will
> then have to first call the lock constructor (can be inlined),
> dereference a pointer to the mutex, and then make a function call
(not
> inlined) to the mutex's do_lock function, which will then
dereference
> the pimpl pointer and call the real do_lock function (can be inlined
> into functions that call it in the .cpp), which then has to call the
> os/library to do the lock. This seems like a lot of unnecesary
> overhead.
If the implementation is fully hidden you can at best get one inline:
either the outer call which calls a non-inlined impl function, or the
impl function itself.
> I think performance is very important.
I agree, but in this case the only way to make this faster is to
expose the implementation. The overhead caused is small enough and
the desire to hide the implementation great enough that I don't think
I want to change this.
> I recently
> benchmarked an application I had worked on that used the Xerces XML
> parsing library. The biggest time consuming functions were in the
> library: pthread_mutex_lock and pthread_mutex_unlock.
This illustrates why you're trying to over optimize here. The
overhead of a non-inline function call is a mere fraction of the
overhead involved in the lock and is lost in the noise. The overhead
of allocation and deallocation of the pimpl is more significant, but
it's also an operation that's not likely to be found in time critical
portions of the code (for example, you're not going to create and
destroy mutexes in a loop). So in practice these operations will
also be "lost in the noise".
> After that, there
> were two other layers that were almost equal in time consumption:
the
> platform abstraction function XMLPlatformUtils::lockMutex(void*
const
> mtxHandle), which was defined in the .cpp and thus not inlined.
Next
> was void XMLMutex::lock() which was the same thing. Now if the
Xerces
> library had written those two functions in the header files, they
> would've been inlined and the app would've gained at least 15% speed
> improvement. The extra layers effectively doubled the time spent
> locking/unlocking mutexes.
This I find hard to believe, and is counter to benchmarks of my own.
> I can understand your not wanting to include windows.h in the header
> file, but I think you're overstating the negative impact. If you're
> writing a program to run on windows, odds are you've already
included
> windows.h and then there's pre-compiled headers with MSVC.
*IF* you're writing it for Windows. With Boost it's more likely that
you're writing portable code that's not going to make use of a single
Windows API and the overhead at compile time becomes quite
significant. Worse yet, Windows.h defines MANY macros that will
conflict with C++ code. I strongly feel that such issues need to be
isolated and hidden when possible. If benchmarks showed me to be
wrong on the performance of real world applications I'd have to
change my mind... but remember, I ran benchmarks on the Win32
implementation during development and saw no measurable overhead at
all.
> For pthreads
> platforms, including pthreads.h is trivial.
No, for Unix platforms it's trivial. The pthreads library ported to
Windows, for example, makes the mistake you want me to make here and
includes Windows.h. This makes it far from trivial. This may well
be true for other pthreads implementations.
> Maybe you could take the approach of using a .ipp file, where if the
> user wants a minimal header they just include the .hpp, but if they
want
> speed, to have things inlined they include the .hpp and the .ipp.
Thus
> the user has the option to choose.
That may be worth considering, but it makes me worry about ODR
violations.
> > > > 3) I need serious help in creating some sort of build
system. I
> > > > realize this is a touchy subject since Boost doesn't yet have
a
> > > > standard way of handling this, but I'll settle for multiple
> > platform
> > > > specific makefiles much as Regex does. I'm just not
qualified to
> > > > handle this part.
> > > >
> > >
> > > I can probably help you out on that. Let me see what I can
come up
> > with.
> >
> > I'd appreciate the help. We need to coordinate things like this
so
> > that multiple people aren't working on the same thing. Are you
> > volunteering for specific platforms?
> >
>
> Yes I am. I can write autoconf/automake scripts. While I can only
test
> them on linux and cygwin, in theory, it should be portable to any
Unix
> that has pthreads installed.
So we should have *nix platforms handled. We need someone (or
multiple people) to handle the various Win32 compilers then, at the
very least.
Bill Kempf
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk