Boost logo

Boost :

From: williamkempf_at_[hidden]
Date: 2001-03-26 10:50:53


--- In boost_at_y..., Jens Maurer <Jens.Maurer_at_g...> wrote:
>
> I've had a look at the boost thread library.
> All in all, I believe this is a solidish foundation to build on.
>
> Here are a few detailed remarks:

Thanks. Most of these comments are going to be useful.

> - Some parts of the documentation are hard to understand and/or
seem
> to require additional commas. Some rephrasing may help.
> In particular, the semaphore docs talk about "decrement in a single
> atomic operation". On first sight, decrement is only a single
> operation, so atomicity is for free (of course, it isn't
necessarily,
> because it involves read-modify-write bus cycles).

Just because something is "a single operation" does not mean that it
can perform atomically, but then you knew that ;). I'm interested in
hearing specific areas in which people think the documentation is
hard to read or understand, though.
 
> - The documentation is missing some statements about assumptions
> of instruction reordering by the compiler. This could mean
> to put the thread sync library under the "observable behaviour"
> clause from the C++ standard. We don't want to have instructions
> reordered across lock/unlock calls, possibly moving access to shared
> data out of the critical section :-) This needs some formalism
> similar to "volatile" handling, also taking into account actual
> memory bus transactions in SMP systems. Basically, we must be
> sure that all effects are "done" when the unlock() returns.

Agreed.
 
> - fast_mutex strikes me as a bad name. We just killed fast_crc.
> What about "unchecked"?

There's a lot more to the design of fast_mutex than just it's
unchecked locking policy (see my comments on speed below). I'm not
against renaming here, however. Maybe simple_mutex? (I don't like
that either, but...)

> - The "mutex" introduction should state at the very beginning that
> a mutex is always associated with some shared data. Otherwise the
> "shared data" falls out of nowhere.

Mutexes are not restricted to "shared data". When the documentation
is read in order this is stated better up front in
mutex_concept.html: "The purpose of a mutex (short for mutual-
exclusion) is to serialize access to a resource shared between
multiple threads." I could restate this in each mutex type's
documentation, but that seems like I'd be needlessly restating myself.

> - I've had some difficulties understanding the difference between
> "undefined" and "unspecified". This could be clarified.

I'll look into this.
 
> - I believe we need the possibility to specify the desired features
> of a mutex in a much more fine-grained manner. For example, on my
> platform (Linux 2.4.2, pthreads from glibc 2.2.2) I have the
following
> set of features available natively:
> - recursive (without timedlock only)
> - checked with and without timedlock
> - unchecked with and without timedlock
> (trylock is available in all cases)
> I'd like to make Boost.Threads use the native implementation
whenever
> possible (see also next item)

See my reply to another response to this. I'm not at all sure that I
agree with a generative approach and would rather see individual
types defined. BTW, a recursive type created with pthread libraries
should not be allowed to be used with boost::condition since it can
not be used safely (pthreads leaves the behavior as undefined if you
use a recursive mutex with a condition variable when the mutex has
been locked recursively).
 
> - While we're focusing on interfaces, I'd like to point out that
> performance seems to be important for threading. After all, one
> class of problems is turned into a threads-based implementation
> for the sole purpose of getting speed improvements on SMP
> platforms. I've run a few measurements with the current code
> (this is microseconds per lock/unlock pair (no contention) in a
> tight loop, see the attached code)

*chuckles* Performance being discussed while "focusing on
interfaces"? ;) Unless you think the interface prevents creating
efficient implementations the two are unrelated.

Other's have brought up issues with the performance of the current
implementations. My response is twofold. Firstly, we're focusing on
the interfaces at this point, insuring they are usable, safe and can
be implemented on target platforms. So performance is something that
isn't being focused on, and can be addressed later. Second, if it's
important to you and if you think that you can improve the
performance then I'd highly suggest submitting an alternative
implementation.

I should also comment that making a process multi-threaded solely to
speed up the process is often a mistake. On SP machines you'll very
rarely manage to speed things up, and in fact will often slow things
down instead. Even on an SMP machine, if the threads are not
carefully designed you'll wind up with a slower process. If things
are carefully designed you'll often find that the speed of locks make
little difference in the overall speed, since there will be very few
locks done (high lock contention causes a process to be nearly
synchronous in nature and thus can actually make a MT process run
slower even on SMP machines). I still agree that speed is important
here and would love to see faster implementations, but I feel that
reason has to be applied considering the facts I just gave and that
we have the "fast_mutex" that will work most of the time when speed
matters.

> boost::fast_mutex: 0.255257
> boost::mutex: 0.669022
> pthreads.fast: 0.21111
> pthreads.normal: 0.198289
> pthreads.recursive: 0.228514
> pthreads.errorcheck: 0.21327
>
> Note how boost::mutex is 3x slower than the native implementation,
> because it's emulating recursive mutexes with a condition variable.

boost::mutex does not emulate a recursive mutex. The condition
variable exists only because of the timedlock.

> Also note that the pimpl-wrapping in boost::fast_mutex costs about
> 10% performance.

I don't see this evidenced in your timings above. How did you figure
this out?

Several people have complained about the use of pimpl. My own
timings showed the speed difference to be nearly unmeasurable, but
I'll run some more timings soon. In any event, this decision is the
least significant one right now, trivially changed (even made
optional) at any point. Again, the focus needs to be on the
interfaces, not the performance.

> Summary: I want to use the native implementation whenever possible.
> That means the user must be able to help me and specify what kinds
> of locks he really needs on the mutex. Currently, he cannot do
> so.

Again, see my response to generative programming.
 
> - Minor sub-item of the above: timedlock + unchecked is not
available
> currently.
>
> - The above sounds like we need a feature model diagram.

One already exists. Such a diagram isn't going to give you a
solution, however.
 
> - The automatic conversion to bool got enough disreputation here
> that I don't need to add anything.

You must not be looking at the latest upload, which makes me question
your timings above. The implementation changed drastically for
pthreads and so they may (likely will) run much faster if you truly
were using an older implementation. As for bool conversions, they've
long since been replaced by void* conversions.
 
> - One other feature the user may want to state is whether he
> needs an explicit unlock(). If he doesn't need it, the boolean
> variable in the basic_*lock classes can be removed in these cases.
> (At least the interface should allow that to be implemented later.)

There's no point to using a gaurd if the destructor isn't going to
unlock the mutex. Removal of the bool is going to buy you very
little while it's going to cost the library a lot in terms of safety.
 
> - The timedlock interface (milliseconds) should be changed to
> the equivalent of POSIX timespec, i.e. seconds and nanoseconds to
> be future-safe.

A) I very much dislike using a "time out at a specified time"
implementation preferring to "time out after a specified duration".
I've recently seen arguments supporting both types of timeouts that
make some sense, so I'd consider including both, but I won't consider
removal of the duration timeout.

B) timespec is a POSIX thing. There is nothing available portably
(i.e. in the language standard) that would do the equivalent. I'm
not going to design an entire time interface just for Boost.Threads,
though I'm interested in Boost including such an interface.

C) It might be possible to define the waits in terms of seconds and
nanoseconds, the basic options available with timespec, but not all
platforms support this fine a grain of time duration. Milliseconds
are more likely to be implementable on multiple platforms. This was
the reasoning chosen at the time these interfaces were designed, but
I'm not locked into this.

> - The overall introduction in the docs misses a link to condition
> variables.

I look into this.
 
> - The example for the condition variable references a "finished"
> variable which appears not to be defined.

Hmmm... I look into this as well.
 
> - atomic_t should say that value_type must be at least "int".
> Otherwise, it's of not much use in the real world.

I'm not sure I follow you here. Care to explain?
 
> - lock_error should get a string why it was thrown.

I'll think about this.
 
> - How do I put a variable number of mutexes in a dynamic data
> structure? Say, I'd like to implement the "dining philosophers"
> problem with a runtime configurable number of philosophers?

I don't see the problem here either.
 
> - Regarding config.hpp: We've got a new dimension of configuration
> information here, namely operating system and hardware information.
> Up to now, we only had information regarding the deviation of some
> compilers and libraries from the true and only ISO C++. There
> is not such thing as "true and only" for the threading
configuration.
> Also, putting BOOST_HAS_WINTHREADS under the MSVC section is,
> technically speaking, incorrect, because there may be an
> implementation of MSVC on non-WinThreads platforms.

The only define really used outside of compiling Boost.Threads for a
given platform is BOOST_HAS_THREADS. I imagine the other defines
could be specified by the build process instead of being placed in a
header at all.
 
> I think we should leave out the threads configuration from
config.hpp
> and put it somewhere else. Are we going to have more interfaces
that
> need OS/hardware specific information? I believe, yes. Besides,
> the OS/hardware configuration should be (nearly) entirely orthogonal
> to the compiler/library configuration currently in config.hpp
>
> So let's have some os.hpp and hardware.hpp. Better names, in
particular
> for "os.hpp", appreciated. "platform.hpp" sounds too general,
though.

I don't think we should use additional headers. Either specify these
at build time, or leave them in config.hpp. Defining a new header
for this is only going to complicate things for users.

> Jens Maurer
>
> #include <boost/thread/mutex.hpp>
> #include <boost/thread/fast_mutex.hpp>
> #include <boost/thread/thread.hpp>
> #include <boost/timer.hpp>
>
> #include <pthread.h>
>
> #include <sys/time.h>
> #include <unistd.h>
>
>
> template<class M>
> void run(const char * name)
> {
> M mtx;
>
> timeval tv;
> gettimeofday(&tv, 0);
> for(int i = 0; i < 1000000; ++i) {
> typename M::lock lk(mtx);
> }
> timeval tv2;
> gettimeofday(&tv2, 0);
> std::cout << name << ": "
> << ((tv2.tv_sec - tv.tv_sec) + (tv2.tv_usec -
tv.tv_usec)/1e6)
> << std::endl;
> }
>
> void run2(int kind, const char *name)
> {
> pthread_mutex_t pmtx;
> pthread_mutexattr_t mtxattr;
> pthread_mutexattr_init(&mtxattr);
> pthread_mutexattr_settype(&mtxattr, kind);
> pthread_mutex_init(&pmtx, &mtxattr);
>
> timeval tv;
> gettimeofday(&tv, 0);
> for(int i = 0; i < 1000000; ++i) {
> pthread_mutex_lock(&pmtx);
> pthread_mutex_unlock(&pmtx);
> }
> timeval tv2;
> gettimeofday(&tv2, 0);
> std::cout << name << ": "
> << ((tv2.tv_sec - tv.tv_sec) + (tv2.tv_usec -
tv.tv_usec)/1e6)
> << std::endl;
> pthread_mutex_destroy(&pmtx);
> }
>
>
> int main()
> {
> run<boost::fast_mutex>("boost::fast_mutex");
> run<boost::mutex>("boost::mutex");
> run2(PTHREAD_MUTEX_FAST_NP, "pthreads.fast");
> run2(PTHREAD_MUTEX_NORMAL, "pthreads.normal");
> run2(PTHREAD_MUTEX_RECURSIVE, "pthreads.recursive");
> run2(PTHREAD_MUTEX_ERRORCHECK, "pthreads.errorcheck");
> }

You missed boost::recursive_mutex in your timings above.

Bill Kempf


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk