Anyway these two tests perform most equally on linux/g++ (but tested very roughly).

Actually, one more addition. I had tested the behavior for linux/g++ using codepad, where I posted the example. It is possible that the executable codepad compiled was without thread support and there the sentry has no locks, that's why it was the same as the stream_buffer approach.

Regards,
Ovanes