Boost logo

Boost :

Subject: Re: [boost] [Review] Lockfree review starts today, July 18th
From: Grund, Holger (Holger.Grund_at_[hidden])
Date: 2011-07-21 06:30:55


> The review of Tim Blechmann's Boost.Lockfree library starts today, July
> 18th
> 2011, and will end on July 28th.
> I really hope to see your vote and your participation in the
> discussions on
> the Boost mailing lists!
>
I had a quick look a the code and read the first few pages of the docs, but haven't built or run anything yet. Still a few questions:

The documentation talks a bit about false sharing and to some extent about cacheline alignment to achieve that, but I don't see that to extent I would expect in code. Specifically, how do you ensure that a given object (I only looked at ringbuffer) _starts_ on a cacheline boundary.

I only see this weird padding "idiom" that everyone seems to use, but nothing to prevent a ringbuffer to be put in the middle of other objects that reside on cacheline that are happily write-allocated by other threads. For instance, what happens for:

ringbuffer<foo> x;
ringbuffer<foo> y;

Consider a standard toolchain without fancy optimizations. Wouldn't this normally result in the x.read_pos and y.write_pos to be allocated on the same cacheline.

I have been bitten quite a few times by compilers not implementing explicit alignment specification in a way you would expect (specifically for objects with automatic storage duration)

There also doesn't seem to be a way to override the allocation of memory. For the kind of low-latency we (as in Morgan Stanley) interested in, we may sometimes care about delays from lazy PTE mechanisms that many operating system have. If you just simply allocate via new[] you may get a few lazily allocated pages from the OS. A 1ms delay for page fault is something we do care about.

Is there any good way to override the allocation?

Are there any performance targets/tests? E.g. for a ringbuffer, I found a test with a variable number of producer and consumers useful, where producers feed well-known data and consumers do almost nothing (e.g. just add the dequeued numbers or something) and see what kind of feed rates can be sustained without the consumer(s) falling behind.

Lastly, what's going on with all the atomic code in there? Can I assume that's just implementation details that overrides things in the current Boost.Atomic lib and hence ignore it for the review?

Thanks!
-hg

--------------------------------------------------------------------------
NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers. If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk