Hi all,

I'm wondering if there is some way to use the Boost interprocess library without the use of mmap for file I/O?

I have developed an R-language package that uses another third party package (bigmemory) to access objects in shared memory when parallelising R across multiple cores. This third party package uses the Boost interprocess library to make these file-backed shared memory objects available to R.

This worked perfectly, until I tested my package on a multi-node cluster. After some digging I found that the code calling these shared memory objects was extremely slow, and had hugely variable runtimes. Accessing these shared memory objects became exponentially slower as I parallelised over more and more cores.

After talking to the authors of the bigmemory library and the administrators of the cluster, we have determined the route cause to be the use of mmap() on that particular filesystem (GPFS). On this filesystem, mmap() performs concurrency checking on every read and write since the filesystem is shared across multiple physical nodes. Marking the shared memory segments as read_only did not solve the issue.

The advice of the cluster administrators is to not use mmap() for file I/O.

I'm hoping there is a simple fix, such as a compilation flag I can provide to boost to use some alternative file I/O function/library? If not I will need to rewrite my code so that it is multi-threaded instead of multi-process without the use of Boost. 

Kind regards,

--- 
Scott Ritchie,
Ph.D. Student | Integrative Systems Biology | Pathology |  http://www.inouyelab.org
The University of Melbourne
---