Boost logo

Boost Users :

Subject: [Boost-users] File-mapped shared memory access through Boost interprocess is slow on some HPC file-systems due to mmap
From: Scott Ritchie (s.ritchie73_at_[hidden])
Date: 2016-05-22 21:39:46


Hi all,

I'm wondering if there is some way to use the Boost interprocess library
without the use of mmap for file I/O?

I have developed an R-language package that uses another third party
package (bigmemory) to access objects in shared memory when parallelising R
across multiple cores. This third party package uses the Boost interprocess
library to make these file-backed shared memory objects available to R.

This worked perfectly, until I tested my package on a multi-node cluster.
After some digging I found that the code calling these shared memory
objects was extremely slow, and had hugely variable runtimes. Accessing
these shared memory objects became exponentially slower as I parallelised
over more and more cores.

After talking to the authors of the bigmemory library and the
administrators of the cluster, we have determined the route cause to be the
use of mmap() on that particular filesystem (GPFS). On this filesystem,
mmap() performs concurrency checking on every read and write since the
filesystem is shared across multiple physical nodes. Marking the shared
memory segments as read_only did not solve the issue.

The advice of the cluster administrators is to not use mmap() for file I/O.

I'm hoping there is a simple fix, such as a compilation flag I can provide
to boost to use some alternative file I/O function/library? If not I will
need to rewrite my code so that it is multi-threaded instead of
multi-process without the use of Boost.

Kind regards,

---
Scott Ritchie,
Ph.D. Student | Integrative Systems Biology | Pathology |
http://www.inouyelab.org
The University of Melbourne
---


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net