Boost logo

Boost Users :

Subject: [Boost-users] Platform independent memory mapped [file] IO
From: Steve (sjh_boost_at_[hidden])
Date: 2011-11-22 06:40:20


I have some very large (TB scale) files, and I want to map fixed-size
(say 1Mb) segments from them into memory, for both reading and writing
(in different contexts) making maximum use of OS-level caching. The
software I'm writing needs to work under Unix/Linux and Windows...
performance is critical.

I've discovered boost::iostreams::mapped_file_source and
boost::iostreams::mapped_file_sink, which provide most of the facilities
I'm looking for. The facilities I'd like, but haven't found are:

  * Forcing a synchronisation of written data to disk ( msync(2) on
    Unix; FlushViewOfFile on Windows)
  * Locking files to prevent two processes attempting to write the
    same file at the same time (or read a newly created file before
    written data is flushed..)
  * Controlling attributes of the file at creation time (Unix)

Can I do these things using boost::iostreams::mapped_file_sink and
boost::iostreams::mapped_file_source? I'm aware of
boost::interprocess::basic_managed_mapped_file et al, which suppors a
flush() method - but I don't want to manage the files as if they were
file-backed heaps... with all the space allocation that entails. I am
only interested in efficiently reading/writing to fixed-size blocks of
very large files (of a size known from the offset.)

Am I overlooking a straightforward strategy, using the existing
boost::iostreams library, to establish file-locks and to ensure only
consistently written data is read by other processes? Is it appropriate
to want to to force synchronisation from code as soon as all the data in
a block has been written (avoiding a large latency when, say, 1GB of
pages have been updated and the process needs to exit?)


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net