Boost logo

Boost :

Subject: Re: [boost] Designing a multi-threaded file parser
From: Niall Douglas (s_sourceforge_at_[hidden])
Date: 2016-04-22 14:18:05


On 22 Apr 2016 at 10:31, Aaron Boxer wrote:

> My impression is that memory mapping is best when reading a file more than
> once, because
> the first read gets cached in virtual memory system, so subsequent reads
> don't have to go to disk.
> Also, it eliminates system calls, using simple buffer access instead
>
> Since memory mapping acts as a cache, it can create memory pressure on the
> virtual memory system,
> as pages need to be recycled for the next usage. And this can slow things
> down, particularly when reading
> files whose total size meets are exceeds current physical memory.
>
> In my case, I am reading the file only once, so I think the normal file IO
> methods will be better.
> Don't know until I benchmark.

You appear to have a flawed understanding of unified page cache
kernels (pretty much all OSs nowadays apart from QNX and OpenBSD).

Unless O_DIRECT is on, *all* reads and writes are memcpy()ied from/to
the page cache. *Always*.

mmap() simply wires parts of the page cache into your process
unmodified. Memory mapped i/o therefore saves on a memcpy(), and is
therefore the most efficient cached i/o you can do.

If you are not on Linux, a read() or write() of >= 4Kb on a 4Kb
aligned boundary may be optimised into a page steal by the kernel of
that memory page into the page cache such that DMA can be directed
immediately into userspace. But, technically speaking, this is still
DMA into the kernel page cache as normal, it's just the page is wired
into userspace already.

So basically you only slow down your code using read() or write().
Use mapped files unless the cost of the memcpy() done by the read()
is lower than a mmap(). This is typically 16Kb or so, but it depends
on memory bandwidth pressure and processor architecture. That part
you should benchmark.

Obviously all the above is with O_DIRECT off. Turning it on is a
whole other kettle of fish, and I wouldn't recommend you do that
unless you have many months of time to hand to write and optimise
your own caching algorithm, and even then 99% of the time you won't
beat the kernel's implementation which has had decades of tuning and
optimisation.

Niall

-- 
ned Productions Limited Consulting
http://www.nedproductions.biz/ 
http://ie.linkedin.com/in/nialldouglas/



Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk