Boost logo

Boost Users :

Subject: Re: [Boost-users] [Regex | Xpressive] Efficiently "grepping" large files
From: Thomas Luzat (thomas_at_[hidden])
Date: 2011-08-17 08:53:43


On 2011-08-17 14:43, Chris Cleeland wrote:
> Have you considered mmap'ing the file and allowing all your activity
> to occur on the mmap'd file? That way the VM subsystem would worry
> about paging things in or out as necessary, and there wouldn't be any
> issues with contention across multiple threads. Of course, if you
> don't have mmap on your system...

I have considered mmaping or reading through the whole file, but
benchmarking so far has shown that I am mostly I/O-limited. By
synchronously working on blocks in parallel I avoid disk seeks as much
as possible. I might offer such an implementation for cases where
seeks are not that expensive (such as for SSDs or slower CPUs).
Another problem is that mmap alone is not a complete solution in
itself on 32 bit systems given that files may very well be larger than
a few GB, but this can be solved now, too.

Cheers

Thomas Luzat

-- 
Thomas Luzat     Softwareentwicklung       USt-IdNr.: DE255529066
Kaiserring 2a    Mobile: +49 173 2575816   Web:       http://luzat.com
46483 Wesel      Fax:    +49 173 502575816 E-Mail:    thomas_at_[hidden]

Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net