Boost logo

Boost-Build :

Subject: Re: [Boost-build] Regex.split on a string of 13, 000 newline-delimited filepaths (700kB) is prohibitively memory-intensive
From: Vladimir Prus (ghost_at_[hidden])
Date: 2009-09-14 12:59:43


On Thursday 10 September 2009 Matthew Chambers wrote:

> Attached is a simple test case based on the list of ".?[pp]" files in
> the boost source tarball, plus tools/build and tools/jam.
>
> It seems like a token-centric split method is appropriate, or a much
> more efficient regex implementation.

Hi Matt,

we already talked about this on IRC, and now the solution is available
in a clean way. Using SVN HEAD of Boost.Jam, you can use this:

        local l = [ SPLIT_BY_CHARACTERS $(string) : \n ] ;

Note that I have changed the naming to make it obvious that we're not
splitting by regex or string. I have also changed the order of parameters
to be similar to regex.split.

HTH,
Volodya


Boost-Build list run by bdawes at acm.org, david.abrahams at rcn.com, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk