Boost logo

Boost-Build :

Subject: Re: [Boost-build] Regex.split on a string of 13, 000 newline-delimited filepaths (700kB) is prohibitively memory-intensive
From: Matthew Chambers (matthew.chambers_at_[hidden])
Date: 2009-09-14 13:01:01


Cool, thanks. I noticed that your first-pass patch with SPLIT worked on
its arguments by reference instead of by value. Indeed, even something like:
local foo = "bar,baz" ;
local tokens = [ SPLIT , : $(foo) ] ;
echo $(foo) ; # this outputs "bar baz" as a list, and it's equal to tokens

Is that behavior changed/fixed?

-Matt

Vladimir Prus wrote:
> On Thursday 10 September 2009 Matthew Chambers wrote:
>
>
>> Attached is a simple test case based on the list of ".?[pp]" files in
>> the boost source tarball, plus tools/build and tools/jam.
>>
>> It seems like a token-centric split method is appropriate, or a much
>> more efficient regex implementation.
>>
>
> Hi Matt,
>
> we already talked about this on IRC, and now the solution is available
> in a clean way. Using SVN HEAD of Boost.Jam, you can use this:
>
> local l = [ SPLIT_BY_CHARACTERS $(string) : \n ] ;
>
> Note that I have changed the naming to make it obvious that we're not
> splitting by regex or string. I have also changed the order of parameters
> to be similar to regex.split.
>
> HTH,
> Volodya
> _______________________________________________
> Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost-build
>


Boost-Build list run by bdawes at acm.org, david.abrahams at rcn.com, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk