Boost logo

Boost Users :

From: John Maddock (john_at_[hidden])
Date: 2006-08-02 04:47:14


Winson Yung wrote:
>> Hello all, I have a follow table:
>>
>> OPERATING REVENUES:
>> publishing
>> $ 42,419 $ 44,754 $ 46,203
>> collegiate marketing and production services
>> 97
>> ASSOCIATION MANAGEMENT SERVICES
>>
>> 16
>> wireless
>> 8,883 8,129 7,507
>>
>> 51,302 52,883 53,823
>>
>> All I know is that OPERATING REVENUES: will be always there,
>> question is how to write a regular expression to capture the total
>> (which is 51,302 here) There might be more/less than four rows in
>> the table. Would really appreciated if anyone has good suggestion on
>> this.

I'm assuming that the difference between the sub-totals and the totals is
that the sub-totals always have a header? If so then off the top of my head
(caution untried!) something like:

"OPERATING\\s+REVENUES:[[:blank:]]*[\r\n]+" // tag line
"(?:" // group sub-totals
 "\\s*[^\\d$][^\r\n]*[\r\n]+[^\r\n]+[\r\n]+" // sub-total=two lines
")*" // close group and repeat
"\\s+\\$?([\\d,.)+" // capture total

HTH, John.


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net