Boost logo

Boost Users :

From: jordi (jordil2_at_[hidden])
Date: 2004-12-27 10:46:01


Hi,

I'm new in regex and this is my first post, so maybe the solution is obvious
but I couldn't find it in google...

I need to parse the multiline output of a command, every line ends with a \n
except the last one, which actually it ends with the end of buffe ("\0"
character). The output I need to parse is something like:

   "text1 this is a multiple-word text\n
   text2 another text"
(the second line does not have a newline)

As a result I want only two sub-expression in a line using a regex like:

(\w+)\s+([^\n]+)\n

The first submatch should be the first word ("text1" and "text2"), while the
second submatch would be the rest of the line ("this is a multiple-word
text" and "another text")

In my program I use regex_search with the boost::match_continuous option,
all the other regex objects are created with the default options.

The first line matchs the regex expression without any problem but as the
second line does not end with a "\n" it does not. I'm unable to find a good
regex expression which can match the two possible "ends of line" (the \n or
\0 character)..

I've tried some expressions without success:

1.- First I tried to match \n || \0 using:

(\w+)\s+([^\n\x00]+)([\n\x00])

but it seems the \x00 is not part of the buffer, so the second line does not
match.

2.- Then I tried to use the "$" string without success (By the way, I
assumed "$" would work as "\n" but it does not match the "end of line"
character. When should I use??)

3.- In google I found that I should use "\z" or "\Z". I tried both, but
they didn't work: The last line of the text never matches! (I suppose I
need to add a new option to a regex object in order the "\z" o "\Z" strings
to work)

Finally I've found a workaround using the regex:

(\w+)\s+([^\n]+)\n*

and now it works but I would like to find a way to match the end of
buffer/end of string. Any idea??

Thanks in advance,

Jordi


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net