|
Boost Users : |
From: John Maddock (john_at_[hidden])
Date: 2004-09-30 05:33:26
> Thank you Caleb and Hartmut for your replies. You both seem to think regex
> is a bad way to go, I will explain better what I want to write just to be
> clear.
>
> I want to write a tool (cli probably) where I can say, here you go, here
> is
> a large folder full of code, go and parse it. I will store the results in
> XML format somewhere, then, I can do say, "<myapp> class someclass" and
> the
> program will go and find where that class is declared/defined using its
> database, saving me headache.
>
> So I thought I could use one of the C++ expat wrappers, and boost regex
> looked powerful enough to do the parsing if only I were handy enough with
> regular expression syntax.
>
> Anyway, I don't know if that better explanation will make any difference
> to
> you recommendations, I look forward to reading you opinions.
>
> Oh, and the example I looked at is here Hartmut:
> http://boost.org/libs/regex/example/snippets/regex_search_example.cpp -
> that
> is what got me thinking I might actually be able to take on this
> challenge.
It depends what you want to do: if you want to use a "real" C++ parser then
you will also have to preprocess the code (including the includes) and then
parse the code. In theory this gives you a "perfect" result, but only if
you know what include paths to use, and what predefined macros should be set
(think about conditional code blocks).
Regexes on the other hand, don't require you to preprocess the code, but can
get confused by macros and the like.
So you have to choose the way that best meets your expectations, and live
with the defects either which way ;-)
To solve your problem BTW, why not scan through the file for line starts
(keeping count obviously!), and at each line start see if it's also the
start of the regex you are interested in (one that matches a class
definition for example), if you do this don't forget to either:
prefix your expression with \A
or
Pass the match_continuous flag to regex_search,
Either will anchor the search at the start of the line you are checking, and
prevent the whole text being searched.
John.
Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net