Boost logo

Boost Users :

Subject: Re: [Boost-users] regular expression to extract numbers
From: Ted Byers (r.ted.byers_at_[hidden])
Date: 2013-01-29 14:57:38


On Tue, Jan 29, 2013 at 2:24 PM, Neil Sutton <neilmsutton_at_[hidden]> wrote:

> I am writing a very simple program that extracts numbers from a string.
> The numbers are actually lottery numbers.
> So far, my program connects to a certain url and downloads a file that
> contains the latest lottery results. I have managed to reach the point
> where the barest amount of relevant data is contained in a std::string.
>
> The data is in the following format - though of course the date and
> numbers vary:
>
> 26-Jan-2013,2,6,21,29,34,47,11,X,X
>
> Note I am not interested - at this stage - in the last two numbers
> represented by X,X. I am only interested in the first seven numbers
> following the date.
>
> So I figured that it should be easy to write a regular expression to match
> this pattern:
>
> boost::regex pattern("\d\d\d\d,\\>(\\d{1,2})\\<,");
>
I do not know regex well enough to know whether or not a regex can provide
the basis for the 'fastest' implementation (I know from some of my
experiments, there can be an order of magnitude difference in performance
between the fastest and slowest algorithms to do the same thing - subject
to the caveat that they all satisfy the functional requirements correctly),
but if the only consideration right now is to get it working, why not
examine boost more thoroughly. It has a tokenizer already (
http://www.boost.org/doc/libs/1_52_0/libs/tokenizer/) that, once you know
how to use it, may eliminate the need for you to roll your own. It also
has a split function in the string algorithms library (
http://www.boost.org/doc/libs/1_52_0/doc/html/string_algo.html). In both
cases, you'd just split your example string on the comma. The first
element so extracted would be your date, and the rest would be your numbers.

HTH

Ted



Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net