Boost logo

Boost :

From: Jeroen (jeroen_at_[hidden])
Date: 2001-08-27 12:11:43


I'm fairly new to Boost, but the following seems to work:

#include<iostream>
#include<boost/tokenizer.hpp>
#include<string>

int main(){
   using namespace boost;
   string s = "$GPGSA,A,3,25,04,16,06,18,24,,,,,,,1.86,0.92,1.62*0E";
   char_delimiters_separator<char> d(true,"$,*","");
   tokenizer<char_delimiters_separator<char> > tok(s, d);
   for(tokenizer<char_delimiters_separator<char> >::iterator
beg=tok.begin(); beg!=tok.end();++beg){
       cout << *beg << "\n";
   }
}

The output includes the separators, which makes it a simple matter to figure
out when you have an <empty> token (This is the case if the current and the
previous token are both separators).

Jeroen

-----Original Message-----
From: johan.nilsson_at_[hidden] [mailto:johan.nilsson_at_[hidden]]
Sent: Sunday, August 26, 2001 11:50 PM
To: boost_at_[hidden]
Subject: RE: [boost] Re: Tokenizer and empty tokens

Thanks,

that works for my previous example, but wasn't very intutive (at least not
for me :).

Unfortunately, it didnt't work for what I'm trying to do; parsing NMEA-0183
strings, in which it is allowed to have empty fields in the sentences. A
'real-world' example: from the following NMEA sentence;

---
$GPGSA,A,3,25,04,16,06,18,24,,,,,,,1.86,0.92,1.62*0E\r\n
---
... I'd like to get the following tokens extracted:
---
GPGSA
A
3
25
04
16
06
18
24
<empty>
<empty>
<empty>
<empty>
<empty>
<empty>
1.86
0.92
1.62
0E
---
I.e.: use be able to '$', ',' and '*' as separators (I can simply remove the
"\r\n" before tokenizing). How about adding some kind of policy class to
allow extraction of empty tokens  to the library?
// Johan
> -----Original Message-----
> From: jbandela_at_[hidden] [mailto:jbandela_at_[hidden]]
> Sent: den 24 augusti 2001 22:38
> To: boost_at_[hidden]
> Subject: [boost] Re: Tokenizer and empty tokens
>
>
> The following should work for your example
>
> string str("one,two,,four");
> tokenizer<escaped_list_separator<char> > tok(str);
> copy(tok.begin(), tok.end(), ostream_iterator<string>(cout, "\n"));
>
> This should do what you want provided you do not have quotes or
> backslashes embedded inside those commas.
>
>
> Regards,
>
> John R. Bandela
>
>
> --- In boost_at_y..., johan.nilsson_at_e... wrote:
> > [first of all, sorry if this results in a repost from yesterday]
> >
> > Hi,
> >
> > is it possible to use the tokenizer to get (embedded) empty tokens
> > extracted? I tried to search the archives on this; couldn't find
> > anything.
> >
> > E.g.:
> >
> > ---
> >
> > string str("one,two,,four");
> > tokenizer<> tok(str);
> > copy(tok.begin(), tok.end(), ostream_iterator<string>(cout, "\n"));
> >
> > ---
> >
> > Would render something like the following:
> >
> > ---
> > one
> > two
> >
> > four
> > ---
> >
> > // Johan
>
>
> Info: http://www.boost.org  Unsubscribe:
> <mailto:boost-unsubscribe_at_[hidden]>
>
> Your use of Yahoo! Groups is subject to
> http://docs.yahoo.com/info/terms/
>
>
Info: http://www.boost.org  Unsubscribe:
<mailto:boost-unsubscribe_at_[hidden]>
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk