|
Boost : |
From: Jeff Garland (jeff_at_[hidden])
Date: 2006-07-01 21:03:36
I've been working on a little project where I've had to doing lots of string
processing, so I decided to put together a string type that wraps up
boost.regex and boost.string_algo into a string type. I also remember a
discussion in the LWG about whether the various string algorithms should be
built in or not -- well consider this a test -- personally I find it easier
built into the string than as standalone functions.
You can download from the string/text processing part of the vault:
Below is the summary and motivating code example.
Enjoy,
Jeff
--------------------------------------------------------------------------
Souped up string class that includes fancy query, replacement, and conversion
functions.
This type has the following main goals:
* Is a drop-in replacement convertable to std::string and std::wstring
* Provide case conversions and case insensitive comparison
* Provide white space triming functions
* Provide a split functions to parse a string into pieces base on string
or regex
* Provide sophisticated text replacement functions based on strings or
regex
* Provide append and insert functions for types
Overall, this class is mostly a convience wrapper around functions available
in boost.string_algo and boost.regex. This is best illustrated with some code:
super_string s(" (456789) [123] 2006-10-01 abcdef ");
s.to_upper();
cout << s << endl;
s.trim(); //lop of the whitespace on both sides
cout << s << endl;
double dbl = 1.23456;
s.append(dbl); //append any streamable type
s+= " ";
cout << s << endl;
date d(2006, Jul, 1);
s.insert_at(28, d); //insert any streamable type
cout << s << endl;
//find the yyyy-mm-dd date format
if (s.contains_regex("\\d{4}-\\d{2}-\\d{2}")) {
//replace parens around digits with square brackets [the digits]
s.replace_all_regex("\\(([0-9]+)\\)", "__[$1]__");
cout << s << endl;
//split the string on white space to process parts
super_string::string_vector out_vec;
unsigned int count = s.split_regex("\\s+", out_vec);
if (count) {
for(int i=0; i < out_vec.size(); ++i) {
out_vec[i].replace_first("__",""); //get rid of first __ in string
cout << i << " " << out_vec[i] << endl;
}
}
}
//wide strings too...
wsuper_string ws(L" hello world ");
ws.trim_left();
wcout << ws << endl;
Expected output is:
(456789) [123] 2006-10-01 ABCDEF
(456789) [123] 2006-10-01 ABCDEF
(456789) [123] 2006-10-01 ABCDEF1.23456
(456789) [123] 2006-10-01 2006-Jul-01 ABCDEF1.23456
__[456789]__ [123] 2006-10-01 2006-Jul-01 ABCDEF1.23456
0 [456789]__
1 [123]
2 2006-10-01
3 2006-Jul-01
4 ABCDEF1.23456
hello world
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk