Boost logo

Boost :

From: Jeff Garland (jeff_at_[hidden])
Date: 2006-07-01 21:03:36


I've been working on a little project where I've had to doing lots of string
processing, so I decided to put together a string type that wraps up
boost.regex and boost.string_algo into a string type. I also remember a
discussion in the LWG about whether the various string algorithms should be
built in or not -- well consider this a test -- personally I find it easier
built into the string than as standalone functions.

You can download from the string/text processing part of the vault:

http://tinyurl.com/dbcye

Below is the summary and motivating code example.

Enjoy,

Jeff
--------------------------------------------------------------------------
Souped up string class that includes fancy query, replacement, and conversion
functions.

This type has the following main goals:

     * Is a drop-in replacement convertable to std::string and std::wstring
     * Provide case conversions and case insensitive comparison
     * Provide white space triming functions
     * Provide a split functions to parse a string into pieces base on string
       or regex
     * Provide sophisticated text replacement functions based on strings or

       regex
     * Provide append and insert functions for types

Overall, this class is mostly a convience wrapper around functions available
in boost.string_algo and boost.regex. This is best illustrated with some code:

   super_string s(" (456789) [123] 2006-10-01 abcdef ");
   s.to_upper();
   cout << s << endl;

   s.trim(); //lop of the whitespace on both sides
   cout << s << endl;

   double dbl = 1.23456;
   s.append(dbl); //append any streamable type
   s+= " ";
   cout << s << endl;

   date d(2006, Jul, 1);
   s.insert_at(28, d); //insert any streamable type
   cout << s << endl;

   //find the yyyy-mm-dd date format
   if (s.contains_regex("\\d{4}-\\d{2}-\\d{2}")) {
     //replace parens around digits with square brackets [the digits]
     s.replace_all_regex("\\(([0-9]+)\\)", "__[$1]__");
     cout << s << endl;

     //split the string on white space to process parts
     super_string::string_vector out_vec;
     unsigned int count = s.split_regex("\\s+", out_vec);
     if (count) {
       for(int i=0; i < out_vec.size(); ++i) {
         out_vec[i].replace_first("__",""); //get rid of first __ in string
         cout << i << " " << out_vec[i] << endl;
       }
     }
   }

   //wide strings too...
   wsuper_string ws(L" hello world ");
   ws.trim_left();
   wcout << ws << endl;

  Expected output is:

     (456789) [123] 2006-10-01 ABCDEF
(456789) [123] 2006-10-01 ABCDEF
(456789) [123] 2006-10-01 ABCDEF1.23456
(456789) [123] 2006-10-01 2006-Jul-01 ABCDEF1.23456
__[456789]__ [123] 2006-10-01 2006-Jul-01 ABCDEF1.23456
0 [456789]__
1 [123]
2 2006-10-01
3 2006-Jul-01
4 ABCDEF1.23456
hello world


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk