Boost logo

Boost :

From: Oleg Abrosimov (beholder_at_[hidden])
Date: 2006-05-06 14:35:29


Jeff Garland wrote:
> Well, I can buy into the idea that the target is always a 'string'.
  And
> frankly I've never used any other sort of target/source for these kinds
> of conversion. But your're still not responding to the core of my
> question. The problem is "what's a string"? Some possibilities:
>
> std::wstring
> std::string
> std::basic_string<uchar> //some sort of unicode string
> std::vector<char> //and variants thereof
> sti::rope
> jeffs_really_cool_string_class // ;-)
>
> I think even if the plan is to limit to 'a string' you will need to
> support these sort of options. Which means you need a second templat
> argument.
>
> Which probably means you need to clearly define the concept of a string
> w.r.t these conversions. And perhaps that's how it differentiates from
> lexical_cast. That is lexical_cast
> requires Target == InputStreamable, DefaultConstructable and Source ==
> OutputStreamable. What are your concept requirements?
>

Jeff, I didn't respond to your previous mail where you request a
detailed comparison between string_cvt and lexical_cast<> because it is
a serious time investment that I can not manage for now (I'm really busy
with my PhD work till the end of may). If you are really interested in
this question and can not wait till June, you can jump into my the very
first message in this thread. I believe, that all information is hidden
;-) in the above mentioned text. Additionally, in the end of this
message, I've attached a copy of the project-related part of my SoC
proposal text.

If we speak about different types of strings, that all should be
supported in some way, it is supported by design of string_cvt library.
std::basic_string<> and any other sequence, that fulfills the Container
requirement would be supported out of the box (except top-level wrappers
like (w)string_from). And any other string/container of chars could be
supported by specializing the cvt_traits template and providing simple
functions like qtstring_from() with trivial implementation forwarding to
basic_string_from<string_type>().

Details are below:

till now support for string types other then std::basic_string<>
was done through cvt_traits template:

// it can be overriden for user-defined strings to support them
template <typename TCont>
struct cvt_traits {
      typedef typename TCont::value_type char_type;
      typedef typename TCont::traits_type traits_type;
      typedef typename TCont::allocator_type allocator_type;
      static const char_type* const c_str_from(TCont const& s) {
          return s.c_str();
      }
      static TCont from_c_str(const char_type* const cs) {
          return TCont(cs);
      }
};

and basic_string_from function template:

     template <typename TCont, typename T>
     inline TCont basic_string_from(
         T const& t
         [optional parameters, like std::ios_base::fmtflags and std::locale]
     );

it was used to implement (w)string_from function:

     template <typename T>
     inline std::(w)string (w)string_from(
         T const& t,
         std::ios_base::fmtflags flags = std::ios_base::dec,
         std::locale const& loc = std::locale::classic()
     )
     {
         return basic_string_from<std::(w)string>(t, flags, loc);
     }

and it can be used to implement conversion to
jeffs_really_cool_string_class:

     template <typename T>
     inline jeffs_really_cool_string_class
jeffs_really_cool_string_class_from(
         T const& t,
         std::ios_base::fmtflags flags = std::ios_base::dec,
         std::locale const& loc = std::locale::classic()
     )
     {
         return basic_string_from<jeffs_really_cool_string_class>(t,
flags, loc);
     }

given that cvt_traits are specialized for it.

But now, after discussion with Alexander Nasonov about string_from()
realization for boost::array<> struct, I've realized, that it would be
more efficient to utilize boost::iostreams library by Jonathan Turkanis.

till now the very internal implementation looks like:
template <typename T, typename TCont>
void operator() (T const& t, TCont& s) {
     stream << t;
     string_type str = stream.str(); // temporary std::basic_string<> is
created here
     s = cvt_traits<TCont>::from_c_str(str.c_str());
}

it is suboptimal because temporary string object is created.
string_convert library by Martin Adrian use special stream type with
access to it's internal character buffer. but it is not optimal too,
because additional copying from internal buffer is required. (if I
understand it correctly)

If I'm not missing something, with use of boost::iostreams it can be
improved.
for std::basic_string<>, boost::array<>, std::vector<> or any other
container (in STL sense) cvt_traits template can be defined as:

// it can be overriden for user-defined strings to support them
template <typename TCont>
struct cvt_traits {
      typedef typename TCont::value_type char_type; // it can be
redundant because of char_type typedef in container_source/sink.
      typedef typename TCont::traits_type traits_type;
      typedef typename TCont::allocator_type allocator_type;

      typedef container_source<TCont> source_type;
      typedef container_sink<TCont> sink_type;
};

where container_source/container_sink are defined as in boost::iostreams
tutorial.

with use of this new traits the internal conversion code would become:

template <typename T, typename TCont>
void operator() (T const& t, TCont& s) {
     boost::iostreams::stream<typename cvt_traits<TCont>::sink_type>
stream(s);
     stream << t;
}

1) I believe that it is optimal for performance
2) by providing appropriate specialization of cvt_traits one can enable
string_cvt library for any string_type.

Text of my SoC proposal is below:

In current C++ standard there is no simple way to do string to type
and type to string conversions. With use of std::iostreams conversion
from some object to string would looks like:

std::stringstream ss;
ss << object;
std::string s = ss.str();

That is too heavy compared to simple:

String s = object.toString();

in java language, for example.

At the same time, these conversions are made on a daily basis by most of
C++ programmers.
The reason is simple: most users’ input is a string that should be
interpreted
as an object of some type, and most program output is a text in some form,
produced from objects.

Two proposals were made to standards committee to solve this usability
problem
for “occasional” users (in terms of Bjarne Stroustrup):
n1803 (Simple Numeric Access) and
n1973 (Lexical Conversion Library Proposal for TR2).
Both of these proposals failed to provide a consistent and extendable
string conversions solution for C++ users.
“Consistent” here means that there is no more any need to fallback to
low-level C-library routines to achieve better performance
or to std::iostreams to achieve better control over formatting or locale
handling.
Furthermore, having two different tools to achieve the same goal would
be misleading for most of “occasional” C++ users.

***

This project is targeted to solve the whole string conversion problem
for C++
with minimal runtime and syntactical overhead in a form of a library wrapper
around existing C++ library facilities like std::iostreams
and C-library subset recommended in n1803 proposal.
This library would be implemented as a boost library.
After approval by boost community, a proposal for TR2 will be made
to replace n1803 and n1973 proposals and to push the library developed
in a future version of C++ standard.

Below is a requirements list for string conversion components of the
library proposed
(these requirements were collected from boost developers’ critiques
of n1973 proposal and from the n1973 proposal itself):
0) symmetrical approach for type to string and string to type conversions;
1) low syntactical overhead (short, but self-descriptive)
2) conversions could be controlled via facets (locales);
3) full power of iostreams in simple interface. All functionality
accessible with iostreams (through manipulators) should be accessible;
4) functor adapters to use with std algorithms;
5) error handling and reporting. (what kind of error occurred?)
    * optionally report failing without exceptions raising;
6) performance comparable to low-level “C” constructs (for built-in types);
7) ability to tune performance when use in loops;

***

As a result of this project the C++ community would have:
1) fully documented and tested library for string conversions under
boost umbrella;
2) full comparative performance analysis would be made
to ensure that there is no more any need to fallback to
low-level C-library functions to achieve more performance;
3) a new proposal for C++ standards committee for TR2
(deadline in October 2006).

***

The project timeline would be:
June) Developing the library;
July) Writing docs and full test suit. Test and tune performance.
August) Pushing it through boost review process. Address issues if raised.
Prepare proposal for TR2.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk