Boost logo

Boost :

From: Daryle Walker (darylew_at_[hidden])
Date: 2005-09-29 20:50:55


On 9/28/05 11:28 AM, "Robert Ramey" <ramey_at_[hidden]> wrote:

>> Is there any reason why it was done this way? For threading issues?
>
> LOL, it is done that way because
>
> a) that was the first way it occurred to me to do it
> b) it worked
> c) it didn't occur to me when I wrote it that there might be a performance
> issue

There a lot more issues than that.

> Of course, now that you mention it ...
>
> My first inclination would be to use
>
> const unsigned char lookup_table[] = { ...
>
> Though I'm not sure that every compiler will exploit the const-ness to avoid
> re-initalizing lookup_table[] every time.

As another poster said, you could make the "lookup_table" static too.

> Thanks for a very astute observation.

But you got other problems, here's the entire class in
<boost/archive/impl/basic_xml_oarchive.ipp>

//========================================================================
//...
namespace boost {
namespace archive {

namespace detail {
template<class CharType>
struct XML_name {
    typedef bool result_type;
    void operator()(CharType t) const{
        unsigned char lookup_table[] = {
            0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
            0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
            0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0, // -.
            1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0, // 0-9
            0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, // A-
            1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,1, // -Z _
            0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, // a-
            1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0, // -z
            0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
            0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
        };
        if((unsigned)t > 127)
            return;
        if(0 == lookup_table[(unsigned)t])
            boost::throw_exception(
                xml_archive_exception(
                    xml_archive_exception::xml_archive_tag_name_error
                )
            );
    }
};

//...
//========================================================================

A static-const table would have to be created for each version of XML_name,
but (in the code later in the file) you only use a "const char" version of
the template. So why bother making this a template at all? Believe it or
not, Boost code doesn't _have_ to be made of templates.

* Why do you use "const char" instead of just "char" in your instantiations?

* result_type doesn't match what operator() returns.

* If the table is just 0's and 1's, why not just use "bool"? Is it because
"bool" can be bigger than a byte?

* The code assumes that CharType is compatible with "unsigned". What if it
isn't? This suggests to be that you should make a non-template function
that optimizes the code for "char" and make the class template call that
function (with appropriate conversions).

* The code assumes that CharType's code points are ASCII compatible.

* XML is supposed to be Unicode-friendly. What about non-ASCII characters
that could be used in XML names?

* As another poster said, what's wrong with using functions like "isalnum"?
If you're afraid it may include non-ASCII letters and numbers, then just use
a giant switch statement. Hopefully compiler makers haven't forgotten how
to optimize those (usually into a table).

* Why do bad values outside the range simply return while ones inside the
range throw?

> Sean Huang wrote:
>> I ran across the implementation of XML_name. It is a functor used to
>> validate each single character in an XML name. The initialization of a
>> stack-based look-up table everytime the functor is called puzzles me
>> (see excerpt below).
>>
>> void operator()(CharType t) const{
>> unsigned char lookup_table[] = {
>> 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
>> 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
>> 0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0, // -.
>> 1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0, // 0-9
>> 0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, // A-
>> 1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,1, // -Z _
>> 0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, // a-
>> 1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0, // -z
>> 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
>> 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
>> };
>> if((unsigned)t > 127)
>> return;
>> if(0 == lookup_table[(unsigned)t])
>> boost::throw_exception(
>> xml_archive_exception(
>> xml_archive_exception::xml_archive_tag_name_error
>> )
>> );
>> }
>>
>> Is there any reason why it was done this way? For threading issues?

-- 
Daryle Walker
Mac, Internet, and Video Game Junkie
darylew AT hotmail DOT com

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk