Boost logo

Boost :

From: Matthew Vogt (mattvogt_at_[hidden])
Date: 2005-03-09 02:27:54


On Wed, 2 Mar 2005 21:03:43 -0700, "Jonathan Turkanis"
<technews_at_[hidden]> said:
> Hi All,

Hi. Sorry I'm replying to this thread so late!

[ snip ]
> My preferred solution is to have get() return an instance of a
> specialization of
> a class template basic_character which can hold a character, an EOF
> indication
> or a temporary failure indication:
>
> template<typename Ch>
> class basic_character {
> public:
> basic_character(Ch c);
> operator Ch () const;
> bool good() const;
> bool eof() const;
> bool fail() const;
> };
>
> typedef basic_character<char> character;
> typedef basic_character<wchar_t> wcharacter;
>
> character eof(); // returns an EOF indication
> character fail(); // returns a temporary failure indication.
>
> wcharater weof();
> wcharater wfail();
>
> [Omitted: templated versions of eof and fail]
>

How about if the character class has a safe conversion to bool which
returns (!fail() && !eof()) ?
All the filter code I've seen (mostly yours, admittedly :) ) calls 'get'
in a while loop; how about instead of checking for 'good' status all the
time, as in this code:

> class uncommenting_input_filter : public input_filter {
> public:
> explicit uncommenting_input_filter(char comment_char = '#')
> : comment_char_(comment_char) { }
>
> template<typename Source>
> character get(Source& src)
> {
> character c = boost::io::get(src);
> if (c.good() && c == comment_char_)
> while (c.good() && c != '\n')
> c = boost::io::get(src);
> return c;
> }
> private:
> char comment_char_;
> };
>

make it part of the loop continuation test, where the occurrence of EOF
or EAGAIN terminates the loop:
(BTW, this example (is meant to) take into account the
call-again-after-EAGAIN issue)

[Warning - don't try to compile this...]

struct uncommenting_input_filter : public input_filter
{
  explicit uncommenting_input_filter(char comment_char) :
    comment_char_(comment_char), in_comment_(false) {}

  template<typename Source>
  character get(Source& src)
  {
    character c;
    if (in_comment_)
    {
      while (in_comment_ && c = boost::io::get(src))
      {
        // c is not EOF or EAGAIN
        if (c == '\n')
        {
          in_comment_ = false;
        }
      }
      if (in_comment_) // c is EOF or EAGAIN
        return c;
    }

    if (c = boost::io::get(src))
    {
      // c is not EOF or EAGAIN
      if (c == comment_char_)
      {
        in_comment_ = true;
        return this->get(src);
      }
    }
    return c;
  }
}

> Similarly, usenet_filter::get (http://tinyurl.com/6xqvk) could be
> rewritten:

[ snip ]

  template<typename Source>
  int get(Source& src)
  {
    if (current_word_complete_ || eof_)
    {
      // Return any characters we buffered
      if (current_word_.size())
      {
        int next = current_word_.begin();
        current_word.erase(current_word.begin());
        return next;
      }
      else
      {
        if (eof_)
          return EOF;
        else
          current_word_complete_ = false;
      }
    }

    character c;
    while (c = boost::io::get(src))
    {
      // c is not EOF or EAGAIN
      if (is_alpha(c))
      {
        current_word_.push_back(c);
      }
      else if (current_word_.size())
      {
        map_type::iterator it = dictionary_.find(current_word_);
        if (it != dictionary_.end())
        {
          current_word_ = (*it).second;
        }
        current_word_complete_ = true;
        return this->get(src);
      }
      else return c;
    }

    // c is EOF or EAGAIN
    if ((c == EOF) && (current_word_.size()))
    {
      eof_ = true;
      return this->get(src);
    }
    else return c;
  }

In this mode, EOF and EAGAIN handling both disappear unless you're doing
something clever like buffering, since in both cases the filter doesn't
want to do anything with the character received except return it to the
caller.

> V. Problems ----------------------------
>
> 1. Harder to learn. Currently the function get and the concept
> InputFilter are
> very easy to explain. I'm afraid having to understand the basic_character
> template before learning these functions will discourage people from
> using the
> library.
>

If you rely on the boolean conversion, you often won't need to care
whether the character is good(), fail() or otherwise.

> 2. Harder to use. Having to check for eof and fail make writing simple
> filters,
> like the above, slightly harder. I'm worried that the effect on more
> complex
> filters may be even worse. This applies not just to get, but to the other
> functions as well, since their returns values will require more careful
> examination.
>

Actually, moving the algorithm state out of the single 'get' call is the
real complication...

Now, there may be any number of reasons why this is unworkable - I'm
sorry that I haven't had a chance to try the idea on your library before
posting...

> Please let me know your opinion.
>
> Jonathan

Hope this helps in some way,
Matt

-- 
  Matthew Vogt
  mattvogt_at_[hidden]

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk