Boost logo

Boost :

Subject: Re: [boost] RFC: edit_distance / edit_alignment library
From: Phil Endecott (spam_from_boost_dev_at_[hidden])
Date: 2013-07-24 06:36:40


Erik Erlandson wrote:
> I am grappling with how best to represent the returned "edit
> script".

"Best" obviously depends on what the caller is going to do with it.
A good design for the interface should emerge once some experience
has been gained with using the algorithm in actual applications.
Attempting to design a general-purpose interface too soon can be
a mistake.

Having said that, the most generic way to return the edit script
would be to template the algorithm on an output-handling class:

template <typename ITER1, typename ITER2, typename OUTPUT>
void diff(ITER1 begin1, ITER1 end1, ITER2 begin2, ITER2 end2, OUTPUT& output)
{
  .......
  // eventually calls object's methods, something like this:
  output.from_1(i,j); // "deletion", present in 1 but not in 2
  output.from_2(p,q); // "insertion", present in 2 but not in 1
  output.from_both(w,x, y,z); // common subsequence present in both
}

Used as follows:

class diff_output
  // Write edit script to cout in the style of the diff program
{
public:
  template <typename ITER>
  void from_1(ITER a, ITER b)
  {
    std::cout << "< " << string(a,b) << "\n";
  }

  template <typename ITER>
  void from_2(ITER a, ITER b)
  {
    std::cout << "> " << string(a,b) << "\n";
  }

  template <typename ITER1, typename ITER2>
  void from_both(ITER1, ITER1, ITER2, ITER2)
  {
  }
};

An important feature of this is that it doesn't store the output.
If the caller wants to store the output, they can supply an object that
does that.

Regards, Phil.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk