|
Boost : |
From: grider1_at_[hidden]
Date: 2001-09-12 08:07:28
std::string stem (const std::string& word)
// Tries to guess the stem form of an English word by
//applying the following rules: (# and % stand for any
//consonant, and @ for any vowel [aeiou].
//
// If word ends in: replace it by: example:
// ies y flies => fly
// ied y carried => carry
// ier y merrier => merry
// iest y merriest => merry
// ##ed # hitted => hit
// ##ing # hitting => hit
// ##er # hitter => hit
// ##est # hottest => hot
// %@#ed %@#e glided => glide
// %@#ing %@#e gliding => glide
// %@#er %@#e glider => glide
// %@#est %@#e palest => pale
// #ed %# barked => bark
// #ing %# barking => bark
// #er %# colder => cold
// #est %# coldest => cold
// #s # things => thing
// If none of the above rules apply, then the word is
//assumed to be already in its stem form. This function
//returns the stem that it guesses. [Note that these really
//are guesses - it isn't hard to find words on which
// these rules fail. E.g., "pies"]
{
return word;
}
I have worked on this some on paper. I understand that it takes a
word like coldest and should convert to cold but my thoughts is ok I
know there will be a function to check for the certain
vowels/constants. I am curious for I don't have much to go on how to
get started in this direction. I'm reading a book that is using a
spell checker program from building it up and algorithm books just
give you the one or two word sentences and you have to figure the
rest. I was just hoping someone done something similiar to this and
could just give me a point in the direction. Thanks and have a nice
day.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk