Hi everyone,

I'm trying to parse an HTML page using the Regex library and am running in to errors.

In the following snippets, "pageSource" is a string pointer to the contents of an html file.

This code causes my app to crash:

void Page::removeScriptTags() {
boost::regex tagRegex("<[sS][cC][rR][iI][pP][tT][\\w\\W]*?>[.]*?</\\s*?[sS][cC][rR][iI][pP][tT]\\s*?>");
string replaced = boost::regex_replace(*pageSource, pageSource, tagRegex, " ", boost::match_default);
delete pageSource;
pageSource = new string(replaced);
}

and this code crashes when attempting to destruct "matches":

void Page::findTitleSummary() {
boost::cmatch matches;
boost::regex bodyRegex("<[tT][iI][tT][lL][eE][\\w\\W]*?>([^<]*)</\\s*?[tT][iI][tT][lL][eE]\\s*?>");
if (boost::regex_search(pageSource->c_str(), matches, bodyRegex)) {
pageSummary = new string(matches[1]);
hasFoundSummary = true;
}
}

What am I missing?

Thanks,

Dave