Boost logo

Boost :

Subject: [boost] Boost::Regex get URLs
From: Tim Hsu (tim_at_[hidden])
Date: 2010-12-29 04:44:47


Hello guys,

How do i using Boost::Regex or standard C++ to get all the URLs from a
given string....
For example... i want to write a function that is....

But this code doesnt work.... I am not sure how to use Boost::Regex to
solve it....

    size_t GetUrlsFromString(std::string const& str,
std::vector<std::string>& urls)
    {
        boost::regex re("<a\\s+href=\"([\\-:\\w\\d\\.\\/]+)\">");
        boost::sregex_token_iterator p(str.begin(), str.end(), re, 1);
        boost::sregex_token_iterator end;

        while (p != end)
        {
            std::string surl(p->first, p->second);
            urls.push_back(surl);
            ++p;
        }
        return urls.size();
    }

Thank you!!!!
Tim


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk