I've used http://homepage.mac.com/pauljlucas/software/html_tree/
http://tidy.sourceforge.net/ has a C++ binding
 


From: boost-users-bounces@lists.boost.org [mailto:boost-users-bounces@lists.boost.org] On Behalf Of Daniel Lord
Sent: Tuesday, May 20, 2008 18:38
To: Boost-users@lists.boost.org
Subject: [Boost-users] Somewhat off-topic: Any equivalent to Python's"Beautiful Soup" for C++

Beautiful Soup, for anyone who doesn't know, is a highly-tolerant HTML parser that is great for screen-scraping non-compliant (as well as compliant) HTML ( http://www.crummy.com/software/BeautifulSoup/ ).  

Although I've often resorted to using PyObjC on OS X to allow me to leverage Python's rich high-level modules from ObjC and C++, there are times when I don't want to use Python past the prototyping stage because of performance issues. But powerful modules like Beautiful Soup are then sorely missed. Anyone know of a similar library for C++?

Daniel Lord

--
Scanned for viruses & dangerous content at One Unified and is believed to be clean.

--
Scanned for viruses & dangerous content at One Unified and is believed to be clean.