Boost logo

Boost :

From: David Abrahams (david.abrahams_at_[hidden])
Date: 2002-02-25 23:19:30


OK, got any Perl savvy?

We already have a way to download all the messages thanks to Carl Daniel,
but unfortunately the whitespace following <BR> is always stripped by the
time it gets to his site, which kills the readability of source code. At my
site I get the whitespace, but I can't get his download software. I tried
the script, but it kills whitespace, too. A little poking at it (I have ZERO
perl expertise) tells me that it's nontrivial. My guess is that the best way
is to grab the whole page as a string and use a regexp to preprocess it by
replacing space characters with &nbsp in any line that starts with a space,
then pass it through HTML::TokeParser.

-Dave

----- Original Message -----
From: "Douglas Gregor" <gregod_at_[hidden]>
To: <boost_at_[hidden]>
Sent: Monday, February 25, 2002 10:06 PM
Subject: Re: [boost] http savvy?

> On Tuesday 19 February 2002 06:57 pm, you wrote:
> > It's being handled. I'll have them all in mbox format by tomorrow
morning.
> >
> > Here's a reference on the mbox format:
http://www.qmail.org/man/man5/mbox.
> >
> > -cd
>
> It's probably all finished by now, but I just ran across this:
> http://www.lpthe.jussieu.fr/~zeitlin/yahoo2mbox.html
>
> Doug
>
> Info: http://www.boost.org Send unsubscribe requests to:
<mailto:boost-unsubscribe_at_[hidden]>
>
> Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
>
>


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk