Boost logo

Boost :

Subject: Re: [boost] [preprocessing] Feedback requested on a C99 preprocessor written in pure universal Python
From: Niall Douglas (s_sourceforge_at_[hidden])
Date: 2017-03-09 15:18:04


Sorry the late reply, your email was filed as Spam because it SPF
failed. Here was the SPF Failure cause:

Received-SPF: Softfail (domain owner discourages use of this host)
identity=mailfrom; client-ip=149.217.99.100; helo=mail2.mpi-hd.mpg.de;
envelope-from=hans.dembinski_at_[hidden]; receiver=s_sourceforge_at_[hidden]

You might want to fix this.

On 08/03/2017 08:29, Hans Dembinski wrote:
> Dear Niall,
>
>> Those of you who watch reddit/r/cpp will know I've been working for
>> the past month on a pure Python implementation of a C99 conforming
>> preprocessor. I am pleased to be able to ask for Boost feedback on
>> a fairly high quality implementation:
>>
>> https://github.com/ned14/pcpp
>
> did you do some benchmarks on how fast pcpp is compared to a "normal"
> C-based preprocessor?

Not with really large inputs yet, no. But its scaling curves are ideal,
so it's linear to tokens processed, linear to macros expanded, linear to
macros defined. It's an ideally minimum copy implementation made easy by
Python never copying anything unless asked, plus we keep token objects
below 512 bytes so the small object Python allocator is used instead of
malloc.

In absolute terms it will always be far slower than a C or C++
implementation. But we're talking half a second versus a tenth of a
second here for small inputs. I would suspect for large inputs the gap
will close, Python ain't half bad at performance once objects are
allocated, especially Python 3 where pcpp runs noticeably faster than on
Python 2.

I haven't tried pcpp with PyPy (JIT compiler) yet, but it does nothing
weird so it should work. That would close the absolute performance gap
substantially I would guess.

> I understand that you wrote this implementation
> to get more features and better standard-compliance than some
> commercial preprocessors, but since some people around me have
> claimed that preprocessing takes a significant fraction of total
> compile time, I wonder about performance.

If a build step can pre-preprocess all #includes into a single file and
run most of the preprocessing, compilers can parse it in much quicker.
If that step takes a few seconds but saves minutes for the overall
build. you win.

I can't say anything about MSVC, but GCC and clang they have a special
fast path for the preprocessor for chunks of text with no macro
expansions possible in it. With already preprocessed input, each
translation unit therefore can save big and thus the overall build time
substantially reduces. That's why Facebook Warp, HPX and other projects
have implemented a pre-preprocessing build step.

Niall

-- 
ned Productions Limited Consulting
http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk