Boost logo

Boost :

Subject: Re: [boost] [GGL] [geometry] Inexplicable speed benefit when using Visual C++ 2010
From: Stephan T. Lavavej (stl_at_[hidden])
Date: 2010-04-17 21:19:26


[Christian Buchner]
> We found that GCC 4.4 on Linux was about 100% faster than Visual C++
> 2008 on Linux without modifying the code.

Wow, I didn't know that VC9 could target that platform. ;-)

> we attributed much of the performance difference to sub-optimal memory
> heap management of Visual C++ 2008.

VC's default implementation of ::operator new() calls malloc(), which calls HeapAlloc() from the Windows API.

HeapAlloc()'s implementation is mostly a mystery to me, but there are a couple of things about it that you should probably be aware of. First, on WinXP and higher (note that VC9 supported targeting Win2K, but VC10 doesn't), Windows implements the Low Fragmentation Heap (LFH). The LFH services "small" allocations (definitely sub-KB; I'd have to ask around in order to figure out the precise limit) from buckets, so applications that allocate lots of small chunks of memory (like nodes) can benefit significantly from the LFH. I believe it performs other wizardry which I'm not aware of. However, on some platforms, the LFH is not enabled by default. You can request it by calling HeapSetInformation() on the CRT heap. There are two situations where the LFH is enabled by default. First, VC enables the LFH for its CRT heap when targeting x64. (For obscure reasons, we can't do this for x86.) Second, WinVista and higher automagically enable the LFH when they detect allocation patterns that would benefit from this. I've spoken to the LFH's maintainers, and they say that this automagic machinery is sufficiently smart, that enabling the LFH manually is no longer necessary (i.e. they have never seen that produce an observable benefit).

> Then we tried recompiling the project with Visual C++ 2010 Ultimate
> Release Candidate (RC). The speed gain of the algorithm was 900% (not
> joking) and the results still appear to be correct.

This greatly amuses me, as I talked about seeing order-of-magnitude performance improvements but never expected to actually see one in the real world.

This is almost certainly due to rvalue references, as nothing else that we did could possibly have had so large of an effect. VC10 implemented rvalue references (v2, but not the stuff in the FCD that automatically generates move ctors/assigns), and we updated the Standard Library to take advantage of them (both by providing move ctors/assigns on Standard Library types like vector, and by taking advantage of move ctors/assigns during operations like vector reallocation).

> Is anyone else seeing similar speedups in boost or in the geometry
> library when compiling with Visual C++ 2010 RC (HINT: it's a free
> download, so anyone can try it out until end of June 2010).

Note that VC10 RC is perfectly representative of VC10 RTM. The compiler and linker are unchanged, and the Standard Library received exactly one bugfix (for obnoxious compiler errors in shared_ptr).

Also note that while GCC 4.3 and higher support rvalue references (v1 in 4.3 and 4.4, v2 in 4.5), they are not enabled by default. You must compile with either -std=c++0x or -std=gnu++0x in order to activate their C++0x mode. I'd be interested to learn how VC10 fares against GCC in C++0x mode. (I'm unsure as to when and to what extent rvalue references were applied to GCC's Standard Library implementation libstdc++. As GCC 4.5 was released mere days after VC10, that would generate the most fair comparison.)

About _SECURE_SCL, the worst performance penalty that I've ever seen it exact is 2x. It could have been responsible (along with allocator mumbo-jumbo) for the 100% difference you observed between GCC 4.4 and VC9, but not for the 900% difference between VC9 and VC10 (unless you have discovered a truly astounding pathological case).

_SECURE_SCL (now combined with _HAS_ITERATOR_DEBUGGING into _ITERATOR_DEBUG_LEVEL) in VC10 is now disabled by default in release mode. (In debug mode, _HAS_ITERATOR_DEBUGGING, aka _ITERATOR_DEBUG_LEVEL=2, is still enabled by default - it can affect performance significantly, but provides exhaustive correctness checks, and debug performance isn't especially important unless it's 1000x slower and renders the program too slow to debug.)

VC10's linker now deterministically detects when you're mixing translation units compiled with different settings of _ITERATOR_DEBUG_LEVEL, so that (combined with the new default of 0 in release mode) will put an end to the bad old days of incomprehensible ODR violations triggered by attempting to disable this machinery in some but not all libraries.

Do note that mixing major compiler versions (e.g. VC9 and VC10) is still completely forbidden when you're using the C++ Standard Library, as we break binary compatibility between every major version and will continue to do so for the foreseeable future. Unfortunately, the linker's new feature (#pragma detect_mismatch) is unable to detect VC9 and VC10 mixing, so you'll just have to be careful. (The machinery is in place for us to detect VC10 and VC11 mixing.)

[Barend Gehrels]
> I've asked last year during BoostCon to Microsoft, present there, if
> they knew this issue, but apparently they didn't, and told me it would
> surprise him because the compiler basically was the same.

(That was me.)

> When I used after that once the VC 2008 command line compiler (so not
> from Visual Studio) the problem disappeared. So it was fast. So it was
> not the compiler itself, it was some setting in the IDE or VPROJ. I've
> never managed to find which setting, though I studied it carefully.

According to my understanding, the VS 2008 IDE had a bug when upgrading VS 2005 projects, where optimization settings would be lost. I'm unsure of what the exact symptoms of the bug looked like (I thought it was just the /O options), but your experience (that it disappeared when you used the command line) perfectly fits with this hypothesis. Alternatively, you could create a VS 2008 project from scratch.

I believe that VS 2010's project system (which underwent a massive overhaul) does not suffer from this bug when upgrading either VS 2005 or VS 2008 projects, but I don't know what happens when you take a project that was damaged by the VS 2005 to VS 2008 upgrade, and then upgrade it to VS 2010. I suspect that in that case, the damage persists.

My understanding of this is vague because I don't use IDE build systems.

Stephan T. Lavavej
Visual C++ Libraries Developer


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk