|
Boost : |
Subject: Re: [boost] [optional] generates unnessesary code for trivial types
From: Domagoj Saric (domagoj.saric_at_[hidden])
Date: 2012-02-02 10:41:50
On 30.1.2012. 21:30, Sebastian Redl wrote:
>> All Linux compilers on x64 platforms follow the AMD64 ABI, possibly with minor variations/bugs. This ABI specifies that classes are passed in registers if
>> - they are trivially copyable and destructible (optional should be specialized for types that fulfill these criteria to ensure this),
>> - they have no virtual functions or bases,
>> - they are smaller than 2 qwords (4 qwords if all members are float, double, or SSE types), and
>> - they don't contain any weird stuff, like 80-bit long doubles or unaligned fields.
>>
>> The Mac ABI for x64 is very close, though I don't know the differences.
Thanks for the summary (didn't know there was a separate OS X x64 ABI).
>> The Win64 ABI is far less nice about registers. It passes the first four arguments in registers, and spills everything else onto the stack. It does not pack multiple values into a register. If a value is larger than 8 bytes, it is not split across registers. The ABI description says that "aggregates" can be passed in registers, but it doesn't elaborate on whether this refers to the C++ definition of aggregates (unlikely!) or whatever else the definition is. It sounds pretty useless.
Right, the Windows/MSVC x64 ABI is a major !?wth!?...I just can't think of a
reason why they had to invest resources into making their own ABI that is so
complicated and so inferior to the AMD proposed one (e.g. you can't pass an SSE
vector through an XMM register??).
> Correcting myself: the Common C++ ABI for x86-32 actually specifies that trivially copyable and destructible classes are treated just like simple values for parameter passing, so they can be passed and returned in registers. Of course, the far smaller register file of x86-32 makes that still not very useful.
Unfortunately I have never seen MSVC pass or return any struct through registers
even though it has interprocedural optimizations and link time code generation
capabilities so it can "invent" (as the documentation claims) its own calling
conventions for non exported functions. Don't know whether any other x86
compiler is able to do so...
In any case, the problem is that there is no nearly
portable/standard/wide-spread way (pragma, decl specifier...) to tell the
compiler to return small PODs in registers, especially not just for a particular
function and/or POD type. GCC has -freg-struct-return but that seems nearly
useless because it applies to the whole binary and so it requires the OS to be
built with that option.
-- "What Huxley teaches is that in the age of advanced technology, spiritual devastation is more likely to come from an enemy with a smiling face than from one whose countenance exudes suspicion and hate." Neil Postman
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk