Boost logo

Boost :

Subject: Re: [boost] [operators] A modern SFINAE-based version of boost::operators?
From: Gavin Lambert (gavinl_at_[hidden])
Date: 2017-11-16 01:39:59


On 16/11/2017 05:41, Peter Dimov wrote:
> It's all slightly misleading anyway, because for the small matrix case
> the copy/move constructors don't actually have side effects and
> therefore get optimized out; they are only relevant in the case of
> something like std::string where copy/move aren't defaulted.

That was my point; your test code doesn't test copy elision because your
constructors have side effects, so can't be elided.

Of course, if you remove the counting side effects in that code then the
compiler just inlines everything to a single mov constant 10 :)

FWIW, in VC14.0 if you force it to use external parameters it can't
inline away then this code:

__declspec(noinline) T calc(int a, int b, int c, int d)
{
     return T(a) + T(b) + T(c) + T(d);
}

Turns into this with the rvalue-ref-return operators:

x86:
; _a$ = edx
   00003 03 55 08 add edx, DWORD PTR _b$[ebp]
   00006 03 55 0c add edx, DWORD PTR _c$[ebp]
   00009 8b 45 10 mov eax, DWORD PTR _d$[ebp]
   0000c 03 c2 add eax, edx
   0000e 89 01 mov DWORD PTR [ecx], eax
   00010 8b c1 mov eax, ecx

x64:
; _a$ = edx
; _b$ = r8d
; _c$ = r9d
   00000 41 03 d0 add edx, r8d
   00003 48 8b c1 mov rax, rcx
   00006 41 03 d1 add edx, r9d
   00009 03 54 24 28 add edx, DWORD PTR d$[rsp]
   0000d 89 11 mov DWORD PTR [rcx], edx

Whereas with the value-return operators:

x86:
; _a$ = edx
   00003 8b 45 08 mov eax, DWORD PTR _b$[ebp]
   00006 03 c2 add eax, edx
   00008 03 45 0c add eax, DWORD PTR _c$[ebp]
   0000b 03 45 10 add eax, DWORD PTR _d$[ebp]
   0000e 89 01 mov DWORD PTR [ecx], eax
   00010 8b c1 mov eax, ecx

x64:
; _a$ = edx
; _b$ = r8d
; _c$ = r9d
   00000 42 8d 04 02 lea eax, DWORD PTR [rdx+r8]
   00004 41 03 c1 add eax, r9d
   00007 03 44 24 28 add eax, DWORD PTR d$[rsp]
   0000b 89 01 mov DWORD PTR [rcx], eax
   0000d 48 8b c1 mov rax, rcx

The rvalue versions are very slightly more efficient, it looks like,
although they're pretty similar (and it's even sneaky enough to turn one
of the adds into an lea in the last one). Though, of course, counting
assembly ops means little with modern CPUs, so take that with a grain of
salt.

And again granted something bigger than an int or with non-trivial copy
constructors will get different results, but overall it looks like I was
wrong with my initial supposition.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk