|
Boost : |
From: Andrey Semashev (andrey.semashev_at_[hidden])
Date: 2024-07-10 16:24:48
On 7/10/24 19:13, Daniela Engert via Boost wrote:
> Am 10.07.2024 um 17:37 schrieb Andrey Semashev via Boost:
>> On 7/10/24 18:26, Andrey Semashev wrote:
>>> On 7/10/24 14:59, Peter Dimov via Boost wrote:
>>>> Ruben Perez wrote:
>>>>> There is OPENSSL_Cleanse, which does a similar job. I found it
>>>>> surprisingly
>>>>> simple - I wonder if this is just enough or there are any corner
>>>>> cases where this
>>>>> doesn't work:
>>>>> https://github.com/openssl/openssl/blob/b544047c99c4a7413f793afe82ab1c165f85b5b6/crypto/mem_clr.c#L22
>>>> That's an interesting approach. I can't offhand think of a reason
>>>> why it wouldn't
>>>> work.
>>> The compiler might convert OPENSSL_cleanse to something like this:
>>>
>>> Â Â void OPENSSL_cleanse(void *ptr, size_t len)
>>> Â Â {
>>> Â Â Â Â memset_t func = memset_func;
>>> Â Â Â Â if (func != memset)
>>> Â Â Â Â Â Â func(ptr, 0, len);
>>> Â Â Â Â else
>>> Â Â Â Â Â Â memset(ptr, 0, len);
>>> Â Â }
>>>
>>> The purpose here is that, if memset_func is actually memset most of the
>>> time, it can further optimize the call to memset, including to
>>> completely remove it, depending on the call context. The
>>> (well-predictable) branch is typically cheaper than an indirect call. I
>>> think I've seen compilers do something along those lines as a result of
>>> call devirtualization, especially with IPO and LTCG.
>>>
>>> I'm not saying that's what actually happens in OpenSSL, just that
>>> something like this is possible. I think, a dummy asm statement is more
>>> reliable and more efficient.
>>>
>>> Â Â void secure_cleanse(void *ptr, size_t len)
>>> Â Â {
>>> Â Â Â Â memset(ptr, 0, len); // a normal memset, optimizations are welcome
>>> Â Â Â Â __asm__ __volatile__ ("" : : "r" (ptr), "r" (len) : "memory");
>>> Â Â }
>>>
>>> You can even make that function inline and it'll work, and in an optimal
>>> way, too.
>> And for compilers that don't support __asm__, I think you could replace
>> it with:
>>
>> Â Â std::atomic_signal_fence(std::memory_order::acq_rel);
>>
> I think Peter is actually right in his assessment:
>
> While 'memset_func' is TU-local, it is also 'volatile'. This implies
> every look at it might reveal a different content. Therefore the
> necessity for non-constness and dynamic initialization. We humans see
> that the variable can't change its value other than in the
> initialization. The compiler can't reason that because said properties
> are the only ones known during the compilation of 'OPENSSL_cleanse()'.
> It can't perform unbounded look-ahead until the end of the TU like we do.
With volatile, the compiler is not allowed to optimize away or reorder
loads and stores of the variable. There's no restriction on what the
compiler is allowed to do with the loaded value.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk