Boost logo

Boost :

From: Gennaro Prota (gennaro_prota_at_[hidden])
Date: 2002-11-21 07:37:23


On Wed, 20 Nov 2002 10:00:14 -0500, Douglas Gregor <gregod_at_[hidden]>
wrote:

>On Saturday 16 November 2002 12:24 pm, Gennaro Prota wrote:
>> Sorry for the late reply (it's just my timezone).
>>
>> You wrote:
>> >I don't see the contradiction here. 5.2.10/7 says that you can cast from a
>> > T pointer to a U pointer and back to a T pointer and get the original
>> > pointer back.
>>
>> Unfortunately the standard is a great piece of work but fails
>> miserably to support you when you want to "deduce" things that are not
>> written explicitly there in plain English, especially when generic
>> expressions like "it's the same as", "is equivalent to", etc. are
>> used.
>
>I would generally agree that such an informal style doesn't allow any
>deduction, but I think we have to be realistic. We don't have a formal
>specification for C++, and it's unlikely that we will ever have one. We won't
>get anywhere if we require formal methods with an informal specification.

But, as you have seen, I wasn't applying formal methods. However I'm
particularly careful at not reading too much in the standard. My
position is different from yours however. You are one of the people
involved in the standardization so you may be in the position of
knowing what they wanted to write, instead of what *is* written. You
see that from my perspective instead, if I read that everything I can
do with the value of reinterpret_cast is to cast it back to its
original type I try to not do anything else :-) By following
comp.std.c++ for a while I've learnt that there are a lot of cases
where the behavior is left unspecified exactly because there are
architectures where the behavior is not the "obvious" one the naive
programmer would expect (for instance, because they do "unusual"
hardware checks)

Frankly I don't know if there's some strange (but conforming)
implementation of references that can cause problems with the chain of
casts:

   reinterpret_cast<T*>(
         &const_cast<char&>(
           reinterpret_cast<const volatile char &>(v)));

The writers of the standard probably know though.

>> First of all the quotes:
>>
>> 5.2.10/7: "A pointer to an object can be explicitly converted
>> to a pointer to an object of different type.65) Except that
>> converting an rvalue of type "pointer to T1" to the type
>> "pointer to T2" (where T1 and T2 are object types and where
>> the alignment requirements of T2 are no stricter than those
>> of T1) and back to its original type yields the original pointer
>> value, the result of such a pointer conversion is unspecified."
>>
>>
>> Incidentally, what does "converting back" means? However that's not
>> the main point.
>
>Given T* tp, reinterpret_cast<T*>(reinterpret_cast<U*>(tp)) is semantically
>equivalent to tp.

In that form, verbatim, yes. The problems begin when you put something
in the middle.

>> 5.2.10/10: An lvalue expression of type T1 can be cast to the type
>> "reference to T2" if an expression of type "pointer to T1" can be
>> explicitly converted to the type "pointer to T2" using a
>> reinterpret_cast. That is, a reference cast reinterpret_cast<T&>(x)
>> has the same effect as the conversion *reinterpret_cast<T*>(&x)
>> with the built-in & and * operators.
>>
>>
>> What does it mean "has the same effect of"?
>
>Would you prefer "is semantically equivalent to?" You can rewrite one
>expression as the other. (Not in actual C++ code, because you can't say "I
>want the built-in operator&", but it's fine for exploring the properties of
>an expression).

But with a built-in you can be sure that & has the built-in meaning,
and I'm not sure that

   double d = 2.0;
   *reinterpret_cast<int*>(d);

is ok either (and not because of alignment issues).

>> 5.2.10/7 says that the
>> result of reinterpret_cast<T*>(&x) is unspecified. What is the effect
>> of dereferencing it?
>>
>> *reinterpret_cast<T*>(&x)
>
>The effect is the same as dereferencing any pointer. You get the lvalue
>associated with the address stored in the pointer.

I think everybody realizes what happens in terms of expressions (i.e.
in terms of sequences of signs written in the program text). The issue
IMHO is what kind of code the compiler is authorized to generate; the
type T could have, for instance, padding at the beginning: is it ok to
"dereference"? Are there architectures that would trap the attempt?

Come to think of it, the situation of addressof is similar to

    &array[array_size]

which is undefined behavior but is usually ok because no access past
the end is actually made.

Another example: suppose the implementation tags the result of
reinterpret_cast<U*> as "unusable except for pointer-casting". As long
as it uses an injective function for the mapping so that the cast-back
works as expected, is it conforming?

In conclusion, I think it's not a matter of what is written in the
standard but of what "should" be. The current wording is IMHO too weak
to give an answer here. But it can be changed according to the intent.
Off the top of my head I would say the requirements of

   *reinterpret_cast<U*> (x)

should be the same as

    reinterpret_cast<U&> (x)

which is exactly the problem: you should be able to do type punning
with both forms, and in some cases the type punning should be
portable. One form that I think should be portable is indeed the
reinterpret_cast to char and unsigned char: according to 3.10/15 it
seems (I say "it seems" because actually the text says the behavior is
undefined for the types *not* in the list, not that it is defined for
the types that are in the list) that I can access the value of an
object through an lvalue of char type but I don't see any portable way
to obtain such an lvalue. I would say that's a hole in the standard.
What do you think?

Genny.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk