Boost logo

Boost :

From: Martin Bonner (martin.bonner_at_[hidden])
Date: 2006-03-08 04:49:25


----Original Message----
From: boost-bounces_at_[hidden]
[mailto:boost-bounces_at_[hidden]] On Behalf Of Andy Little Sent:
07 March 2006 17:43 To: boost_at_[hidden]
Subject: Re: [boost] [bitfield] Initial bitfield proposal
availableinthevault

> "Martin Bonner" wrote
>> Andy Little wrote
>>> "Martin Bonner" <martin.bonner_at_[hidden]> wrote in message
>>> news:D997BF79D1E92C4793B7FCC04B4F90A51D79B6_at_pigeon.pi.local...
>>>> ----Original Message----
>>>> From: Emile Cormier
>>>>
>>>>> The bitfield mechanism relies on this assumption: Unions of
>>>>> non-polymorphic, non-derived objects, having the exact same
>>>>> underlying data member type, will have the same size as this
>>>>> underlying data member type. I'm no language lawyer, so please let
>>>>> me know if this is a safe and portable assumption.
>>>>
>>>> I'm not quite sure what you mean, but given:
>>>> struct a { unsigned char ch; };
>>>> struct b { unsigned char ch; };
>>>> union u { a theA; b theB };
>>>> then you are not guaranteed that sizeof(u) == sizeof(unsigned
>>>> char).
>>>
>>> Though in practise you can use:
>>>
>>> BOOST_STATIC_ASSERT(sizeof(u) == sizeof(unsigned char))
>>
>> My point was exactly that you CANNOT use that. (On a certain class
>> of machine).
>
> Why not?. Will it a) compile but be incorrect or b) not compile but
> be incorrect
> or c) not compile but be correct?

The assert will fire.

> How do you store an unsigned char then? (And whatever way that is
> just pretend to the hardware that the struct is an unsigned char)

Storing an unsigned char is expensive. It involves extra bit twiddling.
(See below)
>
>>>> On word addressed machines (which /are/ still being built), it is
>>>> almost certain that the minimum size for a struct is a complete
>>>> word. This is because the C and C++ standards effectively promise
>>>> that pointers to structs are all of the same size (the size of a
>>>> pointer-to-struct does not depend on the contents of the struct).
>>>> It is desirable that a pointer-to-struct be the smaller,
>>>> cheaper-to-dereference pointer to word (rather than the larger
>>>> more-expensive-to-dereference pointer to char), so the smallest
>>>> struct has to occupy a whole word.
>>>
>>> I dont see why the size of a pointer to a struct affects the size of
>>> a struct which in the case of an empty struct is often 1 byte?
>>
>> I don't think you have understood what a word addressed machine is!
>>
>> On most modern archictectures there are 8 bits stored at (for
>> example) 0x100 and another 8 bits at 0x101. The 32 bits at 0x100
>> cover 0x100, 0x101, 0x102, and 0x103.
>>
>> On a word addressed machine, there may be 36 bits stored at 02000 and
>> another (different) 36 bits stored at 02001. A simple 36-bit pointer
>> can address individual words, but not sub-units within those words.
>> To address individual bytes, you need a double-word pointer. One
>> word identifies the word, and a few bits within the second word
>> identifies which byte you are addressing.
>>
>> On such a machine, it makes sense for an empty struct to occupy a
>> whole word (which is four nine-bit bytes), so that a pointer to
>> struct can (always) be a single word pointer.
>
> Sounds like there is a choice. Either make unsigned char 36 bits and
> use a small pointer or make unsigned char 9 bits and use a large
> pointer.
Yup. And the COMPILER writer gets to make that choice.

> I dont know whether C++ will allow both?

It will allow the compiler writer to make either of those choices,

> It reminds me of the old Microchip PIC architecture
> though. Last I looked they were working to make their hardware
> compatible with C FWIW and just increasing the number of address
> lines, because they were previously so difficult to deal with, with
> the separate extra bits in an address and so on (though that was a
> kind of paged memory IIRC). IOW in their case they
> realised the downside of the idea as I understand it and moved to one
> drop linear addressing for later designs.

I believe it is a similar sort of idea. This is the same sort of thing
as Prime changing their instruction set so that memset( ptr, 0,
sizeof(ptr) ) set ptr to a null pointer (the natural representation of a
null pointer on a Prime was 07777/000000).
>
> Maybe I have got the wrong end of the proverbial stick again though ?

My point is that assuming
>>> BOOST_STATIC_ASSERT(sizeof(u) == sizeof(unsigned char))

Means that there is a class of C++ implementations where the library
will not work. It is then up to the library author to consider whether
that class is suffiently important to change his implementation for (it
may well not be).

-- 
Martin Bonner
Martin.Bonner_at_[hidden]
Pi Technology, Milton Hall, Ely Road, Milton, Cambridge, CB4 6WZ,
ENGLAND Tel: +44 (0)1223 203894

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk