Boost logo

Boost :

From: Andy Glew (glew_at_[hidden])
Date: 2000-01-20 10:49:16

> How would that be possible? Surely the standard mandates that sizeof(T) is
> a compile time constant for all T

This is true for C/C++, and hence is a significant limitation for optimizing compilers.
Moreover, because it is desirable to use the same padding rules for all languages,
to facilitate the interchange of data, we tend to enforce this rule for all languages.
Legacy ABIs.

Nevertheless, there are other contexts where padding is not necessarily to
N*64 bits. Let me use uin8, uint16, uint32, uint64 for clarity:

struct { uint8 a,b,c; }
    only requires 3 bytes of storage, and has no alignment constraints.


struct { uint16 a; uint8 b; }
    requires 3 bytes of storage, but must be aligned to 16 bits to avoid
    misalignment - so padding to a multiple of 16 bits is much more strongly

Now, these are just small examples to illustrate the issue. Realistically,
our compilers always pad to 32 bits, and nearly always to 64 bits
- the legacy ABI that only required 32 bit alignment has been a perennial
performance problem.


struct { uint8 arr[27]; }
        only requires 27 bytes, no alignment constraints
        under the old i386 ABI, would be padded to 28 bytes (multiple of 32)
struct { uint64 arr[3]; uint8 arr2[3]; }
        requires 27 bytes, but should be aligned on a 64 bit boundary.
        under the old i386 ABI would be padded to 28 bytes;
        now we try to pad this to 32 bytes (multiple of 64, *and* a multiple of a
        32 byte cache line)

struct { uint arr[51]]; }
        only requires 51 bytes, no alignment
        under the old ABI would be padded to 52 bytes (13x4 bytes)
struct { uint64 arr[6]; uint8 arr2[2]; }
        only requires 51 bytes
        should be aligned to 8 bytes (64 bits)
        under the old ABI would be 52 bytes in size;
        under proposed new ABI might be 56 bytes (14x4,7x8bytes)
        or even just rounded up to 2 full cache lines, 64 bytes, in size.

Anyway, my point is this: We may not always be just rounding up
to 32 or 64 bits in size. C++ constrains us to always round a struct
to the same size, independent of context, but similarly sized "real structs"
don't necessarily have the same alignment rules, and I don't see
any language constraints in this regard, except for unions.

Even if we fall back to the simple rule, of "structs that have the same
`real size' are padded to the same size", there are very strong reasons
to not always padd to a multiple of 32, 64, 128, or 32 bytes. I think
that the following may well be the rule in the future:

    8 bits -> leave 8 bits
    16 bits -> leave 16 bits
    24 bits -> pad to 32 bits
    32 bits -> leave 32 bits
    40, 48, 56, 64 bits -> round to 64 bits
        round to 32 byte cache lines if amount of wasted
            space is less than 25%
        otherwise round to 64 bits

16 byte cache lines, and 128 bit data typs, just make things
more interesting.

Again, my point is that just assuming padding to a multiple of 4 of 8 bytes
is not realistic. Cache line padding is being done even now, with
the appropriate compiler switches.

By the way, this conversation causes me to wonder:
C and C++'s layout rules prohibit me from doing context sensitive
I.e. if I have
    struct byte3 { char a, b, c; }
and I want to compute addresses efficiently in arrays, so that
    struct byte3 arr[128]; 
uses shifts, then I must pad struct byte3 to 4 bytes.
And then, if I do
        struct foo {
            struct byte3 b3;
            char x;
I am effectively required to make foo be at least 5 bytes
in size, and probably 8 bytes after rounding, rather than the
4 bytes that it needs.
I.e. I must do
rather than
Mainly because there is code out there that assumes that you
can do
        struct foo f;
but also
        struct foo arr[128];
        assert( (char*)(&arr[1]) - (char*)(arr) == sizeof(struct foo) )
I.e. the need for efficient array indexing requires me to do inefficient
structure packing.
Anyway, my question is:  if arrays become less important, as everyone starts
using STL vectors and valarrays, maybe I can start letting sizeof be the
minimal value implied by alignment, and rely on specializations in my
vendor provided library to make indexing efficient?
This doesn't help
        struct bar { uint16 ab; uint8 c; }
        struct baz { struct bar abc; uint8 x; }
which is forced to waste half of its memory because of alignmemt
but its some help.

Boost list run by bdawes at, gregod at, cpdaniel at, john at