sizeof(boost::fusion:map)

I was very excited to find the boost::fusion library, expecting that I could use it to implement structs with introspection. I still think I can, but I was disappointed to see that I'm paying a size penalty. I'd have expected all the book-keeping to be handled at compile time. The following code prints sizeof(Foo) = 12 sizeof(FooS) = 4 with g++ 4.2 or 4.4 on an os/x 10.6 or Centos box. Here Foo is boost::fusion::map<boost::fusion::pair<Name, int> > Where are the extra 8 bytes going? With more complex boost::fusion objects I see an overhead of up to 16 bytes. gdb indicates that foo.data.vec has the desired size, but foo.data is larger, and foo is larger still; however, it doesn't indicate any data members --- is this some sort of alignment issue? Is there a way to generate a "packed boost::fusion::map" without this space overhead? R #include <iostream> #include "boost/fusion/container.hpp" struct Name {}; int main() { typedef boost::fusion::map<boost::fusion::pair<Name, int> > Foo; struct FooS { int age; }; std::cout << "sizeof(Foo) = " << sizeof(Foo) << " sizeof(FooS) = " << sizeof(FooS) << std::endl; }

Robert Lupton the Good schrieb:
I was very excited to find the boost::fusion library, expecting that I could use it to implement structs with introspection. I still think I can, but I was disappointed to see that I'm paying a size penalty. I'd have expected all the book-keeping to be handled at compile time.
The following code prints sizeof(Foo) = 12 sizeof(FooS) = 4 with g++ 4.2 or 4.4 on an os/x 10.6 or Centos box.
Here Foo is boost::fusion::map<boost::fusion::pair<Name, int> >
[snip] I have no idea why there is that space overhead. This problem is not fusion specific, though. #include <iostream> struct X{}; struct A:X{int i;}; struct B:X{A a;}; int main() {std::cout << sizeof(A) << " " << sizeof(B) << std::endl;} When compiled with gcc 4.5.1 i686-pc-mingw32 and -O2, the output is "4 8". When compiled with MSVC 2010 (x86) and /O2, "4 4" is printed. -Christopher

Christopher Schmidt schrieb:
Robert Lupton the Good schrieb:
I was very excited to find the boost::fusion library, expecting that I could use it to implement structs with introspection. I still think I can, but I was disappointed to see that I'm paying a size penalty. I'd have expected all the book-keeping to be handled at compile time.
The following code prints sizeof(Foo) = 12 sizeof(FooS) = 4 with g++ 4.2 or 4.4 on an os/x 10.6 or Centos box.
Here Foo is boost::fusion::map<boost::fusion::pair<Name, int> >
[snip]
I have no idea why there is that space overhead. This problem is not fusion specific, though.
#include <iostream> struct X{}; struct A:X{int i;}; struct B:X{A a;}; int main() {std::cout << sizeof(A) << " " << sizeof(B) << std::endl;}
When compiled with gcc 4.5.1 i686-pc-mingw32 and -O2, the output is "4 8". When compiled with MSVC 2010 (x86) and /O2, "4 4" is printed.
-Christopher
I think gcc is correct. There are two distinct instances of X in B, and those may not have the same address. That's why there is that 4 byte displacement. It is pretty much the same in Fusion. All inbuilt sequences and the vector data storage share a common base class - fusion::sequence_root - that's where the 4 resp. 8 bytes come from. -Christopher

On 11/05/10 18:56, Christopher Schmidt wrote:
Christopher Schmidt schrieb: [snip] I think gcc is correct. There are two distinct instances of X in B, and those may not have the same address. That's why there is that 4 byte displacement. [snip]
I think you're right Christopher about two distinct instances not having the same address. There was a recent thread on that subject in comp.lang.c++: http://groups.google.com/group/comp.lang.c++/browse_thread/thread/847b23e27a...

On 11/6/2010 7:56 AM, Christopher Schmidt wrote:
Christopher Schmidt schrieb:
Robert Lupton the Good schrieb:
I was very excited to find the boost::fusion library, expecting that I could use it to implement structs with introspection. I still think I can, but I was disappointed to see that I'm paying a size penalty. I'd have expected all the book-keeping to be handled at compile time.
The following code prints sizeof(Foo) = 12 sizeof(FooS) = 4 with g++ 4.2 or 4.4 on an os/x 10.6 or Centos box.
Here Foo is boost::fusion::map<boost::fusion::pair<Name, int> >
[snip]
I have no idea why there is that space overhead. This problem is not fusion specific, though.
#include<iostream> struct X{}; struct A:X{int i;}; struct B:X{A a;}; int main() {std::cout<< sizeof(A)<< " "<< sizeof(B)<< std::endl;}
When compiled with gcc 4.5.1 i686-pc-mingw32 and -O2, the output is "4 8". When compiled with MSVC 2010 (x86) and /O2, "4 4" is printed.
-Christopher
I think gcc is correct. There are two distinct instances of X in B, and those may not have the same address. That's why there is that 4 byte displacement. It is pretty much the same in Fusion. All inbuilt sequences and the vector data storage share a common base class - fusion::sequence_root - that's where the 4 resp. 8 bytes come from.
Good analysis, Christopher. We should do something about this. sequence_root is only there for easy tag detection. There should be better way. Regards, -- Joel de Guzman http://www.boostpro.com http://spirit.sf.net

On Nov 5, 9:49 pm, Joel de Guzman <j...@boost-consulting.com> wrote:
On 11/6/2010 7:56 AM, Christopher Schmidt wrote:
Christopher Schmidt schrieb:
Robert Lupton the Good schrieb:
I was very excited to find the boost::fusion library, expecting that I could use it to implement structs with introspection. I still think I can, but I was disappointed to see that I'm paying a size penalty. I'd have expected all the book-keeping to be handled at compile time.
The following code prints sizeof(Foo) = 12 sizeof(FooS) = 4 with g++ 4.2 or 4.4 on an os/x 10.6 or Centos box.
Here Foo is boost::fusion::map<boost::fusion::pair<Name, int> >
[snip]
I have no idea why there is that space overhead. This problem is not fusion specific, though.
#include<iostream> struct X{}; struct A:X{int i;}; struct B:X{A a;}; int main() {std::cout<< sizeof(A)<< " "<< sizeof(B)<< std::endl;}
When compiled with gcc 4.5.1 i686-pc-mingw32 and -O2, the output is "4 8". When compiled with MSVC 2010 (x86) and /O2, "4 4" is printed.
-Christopher
I think gcc is correct. There are two distinct instances of X in B, and those may not have the same address. That's why there is that 4 byte displacement. It is pretty much the same in Fusion. All inbuilt sequences and the vector data storage share a common base class - fusion::sequence_root - that's where the 4 resp. 8 bytes come from.
Good analysis, Christopher. We should do something about this. sequence_root is only there for easy tag detection. There should be better way.
Joel, If I am not wrong: this is related to a question I asked some weeks ago. When I was surprised (disappointed?) to find that fusion::vector<double, double, double> has a memory layout different than double[3] (or boost::array<double,3>). I didn't look at the 'sizeof' then but know that Robert mentions it I tried: std::clog << sizeof(double) << std::endl; std::clog << sizeof(vector<double, double, double>) << std::endl; std::clog << sizeof(boost::array<double, 3>) << std::endl; which in my platform emits: 8 28 24 It would be interesting to know what does Boost.fusion stores in the extra bytes and why on earth neither the size nor the memory layout is the same as the contained data. It seems that the inconsistency on the size raises more eyebrows than the unexpected memory layout. BTW, Besides this sharp edges, I am also very excited to have found Boost.Fusion. Thank you Joel. Thank you, Alfredo

alfC schrieb:
On Nov 5, 9:49 pm, Joel de Guzman <j...@boost-consulting.com> wrote:
On 11/6/2010 7:56 AM, Christopher Schmidt wrote:
Christopher Schmidt schrieb:
Robert Lupton the Good schrieb:
I was very excited to find the boost::fusion library, expecting that I could use it to implement structs with introspection. I still think I can, but I was disappointed to see that I'm paying a size penalty. I'd have expected all the book-keeping to be handled at compile time.
The following code prints sizeof(Foo) = 12 sizeof(FooS) = 4 with g++ 4.2 or 4.4 on an os/x 10.6 or Centos box.
Here Foo is boost::fusion::map<boost::fusion::pair<Name, int> >
[snip]
I have no idea why there is that space overhead. This problem is not fusion specific, though.
#include<iostream> struct X{}; struct A:X{int i;}; struct B:X{A a;}; int main() {std::cout<< sizeof(A)<< " "<< sizeof(B)<< std::endl;}
When compiled with gcc 4.5.1 i686-pc-mingw32 and -O2, the output is "4 8". When compiled with MSVC 2010 (x86) and /O2, "4 4" is printed.
-Christopher
I think gcc is correct. There are two distinct instances of X in B, and those may not have the same address. That's why there is that 4 byte displacement. It is pretty much the same in Fusion. All inbuilt sequences and the vector data storage share a common base class - fusion::sequence_root - that's where the 4 resp. 8 bytes come from.
Good analysis, Christopher. We should do something about this. sequence_root is only there for easy tag detection. There should be better way.
Joel,
If I am not wrong: this is related to a question I asked some weeks ago. When I was surprised (disappointed?) to find that fusion::vector<double, double, double> has a memory layout different than double[3] (or boost::array<double,3>). I didn't look at the 'sizeof' then but know that Robert mentions it I tried:
std::clog << sizeof(double) << std::endl; std::clog << sizeof(vector<double, double, double>) << std::endl; std::clog << sizeof(boost::array<double, 3>) << std::endl;
which in my platform emits:
8 28 24
It would be interesting to know what does Boost.fusion stores in the extra bytes and why on earth neither the size nor the memory layout is the same as the contained data. It seems that the inconsistency on the size raises more eyebrows than the unexpected memory layout.
Nothing is stored in those 4 additional bytes. As I said above one additional byte is added by the compiler to the 3*sizeof(double) bytes so one is able to distinguish between the two instances of a common base class (fusion::sequence_root) of the vector (fusion::vector<double, double, double>) and the internal vector data storage (fusion::vector3<double, double, double>). To get size and alignment right this one byte actually emits a 4-byte overhead on your compiler. -Christopher

On Nov 6, 5:17 am, Christopher Schmidt <mr.chr.schm...@online.de> wrote:
alfC schrieb:
On Nov 5, 9:49 pm, Joel de Guzman <j...@boost-consulting.com> wrote:
On 11/6/2010 7:56 AM, Christopher Schmidt wrote:
Christopher Schmidt schrieb:
Robert Lupton the Good schrieb:
I was very excited to find the boost::fusion library, expecting that I could use it to implement structs with introspection. I still think I can, but I was disappointed to see that I'm paying a size penalty. I'd have expected all the book-keeping to be handled at compile time.
The following code prints sizeof(Foo) = 12 sizeof(FooS) = 4 with g++ 4.2 or 4.4 on an os/x 10.6 or Centos box.
Here Foo is boost::fusion::map<boost::fusion::pair<Name, int> >
[snip]
I have no idea why there is that space overhead. This problem is not fusion specific, though.
#include<iostream> struct X{}; struct A:X{int i;}; struct B:X{A a;}; int main() {std::cout<< sizeof(A)<< " "<< sizeof(B)<< std::endl;}
When compiled with gcc 4.5.1 i686-pc-mingw32 and -O2, the output is "4 8". When compiled with MSVC 2010 (x86) and /O2, "4 4" is printed.
-Christopher
I think gcc is correct. There are two distinct instances of X in B, and those may not have the same address. That's why there is that 4 byte displacement. It is pretty much the same in Fusion. All inbuilt sequences and the vector data storage share a common base class - fusion::sequence_root - that's where the 4 resp. 8 bytes come from.
Good analysis, Christopher. We should do something about this. sequence_root is only there for easy tag detection. There should be better way.
Joel,
If I am not wrong: this is related to a question I asked some weeks ago. When I was surprised (disappointed?) to find that fusion::vector<double, double, double> has a memory layout different than double[3] (or boost::array<double,3>). I didn't look at the 'sizeof' then but know that Robert mentions it I tried:
std::clog << sizeof(double) << std::endl; std::clog << sizeof(vector<double, double, double>) << std::endl; std::clog << sizeof(boost::array<double, 3>) << std::endl;
which in my platform emits:
8 28 24
It would be interesting to know what does Boost.fusion stores in the extra bytes and why on earth neither the size nor the memory layout is the same as the contained data. It seems that the inconsistency on the size raises more eyebrows than the unexpected memory layout.
Nothing is stored in those 4 additional bytes. As I said above one additional byte is added by the compiler to the 3*sizeof(double) bytes so one is able to distinguish between the two instances of a common base class (fusion::sequence_root) of the vector (fusion::vector<double, double, double>) and the internal vector data storage (fusion::vector3<double, double, double>). To get size and alignment right this one byte actually emits a 4-byte overhead on your compiler.
Isn't something called Empty Base Optimization that helps with this problem? (not sure if it applies here). So there is no way around it? (unless the internal design is changed?) What do you mean by my compiler? are there compiler options that can optimize these? (I use gcc 4.4)
-Christopher
_______________________________________________ Boost-users mailing list Boost-us...@lists.boost.orghttp://lists.boost.org/mailman/listinfo.cgi/boost-users

On 11/06/10 07:33, alfC wrote: [snip]
On Nov 6, 5:17 am, Christopher Schmidt <mr.chr.schm...@online.de> wrote:
[snip]
Nothing is stored in those 4 additional bytes. As I said above one additional byte is added by the compiler to the 3*sizeof(double) bytes so one is able to distinguish between the two instances of a common base class (fusion::sequence_root) of the vector (fusion::vector<double, double, double>) and the internal vector data storage (fusion::vector3<double, double, double>). To get size and alignment right this one byte actually emits a 4-byte overhead on your compiler.
Isn't something called Empty Base Optimization that helps with this problem? (not sure if it applies here).
So there is no way around it? (unless the internal design is changed?) [snip] Not if the compiler conforms to the standard, AFAICT. However, apparently, according to the Christopher's earlier post:
http://article.gmane.org/gmane.comp.lib.boost.user/63553 MSVC 2010 (x86) believes avoiding this restriction has some advantage. There is a link in this post: http://groups.google.com/group/comp.lang.c++/msg/ab0c802f7b5e9072 to comp.lang.c++ which maybe explains more. That post also has a code attachment illustrating that EBO doesn't always work as one might expect(see the struct empties_inherit). The code attachment also shows a workaround using: https://svn.boost.org/trac/boost/browser/sandbox/variadic_templates/boost/co... HTH. -Larry

Joel de Guzman schrieb:
On 11/6/2010 7:56 AM, Christopher Schmidt wrote: [snip]
I think gcc is correct. There are two distinct instances of X in B, and those may not have the same address. That's why there is that 4 byte displacement. It is pretty much the same in Fusion. All inbuilt sequences and the vector data storage share a common base class - fusion::sequence_root - that's where the 4 resp. 8 bytes come from.
Good analysis, Christopher. We should do something about this. sequence_root is only there for easy tag detection. There should be better way.
I committed a preliminary fix to the trunk. See https://svn.boost.org/trac/boost/changeset/66411 for more details. It does not break Fusion and Proto on gcc 4.5.1 -std=c++0x, gcc 4.5.1, msvc 10, msvc 9 and gcc 3.4.2 . I removed fusion::sequence_root and added an implicit conversion operator to fusion::sequence_base . https://svn.boost.org/trac/boost/browser/trunk/boost/fusion/support/sequence... Instead of using boost::is_base_of<fusion::sequence_root, Seq> to detect inbuilt fusion container we can use boost::is_convertible<Seq, fusion::detail::from_sequence_convertible_type> now. https://svn.boost.org/trac/boost/browser/trunk/boost/fusion/support/is_seque... This solution is not pretty but it does not make the situation worse either. -Christopher

On 11/6/10 8:49 PM, Christopher Schmidt wrote:
Joel de Guzman schrieb:
On 11/6/2010 7:56 AM, Christopher Schmidt wrote: [snip]
I think gcc is correct. There are two distinct instances of X in B, and those may not have the same address. That's why there is that 4 byte displacement. It is pretty much the same in Fusion. All inbuilt sequences and the vector data storage share a common base class - fusion::sequence_root - that's where the 4 resp. 8 bytes come from.
Good analysis, Christopher. We should do something about this. sequence_root is only there for easy tag detection. There should be better way.
I committed a preliminary fix to the trunk. See
https://svn.boost.org/trac/boost/changeset/66411
for more details. It does not break Fusion and Proto on gcc 4.5.1 -std=c++0x, gcc 4.5.1, msvc 10, msvc 9 and gcc 3.4.2 .
I removed fusion::sequence_root and added an implicit conversion operator to fusion::sequence_base .
https://svn.boost.org/trac/boost/browser/trunk/boost/fusion/support/sequence...
Instead of using boost::is_base_of<fusion::sequence_root, Seq> to detect inbuilt fusion container we can use boost::is_convertible<Seq, fusion::detail::from_sequence_convertible_type> now.
https://svn.boost.org/trac/boost/browser/trunk/boost/fusion/support/is_seque...
This solution is not pretty but it does not make the situation worse either.
Thanks, Christopher. That will do for now. I'm sure you can do better though ;-) Cheers, -- Joel de Guzman http://www.boostpro.com http://spirit.sf.net

On 11/05/10 13:01, Robert Lupton the Good wrote:
I was very excited to find the boost::fusion library, expecting that I could use it to implement structs with introspection. I still think I can, but I was disappointed to see that I'm paying a size penalty. I'd have expected all the book-keeping to be handled at compile time.
The following code prints sizeof(Foo) = 12 sizeof(FooS) = 4 with g++ 4.2 or 4.4 on an os/x 10.6 or Centos box.
Here Foo is boost::fusion::map<boost::fusion::pair<Name, int> >
Where are the extra 8 bytes going? With more complex boost::fusion objects I see an overhead of up to 16 bytes. gdb indicates that foo.data.vec has the desired size, but foo.data is larger, and foo is larger still; however, it doesn't indicate any data members --- is this some sort of alignment issue?
Is there a way to generate a "packed boost::fusion::map" without this space overhead?
Robert, I'm not familiar with fusion::map and don't have any idea how to make a "packed boost::fusion::map"; however, I think the container template specialization found here: http://svn.boost.org/svn/boost/sandbox/variadic_templates/boost/composite_st... /container_all_of_aligned.hpp could be modified to provide what you want. The modification would simply replace the current "enum keys" with whatever key you might desire. Let me explain "enum keys". Currently, the container template in container_all_of_aligned.hpp is much like a boost::fusion::vector; however, the syntax for accessing the component values is a bit different. For fusion vector, the components are accessed with: at_c<KeyValue>(fusion_vector) [where KeyValue is some integral constant, say of type unsigned, between 0 and size(fusion_vector)-1] In contrast, a container_all_of_aligned accesses it's components with a templated member function: aligned_container.project<KeyValue>() where KeyValue is some integral type suitable as the 2nd argument to the boost::mpl::integral_c template. For example, the Index0 argument to the template: template < class Index0 , typename... Components
container < tags::all_of_aligned , Index0 , Components...
located here: https://svn.boost.org/trac/boost/browser/sandbox/variadic_templates/boost/co... is actually an instance of boost::mpl_integral_c<KeyType,KeyValue> where the KeyValue corresonds to the 0-th component. KeyType could be unsigned (corresponding to the KeyType for fusion::vector) or, to be more user friendly, an enum type, e.g. enum component_indices { index_0 , index_1 ... , index_n }; Now, the method used by: aligned_container.project<KeyValue>() to access the component is just to call the overloaded project method: static TailComponent const& project ( index_part index_arg , char const* buffer_composite ) { void const* tail_buffer=buffer_composite+comp_part::offset; TailComponent const* tail_ptr=static_cast<TailComponentconst*>(tail_buffer); return *tail_ptr; } located here: https://svn.boost.org/trac/boost/browser/sandbox/variadic_templates/boost/co... where index_part might be boost::mpl::integral_c<component_indices,index_0>. The actual component values are stored at offsets in the 2nd arg, buffer_composite, to the project static function. That offset, comp_part::offset, is a compile-time function of index_part. Now, intead of: boost::mpl::integral_c<KeyType,KeyValue> any type could used. The only important thing is that, because it's passed by value to project, it should be easy to copy or just empty, which boost::mpl::integral_c<KeyType,KeyValue> is. However, any other type can easily be adapted to this purpose by simply wrapping it in some templated empty type, such as that in boost/mpl/aux_/type_wrapper.hpp. Thus, instead of: typedef boost::mpl::integral_c<KeyType,KeyValue> index_part; for some KeyType, and KeyValue which is a KeyType, an index_part could be: typedef boost::mpl::aux::type_wrapper<KeyClassType> index_part; for some KeyClassType which would be the first element of one of the fusion::pair<KeyClassType,ValueType>'s making up the map. Actually, since no value is used, instead of fusion::pair, an mpl::pair could be used. Thus, since KeyClassType would not occur in any data structure, but only as a template argument to a function, it could cause no memory use (except maybe in the call stack, I guess). IOW, the algorithm used to calculate the size of buffer_composite does not depend in any way on the KeyClassType. Of course with this change, the layout_composite<,,C...>::scanned: https://svn.boost.org/trac/boost/browser/sandbox/variadic_templates/boost/co... would have to be changed to take mpl::pair's instead of just components as the pack argument, C... . Robert, if you're interested in this approach, I'll start adding something similar to fusion::map to composite_storage. If not, then I'll just wait a while before doing that (I've been planning on doing it anyway). -regards, Larry
participants (5)
-
alfC
-
Christopher Schmidt
-
Joel de Guzman
-
Larry Evans
-
Robert Lupton the Good