Boost logo

Boost Users :

From: Daryle Walker (darylew_at_[hidden])
Date: 2008-08-18 19:05:10


Have you ever read a thread in a programming news-group, where some
newbie asks how to implement some wacky scheme? Several far-out
solutions come forth, then flame-wars erupt over the pros & cons.
Then finally someone asks what the newbie really needs and gives a
completely different solution based on that response. Basically, the
newbie was trying to do something in a manner s/he shouldn't even
thought of, let alone suppose it was good enough to try
implementing. This is one of those times.

On Aug 16, 2008, at 1:32 PM, Zeljko Vrba wrote:

> Integer types in C and C++ are a mess. For example, I have made a
> library
> where a task is identified by a unique unsigned integer. The extra
> information
> about the tasks is stored in a std::vector. When a new task is
> created, I use
> the size() method to get the next id, I assign it to the task and then
> push_back the (pointer to) task structure into the vector. Now,
> the task
> structure has also an "unsigned int id" field. In 64-bit mode,
>
> sizeof(unsigned) == 4, sizeof(std::vector::size_type) == 8
>
> I get a warnings about type truncation, and obviously, I don't like
> them. But
> I like explicit casts and turning off warnings even less. No, I
> don't want to
> tie together size_type and task id type (unsigned int). One reason is
> "aesthetic", another reason is that I don't want the task id type
> to be larger
> than necessary (heck, even a 16-bit type would have been enough),
> because the
> task IDs will be copied verbatim into another std::vector<unsigned>
> for further
> processing (edge lists of a graph). Doubling the size of an
> integer type shall
> have bad effects on CPU caches, and I don't want to do it.
>
> What to do? Encapsulate into "get_next_id()" function? Have a
> custom size()
> function/macro that just casts the result of vector::size and
> returns it?

Well, using a custom "size" function will shut the compiler up. But
the "get_next_id" function is better because you can change the
implementation of the ID and external code shouldn't have to change.
(The ID is a typedef and not a naked "unsigned," right?) Anyway,
does it really matter; this ID generation code is only used during
task construction, right?

Actually, writing this response is hard. I've read 20+ responses,
talking about how much built-in integers "suck." Then I decided to
look at the original post again, and something bugged me about it.
Why are you using a number to refer to a container element in the
first place? Then I realized that you can't use iterators because
they're not stable with vector's element adds or removes. Then I
wondered, why are you using a vector in the second place? Wouldn't a
list be better, so you can add or remove without invalidating
iterators, leaving them available to implement your ID type. And you
don't seem to need random-access to various task elements. (A deque
is unsuitable for the same reason as a vector.) Then I thought,
these tasks just store extra information, and have no relation to
each other (that you've revealed). So why are you using any kind of
container at all? You have no compunctions about using dynamic
memory, so just allocate with shared-pointers:

//================================================
class task
{
     struct task_data
     {
         // whatever...
     };
     typedef boost::shared_ptr<task_data> sp_type;

     sp_type data_;

     // Hidden member-wise constructor
     explicit task( sp_type d ) : data_( d ) {}

public:
     // Constructors of various configurations, possibly including
     // a default constructor; but use the automatically-defined
     // copy-constructor and destructor
     task(/*whatever*/) : data_( new task_data(/*whatever*/) ) {}

     // Forced copy
     task clone() const
     {
         sp_type result_data( new task_data(*this->data_) );
         task result( result_data );

         return result;
     }

     // Use automatically defined copy-assignment operator
     bool operator ==( task const &o ) const
     {
         //return this->data_ == o.data_; // shallow
         return *this->data_ == *o.data_; // deep
     }
     bool operator !=( task const &o ) const
     { return !this->operator ==( o ); }

     // Regular task functionality follows...
};
//================================================

(I was going to suggest using Boost's pointer-containers, but then I
realized that you really don't need containment at all.) Now you'll
pass this class around instead of an integer type. The size may be
higher though, two pointers (and an 'int' in debug-mode).

> ==
>
> Another example: an external library defines its interfaces with
> signed integer
> types, I work with unsigned types (why? to avoid even more warnings
> when
> comparing task IDs with vector::size() result, as in assert(task->id <
> tasks.size()), which are abundant in my code). Again, some
> warnings are
> unavoidable.
>
> What to do to have "clean" code?

You tasks IDs are conceptually opaque, why is any external code
wanting to mess with them? The external code shouldn't be doing
anything with the IDs besides comparing them with each other (only !=
and ==, not ordering) and using them as keys to your task functions.
This is why the ID's implementation should be hidden in a wrapping
class, external code can't mess with them by default; you would have
to define all legal interactions. In other words, define the main
task functionality in member functions and friends of the "task"
class I suggested, and have any ancillary code call that core code.

And if any code besides your test-invariant function is doing those
asserts, especially functions outside the task class, you're doing
your wrong method wrong.

> ==
>
> Does anyone know about an integer class that lets the user define
> the number of
> bits used for storage, lower allowed bound and upper allowed bound
> for the
> range? Like: template<int BITS, long low, long high> class Integer;

The integer library in Boost has this. The various class templates
only support one of your parameters at a time, though. (Either bit-
length, maximum, or minimum, not any two or all three.)

> BITS would be allowed to assume a value only equal to the one of
> the existing
> integer types (e.g. 8 for char, 16 for short, etc.), and the class
> would be
> constrained to hold values in range [low, high] (inclusive).
[SNIP rant with big ideas for integers he doesn't currently need]

You'll have to enforce a constraint range yourself. But there is a
numeric-conversion library in Boost to help you there, too.

> Or should I just listen to the devil on my shoulder and turn off the
> appropriate warnings?

No, you should rethink your design on why you need integers in the
first place.

-- 
Daryle Walker
Mac, Internet, and Video Game Junkie
darylew AT hotmail DOT com

Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net