Boost logo

Boost Users :

Subject: Re: [Boost-users] Using boost with 'c' level interfaces ?
From: Anthony Foiani (tkil_at_[hidden])
Date: 2013-06-07 07:10:51


Greetings --

"avib369 ." <avibahra_at_[hidden]> writes:

> I can not find any solution where a 'c' api can update/modify a
> std::vector

It is possible; see below for one possible solution.

I would recommend *against* my solution, however. If you're writing
or rewriting code to accomodate a new API anyway, just use C++. This
will allow you to use low-level routines if you need to, while still
providing you with a high abstraction level.

There are some cases where it can't be avoided, or seems inevitable to
avoid duplicating effort, but please do some profiling to make sure
that the added complication is worthwhile.

(The example that comes to mind is reading incoming data into a
std::vector<char>; we don't want to scan it twice, so we don't want a
two-step "get the length, resize, copy into resized vector". However,
you can avoid the need for a resizable buffer by providing a large
buffer that the C-based function fills in as much as it can, and then
communicates the amount read back to the calling function. E.g., this
is exactly how the POSIX 'read(2)' call works.)

If what you really want is safer dynamic objects (resizable vectors
etc), but purely in C, then look into the recommendation given
upthread -- something like Qt, glib, standalone tools like string
buffer libraries, etc. Calling in and out of C++ seems like overkill
for that need, unless your project already has C++ dependencies.

> Passing a std::vector<int>/std::vector<double> to a 'c' or fortran
> api function, that _reads_, works like a charm.

Right; contiguous memory is a part of the design rationale /
constraint for std::vector.

> However we are out of luck, if we need the 'c' api, to populate the
> array. When passing a std::vector will 'kind of' work. But you can
> never use any of the following
>
> // 1/ size()
> // 2/ empty()

> // 3/ begin()/ end()/ rbegin()/rend()

I have to call this out in particular: what would you expect these to
return? If you're not in C++, you can't use the iterators in a
generic sense. Are you counting on the iterators just being pointers?
If so, that's not always the case!

(Ok, in production code, std::vector<T>::iterator is almost always
"T*"; but std::vector is allowed to implement that in other ways, and
sometimes does so, especially in debug modes.)

The case of rbegin / rend is even murkier. They have the same "type"
as begin/end, from a C API point of view -- but the operations are
different!

> // 4/ push_back()
> // 5/ pop_back()
>
> You have to separately carry the size of the vector, in order to use
> it. You then just end up using it like a 'c' array. I thought there
> may have been way of updating the vectors begin/finish after the api
> call. But there does not appear to any way to do this.

As far as I can see, the cleanest way would be to write C++ stubs that
are exposed via C linkage, so you can call them from C programs.

Something like this (warning, not compiled, and something like "const
void" might explode, it's been a long time since I've done this sort
of thing):

    namespace
    {
        typedef std::vector< int > std_vector_int;
        typedef const std_vector_int const_std_vector_int;

        inline
        std_vector_int * std_vector_int_ptr( void * vec_addr_raw )
        {
            return reinterpret_cast< std_vector_int * >( vec_addr_raw );
        }

        inline
        const_std_vector_int * const_std_vector_int_ptr( void * vec_addr_raw )
        {
            return reinterpret_cast< const_std_vector_int * >( vec_addr_raw );
        }
    }

    extern "C"
    {
        size_t std_vector_int_size( const void * vec_addr_raw )
        {
            return const_std_vector_int_ptr( vec_addr_raw )->size();
        }

        int std_vector_int_empty( const void * vec_addr_raw )
        {
            return const_std_vector_int_ptr( vec_addr_raw )->empty() ? 1 : 0;
        }

        // see comments in the mail; this isn't entirely safe, although it
        // should be safe in most situations.
        int * std_vector_int_begin( void * vec_addr_raw )
        {
            return &( (*std_vector_int_ptr(vec_addr_raw)) [0] );
        }

        const int * std_vector_int_cbegin( void * vec_addr_raw )
        {
            return &( (*const_std_vector_int_ptr(vec_addr_raw)) [0] );
        }

        int * std_vector_int_end( void * vec_addr_raw )
        {
            return &( (*std_vector_int_ptr(vec_addr_raw)) [vec_addr->size()] );
        }

        const int * std_vector_int_cend( void * vec_addr_raw )
        {
            return &( (*const_std_vector_int_ptr(vec_addr_raw)) [vec_addr->size()] );
        }

        void std_vector_int_push_back( void * vec_addr_raw,
                                       const int val )
        {
            std_vector_int_ptr( vec_addr_raw )->push_back( val );
        }
    }

And so on. You can do similar things for the iterators (especially
the reverse iterators), but it is pretty tedious.

Now do the same for each type that you are interested in (int, double,
float, etc). What you quickly discover is that you've re-invented C++
name mangling.

Which leads me back to my original recommendation: if you're rewriting
code *anyway*, then rewrite it into C++. C++ is a more expressive,
powerful, safe, and precise language than either C or Fortran.

Anyway, given these calls, the overall C++ program might look like:

    int main( int argc, char * argv [] )
    {
        std::vector< int > my_vec;

        fill_cpp_std_vector_int( &my_vec );

        for ( const int i : my_vec )
            std::cout << i << std::endl;

        return 0;
    }

And the C function might look like:

    void fill_cpp_std_vector_int( void * vec_addr )
    {
        int i;

        // optional; not shown above, but obvious
        std_vector_int_reserve( vec_addr, 10 );

        for ( i = 0; i < 10; ++i )
          std_vector_int_push_back( vec_addr, i );
    }

But again, I feel that you're going to be better off using C++ to
manipulate C++ constructions. If you're using C++ constructions, then
you are already relying on an environment where you have C++ resources
(compiler, linker, library); why not leverage them?

Hope this helps.

Best regards,
Anthony Foiani


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net