Boost logo

Boost :

From: David Abrahams (abrahams_at_[hidden])
Date: 2001-01-06 09:53:20


----- Original Message -----
From: "John Barnard" <barnard_at_[hidden]>

> Dave,
>
> Dave Abrahams writes:
> > Sorry about the delay in responding to this; for some reason I never
> > got the mail message. I only saw it by chance when browsing boost
> > messages on the web!
>
> I wonder if I'm doing something wrong -- this is the second time
> you've said that you didn't receive my message (I sent an unrelated
> message last month). I sent both mailings to boost_at_egroups.com.

That sounds right. And this one worked. I wonder what could have gone wrong?

> > I'm no expert on NumPy, but I would have thought that it had its own
> > data representation. Is it possible to create a NumPy array that uses
> > some external block of data? Even if this is possible, you run the
> > risk of crashing the Python interpreter if the NumPy array is used
> > after the DVArray is destroyed.
>
> Yes, it is possible to create a NumPy array that uses some external
> block of data, in this case, the data of valarray. If I increase the
> reference count to the DVArray instance when I create the NumPy array
> instance, the DVArray instance should exist as long as the NumPy array
> instance exists.

Okay, sorry if I seemed a condescending here; BPL has users with a wide
range of levels of experience.
Just to be clear, though, no matter what you do with to_array(), code like
this will have problems:
  DVArray x;
  return to_array(x);

> > > if (arr == NULL) return NULL;
> > > arr->flags |= OWN_DATA;
> >
> > Does the above line indicate to the NumPy array that it owns its
> > data? If so, you are lying to it ;-) The data is still owned by the
> > valarray.
>
> No. The flag indicates that the NumPy array doesn't own the data and
> shouldn't free the memory when the NumPy array is destroyed.

OK, that looks right.

> > Ah; I think I'm beginning to understand. You want the NumPy array to
> > manage the lifetime of the DVArray, so that the former holds a
> > reference count to the latter as long as it exists?
>
> Not quite. I want to use the NumPy array interface as a convenient way
> to manipulate in Python the data in DVArray. As I mentioned above,
> NumPy allows you to create a NumPy array that doesn't own the
> data. However, to ensure that the DVArray instance doesn't get
> destroyed while the NumPy array still exists, I need to increase the
> reference count of the DVArray-wrapped Python instance.

That sounds like exactly what I said above. Is there a difference? If so,
what is it?

> > I am not familiar with the internals of NumPy (and I can't seem to
> > get to the manual over the web), so I can't tell you whether this is
> > possible. The NumPy array would need to have some slot where you
> > could store a pointer to the DVArray (if you only store the &self[0]
> > you can't get back to the DVArray), and it would also need to give
> > you a way to hook its destruction.
>
> It has such a slot, base (a PyObject*), whose reference count gets
decremented when
> the NumPy array is deleted, which is exactly what I want.

Whoops, sorry, this is a lie:

> > A DVArray object is-a PyObject, so you can always write:
> >
> > PyObject* to_array(DVArray& self) {
> > PyObject* p = &self; // get the PyObject*
> > Py_INCREF(p); // increment, decrement, whatever
>
> This is exactly what I needed to know. I was confused about the usage
> of DVArray in the to_array function. I thought DVArray referred to
> the the valarray<double> class and not the wrapped valarray<double>
> class.

No, sorry; you weren't confused.

If you look at boost/libs/python/doc/data_structures.txt, you'll see that
there's no pointer back from an arbitrary C++ instance to the corresponding
ExtensionInstance object (which is-a PyObject).

An arbitrary DVArray may or may not be wrapped already, so there's no
reliable way to get the PyObject* back. Yes, I could keep a database of
every wrapped object and the corresponding PyObject so that it would be easy
to do this, but that seemed a bit too heavyweight to me. Still, it might be
a good idea to give users that option, at least for some wrapped classes.

In the absence of that feature, you might consider an approach like this:
1. Make a derived class of DVArray, "WrappedDVArray", and expose that to
Python instead of DVArray using BPL as you did for DVArray, except leave out
the constructor.

2. Write
  const DVArray& from_python(PyObject*p, boost::python::type<DVArray>);
  const DVArray& from_python(PyObject*p, boost::python::type<const
DVArray&>);

functions which implement the following pseudocode:

   PyArrayObject* numpy_array = (PyArrayObject*)p; // cast to NumPy array
   // get the PyObject* whose reference count is being managed
   PyObject* wrapped_dv_array = numpy_array->owned_py_object;
   // Extract the valarray
   return boost::python::from_python(p,
boost::python::type<WrappedDVArray>());

Now you just need a way to build such a NumPy array. Just write a C++
function like your to_array() function and expose it using BPL.

> If I wanted a pointer to a valarray<double> (i.e., DVArray) and
> not the wrapped valarray<double> class, how would I get it in this
> function?

You already have it. My mistake :(

> In general, I'm confused about when a class is considered
> wrapped and when it's not in the context of functions added as methods
> of the wrapped class.

I'm sorry to have confused you. In the methods of the wrapped class, all you
have access to is the C++ object, nothing more.

> > In any case, Python usually doesn't care about the actual types of
> > objects (just their interfaces), so you might find it much easier to
> > build up the interface of DVArray so that it mimics a NumPy array and
> > can be used in its place.
>
> NumPy arrays have a rich interface and allows things like sin(x),
> where x is an NumPy array, to be computed very quickly. To implement
> that interface and functionality for valarray<> or vector<> would be a
> large amount of work, hence, my use of the above strategy.

I think you might find that it's already done for valarray<> ;-)
Actually, valarray<> has many problems, not least due to a lack of experts
on the C++ committee has who could help us get it right. Is the NumPy
interface "done right"? If so, maybe we should cannibalize.

> Thanks very much for your help and for BPL -- it's a fantastic piece
> of work.

My pleasure.

-Dave


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk