
Boost : 
From: David Abrahams (abrahams_at_[hidden])
Date: 20010106 09:53:20
 Original Message 
From: "John Barnard" <barnard_at_[hidden]>
> Dave,
>
> Dave Abrahams writes:
> > Sorry about the delay in responding to this; for some reason I never
> > got the mail message. I only saw it by chance when browsing boost
> > messages on the web!
>
> I wonder if I'm doing something wrong  this is the second time
> you've said that you didn't receive my message (I sent an unrelated
> message last month). I sent both mailings to boost_at_egroups.com.
That sounds right. And this one worked. I wonder what could have gone wrong?
> > I'm no expert on NumPy, but I would have thought that it had its own
> > data representation. Is it possible to create a NumPy array that uses
> > some external block of data? Even if this is possible, you run the
> > risk of crashing the Python interpreter if the NumPy array is used
> > after the DVArray is destroyed.
>
> Yes, it is possible to create a NumPy array that uses some external
> block of data, in this case, the data of valarray. If I increase the
> reference count to the DVArray instance when I create the NumPy array
> instance, the DVArray instance should exist as long as the NumPy array
> instance exists.
Okay, sorry if I seemed a condescending here; BPL has users with a wide
range of levels of experience.
Just to be clear, though, no matter what you do with to_array(), code like
this will have problems:
DVArray x;
return to_array(x);
> > > if (arr == NULL) return NULL;
> > > arr>flags = OWN_DATA;
> >
> > Does the above line indicate to the NumPy array that it owns its
> > data? If so, you are lying to it ;) The data is still owned by the
> > valarray.
>
> No. The flag indicates that the NumPy array doesn't own the data and
> shouldn't free the memory when the NumPy array is destroyed.
OK, that looks right.
> > Ah; I think I'm beginning to understand. You want the NumPy array to
> > manage the lifetime of the DVArray, so that the former holds a
> > reference count to the latter as long as it exists?
>
> Not quite. I want to use the NumPy array interface as a convenient way
> to manipulate in Python the data in DVArray. As I mentioned above,
> NumPy allows you to create a NumPy array that doesn't own the
> data. However, to ensure that the DVArray instance doesn't get
> destroyed while the NumPy array still exists, I need to increase the
> reference count of the DVArraywrapped Python instance.
That sounds like exactly what I said above. Is there a difference? If so,
what is it?
> > I am not familiar with the internals of NumPy (and I can't seem to
> > get to the manual over the web), so I can't tell you whether this is
> > possible. The NumPy array would need to have some slot where you
> > could store a pointer to the DVArray (if you only store the &self[0]
> > you can't get back to the DVArray), and it would also need to give
> > you a way to hook its destruction.
>
> It has such a slot, base (a PyObject*), whose reference count gets
decremented when
> the NumPy array is deleted, which is exactly what I want.
Whoops, sorry, this is a lie:
> > A DVArray object isa PyObject, so you can always write:
> >
> > PyObject* to_array(DVArray& self) {
> > PyObject* p = &self; // get the PyObject*
> > Py_INCREF(p); // increment, decrement, whatever
>
> This is exactly what I needed to know. I was confused about the usage
> of DVArray in the to_array function. I thought DVArray referred to
> the the valarray<double> class and not the wrapped valarray<double>
> class.
No, sorry; you weren't confused.
If you look at boost/libs/python/doc/data_structures.txt, you'll see that
there's no pointer back from an arbitrary C++ instance to the corresponding
ExtensionInstance object (which isa PyObject).
An arbitrary DVArray may or may not be wrapped already, so there's no
reliable way to get the PyObject* back. Yes, I could keep a database of
every wrapped object and the corresponding PyObject so that it would be easy
to do this, but that seemed a bit too heavyweight to me. Still, it might be
a good idea to give users that option, at least for some wrapped classes.
In the absence of that feature, you might consider an approach like this:
1. Make a derived class of DVArray, "WrappedDVArray", and expose that to
Python instead of DVArray using BPL as you did for DVArray, except leave out
the constructor.
2. Write
const DVArray& from_python(PyObject*p, boost::python::type<DVArray>);
const DVArray& from_python(PyObject*p, boost::python::type<const
DVArray&>);
functions which implement the following pseudocode:
PyArrayObject* numpy_array = (PyArrayObject*)p; // cast to NumPy array
// get the PyObject* whose reference count is being managed
PyObject* wrapped_dv_array = numpy_array>owned_py_object;
// Extract the valarray
return boost::python::from_python(p,
boost::python::type<WrappedDVArray>());
Now you just need a way to build such a NumPy array. Just write a C++
function like your to_array() function and expose it using BPL.
> If I wanted a pointer to a valarray<double> (i.e., DVArray) and
> not the wrapped valarray<double> class, how would I get it in this
> function?
You already have it. My mistake :(
> In general, I'm confused about when a class is considered
> wrapped and when it's not in the context of functions added as methods
> of the wrapped class.
I'm sorry to have confused you. In the methods of the wrapped class, all you
have access to is the C++ object, nothing more.
> > In any case, Python usually doesn't care about the actual types of
> > objects (just their interfaces), so you might find it much easier to
> > build up the interface of DVArray so that it mimics a NumPy array and
> > can be used in its place.
>
> NumPy arrays have a rich interface and allows things like sin(x),
> where x is an NumPy array, to be computed very quickly. To implement
> that interface and functionality for valarray<> or vector<> would be a
> large amount of work, hence, my use of the above strategy.
I think you might find that it's already done for valarray<> ;)
Actually, valarray<> has many problems, not least due to a lack of experts
on the C++ committee has who could help us get it right. Is the NumPy
interface "done right"? If so, maybe we should cannibalize.
> Thanks very much for your help and for BPL  it's a fantastic piece
> of work.
My pleasure.
Dave
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk