Boost logo

Boost Users :

Subject: [Boost-users] [Thread] code segfaults/hangs when being called from matlab, not when run standalone
From: Dougal Sutherland (dougal_at_[hidden])
Date: 2012-01-30 13:21:27


Hi all,

I have some code using boost::thread that works fine when it's called
directly from a main() function, but (almost) always segfaults and/or hangs
forever when called from a Matlab mex interface. I don't know whether
that's due to the Matlab interface breaking something about boost::thread
or if the different environment is revealing a flaw with my code.

Here's a simplified version of the code I'm using. (This same question is
also posted on stackoverflow at
http://stackoverflow.com/q/9011780/344821-- the code is prettier, and
you can get some stackoverflow reputation out
of me if you answer there.)

    #include <cstdlib>
    #include <queue>
    #include <vector>

    #include <boost/thread.hpp>
    #include <boost/utility.hpp>

    #ifndef NO_MEX
    #include "mex.h"
    #endif

    class Worker : boost::noncopyable {
        boost::mutex &jobs_mutex;
        std::queue<size_t> &jobs;

        boost::mutex &results_mutex;
        std::vector<double> &results;

        public:

        Worker(boost::mutex &jobs_mutex, std::queue<size_t> &jobs,
               boost::mutex &results_mutex, std::vector<double> &results)
            :
                jobs_mutex(jobs_mutex), jobs(jobs),
                results_mutex(results_mutex), results(results)
        {}

        void operator()() {
            size_t i;
            float r;

            while (true) {
                // get a job
                {
                    boost::mutex::scoped_lock lk(jobs_mutex);
                    if (jobs.size() == 0)
                        return;

                    i = jobs.front();
                    jobs.pop();
                }

                // do some "work"
                r = rand() / 315.612;

                // write the results
                {
                    boost::mutex::scoped_lock lk(results_mutex);
                    results[i] = r;
                }
            }
        }
    };

    std::vector<double> doWork(size_t n) {
        std::vector<double> results;
        results.resize(n);

        boost::mutex jobs_mutex, results_mutex;

        std::queue<size_t> jobs;
        for (size_t i = 0; i < n; i++)
            jobs.push(i);

        Worker w1(jobs_mutex, jobs, results_mutex, results);
        boost::thread t1(boost::ref(w1));

        Worker w2(jobs_mutex, jobs, results_mutex, results);
        boost::thread t2(boost::ref(w2));

        t1.join();
        t2.join();

        return results;
    }

    #ifdef NO_MEX
    int main() {
    #else
    void mexFunction(int nlhs, mxArray **plhs, int nrhs, const mxArray
**prhs) {
    #endif
        std::vector<double> results = doWork(10);
        for (size_t i = 0; i < results.size(); i++)
            printf("%g ", results[i]);
        printf("\n");
    }

If I compile and run this directly (with gcc or clang and -DNO_MEX), it
always completes successfully.

If I run this through Matlab's mex interface and link to a release-variant
boost (I've tried 1.48.0 on OSX and both 1.40 and 1.33.1 on CentOS), I get
a segfault. On 1.48.0, this is trying to access a pointer that's near 0
inside boost::thread::start_thread from inside of t1's constructor; on
1.33.1 and 1.40, it's inside pthread_mutex_lock being called from t1.join().

If I instead link to a debug-variant boost (1.48 on OSX or 1.33.1 on CentOS
-- the machine with 1.40 installed doesn't have a debug variant), it hangs
forever inside t1.join(), though the threads have completed successfully,
with results containing the expected elements that were calculated as rinside
Worker. (On OSX, these are different numbers than are printed by the
standalone version, but they're still consistent between threads.)

Am I doing something stupid that's causing this, or is this somehow due to
something on the matlab end? I'm pretty new to C++ and boost and so it
might well be something dumb that I did, but I don't know what that could
be.

I also don't really have any ideas on how to proceed from here in terms of
debugging, especially since the segfaults only occur when there are no
debugging symbols in the boost libraries themselves around to help figure
out what's going on. My only real idea is to try this without boost::thread
and using pthreads directly, but even if that worked it's nonideal since
I'd also like to support Windows.

Thanks,
Dougal



Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net