Hi all,

I have some code using boost::thread that works fine when it's called directly from a main() function, but (almost) always segfaults and/or hangs forever when called from a Matlab mex interface. I don't know whether that's due to the Matlab interface breaking something about boost::thread or if the different environment is revealing a flaw with my code.

Here's a simplified version of the code I'm using. (This same question is also posted on stackoverflow at http://stackoverflow.com/q/9011780/344821 -- the code is prettier, and you can get some stackoverflow reputation out of me if you answer there.)


    #include <cstdlib>
    #include <queue>
    #include <vector>
    
    #include <boost/thread.hpp>
    #include <boost/utility.hpp>
    
    #ifndef NO_MEX
    #include "mex.h"
    #endif
    
    class Worker : boost::noncopyable {
        boost::mutex &jobs_mutex;
        std::queue<size_t> &jobs;
    
        boost::mutex &results_mutex;
        std::vector<double> &results;
    
        public:
    
        Worker(boost::mutex &jobs_mutex, std::queue<size_t> &jobs,
               boost::mutex &results_mutex, std::vector<double> &results)
            :
                jobs_mutex(jobs_mutex), jobs(jobs),
                results_mutex(results_mutex), results(results)
        {}
    
        void operator()() {
            size_t i;
            float r;
    
            while (true) {
                // get a job
                {
                    boost::mutex::scoped_lock lk(jobs_mutex);
                    if (jobs.size() == 0)
                        return;
    
                    i = jobs.front();
                    jobs.pop();
                }
    
                // do some "work"
                r = rand() / 315.612;
    
                // write the results
                {
                    boost::mutex::scoped_lock lk(results_mutex);
                    results[i] = r;
                }
            }
        }
    };
    
    std::vector<double> doWork(size_t n) {
        std::vector<double> results;
        results.resize(n);
    
        boost::mutex jobs_mutex, results_mutex;
    
        std::queue<size_t> jobs;
        for (size_t i = 0; i < n; i++)
            jobs.push(i);
    
        Worker w1(jobs_mutex, jobs, results_mutex, results);
        boost::thread t1(boost::ref(w1));
    
        Worker w2(jobs_mutex, jobs, results_mutex, results);
        boost::thread t2(boost::ref(w2));
    
        t1.join();
        t2.join();
    
        return results;
    }
    
    #ifdef NO_MEX
    int main() {
    #else
    void mexFunction(int nlhs, mxArray **plhs, int nrhs, const mxArray **prhs) {
    #endif
        std::vector<double> results = doWork(10);
        for (size_t i = 0; i < results.size(); i++)
            printf("%g ", results[i]);
        printf("\n");
    }



If I compile and run this directly (with gcc or clang and -DNO_MEX), it always completes successfully.

If I run this through Matlab's mex interface and link to a release-variant boost (I've tried 1.48.0 on OSX and both 1.40 and 1.33.1 on CentOS), I get a segfault. On 1.48.0, this is trying to access a pointer that's near 0 inside boost::thread::start_thread from inside of t1's constructor; on 1.33.1 and 1.40, it's inside pthread_mutex_lock being called from t1.join().

If I instead link to a debug-variant boost (1.48 on OSX or 1.33.1 on CentOS -- the machine with 1.40 installed doesn't have a debug variant), it hangs forever inside t1.join(), though the threads have completed successfully, with results containing the expected elements that were calculated as r inside Worker. (On OSX, these are different numbers than are printed by the standalone version, but they're still consistent between threads.)


Am I doing something stupid that's causing this, or is this somehow due to something on the matlab end? I'm pretty new to C++ and boost and so it might well be something dumb that I did, but I don't know what that could be.

I also don't really have any ideas on how to proceed from here in terms of debugging, especially since the segfaults only occur when there are no debugging symbols in the boost libraries themselves around to help figure out what's going on. My only real idea is to try this without boost::thread and using pthreads directly, but even if that worked it's nonideal since I'd also like to support Windows.


Thanks,
Dougal