In my algorithms I have to read/write from/to each individual element of  a matrix, and this is making my application really slow. More specifically, I'm assembling a stiffness matrix in Finite Element Method. The code is like this:

    for(int i=0; i<m_tetrahedrons.size(); ++i)
    {
        btTetrahedron* t = m_tetrahedrons[i];
        t->computeCorotatedStiffness();

        for(unsigned int j=0; j<4; ++j)
            for(unsigned int k=0; k<4; ++k)
            {
                unsigned int jj = t->getNodeIndex(j);
                unsigned int kk = t->getNodeIndex(k);

                for(unsigned int r=0; r<3; ++r)
                    for(unsigned int s=0; s<3; ++s)
                    {
                        m_RKR_1(3*jj+r, 3*kk+s) += t->getCorotatedStiffness0(3*j+r, 3*k+s);
                        m_RK(3*jj+r, 3*kk+s) += t->getCorotatedStiffness1(3*j+r, 3*k+s);
                    }
            }
    }

Where m_RKR_1 and m_RK are both compressed_matrix<float>, t->getCorotatedStiffness0/1 just returns the (i,j) element of a 12x12 compressed_matrix<float>. If I don't compute the co-rotated matrices the simulation still works but incorrectly (linear strain only), but very fast (basically, by commenting the element wise operations in that code). Whenever I turn the co-rotational stuff on, it gets damn slow, and those element wise operations are guilty.

What am I doing wrong? Is there any faster technique to do that? Well, there must be one...


Thanks in advance.