Boost logo

Boost Users :

From: Patel Priyank-PPATEL1 (priyankpatel_at_[hidden])
Date: 2006-05-10 14:37:13


Hi Joaquin,

I delete procedure in only two places. And as follows in handle_timeout and handle methods.
And I grepped my trace output and handle_timeout never called. So all deletion will be only in one place called "handle" method of that procedure itself. I am going to try it out the changes you mentioned. This is just to let you know how I create and delete procedure objects.

Thanks
Priyank

// PROCEDURE and PROCEDURE DELETER

#include "../../procedure/Sample_Procedure.h"

int Sample_Procedure::ID = 1000;

Sample_Procedure::Sample_Procedure (Access_Point * _ap, int _recv_timer):
ap_(_ap), state_(Procedure_State_Enum::INITIAL), type_(Procedure_Type_Enum::SAMPLE_TYPE), recv_timer_(5),
recv_timer_handler_(new Procedure_Timer_Handler(recv_timer_))
{
        LD_TRACE("Sample_Procedure::Sample_Procedure");
        id_ = ++Sample_Procedure::ID;
}

int Sample_Procedure::start()
{
        LD_TRACE("Sample_Procedure::start");
        // Create fake message to send
        char data[4];
        memcpy(data, &id_, 4);
        boost::scoped_ptr<ACE_Message_Block> mb(new ACE_Message_Block(4, ACE_Message_Block::MB_DATA, 0, data));
        mb->wr_ptr(4);
        // end creating fake message
        // schedule timer for incoming message
        recv_timer_handler_->schedule(this);
        // start thread that sends out message after certain interval
        // send link continuity status request
        Connection * ap_tcp = (Connection*) ap_->connection(Connection_Type_Enum::AP_TCP);
        if (ap_tcp) {
                if (ap_tcp->send(mb) != 0) {
                        ACE_ERROR ((LD_ERROR, "%N:%l Error in sending data.\n"));
                        return -1;
                }
                ACE_DEBUG((LM_INFO, "SENT=> P_ID: %d ", id_));
                ACE_HEX_DUMP((LM_INFO, data, 4));
        }
}

void Sample_Procedure::handle_timeout()
{
        LD_TRACE("Sample_Procedure::handle_timeout");
        // update statistic for failed procedure
        // REALLY IMPORTANT
        // MAKE SURE TO DO FOLLOWING
        // remove this object from connection/s pool
        // cancel receive handler
        // delete this object
        delete this;
}

int Sample_Procedure::handle(boost::shared_ptr<ACE_Message_Block> _msg_block)
{
        LD_TRACE("Sample_Procedure::handle");
        // Log data first
        ACE_DEBUG((LM_INFO, "RECV<= P_ID: %d ", id_));
        ACE_HEX_DUMP((LM_INFO, _msg_block->rd_ptr(), 4));
        // update statistic for passed procedure
        // REALLY IMPORTANT
        // MAKE SURE TO DO FOLLOWING
        // remove this object from connection/s pool
        ap_->unsubscribe(Connection_Type_Enum::AP_TCP, this);
        // cancel receive timer handler
        recv_timer_handler_->cancel();
        // delete this object. Executor will not delete this object since it will not
        // go out of the scope untill it's total execution duration finishes, So
        // delete the procedure as soon as it is done executing to release memory
        // Also, receive timer handler will be deleted once this object gets deleted
        // so do not delete that explicitly
        ACE_DEBUG((LM_DEBUG, "DELETING PROCEDURE: %d\n", this->id()));
        delete this;
}

// PROCECURE GENERATOR

#include "../include/Procedure_Executor.h"
#include "../include/Duration_Timer_Handler.h"
#include "../include/Executor_Reactor.h"

Procedure_Executor::Procedure_Executor (
        const Procedure_Type_Enum& _type, int _starting_sleep, int _hr, int _min, int _sec,
        int _iterations, int _interval) :
procedure_type_(_type), starting_sleep_(_starting_sleep), hr_(_hr), min_(_min), sec_(_sec),
duration_(0), num_iterations_(_iterations), interval_(_interval), duration_timer_handler_(0)
{
        LD_TRACE("Procedure_Executor::Procedure_Executor");
        // check assertion to make sure proper data is supplied
        ACE_ASSERT(Procedure_Type_Enum::locate(_type) == true);
        ACE_ASSERT(hr_ >= 0);
        ACE_ASSERT(min_ >= 0);
        ACE_ASSERT(sec_ >= 0);
        ACE_ASSERT(num_iterations_ > 0);
        ACE_ASSERT(interval_ > 0);
        // convert this duration in second
        duration_ += (60*60*hr_);
        duration_ += (60*min_);
        duration_ += sec_;
        ACE_DEBUG((LM_DEBUG, "%N:%l Duration to run this procedure: %d seconds.\n", duration_));
        // convert this duration in millisecond
        unsigned long milli_sec = interval_ * 1000;
        // find out sleep time (interval time divided by number of iterations)
        sleep_ = (unsigned long) (milli_sec / num_iterations_);
        ACE_DEBUG((LM_DEBUG,
                        "%N:%l Procedure %s will run every %d milliseconds.\n",
                        _type.to_string().c_str(), sleep_));
        // create new reactor
        reactor_.reset(new Executor_Reactor());
        // create duration timer handler
        duration_timer_handler_.reset(new Duration_Timer_Handler(duration_));
}

int Procedure_Executor::start()
{
        LD_TRACE("Procedure_Executor::start");
        // sleep for starting period time
        if (starting_sleep_ != 0) {
                ACE_OS::sleep(starting_sleep_);
        }
        // TODO: Might require some change to use high resolution timer stopwatch and
        // based on that value we might have to sleep more (ofcourse less is not possible.)
        // start up the reactor
        reactor_->activate();
        // We have registered it recursively so that every time sleep_ will expire,
        // handle_time of this method will be called
        ACE_Time_Value * atv1 = new ACE_Time_Value(0, sleep_ * 1000);
        reactor_->get_ace_reactor().schedule_timer(this, 0, *atv1, *atv1);
        // start time for duration also, and pass this pointer to that object, that object
        // should call back stop() method of this class one duration is expires and we will stop
        // running this procedure
        duration_timer_handler_->schedule(this, &(reactor_->get_ace_reactor()));
        // return success
        return 0;
}

Procedure_Executor::~Procedure_Executor()
{
        LD_TRACE("Procedure_Executor::~Procedure_Executor");
}

int Procedure_Executor::handle_timeout(const ACE_Time_Value &_time, const void * _procedure)
{
        LD_TRACE("Procedure_Executor::handle_timeout");
        // select access point
        Access_Point * ap = (Access_Point*) (Loader_Config::getInstance()->get_aps()->next());
        if (ap == 0) {
                Loader_Config::getInstance()->get_aps()->reset();
                ap = (Access_Point*) (Loader_Config::getInstance()->get_aps()->next());
        }
        // create procedure based on procedure type specified
        if (procedure_type_ == Procedure_Type_Enum::SAMPLE_TYPE) {
                // create procedure
                Sample_Procedure * procedure = new Sample_Procedure(ap, 5);
                // subscribe to related connections
                ap->subscribe(Connection_Type_Enum::AP_TCP, procedure);
                // call start of that procedure
                procedure->start();
        }
        // return success, this will keep this handler in register for further calls
        return 0;
}

int Procedure_Executor::stop()
{
        LD_TRACE("Procedure_Executor::stop");
        // stop the reactor
        reactor_->stop();
        // remove duration handler from reactor
        reactor_->get_ace_reactor().remove_handler(duration_timer_handler_.get(), ACE_Event_Handler::TIMER_MASK);
        // remove this from reactor
        reactor_->get_ace_reactor().remove_handler(this, ACE_Event_Handler::TIMER_MASK);
        // return success
        return 0;
}

   

-----Original Message-----
From: boost-users-bounces_at_[hidden] [mailto:boost-users-bounces_at_[hidden]] On Behalf Of JOAQUIN LOPEZ MU?Z
Sent: Wednesday, May 10, 2006 12:05 PM
To: boost-users_at_[hidden]
Subject: Re: [Boost-users] [multi_index] Core dump in multi-index library !!

Patel Priyank-PPATEL1 ha escrito:
> Hi Joaquin,
>
> Seems the problem was happening when I call add and my add method
> tries to first find out whether that id is already present in
> container or not. It is not during remove method call.

(My hunch is that the problem will randomly show at various locations, please keep on reading.)

> I was concentrating on wrong portion of the code before. Anyways, I
> have changed and put some print out in code also at the end I have
> included output that comes. I have changed container to ordered index
> as you have said, but seems to be the same problem.

This is a very valuable piece of information, and increases my suspicions that the problem lies in your code rather than in B.MI: hashed and ordered indices share almost no code at all, so if the problem persists after switching the index type it's probably because the error lies outside the library. Of course, I might be wrong.

> It seems that it is core dumping on following line:
>
> bool Procedure_Pool::find_by_id(Procedure * _procedure) {
> LD_TRACE("Procedure_Pool::find_by_id");
> ACE_ASSERT(_procedure != 0);
> ACE_DEBUG((LM_DEBUG, "PROCEDURE TO FIND: %d\n", _procedure->id()));
> Procedure_By_Id::iterator it = procedure_by_id_.find(_procedure->id
());
> ACE_DEBUG((LM_DEBUG, "AFTER FIND CALL\n")); return (it !=
> procedure_by_id_.end()); }
>
> Please let me know if I am doing something wrong here. Also, let me
> know if you need any more information. Please see debug output at the
> end of the email.

I was consiering two possible causes for the crash in my previous post (leaving out the possibility of a bug in
B.MI):

1. Procedure id inadvertently changed.
2. Procedure object deleted before having been removed
   from the Procedure_Pool.

The fact that the crash is now happening in find seems to rule out 1, because inconsistent id's might lead to wrong results in a query, but in no case to a core dump.

So, I'd say you are somehow deleting Procedure objects without removing them before from the Procedure_Pool.
Maybe you can try to check this as follows:

If you can locate all the places in your code where a Procedure object is deleted, i.e expressions of the form:

  delete p; // p is a Procedure *

try instrumenting *all of them* with the following check:

  if(p != 0){
    // Get the pool. I guess you've got some API to do this.
    Procedure_Hashed_Pool* pool=...;
    // Since we're deleting p, it musn't still be in pool!
    assert(pool->find_by_id(p->id()) == 0);
  }
  delete p;

Do you get an assertion? If so, here's your bug.

Please report back. If this doesn't clear up the problem we can explore some other paths.

Joaquín M López Muñoz
Telefónica, Investigación y Desarrollo
_______________________________________________
Boost-users mailing list
Boost-users_at_[hidden]
http://lists.boost.org/mailman/listinfo.cgi/boost-users


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net