I have a state [A] with three orthogonal substates [X,Y,Z]. The states [A] and [X] provide custom reactions for an event [E]. If processing event [E] the state machine calls first the custom reaction in [A]. This method returns with forward_event() ; After this the custom reaction in [A] is called again. And then custom reaction in [X] is called.

From my experience, it doesn't behave like that.

 
It would be something easier to work vice versa. First the more "specialized" inner states could do there work and then the "generalized" outer states can do the everything else.

That's what happens usually.

Maybe there's some problem with your FSM definition?