[rabbitmq-discuss] Active-active crash report

Matthias Radestock matthias at rabbitmq.com
Sat Apr 28 08:13:18 BST 2012


Vadim,

(putting the list back on cc)

On 27/04/12 23:28, Vadim Chekan wrote:
> I've spent some time today playing with different client settings. Seems
> ttl does not affect failures at all.

That's good to know. The more factors we can eliminate as possible 
causes the better.

> I managed to reproduce crash many times today. Basic idea is: an
> application, 40 threads create a pub/sub exchange and publish a message
> every second. Under this load I bring the master node (all queues are
> usually created on the same node) and often it causes another node to
> fail. Here is my load simulator (in c#):
> http://www.heypasteit.com/clip/0B5W

That code connects to "rabbitmq-dev". Is that a load balancer sitting in 
front of the three nodes?

How do you deal with the disconnects resulting from the shutting down of 
nodes? There doesn't seem to be any code to handle that.

Mind you, I suspect that the failure should still be reproducible 
without any subscriptions and publishes. Would be good to try that and 
just watch the broker logs for errors.

> As long as maillist does not allow zip attaches, I'm mailing to you guys
> directly

Thanks for posting these. There is an error in the logs that we haven't 
seen before:

            {{badmatch,[]},
             [{rabbit_mirror_queue_misc,'-remove_from_queue/2-fun-0-',2},
              {mnesia_tm,apply_fun,3},
              {mnesia_tm,execute_transaction,5},
              {rabbit_misc,'-execute_mnesia_transaction/1-fun-0-',1},
              {worker_pool_worker,handle_call,3},
              {gen_server2,handle_msg,2},
              {proc_lib,wake_up,3}]}

Looking at the code, this appears to indicate that there are no 
master/mirror processes left for the queue. Which is...unexpected. That 
should give us something to go on.

Regards,

Matthias.


More information about the rabbitmq-discuss mailing list