[rabbitmq-discuss] Cluster Node Crashes

Cordell, Ron Ron.Cordell at RelayHealth.com
Wed Sep 18 17:25:24 BST 2013


Thanks, Simon. That explains some of the behavior I've seen that matches
exactly what you describe - 1 node in the cluster is partitioned from
another node, but is seen by the remaining nodes. If I check the cluster
status from some nodes, all nodes in the cluster appear to be functional
with no partitions. If I check status of the cluster from the "affected"
nodes, the affected nodes show as partitioned.

Now I just need to get to the bottom of what is going on with respect to
network partitions.

Cheers,

Ron

On 9/18/13 5:18 AM, "Simon MacMullen" <simon at rabbitmq.com> wrote:

>On 17/09/13 23:48, Cordell, Ron wrote:
>> =ERROR REPORT==== 15-Sep-2013::06:34:45 ===
>> ** Generic server <0.353.0> terminating
>> ** Last message in was {'DOWN',#Ref<0.0.0.3812>,process,<5111.350.0>,
>>                             {function_clause,
>>                                 [{orddict,fetch,
>>                                      [{9,<5111.350.0>},[]],
>>                                      [{file,"orddict.erl"},{line,72}]},
>>                                  {gm,check_neighbours,1,[]},
>>                                  {gm,handle_cast,2,[]},
>>                                  {gen_server2,handle_msg,2,[]},
>>                                  {proc_lib,wake_up,3,
>>                 
>>[{file,"proc_lib.erl"},{line,249}]}]}}
>
>Hi. This is a stack trace we're aware of. The cause of this is a partial
>network partition - e.g. in a cluster of three nodes ABC, where A is
>partitioned from B but C can see both A and B.
>
>We hope to mitigate this crash in a future release. In the mean time I'm
>afraid all I can suggest is that you try to ensure such partitions don't
>happen.
>
>Cheers, Simon
>
>-- 
>Simon MacMullen
>RabbitMQ, Pivotal



More information about the rabbitmq-discuss mailing list