In my environment, I have numerous RabbitMQ nodes for testing purposes, and frequently changing around cluster configuration (e.g. which nodes are part of a cluster).<br><br>Using 2.8.6 and subsequently 2.8.7 on Erlang R15B01, I've observed several times recently where RabbitMQ has failed to start on a node. The reason seems to be related to a node that is no longer part of the cluster.<br>
<br>For example, at one point I had a three node cluster: play, play2, and util. I then removed util from the cluster, although to be honest, simply by changing the rabbitmq.config file, rather than explicitly running rabbitmqctl stop_app while the cluster is still running.<br>
<br>My steps:<br><ul><li>Running as three node cluster, stop all brokers</li><li>Create a new rabbitmq.config file with just two brokers</li><li>Attempt to start the new cluster.</li></ul><br>Here's my revised RabbitMQ config file. Note references to rabbit@util, which is no longer part of the cluster:<br>
--------<br>[<br>{rabbit, [{cluster_nodes, [rabbit@play,rabbit@play2]}, {disk_free_limit, 104857600}]},<br>{mnesia, [{debug, trace}]}<br>].<br>--------<br><br><br>The rabbit@play.log file:<br><br>=ERROR REPORT==== 28-Sep-2012::11:31:11 ===<br>
Mnesia(rabbit@play): ** ERROR ** (core dumped to file: "/home/mpietrek/MnesiaCore.rabbit@play_1348_857071_113813")<br> ** FATAL ** Failed to merge schema: Bad cookie in table definition mirrored_sup_childspec: rabbit@play = {cstruct,mirrored_sup_childspec,ordered_set,[rabbit@util,rabbit@play2,rabbit@play],[],[],0,read_write,false,[],[],false,mirrored_sup_childspec,[key,mirroring_pid,childspec],[],[],[],{{1346,266106,862481},rabbit@play},{{4,0},{rabbit@util,{1348,781672,983885}}}}, rabbit@util = {cstruct,mirrored_sup_childspec,ordered_set,[rabbit@util],[],[],0,read_write,false,[],[],false,mirrored_sup_childspec,[key,mirroring_pid,childspec],[],[],[],{{1348,854318,794171},rabbit@util},{{2,0},[]}}<br>
<br><br>=ERROR REPORT==== 28-Sep-2012::11:31:21 ===<br>** Generic server mnesia_monitor terminating <br>** Last message in was {'EXIT',<0.44.0>,killed}<br>** When Server state == {state,<0.44.0>,[],[],true,[],undefined,[]}<br>
** Reason for termination == <br>** killed<br><br>=ERROR REPORT==== 28-Sep-2012::11:31:21 ===<br>** Generic server mnesia_recover terminating <br>** Last message in was {'EXIT',<0.44.0>,killed}<br>** When Server state == {state,<0.44.0>,undefined,undefined,undefined,0,false,<br>
true,[]}<br>** Reason for termination == <br>** killed<br><br>=ERROR REPORT==== 28-Sep-2012::11:31:21 ===<br>** Generic server mnesia_snmp_sup terminating <br>** Last message in was {'EXIT',<0.44.0>,killed}<br>
** When Server state == {state,<br> {local,mnesia_snmp_sup},<br> simple_one_for_one,<br> [{child,undefined,mnesia_snmp_sup,<br> {mnesia_snmp_hook,start,[]},<br>
transient,3000,worker,<br> [mnesia_snmp_sup,mnesia_snmp_hook,<br> supervisor]}],<br> undefined,0,86400000,[],mnesia_snmp_sup,[]}<br>
** Reason for termination == <br>** killed<br><br>=ERROR REPORT==== 28-Sep-2012::11:31:21 ===<br>** Generic server mnesia_subscr terminating <br>** Last message in was {'EXIT',<0.44.0>,killed}<br>** When Server state == {state,<0.44.0>,57361}<br>
** Reason for termination == <br>** killed<br><br>=INFO REPORT==== 28-Sep-2012::11:31:21 ===<br> application: mnesia<br> exited: {shutdown,{mnesia_sup,start,[normal,[]]}}<br> type: permanent<br><br><br>And finally, the output from rabbitmq-server:<br>
<br>Mnesia(rabbit@play): mnesia_late_loader starting: <0.81.0><br>Mnesia(rabbit@play): Transaction {tid,21431,<0.82.0>} calling #Fun<mnesia_schema.42.93736344> with [] failed: <br> {aborted,{throw,[66,97,100,32,99,111,111,107,105,101,32,105,110,32,116,97,98,<br>
108,101,32,100,101,102,105,110,105,116,105,111,110,32,<br> "mirrored_sup_childspec",58,32,"rabbit@play",32,61,32,<br> [123,<br> ["cstruct",44,"mirrored_sup_childspec",44,"ordered_set",44,<br>
[91,["rabbit@util",44,"rabbit@play2",44,"rabbit@play"],93],<br> 44,"[]",44,"[]",44,"0",44,"read_write",44,"false",44,"[]",<br>
44,"[]",44,"false",44,"mirrored_sup_childspec",44,<br> [91,["key",44,"mirroring_pid",44,"childspec"],93],<br> 44,"[]",44,"[]",44,"[]",44,<br>
[123,<br> [[123,["1346",44,"266106",44,"862481"],125],<br> 44,"rabbit@play"],<br> 125],<br> 44,<br>
[123,<br> [[123,["4",44,"0"],125],<br> 44,<br> [123,<br> ["rabbit@util",44,<br> [123,["1348",44,"781672",44,"983885"],125]],<br>
125]],<br> 125]],<br> 125],<br> 44,32,"rabbit@util",32,61,32,<br> [123,<br> ["cstruct",44,"mirrored_sup_childspec",44,"ordered_set",44,<br>
[91,["rabbit@util"],93],<br> 44,"[]",44,"[]",44,"0",44,"read_write",44,"false",44,"[]",<br> 44,"[]",44,"false",44,"mirrored_sup_childspec",44,<br>
[91,["key",44,"mirroring_pid",44,"childspec"],93],<br> 44,"[]",44,"[]",44,"[]",44,<br> [123,<br> [[123,["1348",44,"854318",44,"794171"],125],<br>
44,"rabbit@util"],<br> 125],<br> 44,<br> [123,[[123,["2",44,"0"],125],44,"[]"],125]],<br> 125],<br>
"\n"]}}<br>Mnesia(rabbit@play): mnesia_monitor got FATAL ERROR from: <0.81.0><br>Mnesia(rabbit@play): mnesia_subscr terminated: killed<br>{"Kernel pid terminated",application_controller,"{application_start_failure,mnesia,{shutdown,{mnesia_sup,start,[normal,[]]}}}"}<br>
<br>Crash dump was written to: erl_crash.dump<br>Kernel pid terminated (application_controller) ({application_start_failure,mnesia,{shutdown,{mnesia_sup,start,[normal,[]]}}})<br><br>Matt<br>