[rabbitmq-discuss] Errors starting RabbitMQ when cluster membership changes
Matt Pietrek
mpietrek at skytap.com
Fri Sep 28 20:23:38 BST 2012
In my environment, I have numerous RabbitMQ nodes for testing purposes, and
frequently changing around cluster configuration (e.g. which nodes are part
of a cluster).
Using 2.8.6 and subsequently 2.8.7 on Erlang R15B01, I've observed several
times recently where RabbitMQ has failed to start on a node. The reason
seems to be related to a node that is no longer part of the cluster.
For example, at one point I had a three node cluster: play, play2, and
util. I then removed util from the cluster, although to be honest, simply
by changing the rabbitmq.config file, rather than explicitly running
rabbitmqctl stop_app while the cluster is still running.
My steps:
- Running as three node cluster, stop all brokers
- Create a new rabbitmq.config file with just two brokers
- Attempt to start the new cluster.
Here's my revised RabbitMQ config file. Note references to rabbit at util,
which is no longer part of the cluster:
--------
[
{rabbit, [{cluster_nodes, [rabbit at play,rabbit at play2]}, {disk_free_limit,
104857600}]},
{mnesia, [{debug, trace}]}
].
--------
The rabbit at play.log file:
=ERROR REPORT==== 28-Sep-2012::11:31:11 ===
Mnesia(rabbit at play): ** ERROR ** (core dumped to file:
"/home/mpietrek/MnesiaCore.rabbit at play_1348_857071_113813")
** FATAL ** Failed to merge schema: Bad cookie in table definition
mirrored_sup_childspec: rabbit at play =
{cstruct,mirrored_sup_childspec,ordered_set,[rabbit at util,rabbit at play2
,rabbit at play
],[],[],0,read_write,false,[],[],false,mirrored_sup_childspec,[key,mirroring_pid,childspec],[],[],[],{{1346,266106,862481},rabbit at play
},{{4,0},{rabbit at util,{1348,781672,983885}}}}, rabbit at util =
{cstruct,mirrored_sup_childspec,ordered_set,[rabbit at util
],[],[],0,read_write,false,[],[],false,mirrored_sup_childspec,[key,mirroring_pid,childspec],[],[],[],{{1348,854318,794171},rabbit at util
},{{2,0},[]}}
=ERROR REPORT==== 28-Sep-2012::11:31:21 ===
** Generic server mnesia_monitor terminating
** Last message in was {'EXIT',<0.44.0>,killed}
** When Server state == {state,<0.44.0>,[],[],true,[],undefined,[]}
** Reason for termination ==
** killed
=ERROR REPORT==== 28-Sep-2012::11:31:21 ===
** Generic server mnesia_recover terminating
** Last message in was {'EXIT',<0.44.0>,killed}
** When Server state ==
{state,<0.44.0>,undefined,undefined,undefined,0,false,
true,[]}
** Reason for termination ==
** killed
=ERROR REPORT==== 28-Sep-2012::11:31:21 ===
** Generic server mnesia_snmp_sup terminating
** Last message in was {'EXIT',<0.44.0>,killed}
** When Server state == {state,
{local,mnesia_snmp_sup},
simple_one_for_one,
[{child,undefined,mnesia_snmp_sup,
{mnesia_snmp_hook,start,[]},
transient,3000,worker,
[mnesia_snmp_sup,mnesia_snmp_hook,
supervisor]}],
undefined,0,86400000,[],mnesia_snmp_sup,[]}
** Reason for termination ==
** killed
=ERROR REPORT==== 28-Sep-2012::11:31:21 ===
** Generic server mnesia_subscr terminating
** Last message in was {'EXIT',<0.44.0>,killed}
** When Server state == {state,<0.44.0>,57361}
** Reason for termination ==
** killed
=INFO REPORT==== 28-Sep-2012::11:31:21 ===
application: mnesia
exited: {shutdown,{mnesia_sup,start,[normal,[]]}}
type: permanent
And finally, the output from rabbitmq-server:
Mnesia(rabbit at play): mnesia_late_loader starting: <0.81.0>
Mnesia(rabbit at play): Transaction {tid,21431,<0.82.0>} calling
#Fun<mnesia_schema.42.93736344> with [] failed:
{aborted,{throw,[66,97,100,32,99,111,111,107,105,101,32,105,110,32,116,97,98,
108,101,32,100,101,102,105,110,105,116,105,111,110,32,
"mirrored_sup_childspec",58,32,"rabbit at play",32,61,32,
[123,
["cstruct",44,"mirrored_sup_childspec",44,"ordered_set",44,
[91,["rabbit at util",44,"rabbit at play2",44,"rabbit at play
"],93],
44,"[]",44,"[]",44,"0",44,"read_write",44,"false",44,"[]",
44,"[]",44,"false",44,"mirrored_sup_childspec",44,
[91,["key",44,"mirroring_pid",44,"childspec"],93],
44,"[]",44,"[]",44,"[]",44,
[123,
[[123,["1346",44,"266106",44,"862481"],125],
44,"rabbit at play"],
125],
44,
[123,
[[123,["4",44,"0"],125],
44,
[123,
["rabbit at util",44,
[123,["1348",44,"781672",44,"983885"],125]],
125]],
125]],
125],
44,32,"rabbit at util",32,61,32,
[123,
["cstruct",44,"mirrored_sup_childspec",44,"ordered_set",44,
[91,["rabbit at util"],93],
44,"[]",44,"[]",44,"0",44,"read_write",44,"false",44,"[]",
44,"[]",44,"false",44,"mirrored_sup_childspec",44,
[91,["key",44,"mirroring_pid",44,"childspec"],93],
44,"[]",44,"[]",44,"[]",44,
[123,
[[123,["1348",44,"854318",44,"794171"],125],
44,"rabbit at util"],
125],
44,
[123,[[123,["2",44,"0"],125],44,"[]"],125]],
125],
"\n"]}}
Mnesia(rabbit at play): mnesia_monitor got FATAL ERROR from: <0.81.0>
Mnesia(rabbit at play): mnesia_subscr terminated: killed
{"Kernel pid
terminated",application_controller,"{application_start_failure,mnesia,{shutdown,{mnesia_sup,start,[normal,[]]}}}"}
Crash dump was written to: erl_crash.dump
Kernel pid terminated (application_controller)
({application_start_failure,mnesia,{shutdown,{mnesia_sup,start,[normal,[]]}}})
Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20120928/3464fc19/attachment.htm>
More information about the rabbitmq-discuss
mailing list