hello, 
Simon MacMullen. <div>today, rabbitmq down again.</div><div>the output of the sasl.log is the same with the last one. but there are something new in the .log.</div><div>Could u help me to find out the problem ?  thanks</div>

<div><br></div><div><div><b>disc node&#39;s log</b></div><div><b><br></b></div><div>=INFO REPORT==== 27-Sep-2012::19:59:12 ===</div><div>rabbit on node rabbit@zw_124_156 down</div><div><br></div><div>=WARNING REPORT==== 27-Sep-2012::19:59:13 ===</div>

<div>Mnesia(rabbit@zw_124_177): ** WARNING ** Mnesia is overloaded: {dump_log,</div><div>                                                                time_threshold}</div><div><br></div><div>=INFO REPORT==== 27-Sep-2012::19:59:13 ===</div>

<div>    application: rabbitmq_management</div><div>    exited: shutdown</div><div>    type: temporary</div><div><br></div><div>-------------------------------------------------------<span class="Apple-tab-span" style="white-space:pre">        </span></div>

<div><b>ram node&#39;s log</b></div><div><b><br></b></div><div>=ERROR REPORT==== 27-Sep-2012::19:54:11 ===</div><div>** Node rabbit@zw_124_177 not responding **</div><div>** Removing (timedout) connection **</div><div><br>

</div><div>=INFO REPORT==== 27-Sep-2012::19:54:11 ===</div><div>rabbit on node rabbit@zw_124_177 down</div><div><br></div><div>=INFO REPORT==== 27-Sep-2012::19:54:53 ===</div><div>Statistics database started.</div><div><br>

</div><div>=INFO REPORT==== 27-Sep-2012::19:59:13 ===</div><div>global: Name conflict terminating {rabbit_mgmt_db,&lt;7123.24214.2613&gt;}</div><div><br></div><div>=INFO REPORT==== 27-Sep-2012::19:59:13 ===</div><div>    application: rabbitmq_management</div>

<div>    exited: shutdown</div><div>    type: temporary</div></div><div><br></div><div><br></div><div>thanks.</div><div><br></div><div><br></div><div><br></div><div><br></div><div><br><div class="gmail_quote">2012/9/21 Simon MacMullen <span dir="ltr">&lt;<a href="mailto:simon@rabbitmq.com" target="_blank">simon@rabbitmq.com</a>&gt;</span><br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi.<br>
<br>
This looks like a race with management DB failover that we&#39;ve already fixed, but which has not yet made it into any release. This will be fixed in the next bugfix release. Until then if you are affected you can work around it by running the management plugin on only one node in the cluster (and just have rabbitmq_management_agent on the others).<br>


<br>
Cheers, Simon<br>
<br>
On 20/09/12 08:45, liubida wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
hi all,<br>
<br>
dose anybody could point me just the direction for finding the reason<br>
for a fake-dead with the RabbitMQ.<br>
it seem the rabbitmq could not receive messages in some point, without<br>
any warning.<br>
i have constructed a cluster with 2 nodes, one disc and one ram.<br>
if i type the command &quot;rabbitmqctl stop_app &amp;&amp; rabbitmqctl start_app&quot;,<br>
the troubleshoot disappered.<br>
<br>
<br>
here is the disc node&#39;s sasl.log<br>
<br>
=SUPERVISOR REPORT==== 18-Sep-2012::14:53:13 ===<br>
Supervisor: {&lt;0.28802.1777&gt;,mirrored_<u></u>supervisor}<br>
Context: child_terminated<br>
Reason: killed<br>
Offender: [{pid,&lt;0.28804.1777&gt;},<br>
{name,rabbit_mgmt_db},<br>
{mfa,{rabbit_mgmt_db,start_<u></u>link,[]}},<br>
{restart_type,permanent},<br>
{shutdown,4294967295},<br>
{child_type,worker}]<br>
<br>
<br>
=SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===<br>
Supervisor: {&lt;0.28802.1777&gt;,mirrored_<u></u>supervisor}<br>
Context: start_error<br>
Reason: {already_started,&lt;19704.13906.<u></u>2335&gt;}<br>
Offender: [{pid,&lt;0.28804.1777&gt;},<br>
{name,rabbit_mgmt_db},<br>
{mfa,{rabbit_mgmt_db,start_<u></u>link,[]}},<br>
{restart_type,permanent},<br>
{shutdown,4294967295},<br>
{child_type,worker}]<br>
<br>
<br>
=SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===<br>
Supervisor: {&lt;0.28802.1777&gt;,mirrored_<u></u>supervisor}<br>
Context: start_error<br>
Reason: {already_started,&lt;19704.13906.<u></u>2335&gt;}<br>
Offender: [{pid,&lt;0.28804.1777&gt;},<br>
{name,rabbit_mgmt_db},<br>
{mfa,{rabbit_mgmt_db,start_<u></u>link,[]}},<br>
{restart_type,permanent},<br>
{shutdown,4294967295},<br>
{child_type,worker}]<br>
<br>
<br>
=SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===<br>
Supervisor: {&lt;0.28802.1777&gt;,mirrored_<u></u>supervisor}<br>
Context: start_error<br>
Reason: {already_started,&lt;19704.13906.<u></u>2335&gt;}<br>
Offender: [{pid,&lt;0.28804.1777&gt;},<br>
{name,rabbit_mgmt_db},<br>
{mfa,{rabbit_mgmt_db,start_<u></u>link,[]}},<br>
{restart_type,permanent},<br>
{shutdown,4294967295},<br>
{child_type,worker}]<br>
<br>
<br>
=SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===<br>
Supervisor: {&lt;0.28802.1777&gt;,mirrored_<u></u>supervisor}<br>
Context: start_error<br>
Reason: {already_started,&lt;19704.13906.<u></u>2335&gt;}<br>
Offender: [{pid,&lt;0.28804.1777&gt;},<br>
{name,rabbit_mgmt_db},<br>
{mfa,{rabbit_mgmt_db,start_<u></u>link,[]}},<br>
{restart_type,permanent},<br>
{shutdown,4294967295},<br>
{child_type,worker}]<br>
<br>
<br>
=SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===<br>
Supervisor: {&lt;0.28802.1777&gt;,mirrored_<u></u>supervisor}<br>
Context: start_error<br>
Reason: {already_started,&lt;19704.13906.<u></u>2335&gt;}<br>
Offender: [{pid,&lt;0.28804.1777&gt;},<br>
{name,rabbit_mgmt_db},<br>
{mfa,{rabbit_mgmt_db,start_<u></u>link,[]}},<br>
{restart_type,permanent},<br>
{shutdown,4294967295},<br>
{child_type,worker}]<br>
<br>
<br>
=SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===<br>
Supervisor: {&lt;0.28802.1777&gt;,mirrored_<u></u>supervisor}<br>
Context: start_error<br>
Reason: {already_started,&lt;19704.13906.<u></u>2335&gt;}<br>
Offender: [{pid,&lt;0.28804.1777&gt;},<br>
{name,rabbit_mgmt_db},<br>
{mfa,{rabbit_mgmt_db,start_<u></u>link,[]}},<br>
{restart_type,permanent},<br>
{shutdown,4294967295},<br>
{child_type,worker}]<br>
<br>
<br>
=SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===<br>
Supervisor: {&lt;0.28802.1777&gt;,mirrored_<u></u>supervisor}<br>
Context: start_error<br>
Reason: {already_started,&lt;19704.13906.<u></u>2335&gt;}<br>
Offender: [{pid,&lt;0.28804.1777&gt;},<br>
{name,rabbit_mgmt_db},<br>
{mfa,{rabbit_mgmt_db,start_<u></u>link,[]}},<br>
{restart_type,permanent},<br>
{shutdown,4294967295},<br>
{child_type,worker}]<br>
<br>
<br>
=SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===<br>
Supervisor: {&lt;0.28802.1777&gt;,mirrored_<u></u>supervisor}<br>
Context: start_error<br>
Reason: {already_started,&lt;19704.13906.<u></u>2335&gt;}<br>
Offender: [{pid,&lt;0.28804.1777&gt;},<br>
{name,rabbit_mgmt_db},<br>
{mfa,{rabbit_mgmt_db,start_<u></u>link,[]}},<br>
{restart_type,permanent},<br>
{shutdown,4294967295},<br>
{child_type,worker}]<br>
<br>
<br>
=SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===<br>
Supervisor: {&lt;0.28802.1777&gt;,mirrored_<u></u>supervisor}<br>
Context: start_error<br>
Reason: {already_started,&lt;19704.13906.<u></u>2335&gt;}<br>
Offender: [{pid,&lt;0.28804.1777&gt;},<br>
{name,rabbit_mgmt_db},<br>
{mfa,{rabbit_mgmt_db,start_<u></u>link,[]}},<br>
{restart_type,permanent},<br>
{shutdown,4294967295},<br>
{child_type,worker}]<br>
<br>
<br>
=SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===<br>
Supervisor: {&lt;0.28802.1777&gt;,mirrored_<u></u>supervisor}<br>
Context: start_error<br>
Reason: {already_started,&lt;19704.13906.<u></u>2335&gt;}<br>
Offender: [{pid,&lt;0.28804.1777&gt;},<br>
{name,rabbit_mgmt_db},<br>
{mfa,{rabbit_mgmt_db,start_<u></u>link,[]}},<br>
{restart_type,permanent},<br>
{shutdown,4294967295},<br>
{child_type,worker}]<br>
<br>
<br>
=SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===<br>
Supervisor: {&lt;0.28802.1777&gt;,mirrored_<u></u>supervisor}<br>
Context: shutdown<br>
Reason: reached_max_restart_intensity<br>
Offender: [{pid,&lt;0.28804.1777&gt;},<br>
{name,rabbit_mgmt_db},<br>
{mfa,{rabbit_mgmt_db,start_<u></u>link,[]}},<br>
{restart_type,permanent},<br>
{shutdown,4294967295},<br>
{child_type,worker}]<br>
<br>
<br>
=SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===<br>
Supervisor: {local,rabbit_mgmt_sup}<br>
Context: child_terminated<br>
Reason: shutdown<br>
Offender: [{pid,&lt;0.28803.1777&gt;},<br>
{name,mirroring},<br>
{mfa,<br>
{mirrored_supervisor,start_<u></u>internal,<br>
[rabbit_mgmt_sup,<br>
[{rabbit_mgmt_db,<br>
{rabbit_mgmt_db,start_link,[]}<u></u>,<br>
permanent,4294967295,worker,<br>
[rabbit_mgmt_db]}]]}},<br>
{restart_type,permanent},<br>
{shutdown,4294967295},<br>
{child_type,worker}]<br>
<br>
<br>
=SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===<br>
Supervisor: {local,rabbit_mgmt_sup}<br>
Context: shutdown<br>
Reason: reached_max_restart_intensity<br>
Offender: [{pid,&lt;0.28803.1777&gt;},<br>
{name,mirroring},<br>
{mfa,<br>
{mirrored_supervisor,start_<u></u>internal,<br>
[rabbit_mgmt_sup,<br>
[{rabbit_mgmt_db,<br>
{rabbit_mgmt_db,start_link,[]}<u></u>,<br>
permanent,4294967295,worker,<br>
[rabbit_mgmt_db]}]]}},<br>
{restart_type,permanent},<br>
{shutdown,4294967295},<br>
{child_type,worker}]<br>
<br>
<br>
the ram node&#39;s sasl.log<br>
=SUPERVISOR REPORT==== 18-Sep-2012::17:24:50 ===<br>
Supervisor: {local,rabbit_mgmt_sup}<br>
Context: child_terminated<br>
Reason: shutdown<br>
Offender: [{pid,&lt;0.14161.2335&gt;},<br>
{name,mirroring},<br>
{mfa,<br>
{mirrored_supervisor,start_<u></u>internal,<br>
[rabbit_mgmt_sup,<br>
[{rabbit_mgmt_db,<br>
{rabbit_mgmt_db,start_link,[]}<u></u>,<br>
permanent,4294967295,worker,<br>
[rabbit_mgmt_db]}]]}},<br>
{restart_type,permanent},<br>
{shutdown,4294967295},<br>
{child_type,worker}]<br>
<br>
<br>
=SUPERVISOR REPORT==== 18-Sep-2012::17:24:50 ===<br>
Supervisor: {local,rabbit_mgmt_sup}<br>
Context: shutdown<br>
Reason: reached_max_restart_intensity<br>
Offender: [{pid,&lt;0.14161.2335&gt;},<br>
{name,mirroring},<br>
{mfa,<br>
{mirrored_supervisor,start_<u></u>internal,<br>
[rabbit_mgmt_sup,<br>
[{rabbit_mgmt_db,<br>
{rabbit_mgmt_db,start_link,[]}<u></u>,<br>
permanent,4294967295,worker,<br>
[rabbit_mgmt_db]}]]}},<br>
{restart_type,permanent},<br>
{shutdown,4294967295},<br>
{child_type,worker}]<br>
<br>
Thanks for any help.<br>
<br>
--bidaliu<br>
<br>
<br>
<br>
______________________________<u></u>_________________<br>
rabbitmq-discuss mailing list<br>
<a href="mailto:rabbitmq-discuss@lists.rabbitmq.com" target="_blank">rabbitmq-discuss@lists.<u></u>rabbitmq.com</a><br>
<a href="https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss" target="_blank">https://lists.rabbitmq.com/<u></u>cgi-bin/mailman/listinfo/<u></u>rabbitmq-discuss</a><span class="HOEnZb"><font color="#888888"><br>


</font></span></blockquote><span class="HOEnZb"><font color="#888888">
<br>
<br>
-- <br>
Simon MacMullen<br>
RabbitMQ, VMware<br>
</font></span></blockquote></div><br></div>