[rabbitmq-discuss] rabbitmq fake-death && no breath. key word in log : mirrored_supervisor rabbit_mgmt_db

Chao Liu dada.chao.liu at gmail.com
Thu Sep 27 14:54:45 BST 2012


hello,  Simon MacMullen.
today, rabbitmq down again.
the output of the sasl.log is the same with the last one. but there are
something new in the .log.
Could u help me to find out the problem ?  thanks

*disc node's log*
*
*
=INFO REPORT==== 27-Sep-2012::19:59:12 ===
rabbit on node rabbit at zw_124_156 down

=WARNING REPORT==== 27-Sep-2012::19:59:13 ===
Mnesia(rabbit at zw_124_177): ** WARNING ** Mnesia is overloaded: {dump_log,

time_threshold}

=INFO REPORT==== 27-Sep-2012::19:59:13 ===
    application: rabbitmq_management
    exited: shutdown
    type: temporary

-------------------------------------------------------
*ram node's log*
*
*
=ERROR REPORT==== 27-Sep-2012::19:54:11 ===
** Node rabbit at zw_124_177 not responding **
** Removing (timedout) connection **

=INFO REPORT==== 27-Sep-2012::19:54:11 ===
rabbit on node rabbit at zw_124_177 down

=INFO REPORT==== 27-Sep-2012::19:54:53 ===
Statistics database started.

=INFO REPORT==== 27-Sep-2012::19:59:13 ===
global: Name conflict terminating {rabbit_mgmt_db,<7123.24214.2613>}

=INFO REPORT==== 27-Sep-2012::19:59:13 ===
    application: rabbitmq_management
    exited: shutdown
    type: temporary


thanks.





2012/9/21 Simon MacMullen <simon at rabbitmq.com>

> Hi.
>
> This looks like a race with management DB failover that we've already
> fixed, but which has not yet made it into any release. This will be fixed
> in the next bugfix release. Until then if you are affected you can work
> around it by running the management plugin on only one node in the cluster
> (and just have rabbitmq_management_agent on the others).
>
> Cheers, Simon
>
> On 20/09/12 08:45, liubida wrote:
>
>> hi all,
>>
>> dose anybody could point me just the direction for finding the reason
>> for a fake-dead with the RabbitMQ.
>> it seem the rabbitmq could not receive messages in some point, without
>> any warning.
>> i have constructed a cluster with 2 nodes, one disc and one ram.
>> if i type the command "rabbitmqctl stop_app && rabbitmqctl start_app",
>> the troubleshoot disappered.
>>
>>
>> here is the disc node's sasl.log
>>
>> =SUPERVISOR REPORT==== 18-Sep-2012::14:53:13 ===
>> Supervisor: {<0.28802.1777>,mirrored_**supervisor}
>> Context: child_terminated
>> Reason: killed
>> Offender: [{pid,<0.28804.1777>},
>> {name,rabbit_mgmt_db},
>> {mfa,{rabbit_mgmt_db,start_**link,[]}},
>> {restart_type,permanent},
>> {shutdown,4294967295},
>> {child_type,worker}]
>>
>>
>> =SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===
>> Supervisor: {<0.28802.1777>,mirrored_**supervisor}
>> Context: start_error
>> Reason: {already_started,<19704.13906.**2335>}
>> Offender: [{pid,<0.28804.1777>},
>> {name,rabbit_mgmt_db},
>> {mfa,{rabbit_mgmt_db,start_**link,[]}},
>> {restart_type,permanent},
>> {shutdown,4294967295},
>> {child_type,worker}]
>>
>>
>> =SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===
>> Supervisor: {<0.28802.1777>,mirrored_**supervisor}
>> Context: start_error
>> Reason: {already_started,<19704.13906.**2335>}
>> Offender: [{pid,<0.28804.1777>},
>> {name,rabbit_mgmt_db},
>> {mfa,{rabbit_mgmt_db,start_**link,[]}},
>> {restart_type,permanent},
>> {shutdown,4294967295},
>> {child_type,worker}]
>>
>>
>> =SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===
>> Supervisor: {<0.28802.1777>,mirrored_**supervisor}
>> Context: start_error
>> Reason: {already_started,<19704.13906.**2335>}
>> Offender: [{pid,<0.28804.1777>},
>> {name,rabbit_mgmt_db},
>> {mfa,{rabbit_mgmt_db,start_**link,[]}},
>> {restart_type,permanent},
>> {shutdown,4294967295},
>> {child_type,worker}]
>>
>>
>> =SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===
>> Supervisor: {<0.28802.1777>,mirrored_**supervisor}
>> Context: start_error
>> Reason: {already_started,<19704.13906.**2335>}
>> Offender: [{pid,<0.28804.1777>},
>> {name,rabbit_mgmt_db},
>> {mfa,{rabbit_mgmt_db,start_**link,[]}},
>> {restart_type,permanent},
>> {shutdown,4294967295},
>> {child_type,worker}]
>>
>>
>> =SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===
>> Supervisor: {<0.28802.1777>,mirrored_**supervisor}
>> Context: start_error
>> Reason: {already_started,<19704.13906.**2335>}
>> Offender: [{pid,<0.28804.1777>},
>> {name,rabbit_mgmt_db},
>> {mfa,{rabbit_mgmt_db,start_**link,[]}},
>> {restart_type,permanent},
>> {shutdown,4294967295},
>> {child_type,worker}]
>>
>>
>> =SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===
>> Supervisor: {<0.28802.1777>,mirrored_**supervisor}
>> Context: start_error
>> Reason: {already_started,<19704.13906.**2335>}
>> Offender: [{pid,<0.28804.1777>},
>> {name,rabbit_mgmt_db},
>> {mfa,{rabbit_mgmt_db,start_**link,[]}},
>> {restart_type,permanent},
>> {shutdown,4294967295},
>> {child_type,worker}]
>>
>>
>> =SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===
>> Supervisor: {<0.28802.1777>,mirrored_**supervisor}
>> Context: start_error
>> Reason: {already_started,<19704.13906.**2335>}
>> Offender: [{pid,<0.28804.1777>},
>> {name,rabbit_mgmt_db},
>> {mfa,{rabbit_mgmt_db,start_**link,[]}},
>> {restart_type,permanent},
>> {shutdown,4294967295},
>> {child_type,worker}]
>>
>>
>> =SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===
>> Supervisor: {<0.28802.1777>,mirrored_**supervisor}
>> Context: start_error
>> Reason: {already_started,<19704.13906.**2335>}
>> Offender: [{pid,<0.28804.1777>},
>> {name,rabbit_mgmt_db},
>> {mfa,{rabbit_mgmt_db,start_**link,[]}},
>> {restart_type,permanent},
>> {shutdown,4294967295},
>> {child_type,worker}]
>>
>>
>> =SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===
>> Supervisor: {<0.28802.1777>,mirrored_**supervisor}
>> Context: start_error
>> Reason: {already_started,<19704.13906.**2335>}
>> Offender: [{pid,<0.28804.1777>},
>> {name,rabbit_mgmt_db},
>> {mfa,{rabbit_mgmt_db,start_**link,[]}},
>> {restart_type,permanent},
>> {shutdown,4294967295},
>> {child_type,worker}]
>>
>>
>> =SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===
>> Supervisor: {<0.28802.1777>,mirrored_**supervisor}
>> Context: start_error
>> Reason: {already_started,<19704.13906.**2335>}
>> Offender: [{pid,<0.28804.1777>},
>> {name,rabbit_mgmt_db},
>> {mfa,{rabbit_mgmt_db,start_**link,[]}},
>> {restart_type,permanent},
>> {shutdown,4294967295},
>> {child_type,worker}]
>>
>>
>> =SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===
>> Supervisor: {<0.28802.1777>,mirrored_**supervisor}
>> Context: shutdown
>> Reason: reached_max_restart_intensity
>> Offender: [{pid,<0.28804.1777>},
>> {name,rabbit_mgmt_db},
>> {mfa,{rabbit_mgmt_db,start_**link,[]}},
>> {restart_type,permanent},
>> {shutdown,4294967295},
>> {child_type,worker}]
>>
>>
>> =SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===
>> Supervisor: {local,rabbit_mgmt_sup}
>> Context: child_terminated
>> Reason: shutdown
>> Offender: [{pid,<0.28803.1777>},
>> {name,mirroring},
>> {mfa,
>> {mirrored_supervisor,start_**internal,
>> [rabbit_mgmt_sup,
>> [{rabbit_mgmt_db,
>> {rabbit_mgmt_db,start_link,[]}**,
>> permanent,4294967295,worker,
>> [rabbit_mgmt_db]}]]}},
>> {restart_type,permanent},
>> {shutdown,4294967295},
>> {child_type,worker}]
>>
>>
>> =SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===
>> Supervisor: {local,rabbit_mgmt_sup}
>> Context: shutdown
>> Reason: reached_max_restart_intensity
>> Offender: [{pid,<0.28803.1777>},
>> {name,mirroring},
>> {mfa,
>> {mirrored_supervisor,start_**internal,
>> [rabbit_mgmt_sup,
>> [{rabbit_mgmt_db,
>> {rabbit_mgmt_db,start_link,[]}**,
>> permanent,4294967295,worker,
>> [rabbit_mgmt_db]}]]}},
>> {restart_type,permanent},
>> {shutdown,4294967295},
>> {child_type,worker}]
>>
>>
>> the ram node's sasl.log
>> =SUPERVISOR REPORT==== 18-Sep-2012::17:24:50 ===
>> Supervisor: {local,rabbit_mgmt_sup}
>> Context: child_terminated
>> Reason: shutdown
>> Offender: [{pid,<0.14161.2335>},
>> {name,mirroring},
>> {mfa,
>> {mirrored_supervisor,start_**internal,
>> [rabbit_mgmt_sup,
>> [{rabbit_mgmt_db,
>> {rabbit_mgmt_db,start_link,[]}**,
>> permanent,4294967295,worker,
>> [rabbit_mgmt_db]}]]}},
>> {restart_type,permanent},
>> {shutdown,4294967295},
>> {child_type,worker}]
>>
>>
>> =SUPERVISOR REPORT==== 18-Sep-2012::17:24:50 ===
>> Supervisor: {local,rabbit_mgmt_sup}
>> Context: shutdown
>> Reason: reached_max_restart_intensity
>> Offender: [{pid,<0.14161.2335>},
>> {name,mirroring},
>> {mfa,
>> {mirrored_supervisor,start_**internal,
>> [rabbit_mgmt_sup,
>> [{rabbit_mgmt_db,
>> {rabbit_mgmt_db,start_link,[]}**,
>> permanent,4294967295,worker,
>> [rabbit_mgmt_db]}]]}},
>> {restart_type,permanent},
>> {shutdown,4294967295},
>> {child_type,worker}]
>>
>> Thanks for any help.
>>
>> --bidaliu
>>
>>
>>
>> ______________________________**_________________
>> rabbitmq-discuss mailing list
>> rabbitmq-discuss at lists.**rabbitmq.com<rabbitmq-discuss at lists.rabbitmq.com>
>> https://lists.rabbitmq.com/**cgi-bin/mailman/listinfo/**rabbitmq-discuss<https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss>
>>
>
>
> --
> Simon MacMullen
> RabbitMQ, VMware
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20120927/b59efdec/attachment.htm>


More information about the rabbitmq-discuss mailing list