[rabbitmq-discuss] rabbitmq fake-death && no breath. key word in log : mirrored_supervisor rabbit_mgmt_db

Simon MacMullen simon at rabbitmq.com
Thu Sep 20 18:01:01 BST 2012


Hi.

This looks like a race with management DB failover that we've already 
fixed, but which has not yet made it into any release. This will be 
fixed in the next bugfix release. Until then if you are affected you can 
work around it by running the management plugin on only one node in the 
cluster (and just have rabbitmq_management_agent on the others).

Cheers, Simon

On 20/09/12 08:45, liubida wrote:
> hi all,
>
> dose anybody could point me just the direction for finding the reason
> for a fake-dead with the RabbitMQ.
> it seem the rabbitmq could not receive messages in some point, without
> any warning.
> i have constructed a cluster with 2 nodes, one disc and one ram.
> if i type the command "rabbitmqctl stop_app && rabbitmqctl start_app",
> the troubleshoot disappered.
>
>
> here is the disc node's sasl.log
>
> =SUPERVISOR REPORT==== 18-Sep-2012::14:53:13 ===
> Supervisor: {<0.28802.1777>,mirrored_supervisor}
> Context: child_terminated
> Reason: killed
> Offender: [{pid,<0.28804.1777>},
> {name,rabbit_mgmt_db},
> {mfa,{rabbit_mgmt_db,start_link,[]}},
> {restart_type,permanent},
> {shutdown,4294967295},
> {child_type,worker}]
>
>
> =SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===
> Supervisor: {<0.28802.1777>,mirrored_supervisor}
> Context: start_error
> Reason: {already_started,<19704.13906.2335>}
> Offender: [{pid,<0.28804.1777>},
> {name,rabbit_mgmt_db},
> {mfa,{rabbit_mgmt_db,start_link,[]}},
> {restart_type,permanent},
> {shutdown,4294967295},
> {child_type,worker}]
>
>
> =SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===
> Supervisor: {<0.28802.1777>,mirrored_supervisor}
> Context: start_error
> Reason: {already_started,<19704.13906.2335>}
> Offender: [{pid,<0.28804.1777>},
> {name,rabbit_mgmt_db},
> {mfa,{rabbit_mgmt_db,start_link,[]}},
> {restart_type,permanent},
> {shutdown,4294967295},
> {child_type,worker}]
>
>
> =SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===
> Supervisor: {<0.28802.1777>,mirrored_supervisor}
> Context: start_error
> Reason: {already_started,<19704.13906.2335>}
> Offender: [{pid,<0.28804.1777>},
> {name,rabbit_mgmt_db},
> {mfa,{rabbit_mgmt_db,start_link,[]}},
> {restart_type,permanent},
> {shutdown,4294967295},
> {child_type,worker}]
>
>
> =SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===
> Supervisor: {<0.28802.1777>,mirrored_supervisor}
> Context: start_error
> Reason: {already_started,<19704.13906.2335>}
> Offender: [{pid,<0.28804.1777>},
> {name,rabbit_mgmt_db},
> {mfa,{rabbit_mgmt_db,start_link,[]}},
> {restart_type,permanent},
> {shutdown,4294967295},
> {child_type,worker}]
>
>
> =SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===
> Supervisor: {<0.28802.1777>,mirrored_supervisor}
> Context: start_error
> Reason: {already_started,<19704.13906.2335>}
> Offender: [{pid,<0.28804.1777>},
> {name,rabbit_mgmt_db},
> {mfa,{rabbit_mgmt_db,start_link,[]}},
> {restart_type,permanent},
> {shutdown,4294967295},
> {child_type,worker}]
>
>
> =SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===
> Supervisor: {<0.28802.1777>,mirrored_supervisor}
> Context: start_error
> Reason: {already_started,<19704.13906.2335>}
> Offender: [{pid,<0.28804.1777>},
> {name,rabbit_mgmt_db},
> {mfa,{rabbit_mgmt_db,start_link,[]}},
> {restart_type,permanent},
> {shutdown,4294967295},
> {child_type,worker}]
>
>
> =SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===
> Supervisor: {<0.28802.1777>,mirrored_supervisor}
> Context: start_error
> Reason: {already_started,<19704.13906.2335>}
> Offender: [{pid,<0.28804.1777>},
> {name,rabbit_mgmt_db},
> {mfa,{rabbit_mgmt_db,start_link,[]}},
> {restart_type,permanent},
> {shutdown,4294967295},
> {child_type,worker}]
>
>
> =SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===
> Supervisor: {<0.28802.1777>,mirrored_supervisor}
> Context: start_error
> Reason: {already_started,<19704.13906.2335>}
> Offender: [{pid,<0.28804.1777>},
> {name,rabbit_mgmt_db},
> {mfa,{rabbit_mgmt_db,start_link,[]}},
> {restart_type,permanent},
> {shutdown,4294967295},
> {child_type,worker}]
>
>
> =SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===
> Supervisor: {<0.28802.1777>,mirrored_supervisor}
> Context: start_error
> Reason: {already_started,<19704.13906.2335>}
> Offender: [{pid,<0.28804.1777>},
> {name,rabbit_mgmt_db},
> {mfa,{rabbit_mgmt_db,start_link,[]}},
> {restart_type,permanent},
> {shutdown,4294967295},
> {child_type,worker}]
>
>
> =SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===
> Supervisor: {<0.28802.1777>,mirrored_supervisor}
> Context: start_error
> Reason: {already_started,<19704.13906.2335>}
> Offender: [{pid,<0.28804.1777>},
> {name,rabbit_mgmt_db},
> {mfa,{rabbit_mgmt_db,start_link,[]}},
> {restart_type,permanent},
> {shutdown,4294967295},
> {child_type,worker}]
>
>
> =SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===
> Supervisor: {<0.28802.1777>,mirrored_supervisor}
> Context: shutdown
> Reason: reached_max_restart_intensity
> Offender: [{pid,<0.28804.1777>},
> {name,rabbit_mgmt_db},
> {mfa,{rabbit_mgmt_db,start_link,[]}},
> {restart_type,permanent},
> {shutdown,4294967295},
> {child_type,worker}]
>
>
> =SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===
> Supervisor: {local,rabbit_mgmt_sup}
> Context: child_terminated
> Reason: shutdown
> Offender: [{pid,<0.28803.1777>},
> {name,mirroring},
> {mfa,
> {mirrored_supervisor,start_internal,
> [rabbit_mgmt_sup,
> [{rabbit_mgmt_db,
> {rabbit_mgmt_db,start_link,[]},
> permanent,4294967295,worker,
> [rabbit_mgmt_db]}]]}},
> {restart_type,permanent},
> {shutdown,4294967295},
> {child_type,worker}]
>
>
> =SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===
> Supervisor: {local,rabbit_mgmt_sup}
> Context: shutdown
> Reason: reached_max_restart_intensity
> Offender: [{pid,<0.28803.1777>},
> {name,mirroring},
> {mfa,
> {mirrored_supervisor,start_internal,
> [rabbit_mgmt_sup,
> [{rabbit_mgmt_db,
> {rabbit_mgmt_db,start_link,[]},
> permanent,4294967295,worker,
> [rabbit_mgmt_db]}]]}},
> {restart_type,permanent},
> {shutdown,4294967295},
> {child_type,worker}]
>
>
> the ram node's sasl.log
> =SUPERVISOR REPORT==== 18-Sep-2012::17:24:50 ===
> Supervisor: {local,rabbit_mgmt_sup}
> Context: child_terminated
> Reason: shutdown
> Offender: [{pid,<0.14161.2335>},
> {name,mirroring},
> {mfa,
> {mirrored_supervisor,start_internal,
> [rabbit_mgmt_sup,
> [{rabbit_mgmt_db,
> {rabbit_mgmt_db,start_link,[]},
> permanent,4294967295,worker,
> [rabbit_mgmt_db]}]]}},
> {restart_type,permanent},
> {shutdown,4294967295},
> {child_type,worker}]
>
>
> =SUPERVISOR REPORT==== 18-Sep-2012::17:24:50 ===
> Supervisor: {local,rabbit_mgmt_sup}
> Context: shutdown
> Reason: reached_max_restart_intensity
> Offender: [{pid,<0.14161.2335>},
> {name,mirroring},
> {mfa,
> {mirrored_supervisor,start_internal,
> [rabbit_mgmt_sup,
> [{rabbit_mgmt_db,
> {rabbit_mgmt_db,start_link,[]},
> permanent,4294967295,worker,
> [rabbit_mgmt_db]}]]}},
> {restart_type,permanent},
> {shutdown,4294967295},
> {child_type,worker}]
>
> Thanks for any help.
>
> --bidaliu
>
>
>
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss


-- 
Simon MacMullen
RabbitMQ, VMware


More information about the rabbitmq-discuss mailing list