[rabbitmq-discuss] Strange start error issue

Samuel Chen samuel.net at gmail.com
Thu Sep 20 10:19:27 BST 2012


I would update some info about the issue:

When the issue occurs, the status of rabbitMQ cluster is still running. But
the message from Celery will not be processed until we restart rabbit.
At the same time, the web admin UI shutdown. So I was doubting if the
management plugin caused this issue.

Anybody could help?
thanks.

On Thu, Sep 20, 2012 at 12:10 AM, Samuel Chen <samuel.net at gmail.com> wrote:

> Hi all
>
> I'm new to RabbitMQ. We are facing a strange issue against rabbit_mgmt_db.
> I could not find any similar issues by searching on google or
> stackoverflow. So I wonder if anyone could help to diagnose this problem.
> Thanks in advance.
>
> We have a 2-node clustered RabbitMQ integrated with Celery. It worked well
> for several months.
> The issue occurred the first time on Jul 4th. After restarted, it worked
> for about 2 months. Yesterday the issue occurred twice (one is after
> restarted).
> The stat was that a child (rabbit_mgmt_db??) was killed automatically.  By
> some failures of restarting automatically, it reached the max restart
> intensity. Eventually it was shutdown.
> (Anther situation is that we deployed to 2-node cluster from 1 node server
> at the end of June. Note sure if it caused this issue.)
>
> The hosts are virtual servers with 8/12G ram and 30G disk.
> One node is disc node and the other is ram.
> The load balance was very low (around 100M ram, few tasks) . Disk has 2.5G
> free space.
> Log as below.
>
> Thanks for any help.
>
> SUPERVISOR REPORT==== 18-Sep-2012::14:53:13 ===
>>
>>      Supervisor: {<0.28802.1777>,mirrored_supervisor}
>>
>>      Context:    child_terminated
>>
>>      Reason:     killed
>>
>>      Offender:   [{pid,<0.28804.1777>},
>>
>>                   {name,rabbit_mgmt_db},
>>
>>                   {mfa,{rabbit_mgmt_db,start_link,[]}},
>>
>>                   {restart_type,permanent},
>>
>>                   {shutdown,4294967295},
>>
>>                   {child_type,worker}]
>>
>>
>>>
>>> =SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===
>>
>>      Supervisor: {<0.28802.1777>,mirrored_supervisor}
>>
>>      Context:    start_error
>>
>>      Reason:     {already_started,<19704.13906.2335>}
>>
>>      Offender:   [{pid,<0.28804.1777>},
>>
>>                   {name,rabbit_mgmt_db},
>>
>>                   {mfa,{rabbit_mgmt_db,start_link,[]}},
>>
>>                   {restart_type,permanent},
>>
>>                   {shutdown,4294967295},
>>
>>                   {child_type,worker}]
>>
>>
>>>
>>> =SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===
>>
>>      Supervisor: {<0.28802.1777>,mirrored_supervisor}
>>
>>      Context:    start_error
>>
>>      Reason:     {already_started,<19704.13906.2335>}
>>
>>      Offender:   [{pid,<0.28804.1777>},
>>
>>                   {name,rabbit_mgmt_db},
>>
>>                   {mfa,{rabbit_mgmt_db,start_link,[]}},
>>
>>                   {restart_type,permanent},
>>
>>                   {shutdown,4294967295},
>>
>>                   {child_type,worker}]
>>
>>
>>>
>>> =SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===
>>
>>      Supervisor: {<0.28802.1777>,mirrored_supervisor}
>>
>>      Context:    start_error
>>
>>      Reason:     {already_started,<19704.13906.2335>}
>>
>>      Offender:   [{pid,<0.28804.1777>},
>>
>>                   {name,rabbit_mgmt_db},
>>
>>                   {mfa,{rabbit_mgmt_db,start_link,[]}},
>>
>>                   {restart_type,permanent},
>>
>>                   {shutdown,4294967295},
>>
>>                   {child_type,worker}]
>>
>>
>>>
>>> =SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===
>>
>>      Supervisor: {<0.28802.1777>,mirrored_supervisor}
>>
>>      Context:    start_error
>>
>>      Reason:     {already_started,<19704.13906.2335>}
>>
>>      Offender:   [{pid,<0.28804.1777>},
>>
>>                   {name,rabbit_mgmt_db},
>>
>>                   {mfa,{rabbit_mgmt_db,start_link,[]}},
>>
>>                   {restart_type,permanent},
>>
>>                   {shutdown,4294967295},
>>
>>                   {child_type,worker}]
>>
>>
>>>
>>> =SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===
>>
>>      Supervisor: {<0.28802.1777>,mirrored_supervisor}
>>
>>      Context:    start_error
>>
>>      Reason:     {already_started,<19704.13906.2335>}
>>
>>      Offender:   [{pid,<0.28804.1777>},
>>
>>                   {name,rabbit_mgmt_db},
>>
>>                   {mfa,{rabbit_mgmt_db,start_link,[]}},
>>
>>                   {restart_type,permanent},
>>
>>                   {shutdown,4294967295},
>>
>>                   {child_type,worker}]
>>
>>
>>>
>>> =SUPERVISOR REPORT==== 18-Sep-2012::14:53:14 ===
>>
>>      Supervisor: {local,rabbit_mgmt_sup}
>>
>>      Context:    shutdown
>>
>>      Reason:     reached_max_restart_intensity
>>
>>      Offender:   [{pid,<0.28803.1777>},
>>
>>                   {name,mirroring},
>>
>>                   {mfa,
>>
>>                       {mirrored_supervisor,start_internal,
>>
>>                           [rabbit_mgmt_sup,
>>
>>                            [{rabbit_mgmt_db,
>>
>>                                 {rabbit_mgmt_db,start_link,[]},
>>
>>                                 permanent,4294967295,worker,
>>
>>                                 [rabbit_mgmt_db]}]]}},
>>
>>                   {restart_type,permanent},
>>
>>                   {shutdown,4294967295},
>>
>>                   {child_type,worker}]
>>
>>
>
> - Sam
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20120920/0cb0da2a/attachment.htm>


More information about the rabbitmq-discuss mailing list