[rabbitmq-discuss] Queue failure, potential loss of data.

Jason McIntosh mcintoshj at gmail.com
Fri Feb 14 16:24:14 GMT 2014


BTW, here are the sasl logs from another node in the cluster:

=CRASH REPORT==== 13-Feb-2014::05:14:36 ===
  crasher:
    initial call: gen:init_it/6
    pid: <0.987.0>
    registered_name: []
    exception exit: {{badmatch,{error,not_found}},
                     [{rabbit_amqqueue_process,i,2,[]},
                      {rabbit_amqqueue_process,'-infos/2-lc$^0/1-0-',2,[]},
                      {rabbit_amqqueue_process,'-infos/2-lc$^0/1-0-',2,[]},
                      {rabbit_amqqueue_process,emit_stats,2,[]},
                      {rabbit_event,if_enabled,3,[]},
                      {rabbit_amqqueue_process,'-terminate_delete/3-fun-1-',
                          6,[]},
                      {rabbit_amqqueue_process,terminate_shutdown,2,[]},
                      {gen_server2,terminate,3,[]}]}
      in function  gen_server2:terminate/3
    ancestors: [rabbit_mirror_queue_slave_sup,rabbit_sup,<0.782.0>]
    messages: [{'$gen_cast',
                      {run_backing_queue,rabbit_variable_queue,
                          #Fun<rabbit_variable_queue.26.70600163>}},
                  {'EXIT',<0.988.0>,normal}]
    links: [<0.954.0>]
    dictionary: [{{credit_from,<0.944.0>},1671},
                  {{credit_to,<0.24877.6355>},2},
                  {credit_blocked,[]},
                  {delegate,delegate_0},
                  {fhc_age_tree,{0,nil}},
                  {guid,{{2283490857,778293189,3964001052,3912480778},1}}]
    trap_exit: true
    status: running
    heap_size: 6772
    stack_size: 27
    reductions: 28827118159
  neighbours:

=SUPERVISOR REPORT==== 13-Feb-2014::05:14:36 ===
     Supervisor: {local,
                                           rabbit_mirror_queue_slave_sup}
     Context:    child_terminated
     Reason:     {{badmatch,{error,not_found}},
                  [{rabbit_amqqueue_process,i,2,[]},
                   {rabbit_amqqueue_process,'-infos/2-lc$^0/1-0-',2,[]},
                   {rabbit_amqqueue_process,'-infos/2-lc$^0/1-0-',2,[]},
                   {rabbit_amqqueue_process,emit_stats,2,[]},
                   {rabbit_event,if_enabled,3,[]},

 {rabbit_amqqueue_process,'-terminate_delete/3-fun-1-',6,[]},
                   {rabbit_amqqueue_process,terminate_shutdown,2,[]},
                   {gen_server2,terminate,3,[]}]}
     Offender:   [{pid,<0.987.0>},
                  {name,rabbit_mirror_queue_slave},
                  {mfargs,{rabbit_mirror_queue_slave,start_link,undefined}},
                  {restart_type,temporary},
                  {shutdown,4294967295},
                  {child_type,worker}]





On Fri, Feb 14, 2014 at 9:43 AM, Jason McIntosh <mcintoshj at gmail.com> wrote:

>
> RabbitMQ 3.2.0
> Erlang R16B02-1
>
> We have a queue that basically stopped doing anything intelligent.  Here
> are the results.  What's bad about this - it appears that messages
> continued to publish and didn't hit the dead letter exchange - they just
> disappeared.  In this architecture, we've got a fanout exchange that
> publishes to two queues.  One of the queues is working fine still.  Our
> second queue though is what dropped off.  Publishing though hasn't failed
> so I'm worried we've lost data for the last data.  Any input would be
> welcome on this.  Here's the second queues information from the management
> gui:
>  cluster at rabbitmqm10p DLX DLK D Args  Active ? ? ? 0.00/s
>
> When I try and select the queue, I just get an error message:
> TypeError: Cannot read property 'ram_msg_count' of undefined
>
> Any help/advice here?  Is there some way I can change this queue so I do
> NOT lose messages and publishes fail??  I thought publisher confirms (need
> to verify they're on) would have taken care of this situation - that the
> message would have had to have been consumed or persisted to disk for all
> queues or publishing would have been rejected.
> Jason
>
>
>
> =CRASH REPORT==== 13-Feb-2014::05:14:36 ===
>   crasher:
>     initial call: gen:init_it/6
>     pid: <0.367.0>
>     registered_name: []
>     exception exit: {{badmatch,{error,not_found}},
>                      [{rabbit_mirror_queue_master,stop_all_slaves,2,[]},
>
> {rabbit_mirror_queue_master,delete_and_terminate,2,[]},
>
> {rabbit_amqqueue_process,'-terminate_delete/3-fun-1-',
>                           6,[]},
>                       {rabbit_amqqueue_process,terminate_shutdown,2,[]},
>                       {gen_server2,terminate,3,[]},
>                       {proc_lib,wake_up,3,
>                           [{file,"proc_lib.erl"},{line,249}]}]}
>       in function  gen_server2:terminate/3
>     ancestors: [rabbit_amqqueue_sup,rabbit_sup,<0.154.0>]
>     messages: []
>     links: [<0.250.0>,#Port<0.17147>]
>     dictionary: [{{ch,<17654.9226.6150>},
>                    {cr,<17654.9226.6150>,#Ref<0.0.18055.20563>,
>                        {[],[26925191]},
>                        1,
>                        {queue,
>                            [{<17654.9226.6150>,
>
>  {consumer,<<"amq.ctag-LPmzPvp2doZ9pYs-cEEcFg">>,
>                                  true,[]}}],
>                            [],1},
>                        {qstate,<17654.21979.6150>,suspended,{0,nil}},
>                        4}},
>                   {credit_blocked,[]},
>                   {{ch,<17659.4312.6334>},
>                    {cr,<17659.4312.6334>,#Ref<0.0.18273.227308>,
>                        {[],[26925208]},
>                        1,
>                        {queue,
>                            [{<17659.4312.6334>,
>
>  {consumer,<<"amq.ctag--3Kwc_Q-QS9kcpZ9U--8-Q">>,
>                                  true,[]}}],
>                            [],1},
>                        {qstate,<17659.2894.6334>,suspended,{0,nil}},
>                        19}},
>                   {{ch,<17659.3911.6334>},
>                    {cr,<17659.3911.6334>,#Ref<0.0.18273.227286>,
>                        {[26925232,26925226],[26925214]},
>                        1,
>                        {queue,[],[],0},
>                        {qstate,<17659.2051.6334>,active,{0,nil}},
>                        22}},
>                   {{#Ref<0.0.0.36427>,fhc_handle},
>                    {handle,
>                        {file_descriptor,prim_file,{#Port<0.17147>,132}},
>                        118224,false,5136,infinity,
>                        [[<<192,0,0,0,1,154,216,155>>],
>                         [<<192,0,0,0,1,154,216,151>>],
>                         [<<192,0,0,0,1,154,216,150>>],
>                         [<<192,0,0,0,1,154,216,149>>],
>                         [<<192,0,0,0,1,154,216,148>>],
>                         [<<192,0,0,0,1,154,216,147>>],
>                         [<<192,0,0,0,1,154,216,146>>],
>                         [<<192,0,0,0,1,154,216,144>>],
>                         [<<192,0,0,0,1,154,216,142>>],
>                         [<<192,0,0,0,1,154,216,143>>],
>                         [<<192,0,0,0,1,154,216,141>>],
>                         [<<192,0,0,0,1,154,216,140>>],
> .,...
>
>
>
> =SUPERVISOR REPORT==== 13-Feb-2014::05:14:36 ===
>      Supervisor: {local,rabbit_amqqueue_sup}
>      Context:    child_terminated
>      Reason:     {{badmatch,{error,not_found}},
>                   [{rabbit_mirror_queue_master,stop_all_slaves,2,[]},
>                    {rabbit_mirror_queue_master,delete_and_terminate,2,[]},
>
>  {rabbit_amqqueue_process,'-terminate_delete/3-fun-1-',6,[]},
>                    {rabbit_amqqueue_process,terminate_shutdown,2,[]},
>                    {gen_server2,terminate,3,[]},
>
>  {proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,249}]}]}
>      Offender:   [{pid,<0.367.0>},
>                   {name,rabbit_amqqueue},
>                   {mfargs,{rabbit_amqqueue_process,start_link,undefined}},
>                   {restart_type,temporary},
>                   {shutdown,4294967295},
>                   {child_type,worker}]
>
>
> =SUPERVISOR REPORT==== 13-Feb-2014::10:59:28 ===
>      Supervisor: {<0.19778.5266>,
>                                            amqp_channel_sup_sup}
>      Context:    shutdown_error
>      Reason:     shutdown
>      Offender:   [{nb_children,1},
>                   {name,channel_sup},
>                   {mfargs,
>
> {amqp_channel_sup,start_link,[direct,<0.20460.5266>]}},
>                   {restart_type,temporary},
>                   {shutdown,brutal_kill},
>                   {child_type,supervisor}]
>
>
> =SUPERVISOR REPORT==== 13-Feb-2014::11:02:34 ===
>      Supervisor: {<0.852.5267>,amqp_channel_sup_sup}
>      Context:    shutdown_error
>      Reason:     shutdown
>      Offender:   [{nb_children,1},
>                   {name,channel_sup},
>                   {mfargs,
>
> {amqp_channel_sup,start_link,[direct,<0.2623.5267>]}},
>                   {restart_type,temporary},
>                   {shutdown,brutal_kill},
>                   {child_type,supervisor}]
>
>
> =SUPERVISOR REPORT==== 13-Feb-2014::11:03:24 ===
>      Supervisor: {<0.4628.5267>,amqp_channel_sup_sup}
>      Context:    shutdown_error
>      Reason:     shutdown
>      Offender:   [{nb_children,1},
>                   {name,channel_sup},
>                   {mfargs,
>
> {amqp_channel_sup,start_link,[direct,<0.5878.5267>]}},
>                   {restart_type,temporary},
>                   {shutdown,brutal_kill},
>                   {child_type,supervisor}]
>
>
> =CRASH REPORT==== 13-Feb-2014::11:12:31 ===
>   crasher:
>     initial call: gen:init_it/6
>     pid: <0.4699.5268>
>     registered_name: []
>     exception exit: {{badmatch,true},
>                      [{rabbit_queue_index,init,2,[]},
>                       {rabbit_variable_queue,init,5,[]},
>                       {rabbit_mirror_queue_master,init,3,[]},
>                       {rabbit_amqqueue_process,declare,3,[]},
>                       {gen_server2,handle_msg,2,[]},
>                       {proc_lib,init_p_do_apply,3,
>                                 [{file,"proc_lib.erl"},{line,239}]}]}
>       in function  gen_server2:terminate/3
>     ancestors: [rabbit_amqqueue_sup,rabbit_sup,<0.154.0>]
>     messages: []
>     links: [<0.250.0>]
>     dictionary: [{{xtype_to_module,direct},rabbit_exchange_type_direct}]
>     trap_exit: true
>     status: running
>     heap_size: 1598
>     stack_size: 27
>     reductions: 1156
>   neighbours:
>
> =SUPERVISOR REPORT==== 13-Feb-2014::11:12:31 ===
>      Supervisor: {local,rabbit_amqqueue_sup}
>      Context:    child_terminated
>      Reason:     {{badmatch,true},
>                   [{rabbit_queue_index,init,2,[]},
>                    {rabbit_variable_queue,init,5,[]},
>                    {rabbit_mirror_queue_master,init,3,[]},
>                    {rabbit_amqqueue_process,declare,3,[]},
>                    {gen_server2,handle_msg,2,[]},
>                    {proc_lib,init_p_do_apply,3,
>                              [{file,"proc_lib.erl"},{line,239}]}]}
>      Offender:   [{pid,<0.4699.5268>},
>                   {name,rabbit_amqqueue},
>                   {mfargs,{rabbit_amqqueue_process,start_link,undefined}},
>                   {restart_type,temporary},
>                   {shutdown,4294967295},
>                   {child_type,worker}]
>
> =SUPERVISOR REPORT==== 13-Feb-2014::11:35:08 ===
>      Supervisor: {<0.6708.5271>,amqp_channel_sup_sup}
>      Context:    shutdown_error
>      Reason:     shutdown
>      Offender:   [{nb_children,1},
>                   {name,channel_sup},
>                   {mfargs,
>
> {amqp_channel_sup,start_link,[direct,<0.7855.5271>]}},
>                   {restart_type,temporary},
>                   {shutdown,brutal_kill},
>                   {child_type,supervisor}]
>
>
> --
> Jason McIntosh
> https://github.com/jasonmcintosh/
> 573-424-7612
>



-- 
Jason McIntosh
https://github.com/jasonmcintosh/
573-424-7612
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20140214/8582f760/attachment.html>


More information about the rabbitmq-discuss mailing list