[rabbitmq-discuss] Queue failure, potential loss of data.
Jason McIntosh
mcintoshj at gmail.com
Fri Feb 14 16:24:14 GMT 2014
BTW, here are the sasl logs from another node in the cluster:
=CRASH REPORT==== 13-Feb-2014::05:14:36 ===
crasher:
initial call: gen:init_it/6
pid: <0.987.0>
registered_name: []
exception exit: {{badmatch,{error,not_found}},
[{rabbit_amqqueue_process,i,2,[]},
{rabbit_amqqueue_process,'-infos/2-lc$^0/1-0-',2,[]},
{rabbit_amqqueue_process,'-infos/2-lc$^0/1-0-',2,[]},
{rabbit_amqqueue_process,emit_stats,2,[]},
{rabbit_event,if_enabled,3,[]},
{rabbit_amqqueue_process,'-terminate_delete/3-fun-1-',
6,[]},
{rabbit_amqqueue_process,terminate_shutdown,2,[]},
{gen_server2,terminate,3,[]}]}
in function gen_server2:terminate/3
ancestors: [rabbit_mirror_queue_slave_sup,rabbit_sup,<0.782.0>]
messages: [{'$gen_cast',
{run_backing_queue,rabbit_variable_queue,
#Fun<rabbit_variable_queue.26.70600163>}},
{'EXIT',<0.988.0>,normal}]
links: [<0.954.0>]
dictionary: [{{credit_from,<0.944.0>},1671},
{{credit_to,<0.24877.6355>},2},
{credit_blocked,[]},
{delegate,delegate_0},
{fhc_age_tree,{0,nil}},
{guid,{{2283490857,778293189,3964001052,3912480778},1}}]
trap_exit: true
status: running
heap_size: 6772
stack_size: 27
reductions: 28827118159
neighbours:
=SUPERVISOR REPORT==== 13-Feb-2014::05:14:36 ===
Supervisor: {local,
rabbit_mirror_queue_slave_sup}
Context: child_terminated
Reason: {{badmatch,{error,not_found}},
[{rabbit_amqqueue_process,i,2,[]},
{rabbit_amqqueue_process,'-infos/2-lc$^0/1-0-',2,[]},
{rabbit_amqqueue_process,'-infos/2-lc$^0/1-0-',2,[]},
{rabbit_amqqueue_process,emit_stats,2,[]},
{rabbit_event,if_enabled,3,[]},
{rabbit_amqqueue_process,'-terminate_delete/3-fun-1-',6,[]},
{rabbit_amqqueue_process,terminate_shutdown,2,[]},
{gen_server2,terminate,3,[]}]}
Offender: [{pid,<0.987.0>},
{name,rabbit_mirror_queue_slave},
{mfargs,{rabbit_mirror_queue_slave,start_link,undefined}},
{restart_type,temporary},
{shutdown,4294967295},
{child_type,worker}]
On Fri, Feb 14, 2014 at 9:43 AM, Jason McIntosh <mcintoshj at gmail.com> wrote:
>
> RabbitMQ 3.2.0
> Erlang R16B02-1
>
> We have a queue that basically stopped doing anything intelligent. Here
> are the results. What's bad about this - it appears that messages
> continued to publish and didn't hit the dead letter exchange - they just
> disappeared. In this architecture, we've got a fanout exchange that
> publishes to two queues. One of the queues is working fine still. Our
> second queue though is what dropped off. Publishing though hasn't failed
> so I'm worried we've lost data for the last data. Any input would be
> welcome on this. Here's the second queues information from the management
> gui:
> cluster at rabbitmqm10p DLX DLK D Args Active ? ? ? 0.00/s
>
> When I try and select the queue, I just get an error message:
> TypeError: Cannot read property 'ram_msg_count' of undefined
>
> Any help/advice here? Is there some way I can change this queue so I do
> NOT lose messages and publishes fail?? I thought publisher confirms (need
> to verify they're on) would have taken care of this situation - that the
> message would have had to have been consumed or persisted to disk for all
> queues or publishing would have been rejected.
> Jason
>
>
>
> =CRASH REPORT==== 13-Feb-2014::05:14:36 ===
> crasher:
> initial call: gen:init_it/6
> pid: <0.367.0>
> registered_name: []
> exception exit: {{badmatch,{error,not_found}},
> [{rabbit_mirror_queue_master,stop_all_slaves,2,[]},
>
> {rabbit_mirror_queue_master,delete_and_terminate,2,[]},
>
> {rabbit_amqqueue_process,'-terminate_delete/3-fun-1-',
> 6,[]},
> {rabbit_amqqueue_process,terminate_shutdown,2,[]},
> {gen_server2,terminate,3,[]},
> {proc_lib,wake_up,3,
> [{file,"proc_lib.erl"},{line,249}]}]}
> in function gen_server2:terminate/3
> ancestors: [rabbit_amqqueue_sup,rabbit_sup,<0.154.0>]
> messages: []
> links: [<0.250.0>,#Port<0.17147>]
> dictionary: [{{ch,<17654.9226.6150>},
> {cr,<17654.9226.6150>,#Ref<0.0.18055.20563>,
> {[],[26925191]},
> 1,
> {queue,
> [{<17654.9226.6150>,
>
> {consumer,<<"amq.ctag-LPmzPvp2doZ9pYs-cEEcFg">>,
> true,[]}}],
> [],1},
> {qstate,<17654.21979.6150>,suspended,{0,nil}},
> 4}},
> {credit_blocked,[]},
> {{ch,<17659.4312.6334>},
> {cr,<17659.4312.6334>,#Ref<0.0.18273.227308>,
> {[],[26925208]},
> 1,
> {queue,
> [{<17659.4312.6334>,
>
> {consumer,<<"amq.ctag--3Kwc_Q-QS9kcpZ9U--8-Q">>,
> true,[]}}],
> [],1},
> {qstate,<17659.2894.6334>,suspended,{0,nil}},
> 19}},
> {{ch,<17659.3911.6334>},
> {cr,<17659.3911.6334>,#Ref<0.0.18273.227286>,
> {[26925232,26925226],[26925214]},
> 1,
> {queue,[],[],0},
> {qstate,<17659.2051.6334>,active,{0,nil}},
> 22}},
> {{#Ref<0.0.0.36427>,fhc_handle},
> {handle,
> {file_descriptor,prim_file,{#Port<0.17147>,132}},
> 118224,false,5136,infinity,
> [[<<192,0,0,0,1,154,216,155>>],
> [<<192,0,0,0,1,154,216,151>>],
> [<<192,0,0,0,1,154,216,150>>],
> [<<192,0,0,0,1,154,216,149>>],
> [<<192,0,0,0,1,154,216,148>>],
> [<<192,0,0,0,1,154,216,147>>],
> [<<192,0,0,0,1,154,216,146>>],
> [<<192,0,0,0,1,154,216,144>>],
> [<<192,0,0,0,1,154,216,142>>],
> [<<192,0,0,0,1,154,216,143>>],
> [<<192,0,0,0,1,154,216,141>>],
> [<<192,0,0,0,1,154,216,140>>],
> .,...
>
>
>
> =SUPERVISOR REPORT==== 13-Feb-2014::05:14:36 ===
> Supervisor: {local,rabbit_amqqueue_sup}
> Context: child_terminated
> Reason: {{badmatch,{error,not_found}},
> [{rabbit_mirror_queue_master,stop_all_slaves,2,[]},
> {rabbit_mirror_queue_master,delete_and_terminate,2,[]},
>
> {rabbit_amqqueue_process,'-terminate_delete/3-fun-1-',6,[]},
> {rabbit_amqqueue_process,terminate_shutdown,2,[]},
> {gen_server2,terminate,3,[]},
>
> {proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,249}]}]}
> Offender: [{pid,<0.367.0>},
> {name,rabbit_amqqueue},
> {mfargs,{rabbit_amqqueue_process,start_link,undefined}},
> {restart_type,temporary},
> {shutdown,4294967295},
> {child_type,worker}]
>
>
> =SUPERVISOR REPORT==== 13-Feb-2014::10:59:28 ===
> Supervisor: {<0.19778.5266>,
> amqp_channel_sup_sup}
> Context: shutdown_error
> Reason: shutdown
> Offender: [{nb_children,1},
> {name,channel_sup},
> {mfargs,
>
> {amqp_channel_sup,start_link,[direct,<0.20460.5266>]}},
> {restart_type,temporary},
> {shutdown,brutal_kill},
> {child_type,supervisor}]
>
>
> =SUPERVISOR REPORT==== 13-Feb-2014::11:02:34 ===
> Supervisor: {<0.852.5267>,amqp_channel_sup_sup}
> Context: shutdown_error
> Reason: shutdown
> Offender: [{nb_children,1},
> {name,channel_sup},
> {mfargs,
>
> {amqp_channel_sup,start_link,[direct,<0.2623.5267>]}},
> {restart_type,temporary},
> {shutdown,brutal_kill},
> {child_type,supervisor}]
>
>
> =SUPERVISOR REPORT==== 13-Feb-2014::11:03:24 ===
> Supervisor: {<0.4628.5267>,amqp_channel_sup_sup}
> Context: shutdown_error
> Reason: shutdown
> Offender: [{nb_children,1},
> {name,channel_sup},
> {mfargs,
>
> {amqp_channel_sup,start_link,[direct,<0.5878.5267>]}},
> {restart_type,temporary},
> {shutdown,brutal_kill},
> {child_type,supervisor}]
>
>
> =CRASH REPORT==== 13-Feb-2014::11:12:31 ===
> crasher:
> initial call: gen:init_it/6
> pid: <0.4699.5268>
> registered_name: []
> exception exit: {{badmatch,true},
> [{rabbit_queue_index,init,2,[]},
> {rabbit_variable_queue,init,5,[]},
> {rabbit_mirror_queue_master,init,3,[]},
> {rabbit_amqqueue_process,declare,3,[]},
> {gen_server2,handle_msg,2,[]},
> {proc_lib,init_p_do_apply,3,
> [{file,"proc_lib.erl"},{line,239}]}]}
> in function gen_server2:terminate/3
> ancestors: [rabbit_amqqueue_sup,rabbit_sup,<0.154.0>]
> messages: []
> links: [<0.250.0>]
> dictionary: [{{xtype_to_module,direct},rabbit_exchange_type_direct}]
> trap_exit: true
> status: running
> heap_size: 1598
> stack_size: 27
> reductions: 1156
> neighbours:
>
> =SUPERVISOR REPORT==== 13-Feb-2014::11:12:31 ===
> Supervisor: {local,rabbit_amqqueue_sup}
> Context: child_terminated
> Reason: {{badmatch,true},
> [{rabbit_queue_index,init,2,[]},
> {rabbit_variable_queue,init,5,[]},
> {rabbit_mirror_queue_master,init,3,[]},
> {rabbit_amqqueue_process,declare,3,[]},
> {gen_server2,handle_msg,2,[]},
> {proc_lib,init_p_do_apply,3,
> [{file,"proc_lib.erl"},{line,239}]}]}
> Offender: [{pid,<0.4699.5268>},
> {name,rabbit_amqqueue},
> {mfargs,{rabbit_amqqueue_process,start_link,undefined}},
> {restart_type,temporary},
> {shutdown,4294967295},
> {child_type,worker}]
>
> =SUPERVISOR REPORT==== 13-Feb-2014::11:35:08 ===
> Supervisor: {<0.6708.5271>,amqp_channel_sup_sup}
> Context: shutdown_error
> Reason: shutdown
> Offender: [{nb_children,1},
> {name,channel_sup},
> {mfargs,
>
> {amqp_channel_sup,start_link,[direct,<0.7855.5271>]}},
> {restart_type,temporary},
> {shutdown,brutal_kill},
> {child_type,supervisor}]
>
>
> --
> Jason McIntosh
> https://github.com/jasonmcintosh/
> 573-424-7612
>
--
Jason McIntosh
https://github.com/jasonmcintosh/
573-424-7612
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20140214/8582f760/attachment.html>
More information about the rabbitmq-discuss
mailing list