[rabbitmq-discuss] Almost all statistics drop to 0 every hour

Simon MacMullen simon at rabbitmq.com
Tue Nov 13 16:07:04 GMT 2012


Ah, thank you.

So, yes, the rabbit_mgmt_db process is quite overwhelmed. It looks like 
it is too busy responding to queries to process new stats events for a 
while.

Many of those stack traces show rabbit_mgmt_db:augment_queue_pid/2, 
which we're already aware is a hotspot, I guess fixing that just got 
bumped up the priority list.

This will be mitigated slightly in RabbitMQ 3.0 since that will bump the 
priority of the rabbit_mgmt_db process.

But I'm afraid this isn't going to be fixed immediately, so the best I 
can suggest right now is that you try to find what it is that's hitting 
it once an hour.

Cheers, Simon

On 13/11/12 15:25, Irmo Manie wrote:
> Here goes, randomly executed the command. Last 5 calls are done when
> publish rate dropped to 0.
>
> chvirrmqprd01:~ # rabbitmqctl eval
> 'process_info(global:whereis_name(rabbit_mgmt_db),
> [message_queue_len,status, current_stacktrace]).'
> [{message_queue_len,4},
> {status,runnable},
> {current_stacktrace,[{rabbit_mgmt_db,lookup_element,3,[]},
>                        {rabbit_mgmt_db,handle_stats,6,[]},
>                        {rabbit_mgmt_db,handle_event,2,[]},
>                        {rabbit_mgmt_db,handle_cast,2,[]},
>                        {gen_server,handle_msg,5,
>                                    [{file,"gen_server.erl"},{line,597}]},
>                        {proc_lib,init_p_do_apply,3,
>                                  [{file,"proc_lib.erl"},{line,227}]}]}]
> ...done.
> chvirrmqprd01:~ # rabbitmqctl eval
> 'process_info(global:whereis_name(rabbit_mgmt_db),
> [message_queue_len,status, current_stacktrace]).'
> [{message_queue_len,1887},
> {status,runnable},
> {current_stacktrace,[{mnesia_lib,db_fixtable,3,
>                                    [{file,"mnesia_lib.erl"},{line,1127}]},
>                        {mnesia_lib,db_match_object,3,
>                                    [{file,"mnesia_lib.erl"},{line,1087}]},
>                        {rpc,local_call,3,[{file,"rpc.erl"},{line,327}]},
>                        {mnesia,do_dirty_rpc,5,
>                                [{file,"mnesia.erl"},{line,1807}]},
>                        {rabbit_mgmt_db,augment_queue_pid,2,[]},
>                        {rabbit_mgmt_db,augment,4,[]},
>                        {rabbit_mgmt_db,'-augment/3-lc$^0/1-0-',3,[]},
>                        {rabbit_mgmt_db,'-augment/3-lc$^0/1-0-',3,[]}]}]
> ...done.
> chvirrmqprd01:~ # rabbitmqctl eval
> 'process_info(global:whereis_name(rabbit_mgmt_db),
> [message_queue_len,status, current_stacktrace]).'
> [{message_queue_len,1040},
> {status,runnable},
> {current_stacktrace,[{mnesia_lib,db_fixtable,3,
>                                    [{file,"mnesia_lib.erl"},{line,1127}]},
>                        {mnesia_lib,db_match_object,3,
>                                    [{file,"mnesia_lib.erl"},{line,1087}]},
>                        {rpc,local_call,3,[{file,"rpc.erl"},{line,327}]},
>                        {mnesia,do_dirty_rpc,5,
>                                [{file,"mnesia.erl"},{line,1807}]},
>                        {rabbit_mgmt_db,augment_queue_pid,2,[]},
>                        {rabbit_mgmt_db,augment,4,[]},
>                        {rabbit_mgmt_db,'-augment/3-lc$^0/1-0-',3,[]},
>                        {rabbit_mgmt_db,'-augment/3-lc$^0/1-0-',3,[]}]}]
> ...done.
> chvirrmqprd01:~ # rabbitmqctl eval
> 'process_info(global:whereis_name(rabbit_mgmt_db),
> [message_queue_len,status, current_stacktrace]).'
> [{message_queue_len,3165},
> {status,runnable},
> {current_stacktrace,[{dict,update_bkt,4,[{file,"dict.erl"},{line,321}]},
>                        {dict,update_bkt,4,[{file,"dict.erl"},{line,324}]},
>                        {dict,update_bkt,4,[{file,"dict.erl"},{line,324}]},
>                        {dict,on_bucket,3,[{file,"dict.erl"},{line,414}]},
>                        {dict,update,4,[{file,"dict.erl"},{line,317}]},
>                        {lists,foldl,3,[{file,"lists.erl"},{line,1197}]},
>                        {rabbit_mgmt_db,group_sum,2,[]},
>
> {rabbit_mgmt_db,'-fine_stats_fun/2-lc$^0/1-0-',2,[]}]}]
> ...done.
> chvirrmqprd01:~ # rabbitmqctl eval
> 'process_info(global:whereis_name(rabbit_mgmt_db),
> [message_queue_len,status, current_stacktrace]).'
> [{message_queue_len,2423},
> {status,runnable},
> {current_stacktrace,[{lists,suffix,2,[{file,"lists.erl"},{line,151}]},
>                        {rabbit_mgmt_db,maybe_zero_rate,2,[]},
>                        {rabbit_mgmt_db,'-zero_old_rates/2-lc$^0/1-0-',2,[]},
>
> {rabbit_mgmt_db,'-get_fine_stats_from_list/3-lc$^0/1-0-',
>                                        2,[]},
>
> {rabbit_mgmt_db,'-get_fine_stats_from_list/3-lc$^0/1-0-',
>                                        2,[]},
>                        {rabbit_mgmt_db,get_fine_stats_from_list,3,[]},
>                        {rabbit_mgmt_db,'-fine_stats_fun/2-lc$^0/1-0-',2,[]},
>
> {rabbit_mgmt_db,'-fine_stats_fun/2-lc$^0/1-0-',2,[]}]}]
> ...done.
> chvirrmqprd01:~ # rabbitmqctl eval
> 'process_info(global:whereis_name(rabbit_mgmt_db),
> [message_queue_len,status, current_stacktrace]).'
> [{message_queue_len,1454},
> {status,runnable},
> {current_stacktrace,[{mnesia_lib,db_fixtable,3,
>                                    [{file,"mnesia_lib.erl"},{line,1127}]},
>                        {mnesia_lib,db_match_object,3,
>                                    [{file,"mnesia_lib.erl"},{line,1087}]},
>                        {rpc,local_call,3,[{file,"rpc.erl"},{line,327}]},
>                        {mnesia,do_dirty_rpc,5,
>                                [{file,"mnesia.erl"},{line,1807}]},
>                        {rabbit_mgmt_db,augment_queue_pid,2,[]},
>                        {rabbit_mgmt_db,augment,4,[]},
>                        {rabbit_mgmt_db,'-augment/3-lc$^0/1-0-',3,[]},
>                        {rabbit_mgmt_db,'-augment/3-lc$^0/1-0-',3,[]}]}]
> ...done.
> chvirrmqprd01:~ # rabbitmqctl eval
> 'process_info(global:whereis_name(rabbit_mgmt_db),
> [message_queue_len,status, current_stacktrace]).'
> [{message_queue_len,0},
> {status,waiting},
> {current_stacktrace,[{gen_server,loop,6,[{file,"gen_server.erl"},{line,348}]},
>                        {proc_lib,init_p_do_apply,3,
>                                  [{file,"proc_lib.erl"},{line,227}]}]}]
> ...done.
> chvirrmqprd01:~ # rabbitmqctl eval
> 'process_info(global:whereis_name(rabbit_mgmt_db),
> [message_queue_len,status, current_stacktrace]).'
> [{message_queue_len,757},
> {status,runnable},
> {current_stacktrace,[{mnesia_lib,db_fixtable,3,
>                                    [{file,"mnesia_lib.erl"},{line,1127}]},
>                        {mnesia_lib,db_match_object,3,
>                                    [{file,"mnesia_lib.erl"},{line,1087}]},
>                        {rpc,local_call,3,[{file,"rpc.erl"},{line,327}]},
>                        {mnesia,do_dirty_rpc,5,
>                                [{file,"mnesia.erl"},{line,1807}]},
>                        {rabbit_mgmt_db,augment_queue_pid,2,[]},
>                        {rabbit_mgmt_db,augment,4,[]},
>                        {rabbit_mgmt_db,'-augment/3-lc$^0/1-0-',3,[]},
>                        {rabbit_mgmt_db,'-augment/3-lc$^0/1-0-',3,[]}]}]
> ...done.
> chvirrmqprd01:~ # rabbitmqctl eval
> 'process_info(global:whereis_name(rabbit_mgmt_db),
> [message_queue_len,status, current_stacktrace]).'
> [{message_queue_len,876},
> {status,runnable},
> {current_stacktrace,[{mnesia_lib,db_fixtable,3,
>                                    [{file,"mnesia_lib.erl"},{line,1127}]},
>                        {mnesia_lib,db_match_object,3,
>                                    [{file,"mnesia_lib.erl"},{line,1087}]},
>                        {rpc,local_call,3,[{file,"rpc.erl"},{line,327}]},
>                        {mnesia,do_dirty_rpc,5,
>                                [{file,"mnesia.erl"},{line,1807}]},
>                        {rabbit_mgmt_db,augment_queue_pid,2,[]},
>                        {rabbit_mgmt_db,augment,4,[]},
>                        {rabbit_mgmt_db,'-augment/3-lc$^0/1-0-',3,[]},
>                        {rabbit_mgmt_db,'-augment/3-lc$^0/1-0-',3,[]}]}]
> ...done.
> chvirrmqprd01:~ # rabbitmqctl eval
> 'process_info(global:whereis_name(rabbit_mgmt_db),
> [message_queue_len,status, current_stacktrace]).'
> [{message_queue_len,1677},
> {status,runnable},
> {current_stacktrace,[{rabbit_mgmt_db,if_unknown,2,[]},
>                        {rabbit_mgmt_db,gs_update_add,3,[]},
>                        {rabbit_mgmt_db,'-gs_update/3-lc$^0/1-0-',3,[]},
>                        {rabbit_mgmt_db,'-gs_update/3-lc$^0/1-0-',3,[]},
>                        {lists,foldl,3,[{file,"lists.erl"},{line,1197}]},
>                        {dict,map_bucket,2,[{file,"dict.erl"},{line,459}]},
>                        {dict,map_bucket,2,[{file,"dict.erl"},{line,459}]},
>
> {dict,map_bkt_list,2,[{file,"dict.erl"},{line,455}]}]}]
> ...done.
> chvirrmqprd01:~ # rabbitmqctl eval
> 'process_info(global:whereis_name(rabbit_mgmt_db),
> [message_queue_len,status, current_stacktrace]).'
> [{message_queue_len,0},
> {status,waiting},
> {current_stacktrace,[{gen_server,loop,6,[{file,"gen_server.erl"},{line,348}]},
>                        {proc_lib,init_p_do_apply,3,
>                                  [{file,"proc_lib.erl"},{line,227}]}]}]
> ...done.
> chvirrmqprd01:~ # rabbitmqctl eval
> 'process_info(global:whereis_name(rabbit_mgmt_db),
> [message_queue_len,status, current_stacktrace]).'
> [{message_queue_len,735},
> {status,runnable},
> {current_stacktrace,[{mnesia_lib,db_fixtable,3,
>                                    [{file,"mnesia_lib.erl"},{line,1127}]},
>                        {mnesia_lib,db_match_object,3,
>                                    [{file,"mnesia_lib.erl"},{line,1087}]},
>                        {rpc,local_call,3,[{file,"rpc.erl"},{line,327}]},
>                        {mnesia,do_dirty_rpc,5,
>                                [{file,"mnesia.erl"},{line,1807}]},
>                        {rabbit_mgmt_db,augment_queue_pid,2,[]},
>                        {rabbit_mgmt_db,augment,4,[]},
>                        {rabbit_mgmt_db,'-augment/3-lc$^0/1-0-',3,[]},
>                        {rabbit_mgmt_db,'-augment/3-lc$^0/1-0-',3,[]}]}]
> ...done.
> chvirrmqprd01:~ # rabbitmqctl eval
> 'process_info(global:whereis_name(rabbit_mgmt_db),
> [message_queue_len,status, current_stacktrace]).'
> [{message_queue_len,157},
> {status,runnable},
> {current_stacktrace,[{lists,nthtail,2,[{file,"lists.erl"},{line,127}]},
>                        {lists,suffix,2,[{file,"lists.erl"},{line,153}]},
>                        {rabbit_mgmt_db,maybe_zero_rate,2,[]},
>                        {rabbit_mgmt_db,'-zero_old_rates/2-lc$^0/1-0-',2,[]},
>                        {rabbit_mgmt_db,'-zero_old_rates/2-lc$^0/1-0-',2,[]},
>
> {rabbit_mgmt_db,'-get_fine_stats_from_list/3-lc$^0/1-0-',
>                                        2,[]},
>
> {rabbit_mgmt_db,'-get_fine_stats_from_list/3-lc$^0/1-0-',
>                                        2,[]},
>                        {rabbit_mgmt_db,get_fine_stats_from_list,3,[]}]}]
> ...done.
> chvirrmqprd01:~ # rabbitmqctl eval
> 'process_info(global:whereis_name(rabbit_mgmt_db),
> [message_queue_len,status, current_stacktrace]).'
> [{message_queue_len,0},
> {status,waiting},
> {current_stacktrace,[{gen_server,loop,6,[{file,"gen_server.erl"},{line,348}]},
>                        {proc_lib,init_p_do_apply,3,
>                                  [{file,"proc_lib.erl"},{line,227}]}]}]
> ...done.
> chvirrmqprd01:~ # rabbitmqctl eval
> 'process_info(global:whereis_name(rabbit_mgmt_db),
> [message_queue_len,status, current_stacktrace]).'
> [{message_queue_len,3419},
> {status,runnable},
> {current_stacktrace,[{mnesia_lib,db_fixtable,3,
>                                    [{file,"mnesia_lib.erl"},{line,1127}]},
>                        {mnesia_lib,db_match_object,3,
>                                    [{file,"mnesia_lib.erl"},{line,1087}]},
>                        {rpc,local_call,3,[{file,"rpc.erl"},{line,327}]},
>                        {mnesia,do_dirty_rpc,5,
>                                [{file,"mnesia.erl"},{line,1807}]},
>                        {rabbit_mgmt_db,augment_queue_pid,2,[]},
>                        {rabbit_mgmt_db,augment,4,[]},
>                        {rabbit_mgmt_db,'-augment/3-lc$^0/1-0-',3,[]},
>                        {rabbit_mgmt_db,'-augment/3-lc$^0/1-0-',3,[]}]}]
> ...done.
> chvirrmqprd01:~ # rabbitmqctl eval
> 'process_info(global:whereis_name(rabbit_mgmt_db),
> [message_queue_len,status, current_stacktrace]).'
> [{message_queue_len,3942},
> {status,runnable},
> {current_stacktrace,[{mnesia_lib,db_fixtable,3,
>                                    [{file,"mnesia_lib.erl"},{line,1127}]},
>                        {mnesia_lib,db_match_object,3,
>                                    [{file,"mnesia_lib.erl"},{line,1087}]},
>                        {rpc,local_call,3,[{file,"rpc.erl"},{line,327}]},
>                        {mnesia,do_dirty_rpc,5,
>                                [{file,"mnesia.erl"},{line,1807}]},
>                        {rabbit_mgmt_db,augment_queue_pid,2,[]},
>                        {rabbit_mgmt_db,augment,4,[]},
>                        {rabbit_mgmt_db,'-augment/3-lc$^0/1-0-',3,[]},
>                        {rabbit_mgmt_db,'-augment/3-lc$^0/1-0-',3,[]}]}]
> ...done.
> chvirrmqprd01:~ # rabbitmqctl eval
> 'process_info(global:whereis_name(rabbit_mgmt_db),
> [message_queue_len,status, current_stacktrace]).'
> [{message_queue_len,1940},
> {status,runnable},
> {current_stacktrace,[{mnesia_lib,db_fixtable,3,
>                                    [{file,"mnesia_lib.erl"},{line,1127}]},
>                        {mnesia_lib,db_match_object,3,
>                                    [{file,"mnesia_lib.erl"},{line,1087}]},
>                        {rpc,local_call,3,[{file,"rpc.erl"},{line,327}]},
>                        {mnesia,do_dirty_rpc,5,
>                                [{file,"mnesia.erl"},{line,1807}]},
>                        {rabbit_mgmt_db,augment_queue_pid,2,[]},
>                        {rabbit_mgmt_db,augment,4,[]},
>                        {rabbit_mgmt_db,'-augment/3-lc$^0/1-0-',3,[]},
>                        {rabbit_mgmt_db,'-augment/3-lc$^0/1-0-',3,[]}]}]
> ...done.
> chvirrmqprd01:~ # rabbitmqctl eval
> 'process_info(global:whereis_name(rabbit_mgmt_db),
> [message_queue_len,status, current_stacktrace]).'
> [{message_queue_len,0},
> {status,waiting},
> {current_stacktrace,[{gen_server,loop,6,[{file,"gen_server.erl"},{line,348}]},
>                        {proc_lib,init_p_do_apply,3,
>                                  [{file,"proc_lib.erl"},{line,227}]}]}]
> ...done.
> chvirrmqprd01:~ # rabbitmqctl eval
> 'process_info(global:whereis_name(rabbit_mgmt_db),
> [message_queue_len,status, current_stacktrace]).'
> [{message_queue_len,1102},
> {status,runnable},
> {current_stacktrace,[{mnesia_lib,db_fixtable,3,
>                                    [{file,"mnesia_lib.erl"},{line,1127}]},
>                        {mnesia_lib,db_match_object,3,
>                                    [{file,"mnesia_lib.erl"},{line,1087}]},
>                        {rpc,local_call,3,[{file,"rpc.erl"},{line,327}]},
>                        {mnesia,do_dirty_rpc,5,
>                                [{file,"mnesia.erl"},{line,1807}]},
>                        {rabbit_mgmt_db,augment_queue_pid,2,[]},
>                        {rabbit_mgmt_db,augment,4,[]},
>                        {rabbit_mgmt_db,'-augment/3-lc$^0/1-0-',3,[]},
>                        {rabbit_mgmt_db,'-augment/3-lc$^0/1-0-',3,[]}]}]
> ...done.
>
>
>
> On Tue, Nov 13, 2012 at 3:23 PM, Irmo Manie <irmo.manie at gmail.com
> <mailto:irmo.manie at gmail.com>> wrote:
>
>     On Tue, Nov 13, 2012 at 12:24 PM, Simon MacMullen
>     <simon at rabbitmq.com <mailto:simon at rabbitmq.com>> wrote:
>
>         On 13/11/12 10:35, Irmo Manie wrote:
>
>             I'm still seeing this behavior and it seems to get worse.
>             Every hour there is a drop in delivery rate to 0 and now
>             also the
>             publish rate drops to 0.
>             Sometimes even for a period of 5 minutes
>
>             I'm really sure that messaging continues and the statistics
>             are a lie.
>             We use rabbitmq as our main communication hub and certainly
>             a drop in
>             real-time market data going from ~500 msg/s to 0 for about 5
>             minutes
>             must be noticed somewhere if this is true.
>
>             I'm wondering if it might be the writing part of the
>             statistics to the
>             database. Could that timeout?
>
>
>         No.
>
>         But if the database is busy with something else then it could
>         take a while to process the updates it's received. I find it
>         hard to believe that it could be busy for 5 minutes though.
>
>         There is nothing scheduled once an hour in RabbitMQ though - so
>         whatever is happening once an hour must be outside the broker.
>         Figuring out what that is would help a lot.
>
>
>     Clear :-)
>
>
>             I don't really want to restart to see if that fixes the problem.
>             It's been running for 177 days now and apart from the weird
>             statistics
>             we don't have any problems.
>
>             Is there anything I could do as additional checks to figure
>             out where
>             the problem might be?
>
>
>         Just to be clear: is this just the rates in /api/overview or the
>         ones in /api/queues as well?
>
>
>     Both show 0 as the rates
>
>         I'd also be interested to see the output of:
>
>         rabbitmqctl eval
>         'process_info(global:whereis___name(rabbit_mgmt_db),
>         [message_queue_len,status, current_stacktrace]).'
>
>         when the statistics drop to 0.
>
>
>     I'll try to check this and let you know if I find anything weird
>
>     Thanks so far for the information and help.
>
>     - Irmo
>
>
>         Cheers, Simon
>
>         --
>         Simon MacMullen
>         RabbitMQ, VMware
>
>
>


-- 
Simon MacMullen
RabbitMQ, VMware


More information about the rabbitmq-discuss mailing list