[rabbitmq-discuss] separating wheat from chaff inside the stats from rabbitmq_management

Mon Apr 8 10:52:26 BST 2013

On 05/04/13 16:27, ed anderson wrote:
> I'm using rabbitmq version 2.8.7, and i'm curious about what some of the
> stats that come out of the management plugin mean..

Hi!

> These are the things that I don't grok ( please direct me to the decoder
> ring if it exists, I'm happy to read about this topic, but I haven't
> found much via google-search ):
> FWIW, I'm interested in monitoring operational readiness, and throughput.
>
> inside /api/nodes ..
>   *  Is it good/bad/neither that mem_atom_used is 99% of mem_atom? Does
> this tell me anything about performance or availability?

In 2.x most of these mem_ things are taken straight from the figures 
reported by the underlying Erlang VM, without much regard for usefulness 
- see http://www.erlang.org/doc/man/erlang.html#memory-0.

In 3.0 this was rewritten to give a (somewhat) more RabbitMQ-centric and 
thus hopefully useful view of things.

>          "mem_atom": 1547217,
>          "mem_atom_used": 1536812,

>   * mem_alarm ( not listed ) , is clearly valuable for availability
> monitoring, but how is it triggered?

See memory based flow control at http://www.rabbitmq.com/memory.html

>          "mem_binary": 6349904,
>          "mem_code": 17345156,
>          "mem_ets": 1961472,
>          "mem_limit": 787570688,
>          "mem_proc": 18586392,
>          "mem_proc_used": 18559872,
>          "mem_used": 47203120,
>   * mem_atom_used + mem_binary + mem_code + mem_ets + mem_proc is
> _almost_ equal to mem_used, but not quite. It doesn't appear to be off
> by any reported counter.   Do I even care about that?  What matters from
> these counters?
>   * is mem_used / mem_limit  enough to keep track of my memory-related
> utilization and health?

Pretty much so. We occasionally might want to use the others for 
debugging, but they're quite obscure.

>   * run_queue ?  What is this a reference to ?

Again, quite obscure. This is an approximate equivalent of load average 
for Erlang processes inside the VM.

> inside /api/queues/{vhost}
>   * sometimes message_stats is in the output, and sometimes not.

...depending on whether any channels currently exist that are publishing 
to / consuming from the queue.

>   * if present, message_stats has two possible formats.

...depending on whether that's been going on long enough to calculate rates.

>     When rate is present, is it messages/sec?

Yes.

>     What does "publish" mean?

That's the raw count of publishes to that queue - but only those from 
channels which are still open. We'll be fixing that and making it 
monotonic in 3.1 (as well as adding a history).

>    (1)   "message_stats": {
>                "publish": 20
>           },
>    (2)  "message_stats": {
>               "publish_details": {
>                   "rate": 2.2547070391502824,
>                   "interval": 4435166,
>                   "last_event": 1365091799843
>             },
>            "publish": 40
>   * inside the backing_queue_status block, does the difference between
> q1 .. q4 matter ( from an operational point of view )

Probably not. Again backing_queue_status is mostly for debugging. q1 - 
q4 and delta are the internal queues that make up the AMQP queue, 
representing different levels of paged-outness.

>   * messages vs messages_ready vs messages_unacknowledged - what's with
> that?  Also, could their respective _details have changed as in the
> example above for message.

messages_ready - messages which could be consumed. 
messages_unacknowledged - messages which have already been consumed and 
are held waiting for the client to acknowledge. messages - the total of 
the two.

And yes, the code that adds _details is generic.

>   * why doesn't message_stats have a correlating "message" key?

What would that key represent?

Cheers, Simon

-- 
Simon MacMullen
RabbitMQ, VMware