[rabbitmq-discuss] RabbitMQ writes non stop to disk

Tue Oct 25 17:22:23 BST 2011

Hello Emile, thanks for the reply, comments inline.

--
Raphael.

On Tue, Oct 25, 2011 at 1:39 AM, Emile Joubert <emile at rabbitmq.com> wrote:

> Hi Raphael,
>
> On 25/10/11 02:07, Raphael Simon wrote:
> > Hello all,
> >
> > We are seeing an issue on a production broker where the RabbitMQ process
> > writes non stop to files in mnesia/<node>/msg_store_persistent. It keeps
> > creating new files and the problem seems to be getting worse. Listing
> > the files in that directory shows that it's creating a new 16 MB file
> > every 2 to 4 minutes [1].
> >
> > The throughput of persistent messages in this broker is orders of
> > magnitude less (maybe 20 msg/sec at the most, each being in the 10s of
> KB) .
>
> > There are about 100 messages sitting in queues on that broker so that
> > should not cause that many writes, iostat shows about 6000 writes/s.
>
> How did you determine this number? Is it constant? I would expect the
> behaviour you describe when some queues keep growing or when the broker
> needs to free up alot of memory.
>

We use collectd in combination with rabbitmqctl (not great for performance
but allows us to know what's going on in the brokers at a glance). So these
numbers are directly reported from a combination of rabbitmqctl and iostat
in this case over the course of days. They are fairly constant.

>
> If the broker is running rabbit version 2.5.0 or later, could you please
> supply the result of "rabbitmqctl report"?
>

We are running rabbit 2.4.1

>
> If it is an older version please run "rabbitmqctl list_queues" with all
> queueinfoitems, and supply both the query and the result. The result of
> "erlang:memory()." from the Erlang shell will also be helpful, as well
> as a copy of the rabbit configuration file, if you have made any
> relevant changes.
>

So as mentioned above the result of rabbitmqctl list_queues is what we
graph/monitor. The box still has plenty of memory (7G free). Here is the
output of erlang:memory():

(rabbit at broker1-1)1> erlang:memory().
[{total,3240343872},
 {processes,1558984664},
 {processes_used,1545242416},
 {system,1681359208},
 {atom,1924057},
 {atom_used,1908273},
 {binary,1456029912},
 {code,12256962},
 {ets,101781296}]

And here is our config:

[{rabbit, [{vm_memory_high_watermark, 0.5}]}].

> Are there any unusual entries in the broker logfile? How often does the
> memory alarm trigger? Are there any entries that appear at the onset of
> the disk activity?
>

Nothing unusual in the logs (sasl log is empty, rabbit log just has the
usual connection starting / stopping).

>
>
> -Emile
>
>
Something I didn't mention in my first email is that we have about 8
brokers running in production and only one is showing these symptoms, the
throughput is about the same through all brokers.

I've dug deeper using an erlang shell and see that two queue processes seem
to be causing most of the reductions. Looking at the corresponding variable
queue state I see that the target_ram_count of the vqstate record is 0.
This is reminiscent of a couple of bugs we had identified with Matthew
Sackman that he fixed in 2.4. I'm happy to provide more information on the
queue processes state if needed.

--
Raphael.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20111025/d85b1185/attachment.htm>