[rabbitmq-discuss] OOM kill

Sat Mar 8 11:17:40 GMT 2014

Thanks very much for the explanation, Simon.

When you say Rabbit will gracefully stop accepting connections when ulimit is reached, are you talking about some action explicitly done by Rabbit itself when it sees limit approaching or is everything done by the OS?
If Rabbit does something - doesn't it make sense to do the same when memory is running low?

i will of course follow your advice and play with ulimit but it does not feel very reliable - even if I find ulimit value which works in that specific test case, it may stop working when message size changes.

Regarding plugin memory - it worth mentioning that we have couple of custom plugins including a custom authentication backend. Can these be the root cause? Is there anything that needs to be explicitly done by the plugin developer to make sure the memory is accounted correctly?

Thank you.

> On 7 Mar 2014, at 17:45, Simon MacMullen <simon at rabbitmq.com> wrote:
> 
>> On 07/03/2014 5:16PM, Dmitry Andrianov wrote:
>> Hello.
> 
> Hi.
> 
>> We are trying to load-test RabbitMQ server in different configurations
>> on Amazon EC2 node.
>> Most of our tests end with Linux OOM killer intervening and killing Rabbit.
>> That is something I cannot really understand especially given that it is
>> reproducible even with vm_memory_high_watermark set to 0.2 and no other
>> processes running on that box.
>> So if someone could shed some light onto that issue it would help a lot.
>> 
>> Below the status response not long before the process was killed:
> 
> <snip>
> 
>> Couple of strange things there are:
>> 
>> 1. {vm_memory_limit,804643635} but still memory {total,1984625336}. How
>> is that possible? https://www.rabbitmq.com/memory.html says that erlang
>> process can take twice the configured size so I expected that but it is
>> definitely more than twice.
> 
> The only ways RabbitMQ has of preventing memory use from increasing are to do with messages - when the memory alarm goes off it will stop accepting new messages, and before that point it will start trying to reduce memory use by paging messages out to disc.
> 
> Normally, messages are the biggest user of memory in RabbitMQ, so this approach works OK.
> 
> However, your test ends up causing RabbitMQ to end up using the majority of its memory in connection processes - you have 11k connections open, at about 120kB each.
> 
> We don't prevent RabbitMQ from accepting new connections when the memory alarm goes off since our main worry is messages - and those connections could be intending to consume messages and thus reduce memory pressure.
> 
> So I guess you might want to reduce the ulimit, so that RabbitMQ runs out of file descriptors before it runs out of memory (when it runs out of FDs it *will* stop accepting network connections gracefully).
> 
>> 2. {plugins,-44730984} - how this one is possible?
> 
> That's a good question!
> 
> That value is calculated from (memory used by all plugins including management) - (memory used by the management database). So somehow the memory counter managed to determine that "all plugins" used less memory than just the management plugin. I'll look into that.
> 
> Cheers, Simon
> 
> -- 
> Simon MacMullen
> RabbitMQ, Pivotal
This email is for the use of the intended recipient(s) only.
If you have received this email in error, please notify the sender immediately and then delete it.
If you are not the intended recipient, you must not use, disclose or distribute this email without the
author's prior permission. AlertMe.com Ltd. is not responsible for any personal views expressed
in this message or any attachments that are those of the individual sender.

AlertMe.com Ltd, 30 Station Road, Cambridge, CB1 2RE, UK.
Registered in England, Company number 578 2908, VAT registration number GB 895 9914 42.