[rabbitmq-discuss] Erlang crashes reports

Tue Sep 21 10:12:34 BST 2010

Hi Romary,

On 21/09/10 08:03, romary.kremer at gmail.com wrote:
> Hi Emile, thanks for the supports and the suggestions.
> Le 17 sept. 10 à 11:11, Emile Joubert a écrit :
>> Hi Romary,
>> On 16/09/10 16:38, romary.kremer at gmail.com wrote:
>>> Le 16 sept. 10 à 16:27, Emile Joubert a écrit :
>>>> Hi Romary,
>>>> On 16/09/10 14:54, romary.kremer at gmail.com wrote:

[...]

>> These graphs show that you create the connections at a fast rate. Do you
>> get the same failure if you create connections at a lower rate?
> 
> We have a ramp up scenario to evaluate the kind of thing you're thinking
> of. the same
> crashes occurs the same way about 4000 connections established.

Ok, thanks for checking this.

>>>>> The memory available raised the threshold before the end of
>>>>> connections
>>>>> of 10 000 peers.
>>>>>
>>>>> Maybe the average memory occupied by a single, non SSL connection has
>>>>> somehow get bigger between
>>>>> release 1.8.x and 2.x.x ??
>>>>
>>>> The data structures have changed from 1.8.1 to 2.0.0, but I don't think
>>>> that is the cause of the problem.
>>>>
>>>>> Does anybody has experiment or knows the impact of the new release on
>>>>> the memory occupied by connections ?
>>
>> It is possible that more memory is required per connection, If using a
>> large number of connections caused your RAM budget to be approached in
>> versions prior to 2.0.0 then you may now exceed it.

I've tried to repeat your test of opening 10000 SSL connections from the
Java client, using rabbit version 2.1.0 and 1.8.1 . The Erlang VM (R14A)
memory consumption was no different: about 1Gb in both cases. The 1Gb
required is just to open the connections, without any queues or
exchanges being declared.

[...]

> We are a bit of worry considering that all the propositions we get to
> fix this issue are
> related to give the broker more memory, while we thought the purpose of
> the 2.x.x
> was to allow the broker to rely more on disk. Moreover, we are wondering
> what have
> caused a connection to be so greedy that we can no longer establish 10
> 000 on the same
> configuration.

The broker relies more on disk for storing messages. In your case the
persistence layer is less of an issue, because memory consumption is due
to connections rather than messages.

> Since we do not have a 4GB server available to run the same tests, and,
> as we have quite
> short dead line to set up a field test, we are considering to get back
> using the rabbitMQ release 1.8.1,
> with Erlang R14B, hopping that this would be the winning combinaison.

>From the information you provided so far it appears that insufficient
memory is the most probable cause of the problem. If you are forced to
set the highwater mark above 50% then you probably need more RAM, even
on older version of rabbit.

Some further things you can do to rule out alternative causes is to make
100% sure the rabbit logfile does not contain more information. The
establishment of 10000 connections will generate alot of entries which
you need to filter out. Is there anything that remains? You should also
check the OS log for entries around the time of the problem.

If a erl_crash.dump file was produced then it may be worth inspecting
with the crashdump_viewer (part of start_webtool), although I seldom
find this information to be of value.

Regards

-Emile