[rabbitmq-discuss] Erlang crashes trying to allocate 583848200 bytes of memory

Sun Jan 23 18:23:21 GMT 2011

On Jan 18, 10:23 pm, Jerry Kuch <jer... at vmware.com> wrote:
Hi,
we have a similar problem that can be summarised as follows.
Our goal is to compare RabbitMQ (the last stable version) to ActiveMQ.
To this end we set up two minimal and similar testing environments
having both 1 publisher, 1 subscriber and 1 message broker (RabbitMQ/
ActiveMQ) server.
In both testing environments the message broker runs under a Windows
Hyper-V virtual machine (hosted on a Windows Server 23008 64 bits)
that has as OS a 32 bit Windows Server 2008 (if I'm correct we
allocated 2 GB, perhpas 4 GB, to the VM and 1 CPU).
The publishers/subscribers are run under Windows  XP/Vista and are on
the same LAN as the broker VM.

The publisher (whose code and details are reported on this thread
"http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/2011-January/
010818.html") has a config file that specifies how many chunks of
messages have to be sent. For each chunk the config specifies how many
messages have to be published and of what size (our tests use a range
between 512 and 2K bytes).
For each chunk the publisher publishes an initial message that informs
the subscriber a new chunk has been started. This initial message
contains a timestamp of the chunk start. Immediately afterwards the
publisher sends the envisaged number of messages and then a final
message informing the subscriber the chunk is finished. Then the
publisher process the next chunk as specified in the config file.

On the subscriber side it connects to the broker waiting for messages.
When a new chuck started message is received, it saves the received
timestamp and then counts how many messages it has received till the
indication of the chunk end message. At this point it reads the system
clock and calculates how much time has elapsed since the chunck start
on the publisher side (as reported in the received timestamp) and the
arrival on the chunk end message on the subscriber side.
The subscriber then logs all these data (start timestamp, end
timestamp, elapsed time, number of messages), and is ready to process
a new message chunk.
Anyway further details and the Java code is available in the indicated
RabbitMQ newsgroup's message.
In our testing environment, therefore, there are no durable queues, no
permanent messages, a single subscriber, ...

What we experienced are crashes of the RabbitMQ server due to the
memory allocation issues. Indeed, the Windows Task Manager shows the
RabbitMQ process increasing memory, until it crashes.
On the RabbitMQ forum someone suggested to my colleague Cristoforo to
reduce the vm_memory_high_watermark value from its default 0.40 value.
We tried with 0.25, 0.20 and 0.10. The crash problem doesn't disappear
unless the subscriber runs on a fast machine (e.g. putting the
subscriber on the same VM as the message broker seems to have less
crashing issues).
With ActiveMQ we don't experience similar issues at all.
We are not using Erlnag under a 64 bit OS (our VM, as stated, is a 32
bit Windows Server 2008), and the vm_memory_high_watermark has been
lowered enough (we think).

Our conclusion is that the Windows Erlang implementation has some
serious issues on memory management under Windows, or that we have not
configured Erlang/RabbitMQ in a proper way.
Of course we need to run RabbitmQ under Windows (Server 2008 is not a
real constraint, any Windows OS could be OK, even if having Windows
Server 2008 would be preferrable).
So if someone has experience running RabbitMQ under Windows, please
help us in spotting where we are wrong.

Sorry for the long comment, but I think it was necessary to describe a
testing scenario that, in my opinion, can shed some lights on the
flagged issue.
Regards
    Dom
> A couple more small remarks on the somewhat less than optimal Windows
> situation...
>
> > Possible solutions:
> >  - upgrade to erlang >= R13B3
>
> Be warned that the results of doing so currently appear to be a bit on
> the "meh" side.  In some cases we've seen the onset of the problem be
> delayed a little bit, but happen anyway, even with version 14B01.
>
> >  - Configure 'vm_memory_high_watermark' value (as Jerry suggested).
> >    The smaller the value, the sooner Rabbit will stop accepting
> >    new messages and growing memory.
>
> This is probably the most applicable advice we have at this point. If
> you do try it with a lower watermark value, say something like 0.2,
> we're interested in hearing what results you experience as the
> behavior Marek mentions kicks in, and whether they're sufficient to
> keep the crashes at bay while not unacceptably compromising
> throughput.
>
> >  - Play with IMAGE_FILE_LARGE_ADDRESS_AWARE flag.
>
> Also, be aware that just hacking this bit on the executable doesn't
> necessarily mean that the code within the executable is going to do
> the right thing, even if you're running a version of Windows that's
> been booted with the appropriate settings to induce a 3GB/1GB
> user/kernel split rather than the default 2GB/2GB.  Historically, some
> 32-bit applications under Windows have played funny games with pointer
> arithmetic and made assumptions about pointer values that can result
> in them misbehaving if user space doesn't end around the usual 2GB
> boundary.  I have no idea if the Windows release of Erlang is one of
> them.
>
> Best regards,
> Jerry
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-disc... at lists.rabbitmq.comhttps://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss