[rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens under Load

james.poole at rsa.com james.poole at rsa.com
Thu Jan 12 17:55:59 GMT 2012


Simone, that would be great if you could try to reproduce it.

 

As mentioned, we are creating 2000 consumers each with their own queue bound
to a fanout exchange.  After the queues have all been created and bound, a
producer publishes a 2 MB message to this fanout exchange once every second
for 50 seconds.

 

All queues are non-durable.  And autoAck was set to false in the Java
client.

 

Everything hums along until the vm_memory_high_watermark is triggered and
then we see the crash.  One interesting thing is that in the log it still
shows it accepting and starting tcp connections after the memory alarm is
triggered (for around 15 seconds before the crash).  I thought this was
supposed to block until the memory was under control?

 

-James

 

From: Simone Busoli [mailto:simone.busoli at gmail.com] 
Sent: Wednesday, January 11, 2012 3:02 PM
To: Poole, James
Cc: rabbitmq-discuss at lists.rabbitmq.com; Kuch, Jerry (VMware)
Subject: Re: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens
under Load

 

Hi James,

If you can provide more details about the load you're applying to the broker
I would be glad to try to reproduce it.
We've been using RabbitMQ on Windows in production for some months now and
didn't experience any weird behavior.
What I'm interested in is whether entities and messages are durable, if you
use transactions or publisher confirms and the like.

On Jan 11, 2012 7:52 PM, <james.poole at rsa.com> wrote:

Yeah, I should have mentioned that we started out testing with the 64-bit
version and found this issue... though the VM probably didn't have very much
more memory than a 32-bit address space would provide.  Then we backed down
to the 32-bit version to see if it went away, but it didn't.

I will see if we can send out the test program (it's just a simple java app
using the rabbitmq-java-client-2.7.1).  If I can send it out, how would I go
about this... attach to the email or upload it to a server somewhere?

-James

-----Original Message-----
From: Jerry Kuch [mailto:jerryk at vmware.com]
Sent: Wednesday, January 11, 2012 1:44 PM
To: Poole, James
Cc: rabbitmq-discuss at lists.rabbitmq.com
Subject: Re: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens
under Load

James:  Out of curiousity have you tried the new 64-bit release of
Erlang for Windows in your environment?  The address space size
limitations of the 32-bit version have been associated with crashy
Rabbits in the past (although bringing your memory high watermark
value down so that the back-pressure mechanisms engage when the
broker is in less trouble may help).  I think you can scare up the
new Erlang here:

http://www.erlang.org/download/otp_win64_R15B.exe

Until recently there was no 64-bit Erlang, so even those running on
64-bit Windows boxes were still relegated to 32-bit VMs.

I am curious about the different results between a physical machine
and a virtualized one, with one showing a "clean" Erlang VM crash and
the other exhibiting a blue-screen, fatal OS-wrecker...

Is the traffic you're using to bring these systems down part of a
large or proprietary app, or can you extract a bare minimum piece
of code that brings the pain and share it with us?  If you could
do the latter we could more easily investigate the situation within
VMware since the difference in behavior between baremetal and
virtualization is disquieting...

Best regards,
Jerry

----- Original Message -----
From: "james poole" <james.poole at rsa.com>
To: rabbitmq-discuss at lists.rabbitmq.com
Sent: Wednesday, January 11, 2012 10:32:23 AM
Subject: [rabbitmq-discuss] Windows RabbitMQ Crashes and Blue Screens under
Load





We've let loose one of our testing ninjas on RabbitMQ for load testing, and
we're consistently running into issues when the high memory watermark is
hit.



Windows Server 2003 32-bit , Erlang R15B 32-bit, Rabbit 2.7.1



2,000 Consumers each with their own queue bound to a direct exchange

1 Producer, publishing a 2 MB message to the exchange, once every second,
for a total of 50 seconds



Everything behaves as expected, until the memory footprint hits the high
watermark, at which point:

On a physical machine: ERL process crashes and dump file is created

On a Virtual Machine: Blue Screen of Death is shown and server reboots



VM environment = VMware, Inc.R vCenter Lab Manager 4.0 (4.0.3.1318)



One other note is that we see the same problem with ERL R14B04 and Rabbit
2.7.0.



I have looked through the log file and also turned on the console debug
output, and nothing seems to be jumping out as an error. If needed, I can
upload the minidump from the Blue Screen and the ERL crash dump file, just
point me where to do it.



Let me know if there is anything else I can do to try and help get this
fixed.







In the rabbit log, there are no errors, and only a few warnings 20 seconds
before the crash:



=INFO REPORT==== 11-Jan-2012::10:55:53 ===

closing TCP connection <0.4405.0> from 10.6.64.104:57830



=WARNING REPORT==== 11-Jan-2012::10:55:53 ===

exception on TCP connection <0.20552.0> from 10.6.64.104:59521

connection_closed_abruptly





In the console output log file for the physical machine, this is the only
message I see:



starting direct_client ...done

starting notify cluster nodes ...done



broker running

Eshell V5.9 (abort with ^G)

(rabbit at QEDLP082)1>

Crash dump was written to: C:/Documents and
Settings/Administrator.QEDLP/Application Data/RabbitMQ/erl_crash.dump

eheap_alloc: Cannot allocate 6731340 bytes of memory (of type "heap").

in message_loop

win32sysinfo:Erlang has closed.




_______________________________________________
rabbitmq-discuss mailing list
rabbitmq-discuss at lists.rabbitmq.com
https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss

_______________________________________________
rabbitmq-discuss mailing list
rabbitmq-discuss at lists.rabbitmq.com
https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20120112/d8747951/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 7172 bytes
Desc: not available
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20120112/d8747951/attachment.bin>


More information about the rabbitmq-discuss mailing list