<html><head></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">Massimo,<div>After quite a &nbsp;time of trials and tribulations I have found a bug&nbsp;</div><div>in the RabbitMQ STOMP adapter which appears to cause&nbsp;</div><div>crashes under high message load.</div><div>&nbsp; &nbsp; The bug is related to the (rather primitive) throttling which&nbsp;</div><div>kicks in when reaching the memory limit.</div><div><br></div><div>After fixing this, I have successfully run your testcase, and&nbsp;</div><div>RabbitMQ STOMP does manage to survive during the throwing&nbsp;</div><div>away of large numbers of unsubscribed topic SENDs. I had to&nbsp;</div><div>make one alteration to the configuration in order to get the test&nbsp;</div><div>to stay up. I set the vm_high_watermark to 0.2 (instead of the&nbsp;</div><div>default of 0.4).</div><div><br></div><div>This setting determines the proportion of available memory&nbsp;</div><div>above which the 'memory-alarm' is triggered. The reader&nbsp;</div><div>process in RabbitMQ-STOMP will then block processing&nbsp;</div><div>further bytes from clients, until the memory alarm is switched&nbsp;</div><div>off. &nbsp;Because message processing is proceeding at a very&nbsp;</div><div>high rate (that is why memory is being used up), the&nbsp;</div><div>input-queues for Erlang processes are likely to be large,&nbsp;</div><div>and so even blocking input will not result in a lowering of&nbsp;</div><div>memory consumption -- memory requirements increase in&nbsp;</div><div>the pipeline flow -- and so the actual memory consumed is&nbsp;</div><div>likely to reach 0.6/0.7 of available memory (found by observation). &nbsp;</div><div>This is why setting it to 0.4 (the default) doesn't work. &nbsp;Memory&nbsp;</div><div>for Erlang can then be completely exhausted, resulting in&nbsp;</div><div>crashes (which normally result in long pauses and large&nbsp;</div><div>amounts of dump-data written to disk). Any recovery after&nbsp;</div><div>that is likely not to succeed for very long, since the pause&nbsp;</div><div>allows time for more requests to be put in. (The benchmark&nbsp;</div><div>test runs with non-blocking io; I'm not sure what that means,&nbsp;</div><div>but if it means that requests will be put in even if the socket&nbsp;</div><div>blocks then this is going to increase the load during the&nbsp;</div><div>pauses.)</div><div><div><br class="webkit-block-placeholder"></div><div>As I say, setting it to 0.2 worked, and I suspect that 0.3 would&nbsp;</div><div>still work.</div><div><br></div><div>However,...</div><div><br></div><div>The test doesn't really tell us much. &nbsp;This is a poor benchmark,&nbsp;</div><div>because it consists of a single producer sending a 20-byte&nbsp;</div><div>message to a topic to which no-one is currently subscribed. &nbsp;</div><div>This is therefore a measure of how fast a broker can throw&nbsp;</div><div>away messages. &nbsp;RabbitMQ does reasonably well here, but&nbsp;</div><div>cannot determine that there are no subscribers until it reaches&nbsp;</div><div>the actual rabbit exchange, so it goes through AMQP&nbsp;</div><div>processing. &nbsp;In general, especially in the presence of potential&nbsp;</div><div>durable topic subscriptions, it is hard to see how to fix this --&nbsp;</div><div>though if it were a common scenario perhaps some sort of&nbsp;</div><div>status of the broker subscription space could be cached near&nbsp;</div><div>to the client, allowing us to throw away messages close to&nbsp;</div><div>the client: but then we need to update the cache when&nbsp;</div><div>(remote) subscribers come on line, or risk throwing away&nbsp;</div><div>messages in error. &nbsp;This is very complex, and only for a gain&nbsp;</div><div>in a low probability use-case.</div><div><br></div><div>I notice that the benchmark test runs for a very long time&nbsp;</div><div>(hours on my little 2Gb 2Core Linux VM) and produces a noisy&nbsp;</div><div>graph as result (there are lots of zeros in it because RabbitMQ&nbsp;</div><div>blocks reads quite often). The graph doesn't help us to guess&nbsp;</div><div>at an average rate. &nbsp;I attach the benchmark json output, so&nbsp;</div><div>you can view it with generic_report, or something.</div><div><br></div><div>The bug fix is rabbit bugzilla 24426, which after QA will&nbsp;</div><div>probably go into the next release.</div><div><br></div><div>Thanks for spotting this,</div><div><br></div><div><span class="Apple-style-span" style="border-collapse: separate; color: rgb(0, 0, 0); font-family: Helvetica; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div><div style="font-style: normal; "><font class="Apple-style-span" face="Georgia">Steve Powell</font></div><div style="font-style: normal; "></div><div><font class="Apple-style-span" face="Georgia"><a href="mailto:steve@rabbitmq.com">steve@rabbitmq.com</a></font></div></div><div style="font-style: normal; "><div><div><font class="Apple-style-span" face="Georgia" size="2"><span class="Apple-style-span" style="font-size: 10px; ">[</span><i>wrk</i><span class="Apple-style-span" style="font-size: 10px; ">: +44-2380-111-528] [</span><i>mob</i><span class="Apple-style-span" style="font-size: 10px; ">: +44-7815-838-558]</span></font></div><div><font class="Apple-style-span" face="Georgia" size="2"><span class="Apple-style-span" style="font-size: 10px; "></span></font></div></div></div></span></div></div></body></html>