<html><head></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">Massimo,<div>After quite a time of trials and tribulations I have found a bug </div><div>in the RabbitMQ STOMP adapter which appears to cause </div><div>crashes under high message load.</div><div> The bug is related to the (rather primitive) throttling which </div><div>kicks in when reaching the memory limit.</div><div><br></div><div>After fixing this, I have successfully run your testcase, and </div><div>RabbitMQ STOMP does manage to survive during the throwing </div><div>away of large numbers of unsubscribed topic SENDs. I had to </div><div>make one alteration to the configuration in order to get the test </div><div>to stay up. I set the vm_high_watermark to 0.2 (instead of the </div><div>default of 0.4).</div><div><br></div><div>This setting determines the proportion of available memory </div><div>above which the 'memory-alarm' is triggered. The reader </div><div>process in RabbitMQ-STOMP will then block processing </div><div>further bytes from clients, until the memory alarm is switched </div><div>off. Because message processing is proceeding at a very </div><div>high rate (that is why memory is being used up), the </div><div>input-queues for Erlang processes are likely to be large, </div><div>and so even blocking input will not result in a lowering of </div><div>memory consumption -- memory requirements increase in </div><div>the pipeline flow -- and so the actual memory consumed is </div><div>likely to reach 0.6/0.7 of available memory (found by observation). </div><div>This is why setting it to 0.4 (the default) doesn't work. Memory </div><div>for Erlang can then be completely exhausted, resulting in </div><div>crashes (which normally result in long pauses and large </div><div>amounts of dump-data written to disk). Any recovery after </div><div>that is likely not to succeed for very long, since the pause </div><div>allows time for more requests to be put in. (The benchmark </div><div>test runs with non-blocking io; I'm not sure what that means, </div><div>but if it means that requests will be put in even if the socket </div><div>blocks then this is going to increase the load during the </div><div>pauses.)</div><div><div><br class="webkit-block-placeholder"></div><div>As I say, setting it to 0.2 worked, and I suspect that 0.3 would </div><div>still work.</div><div><br></div><div>However,...</div><div><br></div><div>The test doesn't really tell us much. This is a poor benchmark, </div><div>because it consists of a single producer sending a 20-byte </div><div>message to a topic to which no-one is currently subscribed. </div><div>This is therefore a measure of how fast a broker can throw </div><div>away messages. RabbitMQ does reasonably well here, but </div><div>cannot determine that there are no subscribers until it reaches </div><div>the actual rabbit exchange, so it goes through AMQP </div><div>processing. In general, especially in the presence of potential </div><div>durable topic subscriptions, it is hard to see how to fix this -- </div><div>though if it were a common scenario perhaps some sort of </div><div>status of the broker subscription space could be cached near </div><div>to the client, allowing us to throw away messages close to </div><div>the client: but then we need to update the cache when </div><div>(remote) subscribers come on line, or risk throwing away </div><div>messages in error. This is very complex, and only for a gain </div><div>in a low probability use-case.</div><div><br></div><div>I notice that the benchmark test runs for a very long time </div><div>(hours on my little 2Gb 2Core Linux VM) and produces a noisy </div><div>graph as result (there are lots of zeros in it because RabbitMQ </div><div>blocks reads quite often). The graph doesn't help us to guess </div><div>at an average rate. I attach the benchmark json output, so </div><div>you can view it with generic_report, or something.</div><div><br></div><div>The bug fix is rabbit bugzilla 24426, which after QA will </div><div>probably go into the next release.</div><div><br></div><div>Thanks for spotting this,</div><div><br></div><div><span class="Apple-style-span" style="border-collapse: separate; color: rgb(0, 0, 0); font-family: Helvetica; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div><div style="font-style: normal; "><font class="Apple-style-span" face="Georgia">Steve Powell</font></div><div style="font-style: normal; "></div><div><font class="Apple-style-span" face="Georgia"><a href="mailto:steve@rabbitmq.com">steve@rabbitmq.com</a></font></div></div><div style="font-style: normal; "><div><div><font class="Apple-style-span" face="Georgia" size="2"><span class="Apple-style-span" style="font-size: 10px; ">[</span><i>wrk</i><span class="Apple-style-span" style="font-size: 10px; ">: +44-2380-111-528] [</span><i>mob</i><span class="Apple-style-span" style="font-size: 10px; ">: +44-7815-838-558]</span></font></div><div><font class="Apple-style-span" face="Georgia" size="2"><span class="Apple-style-span" style="font-size: 10px; "></span></font></div></div></div></span></div></div></body></html>