<html><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div>Le 15 sept. 10 à 16:11, Marek Majkowski a écrit :</div><div><div><div><br class="Apple-interchange-newline"><blockquote type="cite"><div>On Tue, Sep 14, 2010 at 09:07, <a href="mailto:romary.kremer@gmail.com">romary.kremer@gmail.com</a><br><<a href="mailto:romary.kremer@gmail.com">romary.kremer@gmail.com</a>> wrote:<br><blockquote type="cite"><blockquote type="cite">The flow control was heavily modified between 1.8.1 and 2.0.0. In summary:<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">- 1.8.1 - we have send Channel.flow AMQP message to everyone once<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"> rabbit reached memory limit<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">- 2.0.0 - once we reach memory limit, the connections from which we hear<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"> publishes are stopped temporarily. We stop receiving bytes from tcp<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">sockets.<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"> That 'stop' shouldn't take too long, as data should be swapped out to<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">disk<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"> and memory pressure will drop pretty quickly.<br></blockquote></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">Do you mean that in 2.0.0 the Channel.flow AMQP message is no longer sent to<br></blockquote><blockquote type="cite">the producer that are stopped temporarily ? So that would explain why<br></blockquote><blockquote type="cite"> 1) Channel.publish() can be blocking on the client side when the<br></blockquote><blockquote type="cite">broker stop<br></blockquote><blockquote type="cite"> reading from the socket !<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite"> 2) FlowListener.handleFlow() is no longer invoked on the registered<br></blockquote><blockquote type="cite">listener when<br></blockquote><blockquote type="cite"> the alarm handler is set or cleared<br></blockquote><blockquote type="cite">Are my deduction wright ?<br></blockquote><br>Yes. You will never hear "FlowListener.handleFlow()" and it may be possible for<br>channel.publish to block (though I would need to consult the sources<br>to be sure).<br></div></blockquote><br><div><div>It seems to me that FlowListener interface is likely to be deprecated so, does'nt it ?</div><div>Does not really matter for us anyway, cause we where on wrong idea using that.</div><div>Does this new implementation keep the broker on track for compliance with specification then ?</div></div><br><blockquote type="cite"><div><br><blockquote type="cite">Do you have any figures to quantify "should, not take too long" ? Are their<br></blockquote><blockquote type="cite">some<br></blockquote><blockquote type="cite">test reports available about that major evolution ?<br></blockquote><br>That's the question we really avoided :) Oh, well. No, we haven't done any<br>'real' tests, it's only based on our intuition and experience. In most<br>cases the blocking goes away pretty quickly - after 30 seconds usually,<br>about two minutes sometimes.<br></div></blockquote><br><div>This would be acceptable for our needs, only if we can somehow guarantee that's an upper boundary ! </div><blockquote type="cite"><div><br>But it is possible to create a very pessimistic environment in which the<br>memory usage will not drop - and the connection could be stuck for a long time.<br>(though it's pretty unlikely).<br></div></blockquote><br><div>... Not that much unlikely, considering my little playing with the MultiCastMain sample (see my previous reply about it for details).</div><div>I get 100 % times blocked connection.</div><div>What would be, based on your knowledge and your intuition, "a very pessimistic environment in which the memory usage will not drop" ?</div><div><br></div><div>I think that the experimentation I've done on the MultiCastMain is maybe a beginning of an answer for that question, although I would</div><div>never have considered that a single producer could have such power to flood the broker.</div><br><blockquote type="cite"><div><br><blockquote type="cite">Sorry If I wasn't clear on the previous post ,we are already in 2.0.0 for<br></blockquote><blockquote type="cite">both broker and<br></blockquote><blockquote type="cite">client library.<br></blockquote><br>Good.<br><br><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">It looks like any listener is called back when the alarm handler is set<br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">or<br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">cleared, while the producers are still paused / resumed<br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">like their are to be.<br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">Interesting. Maybe we have a race there? Or maybe you're blocking<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">the main java client thread? (nothing blocking should be done from<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">the main thread)<br></blockquote></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">I am quite sure I am not blocking the main thread, neither the Connection<br></blockquote><blockquote type="cite">Thread. All<br></blockquote><blockquote type="cite">the message-related logic is in a particular thread (Some kind of<br></blockquote><blockquote type="cite">ProducerGroup<br></blockquote><blockquote type="cite">pool of threads actually).<br></blockquote><blockquote type="cite">Consumer call back are running within the Connection thread if I refer to<br></blockquote><blockquote type="cite">the Javadoc !<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">The same code using the library version 1.8.1, The callback where invoked<br></blockquote><blockquote type="cite">when<br></blockquote><blockquote type="cite">alarm handler is set or cleared anyway.<br></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">during long running tests, we have encountered strange behaviour due to<br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">flow control :<br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">The queue depth starts to increase linearly for about 2 hours, these is<br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">coherent since the message throughput of the single consumer<br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">is not enough to absorb message ingress. Memory occupation grow faster<br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">as<br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">well, until the memory watermark is reached on the broker side.<br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">Are you sure your consumer is ACK-ing the messages it received?<br></blockquote></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">The Consumer call back does ACK messages upon reception, one at a time<br></blockquote><blockquote type="cite">(multiple == false).<br></blockquote><blockquote type="cite">Does the basic.ack() method is eligible to be blocked as well as publish()<br></blockquote><blockquote type="cite">upon flow control ?<br></blockquote><br>Well, under current implementation of flow control - yes. As it's whole<br>tcp/ip connection that gets blocked. It will affect any commands, including<br>basic.ack.</div></blockquote><blockquote type="cite"><div><br>What we usually propose is to use different tcp/ip connection for receiving<br>and different for publishing. On memory pressure we only block the publishers.<br>Using separate connection only for receiving you may be sure it will<br>never be blocked.<br></div></blockquote><div><br></div><div>Weren't Channel design for that ? In our environment, we have (naively ?) considered the use of Channel to </div><div>separate the message production from the consumption.</div><div>Since we are targeting 10 000 peers doing both production and consumption, the fact of multiplying the number of</div><div>connections by 2 is not negligible at all, considering scalability.</div><div>Moreover, as I reported later on, we use SSL to authenticate the broker, and we are still unclear about memory leaks</div><div>induce by SSL connections. Doubling the number of connections will not be negligible at all considering memory occupation either.</div><div>In conclusion, we are not likely to implement our peers using 2 connections for the same broker.</div><div>What would you recommend to us then ? And could you give us a better understanding on the use case of channels ?</div><br><blockquote type="cite"><div><br><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">From that point, the producers are indeed paused, as flow control<br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">request<br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">has been issued by the broker, but the consumer seems to be blocked<br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">as well. The queue level is flatten at its top value until the end of<br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">the<br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">test, even when memory occupation lowered under the threshold.<br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">That's how 1.8.1 behaves. In 2.0.0 we introduced swapping out big queues<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">to disk, so the memory usage shouldn't be dependent on a queue size.<br></blockquote></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">Good new, because we had identified 2 scenarios in wich memory-based channel<br></blockquote><blockquote type="cite">flow<br></blockquote><blockquote type="cite">was triggered :<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite"> - the use of SSL<br></blockquote><blockquote type="cite"> - the use of larger message (4kb, same ingress)<br></blockquote><blockquote type="cite">Now I hope that the message size will not be that much determinant for flow<br></blockquote><blockquote type="cite">controll,as soon<br></blockquote><blockquote type="cite">as consumers are able to handle these message regularly.<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">By registering the FlowListener callback, we have noticed that not all<br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">of<br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">the producers are notified all the time the alarm handler is set.<br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">Does this mean that the broker applies some heuristic to try not to<br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">block<br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">every body every time ?<br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">Or does it mean that some of the channels have been somehow blacklisted<br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">by<br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">the broker ?<br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">No, in 1.8.1 broker should send 'channel.flow' to all the channels.<br></blockquote></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">Strange so, their must be some thing very weird.<br></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">Could anybody explain how the blocking of consumer is assumed to be<br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">implemented ?<br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">The best description is probably here:<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"> <a href="http://www.rabbitmq.com/extensions.html#memsup">http://www.rabbitmq.com/extensions.html#memsup</a><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">But it covers 2.0.0. I'd suggest an upgrade to 2.0.0 and monitoring<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">not only queue size but also number of unacknowledged messages<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">('Msg unack' in status plugin). This number should be near zero.<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite">We are already with 2.0.0.<br></blockquote><blockquote type="cite">Where can I find some doc about the Status plugin anyway ?<br></blockquote><br>I'm afraid the old blog post is the only source:<br><a href="http://www.lshift.net/blog/2009/11/30/introducing-rabbitmq-status-plugin">http://www.lshift.net/blog/2009/11/30/introducing-rabbitmq-status-plugin</a><br></div></blockquote><br></div><div>Take it easy, it was really straight forward to install it after all. for those who would experiment some issues just</div><div>go in <b>/usr/lib/rabbitmq/lib/rabbitmq_server-2.0.0/plugins</b> then get </div><div><span class="Apple-tab-span" style="white-space: pre; ">        </span>-mochiweb-2.0.0.ez</div><div><span class="Apple-tab-span" style="white-space: pre; ">        </span>-rabbitmq-mochiweb-2.0.0.ez </div><div><span class="Apple-tab-span" style="white-space: pre; ">        </span>-rabbit_status-2.0.0.ez </div></div></div><div>from <a href="http://www.rabbitmq.com/releases/plugins/v2.0.0/">here</a>, and voila !!</div><div><br></div><div>B.R,</div><div><br></div><div>Romary.</div></body></html>