<div dir="ltr">I noticed something else very odd.<div><br></div><div>Currently, one queue has 43,000 messages backed up in its queue. But when I look at the exchange (there is only one exchange) I see that the message rate in exactly matches the message rate out.</div>
<div><br></div><div>With such a huge backlog, why would that be? I would have thought that the consumers (there are 16 total distributed across 4 systems for that queue with a prefetch of 100) would run at a much higher steady state.</div>
<div><br></div><div>This exchange also seems to cycle regularly. It appears to run from a low of around 60/s in and out to 500+ a second in and out.</div></div><div class="gmail_extra"><br clear="all"><div><div dir="ltr">
<div><span><font color="#888888">-- <br></font></span><div dir="ltr"><font color="#888888"><span><font color="#888888"><font color="#cc0000" face="georgia, serif" style="font-size:small"><b>Mike Templeman<br></b></font><div style="font-family:arial;font-size:small">
<b><font face="georgia, serif">Head of Development</font></b><font face="georgia, serif"><br><br></font></div><div style="font-family:arial;font-size:small"><font color="#cc0000" face="georgia, serif">T: <a href="http://twitter.com/missdestructo" style="color:rgb(17,85,204)" target="_blank" rel="nofollow" link="external">@talkingfrog1950</a></font></div>
<div style="font-family:arial;font-size:small"><font color="#cc0000" face="georgia, serif">T: <a href="http://twitter.com/meshfire" style="color:rgb(17,85,204)" target="_blank" rel="nofollow" link="external">@Meshfire</a></font></div></font></span></font></div>
</div><img src="http://meshfire.wpengine.netdna-cdn.com/wp-content/uploads/2013/07/meshfire_logo_white_bg-140.png"><br></div></div>
<br><br><div class="gmail_quote">On Fri, Dec 13, 2013 at 10:40 AM, Mike Templeman <span dir="ltr"><<a href="/user/SendEmail.jtp?type=node&node=32088&i=0" target="_top" rel="nofollow" link="external">[hidden email]</a>></span> wrote:<br><blockquote style='border-left:2px solid #CCCCCC;padding:0 1em' class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">Also, from observing the Connections screen on the <span style="color:rgb(85,85,85);font-family:Verdana,sans-serif;font-size:13px;line-height:18px">web UI shows that no flow control has been recently turned on for any of the four current connections (four app servers).</span> </div>
<div class="gmail_extra"><div class="im"><br clear="all"><div><div dir="ltr"><div><span><font color="#888888">-- <br></font></span><div dir="ltr"><font color="#888888"><span><font color="#888888"><font color="#cc0000" face="georgia, serif" style="font-size:small"><b>Mike Templeman<br>
</b></font><div style="font-family:arial;font-size:small"><b><font face="georgia, serif">Head of Development</font></b><font face="georgia, serif"><br><br></font></div><div style="font-family:arial;font-size:small"><font color="#cc0000" face="georgia, serif">T: <a href="http://twitter.com/missdestructo" style="color:rgb(17,85,204)" target="_blank" rel="nofollow" link="external">@talkingfrog1950</a></font></div>
<div style="font-family:arial;font-size:small"><font color="#cc0000" face="georgia, serif">T: <a href="http://twitter.com/meshfire" style="color:rgb(17,85,204)" target="_blank" rel="nofollow" link="external">@Meshfire</a></font></div></font></span></font></div>
</div><img src="http://meshfire.wpengine.netdna-cdn.com/wp-content/uploads/2013/07/meshfire_logo_white_bg-140.png"><br></div></div>
<br><br></div><div><div class="h5"><div class="gmail_quote">On Fri, Dec 13, 2013 at 10:17 AM, Mike Templeman <span dir="ltr"><<a href="/user/SendEmail.jtp?type=node&node=32088&i=1" target="_top" rel="nofollow" link="external">[hidden email]</a>></span> wrote:<br>
<blockquote style='border-left:2px solid #CCCCCC;padding:0 1em' class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">Hi Alvaro<div><br></div><div>I would be more than happy to provide logs. But all they have in them is connection and shutdown information. Nothing more. I have just enabled tracing on the vhost and will send the logs shortly. We encounter this issue when under load every day now.</div>
<div><br></div><div>Let me tell you our architecture and deployment:</div><div><br></div><div>rabbitMQ:</div><div><ul><li>m1.large ec2 instance. Version: <span style="color:rgb(68,68,68);font-family:Verdana,sans-serif;font-size:12px;text-align:right">RabbitMQ 3.1.5, </span><span style="color:rgb(68,68,68);font-family:Verdana,sans-serif;font-size:12px;text-align:right"> </span><acronym title="Erlang R14B04 (erts-5.8.5) [source] [64-bit] [smp:2:2] [rq:2] [async-threads:30] [kernel-poll:true]" style="color:inherit;font-family:Verdana,sans-serif;font-size:12px;text-align:right;background-image:none;padding:0px;border-top-left-radius:2px;border-top-right-radius:2px;border-bottom-right-radius:2px;border-bottom-left-radius:2px;border-style:none none dotted;border-bottom-width:1px">Erlang R14B04</acronym><br>
</li><li><span style="color:inherit;font-family:Verdana,sans-serif;font-size:12px;text-align:right">23 queues (transaction and direct)</span><br></li><li>3 exchanges used, two fanout and one topic exchange<br></li><li>topic exchange<br>
</li><li>Topic exchange overview is attached.<br></li><li>46 total channels.<br></li></ul></div><div><br></div><div>AppServers</div><div><ul><li>m1.large tomcat servers running grails application<br></li><li>2-7 servers at any one time.<br>
</li><li>Consume + publish<br></li><li>On busy queues, each server has 16 consumers with prefetch at 100</li><li>message sizes on busy queues are ~4KB.</li><li>publishing rates on busiest queue ranges from 16/s to >100/s. (We need to be able to support 1000/s).</li>
</ul></div><div><br></div><div><acronym title="Erlang R14B04 (erts-5.8.5) [source] [64-bit] [smp:2:2] [rq:2] [async-threads:30] [kernel-poll:true]" style="color:inherit;font-family:Verdana,sans-serif;font-size:12px;text-align:right;background-image:none;padding:0px;border-top-left-radius:2px;border-top-right-radius:2px;border-bottom-right-radius:2px;border-bottom-left-radius:2px;border-style:none none dotted;border-bottom-width:1px">Each AppServer connects to a sharded mongodb cluster of 3 shards. Our first suspicion was that something in Mongo or AWS was causing the periodic delay but AWS techs looked into our volume use and said we were only use 25% of available bandwidth.</acronym></div>
<div><acronym title="Erlang R14B04 (erts-5.8.5) [source] [64-bit] [smp:2:2] [rq:2] [async-threads:30] [kernel-poll:true]" style="color:inherit;font-family:Verdana,sans-serif;font-size:12px;text-align:right;background-image:none;padding:0px;border-top-left-radius:2px;border-top-right-radius:2px;border-bottom-right-radius:2px;border-bottom-left-radius:2px;border-style:none none dotted;border-bottom-width:1px"><br>
</acronym></div><div style="text-align:left"><font face="Verdana, sans-serif"><span style="font-size:12px">At this moment, we have a modest publish rate (~50-60/s) but a backlog of 50,000 messages for the queue "user". You can see a 10 minute snapshot of the queue and see the cycling.</span></font></div>
<div style="text-align:left"><br></div><div style="text-align:left"><font face="Verdana, sans-serif"><span style="font-size:12px">I turned on tracing but the results don't seem to becoming into the log. Is there another way to enable reporting of flow control?</span></font></div>
<span><font color="#888888">
<div style="text-align:left"><font face="Verdana, sans-serif"><span style="font-size:12px"><br></span></font></div><div style="text-align:left"><font face="Verdana, sans-serif"><span style="font-size:12px">Mike Templeman</span></font></div>
<div style="text-align:left"><font face="Verdana, sans-serif"><span style="font-size:12px"><br></span></font></div></font></span></div><div class="gmail_extra"><span><font color="#888888"><br clear="all"><div>
<div dir="ltr"><div><span><font color="#888888">-- <br>
</font></span><div dir="ltr"><font color="#888888"><span><font color="#888888"><font color="#cc0000" face="georgia, serif" style="font-size:small"><b>Mike Templeman<br></b></font><div style="font-family:arial;font-size:small">
<b><font face="georgia, serif">Head of Development</font></b><font face="georgia, serif"><br><br></font></div><div style="font-family:arial;font-size:small"><font color="#cc0000" face="georgia, serif">T: <a href="http://twitter.com/missdestructo" style="color:rgb(17,85,204)" target="_blank" rel="nofollow" link="external">@talkingfrog1950</a></font></div>
<div style="font-family:arial;font-size:small"><font color="#cc0000" face="georgia, serif">T: <a href="http://twitter.com/meshfire" style="color:rgb(17,85,204)" target="_blank" rel="nofollow" link="external">@Meshfire</a></font></div></font></span></font></div>
</div><img src="http://meshfire.wpengine.netdna-cdn.com/wp-content/uploads/2013/07/meshfire_logo_white_bg-140.png"><br></div></div></font></span><div><div>
<br><br><div class="gmail_quote">On Fri, Dec 13, 2013 at 6:03 AM, Alvaro Videla-2 [via RabbitMQ] <span dir="ltr"><<a href="/user/SendEmail.jtp?type=node&node=32088&i=2" target="_top" rel="nofollow" link="external">[hidden email]</a>></span> wrote:<br>
<blockquote style='border-left:2px solid #CCCCCC;padding:0 1em' class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Mike,
<br><br>Would you be able to provide information more information to help us
<br>debug the problem?
<br><br>Tim (from the rabbitmq team) requested more info in order to try to
<br>find answers for this.
<br><br>For example, when consumption drops to zero, are there any logs on the
<br>rabbitmq server that might tell of a flow control mechanism being
<br>activated?
<br><br>Regards,
<br><br>Alvaro
<br><br><br>On Fri, Dec 13, 2013 at 2:19 AM, MikeTempleman <<a href="http://user/SendEmail.jtp?type=node&node=32063&i=0" rel="nofollow" link="external" target="_blank">[hidden email]</a>> wrote:
<div><div class='shrinkable-quote'><br>> Tyson
<br>>
<br>> Did you ever find an answer to this question? We are encountering virtually
<br>> the exact same problem.
<br>>
<br>> We have a variable number of servers setup as producers and consumers and
<br>> see our throughput drop to zero on a periodic basis. This is most severe
<br>> when there are a few hundred thousand messages on rabbit.
<br>>
<br>> Did you just drop Rabbit? Ours is running on an m1.large instance with RAID0
<br>> ephemeral drives, so size and performance of the disk subsystem is not an
<br>> issue (we are still in beta). We have spent untold hours tuning our sharded
<br>> mongodb subsystem only to find out that it is only being 25% utilized (at
<br>> least it will be blazing fast if we ever figure this out).
<br>>
<br>>
<br>>
<br>>
<br>>
<br>> --
<br>> View this message in context: <a href="http://rabbitmq.1065348.n5.nabble.com/Lower-delivery-rate-than-publish-rate-why-tp29247p32040.html" rel="nofollow" link="external" target="_blank">http://rabbitmq.1065348.n5.nabble.com/Lower-delivery-rate-than-publish-rate-why-tp29247p32040.html</a></div>
> Sent from the RabbitMQ mailing list archive at Nabble.com.
<br>> _______________________________________________
<br>> rabbitmq-discuss mailing list
<br>> <a href="http://user/SendEmail.jtp?type=node&node=32063&i=1" rel="nofollow" link="external" target="_blank">[hidden email]</a>
<br>> <a href="https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss" rel="nofollow" link="external" target="_blank">https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss</a></div>_______________________________________________
<br>rabbitmq-discuss mailing list
<br><a href="http://user/SendEmail.jtp?type=node&node=32063&i=2" rel="nofollow" link="external" target="_blank">[hidden email]</a>
<br><a href="https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss" rel="nofollow" link="external" target="_blank">https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss</a><br>
<br>
<br>
<hr noshade size="1" color="#cccccc">
<div style="color:#444;font:12px tahoma,geneva,helvetica,arial,sans-serif">
<div style="font-weight:bold">If you reply to this email, your message will be added to the discussion below:</div>
<a href="http://rabbitmq.1065348.n5.nabble.com/Lower-delivery-rate-than-publish-rate-why-tp29247p32063.html" target="_blank" rel="nofollow" link="external">http://rabbitmq.1065348.n5.nabble.com/Lower-delivery-rate-than-publish-rate-why-tp29247p32063.html</a>
</div>
<div style="color:#666;font:11px tahoma,geneva,helvetica,arial,sans-serif;margin-top:.4em;line-height:1.5em">
To unsubscribe from Lower delivery rate than publish rate - why?, <a href="" target="_blank" rel="nofollow" link="external">click here</a>.<br>
<a href="http://rabbitmq.1065348.n5.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml" rel="nofollow" style="font:9px serif" target="_blank" link="external">NAML</a>
</div></blockquote></div><br></div></div></div>
</blockquote></div><br></div></div></div>
</blockquote></div><br></div>
<br/><hr align="left" width="300" />
View this message in context: <a href="http://rabbitmq.1065348.n5.nabble.com/Lower-delivery-rate-than-publish-rate-why-tp29247p32088.html">Re: Lower delivery rate than publish rate - why?</a><br/>
Sent from the <a href="http://rabbitmq.1065348.n5.nabble.com/">RabbitMQ mailing list archive</a> at Nabble.com.<br/>