[rabbitmq-discuss] Lower delivery rate than publish rate - why?
sergey at openbridge.com
sergey at openbridge.com
Wed Sep 4 19:21:35 BST 2013
I would also like to hear the answer to this question, because my situation
is almost identical. And we thought it's a network problem, and about to
move to the bigger instance type, just for the test.
On Saturday, August 31, 2013 7:30:24 PM UTC-4, Tyson Stewart wrote:
>
> Hello everyone!
>
> We've been experiencing some behavior that I don't understand, and none of
> my searching or documentation-reading has been fruitful, so I'm here to ask
> you all for expert knowledge.
>
> Broadly, we're seeing a lower delivery rate than publish rate. I've
> attached an image to this message that shows how we're able to keep up when
> the publish rate is less than 600 messages/second, but above that,
> consumption falls behind publication. Around 16:00 on that chart, we
> doubled the number of consumers, and it made no difference that we could
> tell. The erratic behavior of the publish rate is us turning off publishes
> of the most active queue because we were falling far enough behind that we
> became worried. When the backlog would get low enough, we would turn it
> back on, and we did that a few times.
>
> Here are some vitals to our cluster:
>
> - 2 nodes
> - Each node is a m1.xlarge instance hosted in EC2
> - We have 133 queues in the cluster (see note below)
> - All queues are mirrored (they all use a policy that makes them
> highly available)
> - All queues are durable; we use AWS provisioned IOPS to guarantee
> enough throughput
> - We only use the direct exchange
>
> Regarding the number of queues, there are four kinds: the "main" queues,
> retry-a queues, retry-b queues, and poison queues. Messages that fail for
> whatever reason during consumption will get put into the retry queues, and
> if they fail long enough, they'll wind up in the poison queue where they
> will stay until we do something with them manually much later. The main
> queues then see the majority of activity.
>
> The average message size is less than 1MB. At nearly one million messages,
> we were still under 1GB of memory usage, and our high watermark is 5.9GB.
>
> Disk IOPS don't appear to be the problem. Metrics indicated we still had
> plenty of headroom. Furthermore, if IOPS were the limitation, I would have
> expected the delivery rate to increase as the publish rate decreased while
> the consumers worked through the queue. It did not, however, as shown on
> the chart.
>
> My question primarily is: *What do you think is limiting our consumption
> rate?* I'm curious about what affects consumption rate in general,
> though. Any advice would be appreciated at this point. Questions for
> clarification are also welcome!
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20130904/4d4b13f9/attachment-0001.htm>
More information about the rabbitmq-discuss
mailing list