[rabbitmq-discuss] Consumer slows down (a lot) after producer exceeds a certain publish rate, why?
Zhibo Wei
zweicmu at gmail.com
Mon Sep 30 10:22:52 BST 2013
Ah....Maybe that's why
I will do more investigations on the CPU usage for each core and split all
msgs into several different queues to see if that will help.
Thanks a lot Simon, It's really helpful.
On Mon, Sep 30, 2013 at 2:16 AM, Simon MacMullen <simon at rabbitmq.com> wrote:
> While you say that CPU usage is fine, bear in mind that a single queue can
> only use one CPU core at once.
>
> It is quite possible that you are simply giving the queue too much work to
> do. (And you're not hitting its limit when testing on your own machine
> because that's faster than a large EC2 instance.)
>
> Unfortunately when a queue is CPU-limited we don't guarantee that it will
> prioritise delivering messages to consumers over accepting messages from
> publishers. We used to, but this can have undesirable consequences in other
> contexts (a large queue which suddenly got consumers could use 100% of its
> CPU draining, thus blocking publishers for potentially a long time). So in
> particular if you have a decent number of publishers it's possible to just
> overwhelm the queue.
>
> If you are able to split your work up across several queues I suspect that
> would help a lot.
>
> Cheers, Simon
>
>
> On 27/09/2013 23:43, Zhibo Wei wrote:
>
>> Here is a very interesting chart.
>>
>> Before 22:10, both publisher and consumer were both running in a stable
>> state (13000msgs/sec);
>> 22:11, I increased the publish rate to 17000msgs/sec;
>> deliver rate went down while publish rate went up;
>> 22:12, I shut down the publisher, the deliver rate went up high and
>> drained the all the messages left in queue in seconds.
>>
>> I know it must be bound by something but I don't know what's that yet.
>> CPU and memory usage are just fine, and I put queue and consumer in the
>> same machine so probably bandwidth is not the root cause. Local I/O
>> utilization was low as well. What else could be the root cause? Any hint
>> will be appreciated.
>>
>> btw, Everything looked good when I tested on my local machine
>> (20000+msgs/sec on both sides without any problem), but when I put them
>> onto EC2, this thing happens.
>>
>> Thanks,
>> Zhibo
>>
>>
>> On Fri, Sep 27, 2013 at 3:05 PM, Zhibo Wei <zweicmu at gmail.com
>> <mailto:zweicmu at gmail.com>> wrote:
>>
>> Hi Alvaro,
>>
>> Thanks for pointing out this.
>>
>> But when I checked the connection status from management plugin,
>> only publisher connections were in 'Flow/Blocked' states, all
>> consumers' connections were still 'running' but in a very low rate.
>>
>> And, as far as I know, 15000 msgs/sec is still far from the ceiling
>> (Based on
>> http://www.rabbitmq.com/blog/**2012/04/25/rabbitmq-**
>> performance-measurements-part-**2/<http://www.rabbitmq.com/blog/2012/04/25/rabbitmq-performance-measurements-part-2/>
>> ).
>>
>> Did I miss anything?
>>
>> Thanks,
>> Zhibo
>>
>>
>> On Fri, Sep 27, 2013 at 2:52 PM, Alvaro Videla
>> <videlalvaro at gmail.com <mailto:videlalvaro at gmail.com>**> wrote:
>>
>> Hi,
>>
>> Take a look at the flow control mechanism:
>> http://www.rabbitmq.com/**memory.html<http://www.rabbitmq.com/memory.html>
>>
>> "A per-connection mechanism that prevents messages being
>> published faster than they can be routed to queues." Perhaps
>> your publishers are hitting that.
>>
>> Regards,
>>
>> Alvaro
>>
>>
>> On Fri, Sep 27, 2013 at 2:44 AM, Zhibo Wei <zweicmu at gmail.com
>> <mailto:zweicmu at gmail.com>> wrote:
>>
>> 3.15 rabbitmq-server
>> 3.15 java-client (consumer)
>> 3.04 erlang (producer)
>> ec2 m1.large
>> 1 durable queue. no exchange.
>> Producer has 5 connections, each connection holds 20 channels.
>> Consumer has only 1 connection, who holds 20 channels,
>> autoack = true.
>>
>> If the publish rate is equal or below 15000msgs/sec, then
>> the consumer can hold it up (Queue never grows). However, if
>> the publish rate exceeds 15,000, say 18,000 msgs/sec, then
>> the delivery rate will drop to 400~2000/sec, then queue
>> starts paging, blocks producer, and then everything crashes.
>>
>> The CPU and memory usages are just fine, but I'm not sure
>> what else could cause such problem. Bandwidth? Socket buffer
>> size?
>>
>> Anyone saw this kind of issue before? Any clues? Any other
>> things I should check?
>>
>> Thanks,
>> Zhibo
>>
>> ______________________________**_________________
>> rabbitmq-discuss mailing list
>> rabbitmq-discuss at lists.**rabbitmq.com<rabbitmq-discuss at lists.rabbitmq.com>
>> <mailto:rabbitmq-discuss@**lists.rabbitmq.com<rabbitmq-discuss at lists.rabbitmq.com>
>> >
>>
>> https://lists.rabbitmq.com/**cgi-bin/mailman/listinfo/**
>> rabbitmq-discuss<https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss>
>>
>>
>>
>> ______________________________**_________________
>> rabbitmq-discuss mailing list
>> rabbitmq-discuss at lists.**rabbitmq.com<rabbitmq-discuss at lists.rabbitmq.com>
>> <mailto:rabbitmq-discuss@**lists.rabbitmq.com<rabbitmq-discuss at lists.rabbitmq.com>
>> >
>>
>> https://lists.rabbitmq.com/**cgi-bin/mailman/listinfo/**
>> rabbitmq-discuss<https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss>
>>
>>
>>
>>
>>
>> ______________________________**_________________
>> rabbitmq-discuss mailing list
>> rabbitmq-discuss at lists.**rabbitmq.com<rabbitmq-discuss at lists.rabbitmq.com>
>> https://lists.rabbitmq.com/**cgi-bin/mailman/listinfo/**rabbitmq-discuss<https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20130930/0e6fab12/attachment.htm>
More information about the rabbitmq-discuss
mailing list