[rabbitmq-discuss] Durability and consumer acknowledgement extremely slow

Thu Jun 6 13:43:30 BST 2013

We encountered perhaps the same issue. This time not on Amazon, but with 
simple VMWare instances. We are using the .NET client. This would imply 
indeed that it is some behavioral aspect of the node. It would be very 
helpful if someone can explain what is happening here.

Regards,
Joost

On Wednesday, April 24, 2013 6:26:42 PM UTC+2, Karl Rieb wrote:
>
> Hi,
>
> I am trying to improve the message throughput for a RabbitMQ queue in an 
> Amazon cloud instance and am noticing a *significant* drop in performance 
> when enabling acknowledgements for consumer of a durable queue (with 
> persisted messages).  The real problem is that the bottleneck appears to be 
> on the rabbit node and not with the consumers, so adding more consumers 
> does not improve the throughput (or help drain the queue any quicker).  As 
> a matter of fact, adding new consumers will just slow down existing 
> consumers so everyone ends up consuming at a slower rate, preventing 
> overall throughput from changing.
>
> Trying to do batch acknowledgements using the Multiple flag helps a bit 
> (8k msgs/s vs 5.5k msgs/s) but not much compared to the initial drop.  It 
> is only when I turn on *auto_ack* for the consumers that I see the 
> performance shoot *way *back up and when I start seeing a linear increase 
> in throughput as I add more consumers.
>
> Is this expected behavior?  Is there a way to configure the rabbit node so 
> it doesn't hit this bottleneck with acknowledgements?
>
> Here is the sample code I'm using to test the throughput:
>
> Publisher:
>
> #!/usr/bin/python
>
> import pika
>
> creds = pika.PlainCredentials('guest','guest')
> conn  = 
> pika.BlockingConnection(pika.ConnectionParameters(host='10.10.1.123', 
> credentials=creds))
> chan  = conn.channel()
>
> while True:
>     chan.basic_publish(exchange='simple_exchange', 
> routing_key='simple_queue', body='', 
> properties=pika.BasicProperties(delivery_mode=2))
>
>
> Consumer:
>
>  #!/usr/bin/python
>
> import pika
>
> def callback(chan, method, properties, body):
>     chan.basic_ack(delivery_tag=method.delivery_tag, multiple=False)
>
> creds = pika.PlainCredentials('guest','guest')
> conn  = 
> pika.BlockingConnection(pika.ConnectionParameters(host='10.10.1.123', 
> credentials=creds))
> chan  = conn.channel()
>
> chan.basic_consume(callback, queue='simple_queue', no_ack=False)
> chan.basic_qos(prefetch_count=1000)
> chan.start_consuming()
>
>
> I spawn multiple processes for the producers and multiple for the consumer 
> (so there is no python interpreter locking issues since each runs in its 
> own interpreter instance).  I'm using an an Amazon *c1.xlarge *(8 virtual 
> cores and "high" IO) Ubuntu 12.04 LTS instance with RabbitMQ version 
> 3.0.4-1 and an Amazon ephemeral disk (in production we would use an EBS 
> volume instead).  The queue is marked *Durable* and my messages all use *
> delivery_mode* 2 (persist).  
>
> Below are the performance numbers.  For each test I use 2 publishers 
> processes and 6 consumer processes (where 3 different machines host 2 
> consumers each).  The producers and consumers are all on *separate*machines from the rabbit node.  Throughput measurements were done using the 
> RabbitMQ management UI and linux utility top.  Python was compiled to pyc 
> files before running.
>
> *no_ack = True:*  
>     rate = 24,000/s 
>     single consumer CPU   =  65% 
>     single publisher CPU  =  80% (flow control enabled and being enforced)
>     (beam.smp) rabbit CPU = 400% (of 800%, 8 cores) -> 0.0%wa 11.5%sy
>
> *no_ack = False (manual acks per message):*
>     rate =  5,500/s
>     single consumer CPU   =  20%
>     single publisher CPU  =  20% (flow control enabled and being enforced)
>     (beam.smp) rabbit CPU = 300% (of 800%, 8 cores) -> 4.5%wa 10.0%sy
>     
> The most notable difference besides the throughput are the I/O waits when 
> ACKs are enabled (4.5% vs 0.0%).  This leads me to believe that the rabbit 
> node is being bottlenecked by performing I/O operations for ACK 
> bookkeeping.  The I/O doesn't appear to be a problem for persisting the 
> published messages since I'm *guessing* that rabbit is buffering those 
> and syncing them to disk in batches.  Does this mean the acknowledgements 
> are not also being buffered before synced with disk?  Can I configure the 
> rabbit node to change this behavior to help speed up the acknowledgements? 
>   I'm not using transactions in the example code above, so I don't need any 
> strict guarantees that ACKs were written to disk before returning.
>
> Thanks,
> Karl
>
> P.S. I wrote the same sample consumer code in Ruby to see if there was a 
> difference (in case there was a Python issue), but the numbers were about 
> the same.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20130606/72b62762/attachment.htm>