[rabbitmq-discuss] Durability and consumer acknowledgement extremely slow

Fri Apr 26 14:21:39 BST 2013

(Replying again but using Reply-All to ensure rabbitmq-discuss forum sees
my response)

Hi Simon,

Thanks a lot for your response.  Okay, I just wanted to make sure I didn't
have something misconfigured.  If the throughput I'm seeing is considered
"normal" given the type of machines I'm running on, then that is a huge
help to me.  I had been wondering if those numbers were considered good,
bad, etc.  Thanks!

On Thu, Apr 25, 2013 at 6:59 AM, Simon MacMullen <simon at rabbitmq.com> wrote:

> Hi Karl. I suspect you are not really seeing a bottleneck with
> acknowledgements, but rather an optimisation in autoack mode. When you
> publish a persistent message to an empty queue with a non-blocked autoack
> consumer RabbitMQ will not persist the message to disc - there's no point.
> The message can go straight to the consumer, and then it's gone; it can
> never be requeued.
>
> So I suspect that's the difference you're seeing. And I'm afraid 5-8k
> msg/s is roughly what I would expect for persistent messages on a
> reasonable machine.
>
> Cheers, Simon
>
> On 24/04/13 17:26, Karl Rieb wrote:
>
>> Hi,
>>
>> I am trying to improve the message throughput for a RabbitMQ queue in an
>> Amazon cloud instance and am noticing a *significant* drop in
>> performance when enabling acknowledgements for consumer of a durable
>> queue (with persisted messages).  The real problem is that the
>> bottleneck appears to be on the rabbit node and not with the consumers,
>> so adding more consumers does not improve the throughput (or help drain
>> the queue any quicker).  As a matter of fact, adding new consumers will
>> just slow down existing consumers so everyone ends up consuming at a
>> slower rate, preventing overall throughput from changing.
>>
>> Trying to do batch acknowledgements using the Multiple flag helps a bit
>> (8k msgs/s vs 5.5k msgs/s) but not much compared to the initial drop.
>>   It is only when I turn on *auto_ack* for the consumers that I see the
>> performance shoot *way *back up and when I start seeing a linear
>> increase in throughput as I add more consumers.
>>
>> Is this expected behavior?  Is there a way to configure the rabbit node
>> so it doesn't hit this bottleneck with acknowledgements?
>>
>> Here is the sample code I'm using to test the throughput:
>>
>> Publisher:
>>
>>     #!/usr/bin/python
>>
>>     import pika
>>
>>     creds = pika.PlainCredentials('guest',**'guest')
>>     conn  =
>>     pika.BlockingConnection(pika.**ConnectionParameters(host='10.**10.1.123',
>> credentials=creds))
>>     chan  = conn.channel()
>>
>>     while True:
>>     chan.basic_publish(exchange='**simple_exchange',
>>     routing_key='simple_queue', body='',
>>     properties=pika.**BasicProperties(delivery_mode=**2))
>>
>>
>> Consumer:
>>
>>       #!/usr/bin/python
>>
>>     import pika
>>
>>     def callback(chan, method, properties, body):
>>          chan.basic_ack(delivery_tag=**method.delivery_tag,
>> multiple=False)
>>
>>     creds = pika.PlainCredentials('guest',**'guest')
>>     conn  =
>>     pika.BlockingConnection(pika.**ConnectionParameters(host='10.**10.1.123',
>> credentials=creds))
>>     chan  = conn.channel()
>>
>>     chan.basic_consume(callback, queue='simple_queue', no_ack=False)
>>     chan.basic_qos(prefetch_count=**1000)
>>     chan.start_consuming()
>>
>>
>> I spawn multiple processes for the producers and multiple for the
>> consumer (so there is no python interpreter locking issues since each
>> runs in its own interpreter instance).  I'm using an an Amazon
>> *c1.xlarge *(8 virtual cores and "high" IO) Ubuntu 12.04 LTS instance
>> with RabbitMQ version 3.0.4-1 and an Amazon ephemeral disk (in
>> production we would use an EBS volume instead).  The queue is marked
>> *Durable* and my messages all use *delivery_mode* 2 (persist).
>>
>> Below are the performance numbers.  For each test I use 2 publishers
>> processes and 6 consumer processes (where 3 different machines host 2
>> consumers each).  The producers and consumers are all on *separate*
>> machines from the rabbit node.  Throughput measurements were done using
>> the RabbitMQ management UI and linux utility top.  Python was compiled
>> to pyc files before running.
>>
>> *no_ack = True:*
>>      rate = 24,000/s
>>      single consumer CPU   =  65%
>>      single publisher CPU  =  80% (flow control enabled and being
>> enforced)
>>      (beam.smp) rabbit CPU = 400% (of 800%, 8 cores) -> 0.0%wa 11.5%sy
>>
>> *no_ack = False (manual acks per message):*
>>      rate =  5,500/s
>>      single consumer CPU   =  20%
>>      single publisher CPU  =  20% (flow control enabled and being
>> enforced)
>>      (beam.smp) rabbit CPU = 300% (of 800%, 8 cores) -> 4.5%wa 10.0%sy
>> The most notable difference besides the throughput are the I/O waits
>> when ACKs are enabled (4.5% vs 0.0%).  This leads me to believe that the
>> rabbit node is being bottlenecked by performing I/O operations for ACK
>> bookkeeping.  The I/O doesn't appear to be a problem for persisting the
>> published messages since I'm *guessing* that rabbit is buffering those
>> and syncing them to disk in batches.  Does this mean the
>> acknowledgements are not also being buffered before synced with disk?
>>   Can I configure the rabbit node to change this behavior to help speed
>> up the acknowledgements?   I'm not using transactions in the example
>> code above, so I don't need any strict guarantees that ACKs were written
>> to disk before returning.
>>
>> Thanks,
>> Karl
>>
>> P.S. I wrote the same sample consumer code in Ruby to see if there was a
>> difference (in case there was a Python issue), but the numbers were
>> about the same.
>>
>>
>> ______________________________**_________________
>> rabbitmq-discuss mailing list
>> rabbitmq-discuss at lists.**rabbitmq.com<rabbitmq-discuss at lists.rabbitmq.com>
>> https://lists.rabbitmq.com/**cgi-bin/mailman/listinfo/**rabbitmq-discuss<https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss>
>>
>>
>
> --
> Simon MacMullen
> RabbitMQ, VMware
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20130426/c926a931/attachment.htm>