[rabbitmq-discuss] Occasional slow message on a local machine
Brennan Sellner
bsellner at seegrid.com
Fri Sep 14 21:07:54 BST 2012
On 09/10/2012 03:52 PM, Matthias Radestock wrote:
> On 10/09/12 20:42, Brennan Sellner wrote:
>> Well, there goes that theory. Thanks for the info!
>>
>> I'll see if I can reproduce this in a more controlled environment. If
>> anyone has further ideas in the meantime, I'm all ears.
>
> Set up some syscall tracing on the rabbit process to see what it's up to
> when you encounter the delay. That would reveal any unexpected fsync's,
> for example.
Short version
-------------
It appears that cleaning up a queue (e.g. auto-delete on channel close)
requires Rabbit to fsync, even if the queue is non-durable, contains
only non-persistent messages, and is bound to a non-durable exchange. A
portion of our system is still using a query-response architecture, so
short-lived queues are quite common. Is there any way to declare a
queue such that Rabbit doesn't touch the disk at any point in the
queue's life cycle?
Long version / background
-------------------------
All righty, I think I have a root cause. In a previous thread
(http://rabbitmq.1065348.n5.nabble.com/librabbitmq-c-and-amqp-channel-close-tt20334.html),
we reported that amqp_channel_close was occasionally taking a long time.
As a result of that conversation, we implemented an alternate version
that didn't wait for close-ok, and threw away the close-ok frame when it
eventually arrived.
It appears that the long-running cleanup done in response to
amqp_channel_close is at fault here, and is attributed to one of the
subsequent calls, with the specific one dependent on when the fsync falls.
I've implemented a pair of self-contained test programs that replicate
the problem, using only librabbitmq-c (the source is available if anyone
wants it):
amqp-test-recv:
- Declare a queue and bind it to testSendExchange
- Consume from the queue infinitely
- Send a reply to each arriving message
- To the default exchange
- Route with the reply-to header of the received message
amqp-test-send:
- Loops infinitely:
- Open a channel
- Declare a server-named queue
- Consume from it
- Send a message to testSendExchange, filling in reply-to
- Wait for the method, header, and body frames of the reply
- Close the channel *asynchronously*
- Increment the channel number to be used
amqp-test-send's loop runs at 80 - 220 Hz, depending on system load.
If these are run on the same host as Rabbit, and we hit the hard drive
hard (e.g. continuously copy large files), we see a slow (0.5 - 7
seconds) response every few seconds.
If we instead close the channel synchronously in amqp-test-send, the
slow responses are exclusively associated with amqp_channel_close.
I've experimented with a variety of parameters, and it's pretty
conclusive that the problem is with cleaning up the server-named reply
queues. If I declare them as non-auto-delete, Rabbit doesn't hit the
disk, and we don't see any timeouts. However, the queues accumulate,
and when amqp-test-send is killed, Rabbit is hugely slowed while
performing all the cleanup. Ditto if I operate in a single channel,
rather than closing and recreating the channel on each loop.
So. The question is: is it possible to declare a queue that's entirely
transitory, and doesn't result in Rabbit having to hit the disk at any
point in the queue's life cycle? I'm currently declaring non-passive,
non-durable, exclusive, auto-delete queues. Switching to non-exclusive
has no effect. It really seems like there should be *some* way to have
a lightweight, ephemeral queue...
Thanks,
-Brennan
More information about the rabbitmq-discuss
mailing list