[rabbitmq-discuss] rabbitmq recovery, channelflow, and clients

Matthew Sackman matthew at rabbitmq.com
Sun May 23 09:32:45 BST 2010


Hi Tyler,

On Sat, May 22, 2010 at 09:21:32AM -0400, Tyler Williams wrote:
> I've been playing with the 21673 branch of rabbitmq, and I have a few questions. 
> 
> How does rabbitmq handle bumping up against ulimit -n?

The long answer is read my blog post about it here:
http://www.lshift.net/blog/2010/03/23/the-fine-art-of-holding-a-file-descriptor

The short answer is "it copes magically and seemlessly". At least
probabilistically.

> After overwhelming rabbitmq with what I think is a misbehaving client, and then restarting it, rabbitmq recovers some of my queues, and none of the messages that were in them. Is this the expected behavior? I see messages like this in rabbit.log for the unrecovered queues:

Durable queues will survive a broker crash/restart. Persistent messages
inside durable queues will too. However, due to buffering in various
places you may lose some of the persistent messages inside durable
queues, unless you happen to be using transactions.

> ** Reason for termination == 
> ** {{badmatch,{error,emfile}},

Hmm, you really have run out of file descriptors. This is not meant to
happen! On Thursday and Friday we added a few fixes to try and prevent
some mis-behaving clients from overwhelming Rabbit from the connection
point of view (rapidly opening new sockets etc). I've just merged all
that into bug21673 so if you pull and try again, I'd be interested to
see if things have improved there.

But in general, what kinds of things were your misbehaving clients doing
to cause this to happen?

> With respect to ChannelFlow, when rabbit gets a high mem watermark event and sends out the channelflow command to producers, will it disconnect producers who don't honor this?

Not yet, but soon, yes. This is likely to be implemented next week.

> I've been playing with both pika and amqplib clients. With pika, I see the watermark get set, the flow exception thrown in my producer, and then eventually the watermark gets cleared. With amqplib I see the watermark set but never cleared. I see what looks like some support for flow in amqplib, but I can't tell how it's being used.

There is no support for flow in amqplib. Tony has produced an
experimental patch for it (search this list), but I've no idea how fully
tested it is.

The watermark will eventually get cleared - the goal of the bug21673
branch is to ensure that eventually, Rabbit will always be able to
accept another message. Now hard disks aren't as fast as RAM, so from
time to time, when we run out of space in RAM, we have no choice but to
use flow to stop clients from publishing so as to give ourselves some
breathing room to flush out to disk. But the high watermark should
always drop, eventually.

Best wishes,

Matthew



More information about the rabbitmq-discuss mailing list