[rabbitmq-discuss] Blocked Connections

Fri Dec 7 21:32:35 GMT 2012

On 7 Dec 2012, at 19:56, Rawat wrote:

> We use RabbitMQ in our production environment. There are 6 publishers
> and 2 consumers. Suddenly we have seen problems where RabbitMQ  blocks
> the connections from producers as well as consumers. We have 8 GB of
> RAM on the RabbitMQ host. And our queues are non-persistent. We are
> using Exchange/Fanout. Our RabbitMQ Server version is 2.8.6
> 

Just because your queues are non-persistent doesn't mean your messages aren't written to disk. Messages can be persistent but even if they're not, the broker may still page to disk if memory becomes scarce. Having 8Gb of RAM on a host does not imply that all 8Gb are available to the broker, and even if it did the default memory_high_watermark is 0.4 % of the available RAM iirc.

> I am wondering :
> 
> -- Is it a known issue with version 2.8.6 ?
> -- Do you recommend upgrading to latest 3.0 version ?
> 

This is not a known 'issue' as such - it sounds like an example of flow control and you can read about it here: http://www.rabbitmq.com/memory.html - I would recommend upgrading to 3.0.1 (which is out early next week) anyway if you can, as there have been numerous bug fixes and performance improvements. You should be aware that 3.0.0 is a major release away from the 2.8.x series though, and check for breaking changes that may affect you. The blog post at http://www.rabbitmq.com/blog/2012/11/19/breaking-things-with-rabbitmq-3-0/ provides some insight into this.

> 
> What is most severe in our case is that when RabbitMQ blocks producer
> connections, On producer side the thread that was publishing data to
> rabbitmq, gets blocked. And that thread is a critical thread that does
> other jobs too in Producer. Is there a way so that when connection is
> blocked, producer thread gets exception instead of getting infinitely
> blocked ?
> 

No there isn't a way to get an exception thrown afaik. Even if the broker has a lot of memory available to it, at some point you may still exhaust the server's resources and memory based flow control will *need* to kick in in order to prevent the broker from crashing. You can also run out of file descriptors which can lead to throttling, though this is a bit less specific in its application - unfortunately I'm not an expert in how that presents itself at runtime, but you can check to see if that is happening using `lsof' or some such.

The same principle about memory based flow control applies to per-connection flow control, viz producers/connections which are generating traffic faster than the queues can process them are blocked: The broker stops reading from the inbound connections (sort of, roughly) which exerts TCP back-pressure on the producers. Flow control is actually 'credit' based in the broker, but that detail aside the results ought to look pretty much the same from a producer's perspective.

> Any idea/suggestion is most welcome.

The solution is fairly simple if you ask me: don't do critical work in threads that call out to the network! I would advise the same thing if you were communicating with a relational database, web service or any other external resource accessed over a network link which might take arbitrary time to respond. 

If you want to avoid this situation, which can happen *any time* BTW in a networked application, even if the external resource is not deliberately applying back pressure, because various bits of network infrastructure or behaviour in the OS networking sub-system can block producers too - I would suggest some simple refactoring such as 

1. spawn a new thread for producing messages
2. set up a lightweight channel for communicating with this 'producer thread' - I would suggest one of the Queue data structures in java.util.concurrent or a class from the System.Collections.Concurrent namespace if you're in .NET land.
3. keep the 'critical thread' separate and unblocked, and have it communicate with the producer thread by putting data into the shared (concurrent) data structure
4. put the interaction between the two threads into a class/object so the code the two threads use (and share) to communicate is neatly isolated in one place

Then you won't have an issue if the producers get blocked. If you need to become aware that you're blocked then you can use timeouts (either when writing to the shared memory area or by setting up a house-keeping thread to periodically check the last time the producer became available) and most of the data structures I mentioned earlier support 'try-timeout' semantics after some fashion or another. Similar capabilities exist for python/ruby, although I not well versed in the client libraries for those languages so YMMV.

HTH 

Tim  

> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss