[rabbitmq-discuss] Connection blocked by "flow" for more than 600 seconds
Jesper Louis Andersen
jesper.louis.andersen at gmail.com
Fri Oct 11 12:53:06 BST 2013
You are using R16B01. Upgrade to R16B02 at once! R16B01 has a bug which
means that async worker processes are not getting used correctly (too many
processes are hashed to the wrong async worker, more or less). This
severely hits disk I/O on a busy machine.
There are other problems with R16B01. It should be avoided if possible.
On Fri, Oct 11, 2013 at 1:29 PM, Simon MacMullen <simon at rabbitmq.com> wrote:
> OK, so your screenshot shows 750 queues and 753 connections. Was this from
> the same time as you had ~10k file descriptors in use? That sounds wrong.
> I think your publishing connections are going into flow control because
> there's a squeeze on file descriptors, which is causing the queues to have
> to share a small number of file descriptors between them - thus slowing
> them down.
> If you do have far more file descriptors in use than queues + connections,
> do you have any exotic plugins in use? What does "lsof -lnp <pid of server
> process>" say?
> Cheers, Simon
> On 11/10/2013 3:22AM, Choo wrote:
>> Hi Simon,
>> As memory is plenty, I found that file descriptors hit the default limit,
>> so, I bumped the limit up to 5,120, and finally to 10,240 on each nodes.
>> turned out that the file descriptors also touched the limit (around
>> and things started to go downhill.
>> I started processes in reverse order, by starting subscriber-side first
>> (1:42), then the bigger publishers later (1:45). The number of published
>> messages bounced up&down, then just after 1:48, the most of the publishers
>> were blocked.
>> There are more than 350 of blocked connections like below now (and file
>> descriptors are running at 7,558 + 4,647 on 2 nodes):
>> 10.95.212.11:33751 -> 10.95.212.13:5672 blocked 1261.558817 flow
>> 10.95.212.11:33752 -> 10.95.212.13:5672 blocked 1326.324919 flow
>> 10.95.212.11:33753 -> 10.95.212.13:5672 blocked 1326.45322 flow
>> 10.95.212.11:33754 -> 10.95.212.13:5672 blocked 1278.581221 flow
>> 10.95.212.11:33755 -> 10.95.212.13:5672 blocked 1312.584426 flow
>> 10.95.212.11:33756 -> 10.95.212.13:5672 blocked 1279.623625 flow
>> 10.95.212.11:33757 -> 10.95.212.13:5672 blocked 1294.492535 flow
>> 10.95.212.11:33758 -> 10.95.212.13:5672 blocked 1276.134377 flow
>> 10.95.212.11:33759 -> 10.95.212.13:5672 blocked 1292.862081 flow
>> 10.95.212.11:33760 -> 10.95.212.13:5672 blocked 1290.695249 flow
>> 10.95.212.11:33761 -> 10.95.212.13:5672 blocked 1255.599642 flow
>> 10.95.212.11:33762 -> 10.95.212.13:5672 blocked 1284.984752 flow
>> Please kindly suggest.
>> Thank you and Best Regards,
>> View this message in context: http://rabbitmq.1065348.n5.**
>> Sent from the RabbitMQ mailing list archive at Nabble.com.
>> rabbitmq-discuss mailing list
>> rabbitmq-discuss at lists.**rabbitmq.com<rabbitmq-discuss at lists.rabbitmq.com>
> Simon MacMullen
> RabbitMQ, Pivotal
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.**rabbitmq.com<rabbitmq-discuss at lists.rabbitmq.com>
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the rabbitmq-discuss