[rabbitmq-discuss] Connection blocked by "flow" for more than 600 seconds
Simon MacMullen
simon at rabbitmq.com
Fri Oct 11 16:40:10 BST 2013
Ouch. Didn't realise it was that bad. So should we disrecommend R16B01
in general?
Cheers, Simon
On 11/10/2013 12:53PM, Jesper Louis Andersen wrote:
> You are using R16B01. Upgrade to R16B02 at once! R16B01 has a bug which
> means that async worker processes are not getting used correctly (too
> many processes are hashed to the wrong async worker, more or less). This
> severely hits disk I/O on a busy machine.
>
> There are other problems with R16B01. It should be avoided if possible.
>
>
> On Fri, Oct 11, 2013 at 1:29 PM, Simon MacMullen <simon at rabbitmq.com
> <mailto:simon at rabbitmq.com>> wrote:
>
> OK, so your screenshot shows 750 queues and 753 connections. Was
> this from the same time as you had ~10k file descriptors in use?
> That sounds wrong.
>
> I think your publishing connections are going into flow control
> because there's a squeeze on file descriptors, which is causing the
> queues to have to share a small number of file descriptors between
> them - thus slowing them down.
>
> If you do have far more file descriptors in use than queues +
> connections, do you have any exotic plugins in use? What does "lsof
> -lnp <pid of server process>" say?
>
> Cheers, Simon
>
>
> On 11/10/2013 3:22AM, Choo wrote:
>
> Hi Simon,
>
> As memory is plenty, I found that file descriptors hit the
> default limit,
> so, I bumped the limit up to 5,120, and finally to 10,240 on
> each nodes. It
> turned out that the file descriptors also touched the limit
> (around 10,086),
> and things started to go downhill.
>
> <http://rabbitmq.1065348.n5.__nabble.com/file/n30402/__ScreenShot.jpg
> <http://rabbitmq.1065348.n5.nabble.com/file/n30402/ScreenShot.jpg>>
>
> I started processes in reverse order, by starting
> subscriber-side first
> (1:42), then the bigger publishers later (1:45). The number of
> published
> messages bounced up&down, then just after 1:48, the most of the
> publishers
> were blocked.
>
> There are more than 350 of blocked connections like below now
> (and file
> descriptors are running at 7,558 + 4,647 on 2 nodes):
> 10.95.212.11:33751 <http://10.95.212.11:33751> ->
> 10.95.212.13:5672 <http://10.95.212.13:5672> blocked 1261.558817
> flow
> 10.95.212.11:33752 <http://10.95.212.11:33752> ->
> 10.95.212.13:5672 <http://10.95.212.13:5672> blocked 1326.324919
> flow
> 10.95.212.11:33753 <http://10.95.212.11:33753> ->
> 10.95.212.13:5672 <http://10.95.212.13:5672> blocked 1326.45322
> flow
> 10.95.212.11:33754 <http://10.95.212.11:33754> ->
> 10.95.212.13:5672 <http://10.95.212.13:5672> blocked 1278.581221
> flow
> 10.95.212.11:33755 <http://10.95.212.11:33755> ->
> 10.95.212.13:5672 <http://10.95.212.13:5672> blocked 1312.584426
> flow
> 10.95.212.11:33756 <http://10.95.212.11:33756> ->
> 10.95.212.13:5672 <http://10.95.212.13:5672> blocked 1279.623625
> flow
> 10.95.212.11:33757 <http://10.95.212.11:33757> ->
> 10.95.212.13:5672 <http://10.95.212.13:5672> blocked 1294.492535
> flow
> 10.95.212.11:33758 <http://10.95.212.11:33758> ->
> 10.95.212.13:5672 <http://10.95.212.13:5672> blocked 1276.134377
> flow
> 10.95.212.11:33759 <http://10.95.212.11:33759> ->
> 10.95.212.13:5672 <http://10.95.212.13:5672> blocked 1292.862081
> flow
> 10.95.212.11:33760 <http://10.95.212.11:33760> ->
> 10.95.212.13:5672 <http://10.95.212.13:5672> blocked 1290.695249
> flow
> 10.95.212.11:33761 <http://10.95.212.11:33761> ->
> 10.95.212.13:5672 <http://10.95.212.13:5672> blocked 1255.599642
> flow
> 10.95.212.11:33762 <http://10.95.212.11:33762> ->
> 10.95.212.13:5672 <http://10.95.212.13:5672> blocked 1284.984752
> flow
>
> Please kindly suggest.
>
> Thank you and Best Regards,
> Choo
>
>
>
> --
> View this message in context:
> http://rabbitmq.1065348.n5.__nabble.com/Connection-blocked-__by-flow-for-more-than-600-__seconds-tp30349p30402.html
> <http://rabbitmq.1065348.n5.nabble.com/Connection-blocked-by-flow-for-more-than-600-seconds-tp30349p30402.html>
> Sent from the RabbitMQ mailing list archive at Nabble.com.
> _________________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.__rabbitmq.com
> <mailto:rabbitmq-discuss at lists.rabbitmq.com>
> https://lists.rabbitmq.com/__cgi-bin/mailman/listinfo/__rabbitmq-discuss
> <https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss>
>
>
> --
> Simon MacMullen
> RabbitMQ, Pivotal
>
> _________________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.__rabbitmq.com
> <mailto:rabbitmq-discuss at lists.rabbitmq.com>
> https://lists.rabbitmq.com/__cgi-bin/mailman/listinfo/__rabbitmq-discuss
> <https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss>
>
>
>
>
> --
> J.
>
>
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>
--
Simon MacMullen
RabbitMQ, Pivotal
More information about the rabbitmq-discuss
mailing list