[rabbitmq-discuss] Can a downed node affect responsive of HTTP queries to other nodes?

Thu Oct 25 11:30:59 BST 2012

Hi Matt.

In 2.8.x /api/nodes will make RPC calls to each node, but no other paths 
will. In 3.0 that's going to be removed as well.

However, /api/queues, like a lot of other paths, makes calls into Mnesia 
to figure out what things exist. When a node is unresponsive, calls into 
Mnesia can hang...

... but only for long enough for the VM to declare that the node is down 
(again configured by net_ticktime, defaulting to about a minute). After 
that, the node should be declared down and Mnesia should ignore it.

So when you say the node was "stuck", was it completely unresponsive? I 
am wondering if it could be just responsive enough to prevent Erlang 
from considering it down, while still being unresponsive enough to... be 
unresponsive.

Cheers, Simon

On 25/10/12 01:48, Matt Pietrek wrote:
> As part of our production monitoring support, we have a script that runs
> every five seconds and checks some information about the queues. In
> particular, it uses the "/api/queues/..." URL to  query info about them.
>
> All of our queues are declared as HA. Recently we had some problems
> where a node just got stuck for 30+ minutes (Known linux kernel bug).
> However, on the monitoring running on the healthy node, I was seeing my
> /api/queues queries timing out.
>
> I'm guessing that there's some set of the HTTP APIs that when invoked,
> may cause network traffic to other nodes. And if those nodes are down,
> the HTTP API is essentially useless as it eventually times out waiting
> for communication with the downed node.
>
> Can you always helpful RabbitMQ folks tell me if this is indeed the
> case, and if there's anything else useful to know when planning a
> monitoring strategy using the HTTP API?
>
> Thanks,
>
> Matt
>
>
>
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>

-- 
Simon MacMullen
RabbitMQ, VMware