Is the queue set to be durable?<br><br><div class="gmail_quote">On Tue, Aug 30, 2011 at 6:50 AM, Matthew Sackman <span dir="ltr"><<a href="mailto:matthew@rabbitmq.com">matthew@rabbitmq.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
Hi Cezary,<br>
<br>
On Thu, Aug 25, 2011 at 01:09:47PM +0100, Cezary Siwek wrote:<br>
> I'm facing an issue with my rabittmq cluster where a queue<br>
> disappears from a node.<br>
> I have 2 nodes in a cluster working in "disk mode".<br>
> Node1 - only consumes messages from the queue.<br>
> Node2 - only publishes messages to the queue.<br>
><br>
> The consumer script on Node1 creates a durable queue and waits for<br>
> messages.<br>
> Everything works fine until some no-activity time. I can't say how<br>
> long it needs to be but usually after more than 24h the queue<br>
> disappears from Node2 and the consumer stops receiving messages. The<br>
> list_queues command shows the queue exists on Node1 but not on<br>
> Node2.<br>
> I've done a packet trace when it happened and I can see some packets<br>
> are being exchanged between nodes. Also cluster_status commands<br>
> shows that both nodes are up and running in the cluster.<br>
> When I try to declare_queue on the Node2 i get 'NOT_FOUND - no queue<br>
> 'msgs' in vhost '/vhost1'.<br>
> All I need to do to have the Node2 running is to run stop_app and start_app.<br>
><br>
> Both nodes are sitting behind firewalls (in separate networks) but<br>
> both firewalls have been granted to pass all the traffic between<br>
> these two boxes.<br>
> It happens on my dev platfrom. On production I don't think I will<br>
> ever have such long quiet periods but I need to find out what is<br>
> causing this.<br>
<br>
Hmm, this is odd. It suggests that the two nodes have lost contact with<br>
each other which is why Node2 is responding with the NOT_FOUND when you<br>
try to redeclare the queue. However, if at this point the cluster_status<br>
on both nodes is suggesting everything is clustered and happy, then this<br>
is very odd indeed. Could you check that you can achieve these sets of<br>
circumstances?<br>
<br>
Also, do you have the logs for both nodes during the no-activity time?<br>
I'm curious whether there are entries in there that suggest the cluster<br>
has split apart. If they're large, then maybe send them to us off-list.<br>
<br>
Matthew<br>
_______________________________________________<br>
rabbitmq-discuss mailing list<br>
<a href="mailto:rabbitmq-discuss@lists.rabbitmq.com">rabbitmq-discuss@lists.rabbitmq.com</a><br>
<a href="https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss" target="_blank">https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss</a><br>
</blockquote></div><br>