[rabbitmq-discuss] Federation and upstream cluster

Fri Dec 28 02:04:34 GMT 2012

Hi

On 27/12/2012 17:02, Vladislav Pernin wrote:
> Hi,
>
> I'm running RabbitMQ 3.0.1 on two cluster of Linux servers.
>
> Let's name the two clusters :
> - downstream cluster running a federation to get messages from upstream
> cluster
> - upstream cluster
>
> The documentation explains well if a node fails, links to upstream
> exchanges will be recreated on a surviving node.
> There is no problem for the "client" side of the federation.
>
> I cannot use a load balancer if fail over mode to have high avaibility
> of the upstream cluster.
>
> What would be the recommended solution in this case ?
>

I'm struggling to understand what the question is here Vladislav. The 
'failover' that is being described in the federation plugin 
documentation is applied when using federation in a cluster, so if the 
node on which the downstream link is running dies, then another 
downstream node will take over (i.e., re-establish the links). There is 
a choice between clustering (i.e., ha/mirror queues) and federation - 
you do not get 'ha of the upstream cluster' in the same sense that 
mirror queues in a cluster are 'ha'. You have federated exchanges which 
copy data using AMQP (with ACKs enabled and some other guarantees) and 
the ability to try and re-establish links and so. Federation however, 
provides only the Availability and Partition tolerance parts of the CAP 
theorem, not the same Consistency guarantees as clustering/ha.

> I have tried to set up two upstream and group them in a upstream set,

Can you post the configuration you're using to do that?

> but I have the following problem :
> - when I shut down one the node, the federation status shows the
> matching upstream down as expected but after having restarted the first
> one, if I shut down the other one, both the federation status shows both
> upstream down

Just to confirm: you're saying that

1. you shut down one of the two upstream nodes
2. that node shows up dead in the web interface
3. you re-start that node
4. that node shows up alive in the web interface
5. you shut down the other upstream node
6. both nodes show up as dead in the web interface *but*
7. one of the upstream nodes *is* alive despite what the web admin says

Have I understood that correctly?

> - so, I tried to add a ha-mode policy to all on the federated queue, it
> is now possible to shutdown either one or the other node,

I'm not sure I understand this at all. Are you saying it was not 
possible to shut down one or both of the upstream nodes before? That 
seems different from your earlier comment.

> but it seems that I'm losing some messages.
>

When you say 'the federated queue' do you mean the queue created in the 
upstream exchange's broker? Why would you want to add ha-mode policy 
that? The upstream queue is internal to the federation mechanism so you 
should be binding to the downstream exchange only. Or are you saying 
that you've bound a queue to the downstream exchange and made that 
ha-enabled? Because in the latter case, that will make no difference to 
reliability: if both upstream nodes go down before messages are 
delivered and ack'ed by the downstream for example.

I'd be interested to hear how you've set this ha-mode policy and why and 
also how you've determined that there was message loss? I suspect that 
you have assumed expectations about the reliability of federation (in 
the face of node failures) that do not hold. If your messages sat in an 
exchange on an upstream node (or pair of exchanges/nodes, etc) and both 
nodes die before successfully transmitting the messages, then they will 
not arrive at the downstream exchange. The guarantees about message 
delivery for ha/mirror queues apply to nodes in *that* cluster only. The 
federation guarantees are different and orthogonal to ha/clustering.

Hope that makes sense.

Cheers,
Tim

> --
> Vladislav
>
>
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>