[rabbitmq-discuss] Fully reliable setup impossible?
Simon MacMullen
simon at rabbitmq.com
Fri May 23 12:16:39 BST 2014
On 22/05/14 20:04, Steffen Daniel Jensen wrote:
> Yes, I know. But I am not asking an unreasonably lot, IMO :-)
> I am aware of the CAP theorem, but I don't see how it is in violation. I
> am willing to live with eventual consistency.
Ah, right.
> What I mean when I say "reliable" is: All subscribers at the time of
> publish will eventually get the message.
>
> That should be possible, assuming that all live inconsistent nodes will
> eventually rejoin (without dumping messages). I know this is not the
> case in rabbitmq, but it is definitely theoretically possible. I guess
> this is what is usually referred to as eventual consistency.
So why did you originally say:
> It must be a cluster because we want clients to be able to connect to
> each node transparently. Federation is not an option.
...because it really sounds like federation is what you want :-)
> Yes, see http://www.rabbitmq.com/__nettick.html
> <http://www.rabbitmq.com/nettick.html>
>
>
> Thank you! (!)
> I have been looking for that one. But I am surprised to see that it is
> actually 60sec. Then I really don't understand how I could have seen so
> many clusters ending up partitioned.
>
> Do you know what the consequence of doubling it might be?
It will take twice as long to conclude that a remote node that is no
longer connected has actually gone away. Until then, things can block.
> RabbitMq writes:
> Increasing the net_ticktime across all nodes in a cluster will make the
> cluster more resilient to short network outtages, but it will take
> longer for remaing nodes to detect crashed nodes.
>
> More specifically I wonder what happens in the time a node is actually
> in its own network, but before it finds out. In our setup all publishes
> have all-HA subscriber queues, with publisher confirm. So I will expect
> a distributed agreement that the msg has been persisted.
Yes.
> Will a
> publisher confirm then block until the node decides that other nodes are
> down, and then succeed?
Yes.
> The duplication is ok -- but assuming that rabbit is usually empty, it
> won't really happen, I think.
> But -- I am sure that rabbit does not guarantee exactly once delivery
> anyway.
> For that reason, we will build in idempotency for critical messages.
>
> Ordering can always get scrambled when nacking consuming messages, so we
> are not assuming ordering either.
OK.
> About the CAP theorem in relation to rabbit.
> Reliable messaging (zero message loss), is often preferred in
> SOA-settings. I wonder why vmware/pivotal/... chose not to prioritize
> this guarantee. It is aimed by the federation setup, but it is a little
> to weak in its synchronization. It would be preferred if it had a
> possibility of communicating consumption of messages. Then one could
> mirror queues between up/down-stream exchanges, and have even more
> "availability".
I'm not sure what you're talking about here. Federation certainly should
be able to ensure zero message loss! (Assuming you leave it in
"on-confirm" ack-mode).
So when you say "if it had a possibility of communicating consumption of
messages" you're talking about a sort of eventually consistent federated
mirrored queue? I have wondered about producing such a thing. But as
usual the list of things we could do is large, and the resources small.
And I suspect the cost of producing it would be quite large, mostly due
to the need to somehow reunify the queues after a partition.
Cheers, Simon
--
Simon MacMullen
RabbitMQ, Pivotal
More information about the rabbitmq-discuss
mailing list