[rabbitmq-discuss] Lost messages in cluster

Wed Jan 18 23:03:26 GMT 2012

Hello, Jerry.

> When you say the "topology" in this context, what do you mean exactly?
I mean a set of brokers, exchanges and queues and relations between them.

> Are you using the new HA/mirrored-queues feature, and have declared
> replication of queues across cluster nodes explicitly? Or are you
> using "regular" clustering of the older type? If the latter then the
> queue processes and queue contents will 'really' live only on a single
> node in the cluster, although of course they can be published by, or
> delivered to clients who happen to be connected to any node in the
> cluster.
Not, I dont use mirrored queues.

> By "node" do you mean one of your producers or consumers, or one of
> the Rabbit cluster members? If you mean a Rabbit cluster member then
> you can't publish to, or consume from, a queue as long as the node on
> which it lives is down. If you use the new active-active,
> mirrored-queues HA system then you can specify that a queue be
> replicated across multiple nodes in a cluster, and the loss of the
> master queue replica leads to one of the replicated slaves taking
> over.  There's also an older active-passive HA system where you use
> something like Pacemaker to switch over from a failed cluster node to
> a hot standby, that was sharing storage with its backup brethren.
Each node is physical/virtual machine with installed rabbitmq broker,
consumer and producer. Consumer and producer always connect to local broker.
Consumer receives messages only from queues that were declared on local 
broker.
Producer publishes messages to fanout exchange to which all queues are 
bound.

> Answering this fully depends on what potential causes of message loss
> you want to immunize yourself against.  If you're worried about the
> failure of a cluster node on which a queue resides rendering that
> queue unavailable, your options are either:
>
>    - the new active/active mirrored queues HA
>
>    - the old active/passive system with shared storage and something
>      like Pacemaker handling the failover
>
> If you need messages to be moved over potentially high latency or
> flaky WAN links, then you want to consider Shovel or Federation for
> bridging the wild network waters between the islands on which your
> clusters live.
>
> Both are documented on the RabbitMQ website, and the latter is
> discussed in the Manning book "RabbitMQ in Action" (currently
> available as a preview eBook, final print version due out later this
> Spring).
>
> Does this help at all?
Number of brokers is unlimited.
All brokers are in the same datacenter, but any broker may become
unavailable to others. And if that period of time is less than net_ticktime,
queue, declared on that broker, misses messages.

I tried to use active/active mirrored queues, declaring them
with "x-ha-policy" = "all" and faced some unpleasant issue:
If any node become unavailable, the command
# rabbitmqctl list_queues
hangs and I cant get any information.

Thats why I looked at federation plugin.

I'll continue playing with HA queues and hope this solution will help
to solve my problem. I will keep you informed of the results of my 
experiments.

--
Best regards,
Artsiom