[rabbitmq-discuss] RabbitMQ and a two-site deployment connected via WAN.

Daniel Pittman daniel at rimspace.net
Tue May 4 14:11:38 BST 2010


Simon MacMullen <simon at rabbitmq.com> writes:
> On 04/05/10 11:29, Daniel Pittman <daniel at rimspace.net> said:
>> G'day.
>>
>> I am looking at deploying RabbitMQ to provide messaging services inside, and
>> between, our two sites using both the AMQP and STOMP interfaces.
>>
>> As far as I can tell we have two choices for deploying this:
>>
>> 1. Use an Erlang cluster over the WAN, as a single instance of RabbitMQ
>>
>> 2. Use the 'shovel' extension to push messages between two distinct RabbitMQ
>>     clusters, one at each site.
>>
>> The only real discussion I can find about this is from 35 weeks ago, where the
>> comment was that using shovel might better handle a situation where the WAN
>> was down, but that it was not production ready.
>
> As an aside, Matthew now tells me that the shovel is "production ready", 
> although at the moment it still requires compiling the broker, Erlang 
> client and shovel from source.

At least help from another user here, and a bunch of struggling, got me as far
as building rabbitmq-stomp from source to a Debian package, so building those
wouldn't be too terrible, if needs be.

I figured in most of a year things might have moved on; is it still likely
that shovel will avoid losing messages across a WAN split where Erlang
clustering would lose them?

>> My impression is that, generally speaking, we are likely to be best served by
>> using an Erlang cluster over the network: the WAN is fairly reliable, and will
>> soon have two redundant paths, further reducing the risk of a split cluster.

[...]

>> Erlang and/or RabbitMQ will ensure that only one copy of a message
>> traverses the WAN, even if there are multiple recipients at the remote
>> site.
>
> Not in 1.7.2. Versions of this optimisation existed in previous versions
> of RabbitMQ, but have since been disabled since they broke some ordering
> guarantees.

OK.  Given I do value stability and correctness over performance, in 1.7.2 is
this going to be one copy per subscriber on the remote node over the WAN?

> There's a new, correct, version of this optimisation on branch bug19844 
> in Mercurial (so, err, requiring you to compile from source again). It's 
> in a fairly reasonable state and hopefully should get merged into the 
> default branch soon and thus find its way into the next release. You are 
> very welcome, and indeed encouraged, to test it out before that happens.

Depending, I might well do so — but given we are testing the message queue
system, we are very unlikely to actually find problems with this.

[...]

>> Further, is there anything useful that I can read about Erlang and RabbitMQ
>> WAN clustering?  My research, to date, has not given me much to learn from,
>> which makes me wonder what hidden complexity I might be risking...
>
> I assume you've seen the clustering guide:
> http://www.rabbitmq.com/clustering.html

Yup.  Very useful, and a good guide.  I felt quite comfortable that it covered
pretty much everything RabbitMQ-specific I needed to know about clustering.

> (but that's pretty basic, I admit), and the Distributed Erlang manpage:
> http://www.erlang.org/doc/reference_manual/distributed.html
> (or at least the section on security).

Yup.  Again, useful, though I need to invest a little more time in
understanding how best to set and share the cookie if I do go down this path.

What I am missing is not so much a reference or admin guide, but a practical
implementation FAQ:

I want to understand better what, for example, high latency or low bandwidth
links mean in the context of a WAN Erlang cluster: if latency gets high
enough, do messages time-out and get retried?  Is this congestion-controlled?

When the Erlang messages start to back up, do they take down the cluster?

I *think* the answer is that Erlang is good about this, and time-out control
is application specific, but there doesn't seem to be a good guide to what
there is that I *should* be worrying about.

> Yes, we could probably do with more documentation...

The RabbitMQ documentation is actually pretty darn good, as far as I am
concerned; a large part of why your implementation was the one I chose to
trial was that it was easy to understand from what you have written.

Now, the Erlang platform, that could do with a guidebook in addition to the
reference manual. :)

        Daniel
-- 
✣ Daniel Pittman            ✉ daniel at rimspace.net            ☎ +61 401 155 707
               ♽ made with 100 percent post-consumer electrons




More information about the rabbitmq-discuss mailing list