[rabbitmq-discuss] RabbitMQ Clustering in a changing network environment

Wed Jun 17 09:13:35 BST 2009

Anthony

This presentation presents approaches to *some* but not all of your questions:

http://skillsmatter.com/podcast/cloud-grid/rabbitmq-internal-architecture-tony-garnock-jones

Federation-for-availability is required in a large number of
interesting cases - of which yours is a great example.  As we speak,
the AMQP working group is looking at some of these questions which -
as I think you imply - speak equally to networking concerns such as
addressing and multihoming.

Re your questions about 'nodes' below.  That depends if 'node'
explicitly means erlang node or not.  In the presentation above,
routing between clusters can be but does not have to be based on
erlang's communication mechanisms.  So, 'joining and leaving a group'
might have a different meaning for federated brokers (consisting of
"broker nodes"), than it does within a cluster (of "erlang nodes").
We are looking at the semantics of this during summer 2009.  It's an
interesting area and there is no 'one' answer at the moment.

alexis

On Wed, Jun 17, 2009 at 8:10 AM, Anthony<anthony-rabbitmq at hogan.id.au> wrote:
> We have several sites which run applications which hook into a
> RabbitMQ implemented AMQP bus. At the moment, there's a single site
> with a rabbit server, and everything connects into that, but obviously
> as things grow and more apps request data etc., it seems like it might
> be more efficient for each site to have its own RabbitMQ server to act
> as somewhat of a concentrator and perform intelligent routing between
> the sites as well as reduce the need for the transmission of duplicate
> messages across the inter-site links.
>
> What isn't quite so clear is how one might interconnect several sites
> with respect to rabbit clustering when the link between one site and
> others changes, if one site can only "connect out" and sites can't
> connect back to it, and what happens if the node that contains a given
> message queue dies
>
> In our specific case, 3G wireless data links in Australia often have
> internal, non-routable, pre-NAT'd addresses assigned to them, such
> that when a site's primary ADSL or other connection fails and the
> firewall router switches on the 3G link, other sites are no longer
> able to connect in - this site is only able to connect out.
>
> In the event of an inter-site link dropping, it's obviously important
> to us that rabbit keep going and where alternate links come up, that
> rabbit leverage those to keep the client network config simple and
> keep everything running.
>
> Reading through the rabbitmq docs, it seems the clustering is pretty
> much all done by erlang - a language/environment I have little
> experience with I admit..
>
> When nodes join a cluster, do they then attempt to message one another
> directly or go through the node they identified with specifically when
> joining the cluster?
> If they communicate via the specific node they identified to - can
> they identify directly with several specific nodes?
> Is this through a persistent connection, or something transient?
> Are there any caveats with NAT (including the case where one can't map
> a port back in) or dynamic IPs?
> How does the whole nodename/hostname thing work/resolve?
>
> It also sounded like each site, if it could be arguably cut off from
> every other one, should be a disk node?
>
> I guess I'm askin' a lot of questions here - reading the clustering
> doc suggests it's primarily aimed for environments where each cluster
> is located at a site with a single, pretty reliable connection with a
> static IP and hostname which isn't always the case in our
> environment...
>
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
> http://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>