[rabbitmq-discuss] Question about initial cluster connect & client-side cluster awareness

Thu Sep 18 17:50:57 BST 2008

Hi Holger,

Holger Hoffstätte wrote:
> I've been experimenting with a cluster of rabbits (keeping the current
> limitations in mind) and was wondering about the client side of things.
> How should the problem of the initial connection be approached? As far as
> I can tell there are no built-in mechanisms in place to have a client
> 
> - dynamically discover a broker or a cluster of nodes on a network
> 
> - try to connect to a number of hosts in order to get an initial connection
Correct, there seems to be no built-in mechanism for this, probably because it's hard to 
come up with a one-size-fits-all solution for service discovery.

> - be aware of the available cluster nodes via the known_hosts field (as
> described in the docs) so that when that node dies, the client can
> automatically reconnect to a live member
Speaking about known_hosts. There are several issues that I ran into in this area that you 
might want to know.

1. Rabbit populates known_hosts with hostnames as *it* knows them. What I mean by this is 
that if rabbit at host1 knows that its second cluster node runs on 10.1.1.1 and 10.1.1.1 is 
listed in host1:/etc/hosts as "10.1.1.1 foo", it will populate known_hosts with 
"rabbit at host1,rabbit at foo". Note that in this case foo needs to be resolvable by your 
clients as well, otherwise it's useless.

2. known_hosts appears useless if your clients reside in different network segments. What 
I mean by this is consider some clients inside firewall connecting to rabbit as 10.1.1.1 
and clients outside of firewall (from Internet) connecting to rabbit as 216.216.216.216. 
In this case, names appearing in known_hosts needs to be resolvable differently based on 
where client is coming from (there are issues with resolving non-FQDN "foo" outside of 
firewall too). With NAT and VIPs this becomes very difficult to manage for a generic case.

All in all, I personally found the best approach is not to rely on known_hosts at all, and 
tell each client (individually or as a group) IP addresses how they can connect to rabbit. 
I also found connection.redirect response to be useless in this scenario, so I always set 
insist to true when opening a connection. For load balancing, I use tcp level things like 
haproxy instead of AMQP-level techniques. My clients each implement reconnect logic 
individually.

Please note that known_hosts issues I described above are not specific to RabbitMQ. This 
is just the way AMQP (0-8 at least) is - it provides means for cluster node discovery 
after initial bootstrap only if all your clients are in the same network address space and 
have access to the same name resolution services. AMQP experts - I would love to hear your 
opinion on this one.

I am hoping we get more people on this list running rabbit clusters :)

Regards,
Dmitriy Samovskiy