[rabbitmq-discuss] Clustering - just can't get it going

Allan Baker a.baker at irisat.mx
Wed Sep 25 19:03:00 BST 2013


Hello Derek

Three things I might recommend given my experience having issues similar 
to yours.

1) Do be sure that every package that you have, depending on your Linux 
distribution (or depending on the OS version), is updated to the latest 
version.
In my case, I apparently had permission problems with Erlang and a 
segmentation fault. I added the Debian repository
along with the Squeeze repositories, used the latest version and it 
worked. This only applies to Debian Squeeze (6) and RabbitMQ V 3.1.5

2) Be sure that you don't have the hosts ports restricted by any 
Firewalling or Wrappers software. Be sure you have no entries
on /etc/hosts.allow and /etc/hosts.deny and that your endpoint (network 
router/switch) doesn't have any port or network restrictions.

3) Test the connectivity and the ports. You can try doing "telnet 
hostname port" to see if you can see the ports from one host to the
other. I would particularly recommend testing this with the Rabbit and 
to simply disconnect.

Let's hope this helps

Regards,
Allan Baker

El 25/09/2013 10:27 a.m., Derek Wyatt escribió:
> This is rabbitmq 3.1.5 - I'm not sure what the erlang version is, but 
> the erts version is 5.8.5.  I'm a little new to the whole erlang 
> thing, so I just picked a component with 'e' in it :P
>
> I'm planning to get these machines into the DNS proper to see if that 
> helps.  It would be pretty weird if it works, since everyone else 
> works OK with just /etc/hosts resolution, but it's worth a try.
>
>
> On 25 September 2013 11:16, Simon MacMullen <simon at rabbitmq.com 
> <mailto:simon at rabbitmq.com>> wrote:
>
>     You may (depending on Erlang version) need to make sure that each
>     machine can resolve its own hostname too, as well as the other one.
>
>     Cheers, Simon
>
>     On 25/09/13 16:04, Derek Wyatt wrote:
>
>         Damn. That's exactly the setup I have.
>
>
>         On 25 September 2013 10:55, Robin Lawrie - HostelBookers
>         <Robin.Lawrie at hostelbookers.com
>         <mailto:Robin.Lawrie at hostelbookers.com>
>         <mailto:Robin.Lawrie at hostelbookers.com
>         <mailto:Robin.Lawrie at hostelbookers.com>>>
>         wrote:
>
>             Hi,
>
>             In my case, I have a 2 node cluster (called cache1 and
>         cache2) and I
>             needed to add an entry to the hosts file on both nodes to
>         ensure
>             each node can resolve the name of the other node before
>         clustering
>             worked for me.
>
>             My hosts file is in /etc and is called hosts
>
>             In there I entered the following:
>
>             On cache1.lon.hosting, enter the line 192.168.3.1
>         Cache2.domain.com <http://Cache2.domain.com>
>             <http://Cache2.domain.com> Cache2
>
>             On cache2.lon.hosting, enter the line 192.168.3.0
>         Cache1.domain.com <http://Cache1.domain.com>
>             <http://Cache1.domain.com> Cache1
>
>             Once done, I needed to confirm I could ping each node
>         using it's
>             hostname from the other node. I don't care about DNS or
>         nslookup
>             working/resolving the name.
>
>             HTH
>
>             Robin
>
>             *From:*rabbitmq-discuss-bounces at lists.rabbitmq.com
>         <mailto:rabbitmq-discuss-bounces at lists.rabbitmq.com>
>             <mailto:rabbitmq-discuss-bounces at lists.rabbitmq.com
>         <mailto:rabbitmq-discuss-bounces at lists.rabbitmq.com>>
>             [mailto:rabbitmq-discuss-bounces at lists.rabbitmq.com
>         <mailto:rabbitmq-discuss-bounces at lists.rabbitmq.com>
>             <mailto:rabbitmq-discuss-bounces at lists.rabbitmq.com
>         <mailto:rabbitmq-discuss-bounces at lists.rabbitmq.com>>] *On
>         Behalf Of
>             *Derek Wyatt
>             *Sent:* 25 September 2013 15:47
>             *To:* Discussions about RabbitMQ
>             *Subject:* Re: [rabbitmq-discuss] Clustering - just can't
>         get it going
>
>             Ah, I do have more information though:
>
>             DIAGNOSTICS
>
>             ===========
>
>             nodes in question: ['RMQ1']
>
>             hosts, their running nodes and ports:
>
>             - unable to connect to epmd on RMQ1: nxdomain
>         (non-existing domain)
>
>             current node details:
>
>             - node name: 'rabbitmqctl1577 at RMQ2'
>
>             - home dir: /var/lib/rabbitmq
>
>             - cookie hash: ohQKEF09peb6bAgNqawvKA==
>
>             And just to be clear, the cookie is the same:
>
>             *01*:~$ sudo md5sum /var/lib/rabbitmq/.erlang.cookie
>
>             a2140a105d3da5e6fa6c080da9ac2f28
>          /var/lib/rabbitmq/.erlang.cookie
>
>             *02*:~$ sudo md5sum /var/lib/rabbitmq/.erlang.cookie
>
>             a2140a105d3da5e6fa6c080da9ac2f28
>          /var/lib/rabbitmq/.erlang.cookie
>
>             Somehow, telnet to epmd works just fine, but something
>         that RMQ is
>             doing fails to make that happen.  Is there some sort of
>         DNS work
>             that it's doing, instead of using the hosts files?
>
>             i.e. one thing I found is that nslookup fails:
>
>             02:~$ nslookup RMQ1
>
>             ;; Got SERVFAIL reply from <ipaddress>, trying next server
>
>             Server:       <ipaddress>
>
>             Address:  <ipaddress>
>
>             ** server can't find RMQ1: SERVFAIL
>
>             But if I ping RMQ1 it works fine. /etc/nsswitch.conf
>         specifies that
>             files should be tried first, before DNS w.r.t. hosts.
>
>             So, it looks like RMQ is doing something more rigorous to
>         resolve
>             the host, and I don't know how to change that.  I also
>         don't have
>             access to the DNS server configuration in order to modify
>         it in any way.
>
>             On 25 September 2013 09:57, Jason McIntosh
>         <mcintoshj at gmail.com <mailto:mcintoshj at gmail.com>
>             <mailto:mcintoshj at gmail.com <mailto:mcintoshj at gmail.com>>>
>         wrote:
>
>             Check your erlang cookie on both servers to make sure it
>         matches I
>             think it's in - /var/lib/rabbitmq/ - then you can use
>         rabbitmqctl
>             from one machine and see if you can connect to another to list
>             queues.  I THINK that's rabbitmqctl -n <servernode>
>         list_queues for
>             example.  If both servers can talk to each other then it
>         should be
>             rabbitmqctl stop_app, join_cluster, start_app.
>
>             Jason
>
>             On Wed, Sep 25, 2013 at 8:50 AM, Derek Wyatt
>         <derek at derekwyatt.org <mailto:derek at derekwyatt.org>
>             <mailto:derek at derekwyatt.org
>         <mailto:derek at derekwyatt.org>>> wrote:
>
>             Hi,
>
>             I've seen a number of people failing to get clustering
>         running and,
>             unfortunately, I can't get it going either.  Here's the
>         summary of
>             what I've got:
>
>               * Two nodes - RMQ1 and RMQ2
>               * I can ping RMQ1 from RMQ2, and vice versa
>               * I can telnet from RMQ1 to RMQ2:epmd, and vice versa
>               * I can telnet from RMQ1 to RMQ2:amqp, and vice versa
>               * The cookie file is identical, as is clear from the
>         startup INFO
>
>             My goal is to have RMQ2 join RMQ1 in a cluster.
>
>             The servers are started using the init script in Ubuntu (i.e.
>             service rabbitmq-server start).  This is different than
>         the script
>             at http://www.rabbitmq.com/clustering.html, which says to
>         start with
>             "rabbitmq-server -detached".  I've tried that and it
>         doesn't seem to
>             make any difference so I always use the init script instead.
>
>             So, the script says to stop the RMQ2 server and then join the
>             cluster.  The following transcript shows how well all this
>         goes:
>
>             02:~$ sudo rabbitmqctl stop_app
>
>             Stopping node 'rabbit at RMQ2' ...
>
>             ...done.
>
>             02:~$ sudo rabbitmqctl join_cluster --ram rabbit at RMQ1
>
>             Clustering node 'rabbit at RMQ2' with 'rabbit at RMQ1' ...
>
>             Error: {cannot_discover_cluster,"The nodes provided are either
>             offline or not running"}
>
>             However, as I said above, telnetting to the ports works
>         just fine:
>
>             02:~$ telnet RMQ1 epmd
>
>             Trying <ip address>...
>
>             Connected to RMQ1
>
>             Escape character is '^]'.
>
>             booger!
>
>             Connection closed by foreign host.
>
>             02:~$ telnet RMQ1 amqp
>
>             Trying <ip address>...
>
>             Connected to RMQ1
>
>             Escape character is '^]'.
>
>             booger!
>
>             AMQP Connection closed by foreign host.
>
>             I'm stuck for what else to test.  Does anyone know how to
>             troubleshoot this thing further?
>
>             Thanks,
>
>             Derek
>
>
>             _______________________________________________
>             rabbitmq-discuss mailing list
>         rabbitmq-discuss at lists.rabbitmq.com
>         <mailto:rabbitmq-discuss at lists.rabbitmq.com>
>             <mailto:rabbitmq-discuss at lists.rabbitmq.com
>         <mailto:rabbitmq-discuss at lists.rabbitmq.com>>
>         https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>
>
>
>             --
>             Jason McIntosh
>         http://mcintosh.poetshome.com/blog/
>         573-424-7612 <tel:573-424-7612> <tel:573-424-7612
>         <tel:573-424-7612>>
>
>
>             _______________________________________________
>             rabbitmq-discuss mailing list
>         rabbitmq-discuss at lists.rabbitmq.com
>         <mailto:rabbitmq-discuss at lists.rabbitmq.com>
>             <mailto:rabbitmq-discuss at lists.rabbitmq.com
>         <mailto:rabbitmq-discuss at lists.rabbitmq.com>>
>         https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>
>            
>         ------------------------------------------------------------------------
>
>             This email is from Hostelbookers.com Limited. Registered
>         office:
>             52-54 High Holborn, London, WC1V 6RL, UK. Registered in
>         England
>             under Company No.: 2841908. This email and any files
>         transmitted
>             with it are confidential and may be privileged and are
>         intended
>             solely for the use of the individual or entity to whom
>         they are
>             addressed. As email can be subject to operational or technical
>             difficulties and time delays, communications that are
>         subject to
>             deadlines should also be sent by post. Any unauthorised
>         direct or
>             indirect dissemination, distribution or copying of this
>         message and
>             any attachments is strictly prohibited. If you have
>         received the
>             email in error, please notify postmaster at hostelbookers.com
>         <mailto:postmaster at hostelbookers.com>
>             <mailto:postmaster at hostelbookers.com
>         <mailto:postmaster at hostelbookers.com>>
>
>            
>         ------------------------------------------------------------------------
>
>             _______________________________________________
>             rabbitmq-discuss mailing list
>         rabbitmq-discuss at lists.rabbitmq.com
>         <mailto:rabbitmq-discuss at lists.rabbitmq.com>
>             <mailto:rabbitmq-discuss at lists.rabbitmq.com
>         <mailto:rabbitmq-discuss at lists.rabbitmq.com>>
>         https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>
>
>
>
>         _______________________________________________
>         rabbitmq-discuss mailing list
>         rabbitmq-discuss at lists.rabbitmq.com
>         <mailto:rabbitmq-discuss at lists.rabbitmq.com>
>         https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>
>
>
>     -- 
>     Simon MacMullen
>     RabbitMQ, Pivotal
>
>
>
>
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20130925/5a2afc89/attachment.htm>


More information about the rabbitmq-discuss mailing list