[rabbitmq-discuss] Clustering - just can't get it going

Derek Wyatt derek at derekwyatt.org
Wed Sep 25 16:04:28 BST 2013


Damn. That's exactly the setup I have.


On 25 September 2013 10:55, Robin Lawrie - HostelBookers <
Robin.Lawrie at hostelbookers.com> wrote:

>  Hi,
>
>
>
> In my case, I have a 2 node cluster (called cache1 and cache2) and I
> needed to add an entry to the hosts file on both nodes to ensure each node
> can resolve the name of the other node before clustering worked for me.
>
>
>
> My hosts file is in /etc and is called hosts
>
>
>
> In there I entered the following:
>
>
>
> On cache1.lon.hosting, enter the line 192.168.3.1 Cache2.domain.com Cache2
>
> On cache2.lon.hosting, enter the line 192.168.3.0 Cache1.domain.com Cache1
>
>
>
> Once done, I needed to confirm I could ping each node using it’s hostname
> from the other node. I don’t care about DNS or nslookup working/resolving
> the name.
>
>
>
> HTH
>
>
>
> Robin
>
>
>
> *From:* rabbitmq-discuss-bounces at lists.rabbitmq.com [mailto:
> rabbitmq-discuss-bounces at lists.rabbitmq.com] *On Behalf Of *Derek Wyatt
> *Sent:* 25 September 2013 15:47
> *To:* Discussions about RabbitMQ
> *Subject:* Re: [rabbitmq-discuss] Clustering - just can't get it going
>
>
>
> Ah, I do have more information though:
>
>
>
> DIAGNOSTICS
>
> ===========
>
>
>
> nodes in question: ['RMQ1']
>
>
>
> hosts, their running nodes and ports:
>
> - unable to connect to epmd on RMQ1: nxdomain (non-existing domain)
>
>
>
> current node details:
>
> - node name: 'rabbitmqctl1577 at RMQ2'
>
> - home dir: /var/lib/rabbitmq
>
> - cookie hash: ohQKEF09peb6bAgNqawvKA==
>
>
>
> And just to be clear, the cookie is the same:
>
>
>
> *01*:~$ sudo md5sum /var/lib/rabbitmq/.erlang.cookie
>
> a2140a105d3da5e6fa6c080da9ac2f28  /var/lib/rabbitmq/.erlang.cookie
>
> *02*:~$ sudo md5sum /var/lib/rabbitmq/.erlang.cookie
>
> a2140a105d3da5e6fa6c080da9ac2f28  /var/lib/rabbitmq/.erlang.cookie
>
>
>
> Somehow, telnet to epmd works just fine, but something that RMQ is doing
> fails to make that happen.  Is there some sort of DNS work that it's doing,
> instead of using the hosts files?
>
>
>
> i.e. one thing I found is that nslookup fails:
>
>
>
> 02:~$ nslookup RMQ1
>
> ;; Got SERVFAIL reply from <ipaddress>, trying next server
>
> Server:       <ipaddress>
>
> Address:  <ipaddress>
>
>
>
> ** server can't find RMQ1: SERVFAIL
>
>
>
> But if I ping RMQ1 it works fine.  /etc/nsswitch.conf specifies that
> files should be tried first, before DNS w.r.t. hosts.
>
>
>
> So, it looks like RMQ is doing something more rigorous to resolve the
> host, and I don't know how to change that.  I also don't have access to the
> DNS server configuration in order to modify it in any way.
>
>
>
>
>
> On 25 September 2013 09:57, Jason McIntosh <mcintoshj at gmail.com> wrote:
>
> Check your erlang cookie on both servers to make sure it matches I think
> it's in - /var/lib/rabbitmq/ - then you can use rabbitmqctl from one
> machine and see if you can connect to another to list queues.  I THINK
> that's rabbitmqctl -n <servernode> list_queues for example.  If both
> servers can talk to each other then it should be rabbitmqctl stop_app,
> join_cluster, start_app.
>
> Jason
>
>
>
> On Wed, Sep 25, 2013 at 8:50 AM, Derek Wyatt <derek at derekwyatt.org> wrote:
>
> Hi,
>
>
>
> I've seen a number of people failing to get clustering running and,
> unfortunately, I can't get it going either.  Here's the summary of what
> I've got:
>
>    - Two nodes - RMQ1 and RMQ2
>    - I can ping RMQ1 from RMQ2, and vice versa
>    - I can telnet from RMQ1 to RMQ2:epmd, and vice versa
>    - I can telnet from RMQ1 to RMQ2:amqp, and vice versa
>    - The cookie file is identical, as is clear from the startup INFO
>
>  My goal is to have RMQ2 join RMQ1 in a cluster.
>
>
>
> The servers are started using the init script in Ubuntu (i.e. service
> rabbitmq-server start).  This is different than the script at
> http://www.rabbitmq.com/clustering.html, which says to start with
> "rabbitmq-server -detached".  I've tried that and it doesn't seem to make
> any difference so I always use the init script instead.
>
>
>
> So, the script says to stop the RMQ2 server and then join the cluster.
>  The following transcript shows how well all this goes:
>
>
>
> 02:~$ sudo rabbitmqctl stop_app
>
> Stopping node 'rabbit at RMQ2' ...
>
> ...done.
>
>
>
> 02:~$ sudo rabbitmqctl join_cluster --ram rabbit at RMQ1
>
> Clustering node 'rabbit at RMQ2' with 'rabbit at RMQ1' ...
>
> Error: {cannot_discover_cluster,"The nodes provided are either offline or
> not running"}
>
>
>
> However, as I said above, telnetting to the ports works just fine:
>
>
>
> 02:~$ telnet RMQ1 epmd
>
> Trying <ip address>...
>
> Connected to RMQ1
>
> Escape character is '^]'.
>
> booger!
>
> Connection closed by foreign host.
>
>
>
> 02:~$ telnet RMQ1 amqp
>
> Trying <ip address>...
>
> Connected to RMQ1
>
> Escape character is '^]'.
>
> booger!
>
> AMQP Connection closed by foreign host.
>
>
>
> I'm stuck for what else to test.  Does anyone know how to troubleshoot
> this thing further?
>
>
>
> Thanks,
>
> Derek
>
>
>
>
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>
>
>
>
>
> --
> Jason McIntosh
> http://mcintosh.poetshome.com/blog/
> 573-424-7612
>
>
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>
>
>  ------------------------------
>
> This email is from Hostelbookers.com Limited. Registered office: 52-54
> High Holborn, London, WC1V 6RL, UK. Registered in England under Company
> No.: 2841908. This email and any files transmitted with it are confidential
> and may be privileged and are intended solely for the use of the individual
> or entity to whom they are addressed. As email can be subject to
> operational or technical difficulties and time delays, communications that
> are subject to deadlines should also be sent by post. Any unauthorised
> direct or indirect dissemination, distribution or copying of this message
> and any attachments is strictly prohibited. If you have received the email
> in error, please notify postmaster at hostelbookers.com
> ------------------------------
>
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20130925/db888cfa/attachment.htm>


More information about the rabbitmq-discuss mailing list