[rabbitmq-discuss] RabbitMQ: Joining a cluster - help required to understand rabbitmq commands: start_app / stop_app

Phil Lazarou phil.lazarou at gmail.com
Wed Dec 4 10:39:04 GMT 2013

Hi ,

I've inherited a system from a departing user and it seems to work, but
there are intermittent problems and I feel that if I understood what was
going on, then I could help support it better.

Please note this is a complicated system using elastic cloud instances and
load balancers, but the principle is that when a new instance starts, it
needs to join the rabbit cluster. I do appreciate that asking for help when
trying to debug my system may be a "hiding to nothing" but I just want to
try and understand the joining cluster aspect.

Whenever a new server instance starts a custom bash script (written by my
company) runs. Here is the rabbit part of the script:

2: echo "Stopping Rabbit on localhost"
3: while [[ -z $RABBIT_HOST ]]; do
4:    echo `date` "Trying to find Rabbit Host to cluster"
5:    sudo rabbitmqctl start_app
6:    sleep $WAIT_TIME
7:    sudo rabbitmqctl stop_app
8:    RABBIT_HOST=`curl -s -u guest:guest
$PRIVATE_LOAD_BALANCER_HOSTNAME:15672/api/nodes | /home/ubuntu/bin/jq -r
'map(select(.running and .type == "disc")) | .[0].name'`
9:    if [ -n $RABBIT_HOST ]; then
10:        echo `date` "Found Rabbit Host"
11:   fi
12: done
14: echo `date` ": Joining Rabbit with cluster on " $RABBIT_HOST
15: sudo rabbitmqctl join_cluster $RABBIT_HOST
16: echo "Starting Rabbit on localhost"
17: sudo rabbitmqctl start_app

Here are my questions:

Line 2 : why do we want to "stop rabbit"? I feel this comment is very
misleading, since RabbitMQ as an entity will not stop. What would "stopping
rabbit" achieve?

Lines 5-7 : I can see that the rabbit command start_app is executed,
followed by a wait then a stop_app. Why? What does this achieve? Why start
then stop? Can these steps just be missed out?

Line 8: curl (and other tools) are used to query the load balancer try and
assign a value to the variable RABBIT_HOST. I get this, since we want to
cluster this instance to the load balancer.

Line 12: If this is successful, the while-do loop ends and the command to
join the cluster is issued.

Line 15: We join the cluster. Fine, although can we only join a cluster
after "stop_app" has run?

Line 17: We run start_app again! Why???

I really don't understand this whole stop_app / start_app business. Can
anyone shed any light onto what this is trying to achieve?

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20131204/a9970771/attachment.html>

More information about the rabbitmq-discuss mailing list