[rabbitmq-discuss] rabbitmq 2.6.1 cluster failure recovery
Simon MacMullen
simon at rabbitmq.com
Mon Oct 3 12:26:55 BST 2011
Hi Alain.
When you see timeout_waiting_for_tables, that should mean that the node
you're trying to start:
* Could not find any other cluster nodes running
* Was not the last node to shut down
From your explanation it sounds like node-1 *is* running while you
restart node-2 - is that correct? In that case, can node-2 definitely
see node-1? (i.e. it can ping cumulonimbus)
Cheers, Simon
On 01/10/11 01:25, Alain Dazzi wrote:
> Hi,
>
> I can't get my rabbitmq cluster to recover from a dead node. So
> perhaps someone can help ...
>
> node-1 (cumulonimbus)
> Linux cumulonimbus 2.6.38-11-server #50-Ubuntu SMP Mon Sep 12 21:34:27
> UTC 2011 x86_64 x86_64 x86_64 GNU/Linux
> ii rabbitmq-server 2.6.1-1
> root at cumulonimbus:~# ls -1 /usr/lib/rabbitmq/lib/rabbitmq_server-2.6.1/plugins/
> amqp_client-2.6.1.ez
> mochiweb-1.3-rmq2.6.1-git9a53dbd.ez
> rabbitmq_management-2.6.1.ez
> rabbitmq_management_agent-2.6.1.ez
> rabbitmq_management_visualiser-2.6.1.ez
> rabbitmq_mochiweb-2.6.1.ez
> README
> webmachine-1.7.0-rmq2.6.1-hg0c4b60a.ez
>
>
> node-2 (nuage-informatique)
> Linux nuage-informatique 2.6.38-11-generic #50-Ubuntu SMP Mon Sep 12
> 21:17:25 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux
> ii rabbitmq-server 2.6.1-1
>
> 1/ stop both servers and set-up same .erlang_cookie value; restart nodes
>
> 2/ on node1 I create a cluster
> rabbitmqctl stop_app
> rabbitmqctl reset
> rabbitmqctl cluster rabbit at nuage-informatique rabbit at cumulonimbus
> Clustering node rabbit at cumulonimbus with ['rabbit at nuage-informatique',
> rabbit at cumulonimbus] ...
> ...done.
>
> 3/ This creates 2 disc nodes !!!
>
> 4/ run a test and pass data successfully
>
> 5/ restart node-2 (service rabbitmq-server stop)
> service rabbitmq-server start ... fails with ...
> root at nuage-informatique:~/Desktop# service rabbitmq-server start
> Starting rabbitmq-server: FAILED - check /var/log/rabbitmq/startup_{log, _err}
> rabbitmq-server.
> Erlang has closed
> ^M
> Crash dump was written to: erl_crash.dump^M
> Kernel pid terminated (application_controller)
> ({application_start_failure,rabbit,{bad_return,{{rabbit,start,[normal,[]]},{'EXIT',{rabbit,failure_during_boot}}}}})^M
>
> Activating RabbitMQ plugins ...
> 1 plugins activated:
> * rabbitmq_management_agent-2.6.1
>
>
> +---+ +---+
> | | | |
> | | | |
> | | | |
> | +---+ +-------+
> | |
> | RabbitMQ +---+ |
> | | | |
> | v2.6.1 +---+ |
> | |
> +-------------------+
> AMQP 0-9-1 / 0-9 / 0-8
> Copyright (C) 2007-2011 VMware, Inc.
> Licensed under the MPL. See http://www.rabbitmq.com/
>
> node : rabbit at nuage-informatique
> app descriptor :
> /usr/lib/rabbitmq/lib/rabbitmq_server-2.6.1/sbin/../ebin/rabbit.app
> home dir : /var/lib/rabbitmq
> config file(s) : (none)
> cookie hash : qHpvLciGsi5o4f8ScVzyWg==
> log : /var/log/rabbitmq/rabbit at nuage-informatique.log
> sasl log : /var/log/rabbitmq/rabbit at nuage-informatique-sasl.log
> database dir : /var/lib/rabbitmq/mnesia/rabbit at nuage-informatique
> erlang version : 5.7.4
>
> -- rabbit boot start
> starting file handle cache server ...done
> starting worker pool ...done
> starting database
> ...BOOT ERROR: FAILED
> Reason: {error,
> {timeout_waiting_for_tables,
> [rabbit_user,rabbit_user_permission,rabbit_vhost,
> rabbit_durable_route,rabbit_durable_exchange,
> rabbit_durable_queue]}}
> Stacktrace: [{rabbit_mnesia,wait_for_tables,1},
> {rabbit_mnesia,check_schema_integrity,0},
> {rabbit_mnesia,ensure_schema_integrity,0},
> {rabbit_mnesia,init,0},
> {rabbit,'-run_boot_step/1-lc$^1/1-1-',1},
> {rabbit,run_boot_step,1},
> {rabbit,'-start/2-lc$^0/1-0-',1},
> {rabbit,start,2}]
> {"Kernel pid terminated",application_controller,"{application_start_failure,rabbit,{bad_return,{{rabbit,start,[normal,[]]},{'EXIT',{rabbit,failure_during_boot}}}}}"}^M
>
> At this point I have to re-install node-2 to recover.
>
> Any idea why?
>
> Thank you,
>
> next I would like to test mirrored q but obviously this has to work first...
>
> -Alain
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
--
Simon MacMullen
RabbitMQ, VMware
More information about the rabbitmq-discuss
mailing list