[rabbitmq-discuss] If 2 nodes out a 3 node cluster, the third one becomes unresponsive until one of the nodes is brought back.

Yamil Einar Asusta Santos yamil.asusta at upr.edu
Tue Aug 6 17:53:29 BST 2013


I ran this with 4 nodes in the cluster and after just 2 nodes down, the 
whole system became unresponsive.
I have put the log in 
here: https://gist.github.com/elbuo8/e5171ec85608b7bd7842

Also, I noticed that when I bring the nodes back up, they become Disc nodes 
even when they have been configured specifically as RAM nodes.


On Tuesday, August 6, 2013 11:47:31 AM UTC-4, Yamil Einar Asusta Santos 
wrote:
>
> I have been testing my cluster and I have come across an unexpected 
> behavior. As explained in the subject, the cluster runs well on 3 nodes. If 
> I bring 1 node down, it still runs smoothly. But if I bring a second node 
> down, the third one becomes unresponsive.
> This is what the third node provides after running "rabbitmqctl report":
>
> Reporting server status on {{2013,8,6},{15,19,8}}
>  ...
> Error: {aborted,{no_exists,rabbit_vhost}}
>
>
> Then if I bring 1 or 2 of my nodes back up, the third one becomes 
> responsive and everything is back to normal. 
>
> Here is the report after bringing the nodes up again:
>
> Reporting server status on {{2013,8,6},{15,26,31}}
>  ...
> Status of node rabbit at qcluster1 ...
> [{pid,901},
>  {running_applications,
>      [{rabbitmq_management,"RabbitMQ Management Console","3.1.3"},
>       {rabbitmq_management_agent,"RabbitMQ Management Agent","3.1.3"},
>       {rabbit,"RabbitMQ","3.1.3"},
>       {os_mon,"CPO  CXC 138 46","2.2.7"},
>       {rabbitmq_web_dispatch,"RabbitMQ Web Dispatcher","3.1.3"},
>       {webmachine,"webmachine","1.9.1-rmq3.1.3-git52e62bc"},
>       {mochiweb,"MochiMedia Web Server","2.3.1-rmq3.1.3-gitd541e9a"},
>       {xmerl,"XML parser","1.2.10"},
>       {inets,"INETS  CXC 138 49","5.7.1"},
>       {mnesia,"MNESIA  CXC 138 12","4.5"},
>       {amqp_client,"RabbitMQ AMQP Client","3.1.3"},
>       {sasl,"SASL  CXC 138 11","2.1.10"},
>       {stdlib,"ERTS  CXC 138 10","1.17.5"},
>       {kernel,"ERTS  CXC 138 10","2.14.5"}]},
>  {os,{unix,linux}},
>  {erlang_version,
>      "Erlang R14B04 (erts-5.8.5) [source] [64-bit] [smp:2:2] [rq:2] 
> [async-threads:30] [kernel-poll:true]\n"},
>  {memory,
>      [{total,31689208},
>       {connection_procs,5408},
>       {queue_procs,5408},
>       {plugins,340592},
>       {other_proc,9166952},
>       {mnesia,61984},
>       {mgmt_db,10248},
>       {msg_index,34160},
>       {other_ets,1080168},
>       {binary,6480},
>       {code,17557932},
>       {atom,1565809},
>       {other_system,1854067}]},
>  {vm_memory_high_watermark,0.4},
>  {vm_memory_limit,153295257},
>  {disk_free_limit,1000000000},
>  {disk_free,77554466816},
>  {file_descriptors,
>     
>  [{total_limit,924},{total_used,3},{sockets_limit,829},{sockets_used,1}]},
>  {processes,[{limit,1048576},{used,187}]},
>  {run_queue,0},
>  {uptime,210}]
> Cluster status of node rabbit at qcluster1 ...
>
> [{nodes,[{disc,[rabbit at qcluster2,rabbit at qcluster0]},{ram,[rabbit at qcluster1]}]},
>  {running_nodes,[rabbit at qcluster0,rabbit at qcluster2,rabbit at qcluster1]},
>  {partitions,[]}]
> Application environment of node rabbit at qcluster1 ...
> [{auth_backends,[rabbit_auth_backend_internal]},
>  {auth_mechanisms,['PLAIN','AMQPLAIN']},
>  {backing_queue_module,rabbit_variable_queue},
>  {cluster_nodes,{[rabbit at qcluster0,rabbit at qcluster1],ram}},
>  {cluster_partition_handling,pause_minority},
>  {collect_statistics,fine},
>  {collect_statistics_interval,5000},
>  {default_permissions,[<<".*">>,<<".*">>,<<".*">>]},
>  {default_user,<<"guest">>},
>  {default_user_tags,[administrator]},
>  {default_vhost,<<"/">>},
>  {delegate_count,16},
>  {disk_free_limit,1000000000},
>  {enabled_plugins_file,"/etc/rabbitmq/enabled_plugins"},
>  {error_logger,{file,"/var/log/rabbitmq/rabbit at qcluster1.log"}},
>  {frame_max,131072},
>  {heartbeat,600},
>  {hipe_compile,false},
>  {included_applications,[]},
>  {log_levels,[{connection,info}]},
>  {msg_store_file_size_limit,16777216},
>  {msg_store_index_module,rabbit_msg_store_ets_index},
>
>  {plugins_dir,"/usr/lib/rabbitmq/lib/rabbitmq_server-3.1.3/sbin/../plugins"},
>
>  {plugins_expand_dir,"/var/lib/rabbitmq/mnesia/rabbit at qcluster1-plugins-expand"},
>  {queue_index_max_journal_entries,65536},
>  {reverse_dns_lookups,false},
>  {sasl_error_logger,{file,"/var/log/rabbitmq/rabbit at qcluster1-sasl.log"}},
>  {server_properties,[]},
>  {ssl_cert_login_from,distinguished_name},
>  {ssl_listeners,[]},
>  {ssl_options,[]},
>  {tcp_listen_options,[binary,
>                       {packet,raw},
>                       {reuseaddr,true},
>                       {backlog,128},
>                       {nodelay,true},
>                       {linger,{true,0}},
>                       {exit_on_close,false}]},
>  {tcp_listeners,[{"auto",5672}]},
>  {trace_vhosts,[]},
>  {vm_memory_high_watermark,0.4}]
> Status of node rabbit at qcluster2 ...
> [{pid,1940},
>  {running_applications,
>      [{rabbitmq_management,"RabbitMQ Management Console","3.1.3"},
>       {rabbitmq_management_agent,"RabbitMQ Management Agent","3.1.3"},
>       {rabbit,"RabbitMQ","3.1.3"},
>       {os_mon,"CPO  CXC 138 46","2.2.7"},
>       {rabbitmq_web_dispatch,"RabbitMQ Web Dispatcher","3.1.3"},
>       {webmachine,"webmachine","1.9.1-rmq3.1.3-git52e62bc"},
>       {mochiweb,"MochiMedia Web Server","2.3.1-rmq3.1.3-gitd541e9a"},
>       {mnesia,"MNESIA  CXC 138 12","4.5"},
>       {amqp_client,"RabbitMQ AMQP Client","3.1.3"},
>       {xmerl,"XML parser","1.2.10"},
>       {inets,"INETS  CXC 138 49","5.7.1"},
>       {sasl,"SASL  CXC 138 11","2.1.10"},
>       {stdlib,"ERTS  CXC 138 10","1.17.5"},
>       {kernel,"ERTS  CXC 138 10","2.14.5"}]},
>  {os,{unix,linux}},
>  {erlang_version,
>      "Erlang R14B04 (erts-5.8.5) [source] [64-bit] [smp:2:2] [rq:2] 
> [async-threads:30] [kernel-poll:true]\n"},
>  {memory,
>      [{total,32032800},
>       {connection_procs,5408},
>       {queue_procs,5408},
>       {plugins,315040},
>       {other_proc,9172832},
>       {mnesia,61952},
>       {mgmt_db,89704},
>       {msg_index,34160},
>       {other_ets,1111736},
>       {binary,33384},
>       {code,17743902},
>       {atom,1603433},
>       {other_system,1855841}]},
>  {vm_memory_high_watermark,0.4},
>  {vm_memory_limit,153295257},
>  {disk_free_limit,1000000000},
>  {disk_free,77553033216},
>  {file_descriptors,
>     
>  [{total_limit,924},{total_used,3},{sockets_limit,829},{sockets_used,1}]},
>  {processes,[{limit,1048576},{used,191}]},
>  {run_queue,0},
>  {uptime,617}]
> Cluster status of node rabbit at qcluster2 ...
>
> [{nodes,[{disc,[rabbit at qcluster0,rabbit at qcluster2]},{ram,[rabbit at qcluster1]}]},
>  {running_nodes,[rabbit at qcluster1,rabbit at qcluster0,rabbit at qcluster2]},
>  {partitions,[]}]
> Application environment of node rabbit at qcluster2 ...
> [{auth_backends,[rabbit_auth_backend_internal]},
>  {auth_mechanisms,['PLAIN','AMQPLAIN']},
>  {backing_queue_module,rabbit_variable_queue},
>
>  {cluster_nodes,{[rabbit at qcluster0,rabbit at qcluster1,rabbit at qcluster2],ram}},
>  {cluster_partition_handling,pause_minority},
>  {collect_statistics,fine},
>  {collect_statistics_interval,5000},
>  {default_permissions,[<<".*">>,<<".*">>,<<".*">>]},
>  {default_user,<<"guest">>},
>  {default_user_tags,[administrator]},
>  {default_vhost,<<"/">>},
>  {delegate_count,16},
>  {disk_free_limit,1000000000},
>  {enabled_plugins_file,"/etc/rabbitmq/enabled_plugins"},
>  {error_logger,{file,"/var/log/rabbitmq/rabbit at qcluster2.log"}},
>  {frame_max,131072},
>  {heartbeat,600},
>  {hipe_compile,false},
>  {included_applications,[]},
>  {log_levels,[{connection,info}]},
>  {msg_store_file_size_limit,16777216},
>  {msg_store_index_module,rabbit_msg_store_ets_index},
>
>  {plugins_dir,"/usr/lib/rabbitmq/lib/rabbitmq_server-3.1.3/sbin/../plugins"},
>
>  {plugins_expand_dir,"/var/lib/rabbitmq/mnesia/rabbit at qcluster2-plugins-expand"},
>  {queue_index_max_journal_entries,65536},
>  {reverse_dns_lookups,false},
>  {sasl_error_logger,{file,"/var/log/rabbitmq/rabbit at qcluster2-sasl.log"}},
>  {server_properties,[]},
>  {ssl_cert_login_from,distinguished_name},
>  {ssl_listeners,[]},
>  {ssl_options,[]},
>  {tcp_listen_options,[binary,
>                       {packet,raw},
>                       {reuseaddr,true},
>                       {backlog,128},
>                       {nodelay,true},
>                       {linger,{true,0}},
>                       {exit_on_close,false}]},
>  {tcp_listeners,[{"auto",5672}]},
>  {trace_vhosts,[]},
>  {vm_memory_high_watermark,0.4}]
> Status of node rabbit at qcluster0 ...
> [{pid,873},
>  {running_applications,
>      [{rabbitmq_management,"RabbitMQ Management Console","3.1.3"},
>       {rabbitmq_management_agent,"RabbitMQ Management Agent","3.1.3"},
>       {rabbit,"RabbitMQ","3.1.3"},
>       {os_mon,"CPO  CXC 138 46","2.2.7"},
>       {rabbitmq_web_dispatch,"RabbitMQ Web Dispatcher","3.1.3"},
>       {webmachine,"webmachine","1.9.1-rmq3.1.3-git52e62bc"},
>       {mochiweb,"MochiMedia Web Server","2.3.1-rmq3.1.3-gitd541e9a"},
>       {xmerl,"XML parser","1.2.10"},
>       {inets,"INETS  CXC 138 49","5.7.1"},
>       {mnesia,"MNESIA  CXC 138 12","4.5"},
>       {amqp_client,"RabbitMQ AMQP Client","3.1.3"},
>       {sasl,"SASL  CXC 138 11","2.1.10"},
>       {stdlib,"ERTS  CXC 138 10","1.17.5"},
>       {kernel,"ERTS  CXC 138 10","2.14.5"}]},
>  {os,{unix,linux}},
>  {erlang_version,
>      "Erlang R14B04 (erts-5.8.5) [source] [64-bit] [smp:2:2] [rq:2] 
> [async-threads:30] [kernel-poll:true]\n"},
>  {memory,
>      [{total,33671216},
>       {connection_procs,5408},
>       {queue_procs,5408},
>       {plugins,452592},
>       {other_proc,10742928},
>       {mnesia,61984},
>       {mgmt_db,10248},
>       {msg_index,34160},
>       {other_ets,1117456},
>       {binary,35464},
>       {code,17740702},
>       {atom,1602625},
>       {other_system,1862241}]},
>  {vm_memory_high_watermark,0.4},
>  {vm_memory_limit,153295257},
>  {disk_free_limit,1000000000},
>  {disk_free,77554319360},
>  {file_descriptors,
>     
>  [{total_limit,924},{total_used,3},{sockets_limit,829},{sockets_used,1}]},
>  {processes,[{limit,1048576},{used,196}]},
>  {run_queue,0},
>  {uptime,335}]
> Cluster status of node rabbit at qcluster0 ...
>
> [{nodes,[{disc,[rabbit at qcluster0,rabbit at qcluster2]},{ram,[rabbit at qcluster1]}]},
>  {running_nodes,[rabbit at qcluster1,rabbit at qcluster2,rabbit at qcluster0]},
>  {partitions,[]}]
> Application environment of node rabbit at qcluster0 ...
> [{auth_backends,[rabbit_auth_backend_internal]},
>  {auth_mechanisms,['PLAIN','AMQPLAIN']},
>  {backing_queue_module,rabbit_variable_queue},
>  {cluster_nodes,{[rabbit at qcluster0,rabbit at qcluster1],disc}},
>  {cluster_partition_handling,pause_minority},
>  {collect_statistics,fine},
>  {collect_statistics_interval,5000},
>  {default_permissions,[<<".*">>,<<".*">>,<<".*">>]},
>  {default_user,<<"guest">>},
>  {default_user_tags,[administrator]},
>  {default_vhost,<<"/">>},
>  {delegate_count,16},
>  {disk_free_limit,1000000000},
>  {enabled_plugins_file,"/etc/rabbitmq/enabled_plugins"},
>  {error_logger,{file,"/var/log/rabbitmq/rabbit at qcluster0.log"}},
>  {frame_max,131072},
>  {heartbeat,600},
>  {hipe_compile,false},
>  {included_applications,[]},
>  {log_levels,[{connection,info}]},
>  {msg_store_file_size_limit,16777216},
>  {msg_store_index_module,rabbit_msg_store_ets_index},
>
>  {plugins_dir,"/usr/lib/rabbitmq/lib/rabbitmq_server-3.1.3/sbin/../plugins"},
>
>  {plugins_expand_dir,"/var/lib/rabbitmq/mnesia/rabbit at qcluster0-plugins-expand"},
>  {queue_index_max_journal_entries,65536},
>  {reverse_dns_lookups,false},
>  {sasl_error_logger,{file,"/var/log/rabbitmq/rabbit at qcluster0-sasl.log"}},
>  {server_properties,[]},
>  {ssl_cert_login_from,distinguished_name},
>  {ssl_listeners,[]},
>  {ssl_options,[]},
>  {tcp_listen_options,[binary,
>                       {packet,raw},
>                       {reuseaddr,true},
>                       {backlog,128},
>                       {nodelay,true},
>                       {linger,{true,0}},
>                       {exit_on_close,false}]},
>  {tcp_listeners,[{"auto",5672}]},
>  {trace_vhosts,[]},
>  {vm_memory_high_watermark,0.4}]
> Connections:
> Channels:
> Queues on /:
> Exchanges on /:
> name type durable auto_delete internal arguments policy
> direct true false false []
> amq.direct direct true false false []
> amq.fanout fanout true false false []
> amq.headers headers true false false []
> amq.match headers true false false []
> amq.rabbitmq.log topic true false false []
> amq.rabbitmq.trace topic true false false []
> amq.topic topic true false false []
> Bindings on /:
> Consumers on /:
> Permissions on /:
> user configure write read
> guest .* .* .*
> Policies on /:
> Parameters on /:
> ...done. 
>
> Any help would be appreciated. 
> Thanks 
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20130806/cbd2fbf3/attachment.htm>


More information about the rabbitmq-discuss mailing list