[rabbitmq-discuss] RabbitMQ hangs, does not accept connections

Alvaro Videla videlalvaro at gmail.com
Thu Dec 22 08:35:59 GMT 2011


Yes, you might get a pang but make sure the user that started the erl command has the same .erlang.cookie as the user running the RabbitMQ process.

On Dec 22, 2011, at 8:05 AM, Dmitri Minaev wrote:

> Oh...
> 
> $ erl -sname qwer
> Erlang R13B03 (erts-5.7.4) [source] [64-bit] [smp:4:4] [rq:4]
> [async-threads:0] [hipe] [kernel-poll:false]
> 
> Eshell V5.7.4  (abort with ^G)
> (qwer at dbx)1> net_adm:names().
> {ok,[{"rabbit",60040},{"qwer",58043}]}
> (qwer at dbx)2> net_adm:ping(rabbit).
> pang
> 
> 
> 
> On 22 December 2011 10:55, Alvaro Videla <videlalvaro at gmail.com> wrote:
>> Hi,
>> 
>> A small note,
>> 
>> When connecting to a remote Erlang node, in this case the the rabbit node, you have to choose a different node name.
>> 
>> For example:
>> 
>> erl -sname foo
>> 
>> Once you are on the Erlang REPL then you can try to remotely connect to the rabbit node using net_adm:ping
>> 
>> -Alvaro.
>> 
>> Sent from my iFad
>> 
>> On Dec 22, 2011, at 7:32 AM, Dmitri Minaev <minaev at gmail.com> wrote:
>> 
>>> Now, I have a hanging Rabbit available for the autopsy.
>>> 
>>> Running processes (ps ax|grep rabbit):
>>> 
>>> -------------
>>> 29699 ?        Ss     0:00 sh -c
>>> RABBITMQ_PID_FILE=/var/run/rabbitmq/pid /usr/sbin/rabbitmq-server >
>>>         /var/log/rabbitmq/startup_log 2>
>>> /var/log/rabbitmq/startup_err
>>> 29702 ?        S      0:00 /bin/sh /usr/sbin/rabbitmq-server
>>> 29708 ?        S      0:00 su rabbitmq -s /bin/sh -c
>>> /usr/lib/rabbitmq/bin/rabbitmq-server
>>> 29710 ?        S      0:00 sh -c /usr/lib/rabbitmq/bin/rabbitmq-server
>>> 29711 ?        Sl   4715:59 /usr/lib/erlang/erts-5.7.4/bin/beam.smp -W
>>> w -K true -A30 -P 1048576 -- -root /usr/lib/erlang -progname erl --
>>> -home /var/lib/rabbitmq -- -noshell -noinput -sname rabbit at dbx
>>> -setcookie riak -boot
>>> /var/lib/rabbitmq/mnesia/rabbit at dbx-plugins-expand/rabbit -config
>>> /etc/rabbitmq/rabbitmq -kernel inet_default_connect_options
>>> [{nodelay,true}] -rabbit tcp_listeners [{"0.0.0.0",5672}] -sasl
>>> errlog_type error -kernel error_logger
>>> {file,"/var/log/rabbitmq/rabbit at dbx.log"} -sasl sasl_error_logger
>>> {file,"/var/log/rabbitmq/rabbit at dbx-sasl.log"} -os_mon start_cpu_sup
>>> true -os_mon start_disksup false -os_mon start_memsup false -mnesia
>>> dir "/var/lib/rabbitmq/mnesia/rabbit at dbx"
>>> -------------
>>> 
>>> Network sockets are available:
>>> $ sudo netstat -tunlp|grep beam
>>> tcp        0      0 0.0.0.0:5672            0.0.0.0:*
>>> LISTEN      29711/beam.smp
>>> tcp        0      0 0.0.0.0:60040           0.0.0.0:*
>>> LISTEN      29711/beam.smp
>>> 
>>> $ cat /etc/rabbitmq/rabbitmq.config
>>> [{rabbit, [{vm_memory_high_watermark, 0.7}]},
>>> {rabbit, [{tcp_listeners, [{"0.0.0.0", 5672}]}]}].
>>> 
>>> $ cat /etc/rabbitmq/rabbitmq-env.conf
>>> RABBITMQ_NODE_IP_ADDRESS=0.0.0.0
>>> 
>>> strace -p 29711 shows that the process is waiting in select():
>>> select(0, NULL, NULL, NULL, NULL
>>> 
>>> 
>>> Last lines in rabbit at dbx.log:
>>> ---------------------------
>>> =WARNING REPORT==== 22-Dec-2011::09:55:44 ===
>>> exception on TCP connection <0.367.0> from x.x.x.26:43157
>>> connection_closed_abruptly
>>> 
>>> =INFO REPORT==== 22-Dec-2011::09:55:44 ===
>>> closing TCP connection <0.367.0> from x.x.x..26:43157
>>> 
>>> =WARNING REPORT==== 22-Dec-2011::09:55:44 ===
>>> exception on TCP connection <0.379.0> from x.x.x.26:43160
>>> connection_closed_abruptly
>>> 
>>> =INFO REPORT==== 22-Dec-2011::09:55:44 ===
>>> closing TCP connection <0.379.0> from x.x.x.26:43160
>>> 
>>> =WARNING REPORT==== 22-Dec-2011::09:55:44 ===
>>> exception on TCP connection <0.335.0> from x.x.x.26:43154
>>> connection_closed_abruptly
>>> 
>>> =INFO REPORT==== 22-Dec-2011::09:55:44 ===
>>> closing TCP connection <0.335.0> from x.x.x.26:43154
>>> 
>>> =WARNING REPORT==== 22-Dec-2011::09:55:44 ===
>>> exception on TCP connection <0.467.0> from x.x.x.26:43166
>>> connection_closed_abruptly
>>> 
>>> =INFO REPORT==== 22-Dec-2011::09:55:44 ===
>>> closing TCP connection <0.467.0> from x.x.x.26:43166
>>> ---------------------------
>>> 
>>> PHP clients cannot connect to RabbitMQ. When I run my test Python
>>> script which uses amqplib.client_0_8, it hangs on
>>> amqp.Connection(host, "guest", "guest", ssl=False)
>>> 
>>> strace shows the following:
>>> 
>>> connect(3, {sa_family=AF_INET, sin_port=htons(5672),
>>> sin_addr=inet_addr("127.0.0.1")}, 16) = 0
>>> fcntl(3, F_GETFL)                       = 0x2 (flags O_RDWR)
>>> fcntl(3, F_SETFL, O_RDWR)               = 0
>>> sendto(3, "AMQP\1\1\t\1", 8, 0, NULL, 0) = 8
>>> brk(0x1461000)                          = 0x1461000
>>> recvfrom(3,
>>> 
>>> Now, I try to connect to the RabbitMQ node using 'erl':
>>> $ erl -sname 'rabbit at dbx'
>>> {error_logger,{{2011,12,22},{10,26,33}},"Protocol: ~p: register error:
>>> ~p~n",["inet_tcp",{{badmatch,{error,duplicate_name}},[{inet_tcp_dist,listen,1},{net_kernel,start_protos,4},{net_kernel,start_protos,3},{net_kernel,init_node,2},{net_kernel,init,1},{gen_server,init_it,6},{proc_lib,init_p_do_apply,3}]}]}
>>> {error_logger,{{2011,12,22},{10,26,33}},crash_report,[[{initial_call,{net_kernel,init,['Argument__1']}},{pid,<0.21.0>},{registered_name,[]},{error_info,{exit,{error,badarg},[{gen_server,init_it,6},{proc_lib,init_p_do_apply,3}]}},{ancestors,[net_sup,kernel_sup,<0.9.0>]},{messages,[]},{links,[#Port<0.68>,<0.18.0>]},{dictionary,[{longnames,false}]},{trap_exit,true},{status,running},{heap_size,377},{stack_size,24},{reductions,442}],[]]}
>>> {error_logger,{{2011,12,22},{10,26,33}},supervisor_report,[{supervisor,{local,net_sup}},{errorContext,start_error},{reason,{'EXIT',nodistribution}},{offender,[{pid,undefined},{name,net_kernel},{mfa,{net_kernel,start_link,[['rabbit at dbx',shortnames]]}},{restart_type,permanent},{shutdown,2000},{child_type,worker}]}]}
>>> {error_logger,{{2011,12,22},{10,26,33}},supervisor_report,[{supervisor,{local,kernel_sup}},{errorContext,start_error},{reason,shutdown},{offender,[{pid,undefined},{name,net_sup},{mfa,{erl_distribution,start_link,[]}},{restart_type,permanent},{shutdown,infinity},{child_type,supervisor}]}]}
>>> {error_logger,{{2011,12,22},{10,26,33}},std_info,[{application,kernel},{exited,{shutdown,{kernel,start,[normal,[]]}}},{type,permanent}]}
>>> {"Kernel pid terminated",application_controller,"{application_start_failure,kernel,{shutdown,{kernel,start,[normal,[]]}}}"}
>>> 
>>> Crash dump was written to: erl_crash.dump
>>> Kernel pid terminated (application_controller)
>>> ({application_start_failure,kernel,{shutdown,{kernel,start,[normal,[]]}}})
>>> 
>>> Is there any other information that might be useful?
>>> 
>>> On 13 December 2011 18:26, Dmitri Minaev <minaev at gmail.com> wrote:
>>>> Thank you for the reply. Yes, TCP connection could be established, but
>>>> not AMQP. We generally use PHP library, but I also tested RabbitMQ
>>>> using Python amqplib. In both cases, the client side cannot get the
>>>> connection.
>>>> 
>>>> Besides the common information messages (starting/closing TCP
>>>> connection), there's only one type of messages in the log files:
>>>> 
>>>> =WARNING REPORT==== 13-Dec-2011::16:56:51 ===
>>>> exception on TCP connection <0.14474.173> from x.x.x.x:xxx
>>>> connection_closed_abruptly
>>>> 
>>>> But then, again, these messages may be found even during normal
>>>> operation, this is why I don't think they're relevant.
>>>> 
>>>> 
>>>> On 13 December 2011 14:42, Simon MacMullen <simon at rabbitmq.com> wrote:
>>>>> Hmm. I can't really say anything from your description - can you post the
>>>>> logs somewhere? It's possible that your definition of "nothing unusual in
>>>>> the logs" differs from mine.
>>>>> 
>>>>> And when you say that "the server refused attempts to connect", what exactly
>>>>> do you mean. You say that a TCP connection *could* be established - so does
>>>>> your client hang during AMQP handshaking? Disconnect? Something else?
>>>>> 
>>>>> Cheers, Simon
>>>>> 
>>>>> 
>>>>> On 12/12/11 16:24, Dmitri Minaev wrote:
>>>>>> 
>>>>>> Hello,
>>>>>> 
>>>>>> We use RabbitMQ for about a year now. From time to time I upgraded it
>>>>>> and switched from one server to another. About a month ago the last
>>>>>> such transition took place. I installed new RabbitMQ (2.7) on a new
>>>>>> server and our web application was reconfigured. Quite soon we faced
>>>>>> new problems. After some days of stable work clients could not connect
>>>>>> to RabbitMQ. I could list run rabbitmqctl, list queues, kill
>>>>>> connections, but the server refused attempts to connect. That is, TCP
>>>>>> socket was available and telnet could connect to port 5672, but the
>>>>>> AMQP connection could not be established. There was nothing unusual in
>>>>>> the logs. vm_memory_high_watermark is set to 0.7 and there's still
>>>>>> plenty of free memory.
>>>>>> 
>>>>>> After a couple of such failures I tried to downgrade to 2.6.1, but the
>>>>>> problem remained. The last time I disabled IPv6, but today we hit the
>>>>>> same trouble again.
>>>>>> 
>>>>>> I think I must have done something wrong when setting up the
>>>>>> environment, but what could that be?
>>>>>> 
>>>>>> OS: Ubuntu 10.04 LTS.
>>>>>> 16GB RAM.
>>>>>> RabbitMQ 2.6.1
>>>>>> Erlang R13B03 (erts-5.7.4) (package erlang-nox from Ubuntu repository)
>>>>>> Client: php-amqplib
>>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Simon MacMullen
>>>>> RabbitMQ, VMware
>>>>> _______________________________________________
>>>>> rabbitmq-discuss mailing list
>>>>> rabbitmq-discuss at lists.rabbitmq.com
>>>>> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>>>> 
>>>> 
>>>> 
>>>> --
>>>> With best regards,
>>>> Dmitri Minaev
>>> 
>>> 
>>> 
>>> --
>>> With best regards,
>>> Dmitri Minaev
>>> _______________________________________________
>>> rabbitmq-discuss mailing list
>>> rabbitmq-discuss at lists.rabbitmq.com
>>> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
> 
> 
> 
> -- 
> With best regards,
> Dmitri Minaev

Sent form my Nokia 1100





More information about the rabbitmq-discuss mailing list