[rabbitmq-discuss] client connect problems

Robert Fuller fullergalway at gmail.com
Fri Dec 10 17:05:03 GMT 2010


Hi,

I found the hadoop task processes hanging around because my code did
not always close the rabbitmq connection. I fixed the bug in my code
and now, voila, no more client connect problems.

Thanks again!
Rob.

On 3 December 2010 18:25, Robert Fuller <fullergalway at gmail.com> wrote:
> Thanks Jerry,
>
> I used `rabbitmqctl list_connections` to find 436 connections. Hmmn.
> More than I expected.
>
> I logged into one of the (hadoop map/reduce) servers, and I can see
> that some processes for old hadoop task attempts are hanging around,
> so yes it looks like i'm not playing nicely.
>
> I'll look into it further on Monday.
>
> Thanks,
> Rob
>
> On 3 December 2010 18:11, Jerry Kuch <jerryk at vmware.com> wrote:
>> Hi, Robert.
>>
>> With a broker version this new, these things are often either a misbehaving client, or the broker running in an environment where there's a resource limit configuration problem, e.g. running out of file descriptors.
>>
>> Here are a few things to investigate in such circumstances (derived from a support incident on an older Rabbit version, but most of this remains sensible stuff to check).
>>
>> Please let us know how it goes.  If none of the above yield meaningful clues, would you feel comfortable sharing some snippets of your client code with which you're encountering the problem?
>>
>> Jerry
>>
>> ==============
>>
>>  Here's a quick laundry list of things to investigate when one
>>  encounters problems of this flavor.
>>
>>  - check `ulimit -n` as whichever user is being used to run rabbit;
>>   depending on your system this may be the 'rabbitmq' user.  Be wary
>>   of raised limits not making it to the user/shell/process one
>>   wanted them to.  Again, newer Rabbits will announce at startup
>>   time what sort of file handle limits they think they're working under
>>   with a log message something like:
>>
>>           =INFO REPORT==== 2-Dec-2010::14:37:38 ===
>>        Limiting to approx 156 file handles (138 sockets)
>>
>>  This example is from my desktop machine where by default 'ulimit -n'
>>  gives 256, a value that's likely rather too low for a broker that's going
>>  to be serving a lot of clients.
>>
>>  - check the logs for memory alarms; severe memory pressure can cause
>>   Rabbit to start refusing connections.
>>
>>  - check that Rabbit hasn't run out of sockets; to do this use
>>   `rabbitmqctl list_connections` and netstat to see if there are
>>   suspiciously large numbers of existing connections.  Client
>>   misbehavior may cause this on an otherwise healthy Rabbit.
>>
>>  - check the rabbit-sasl.log and the main rabbit.log for any sign
>>   that the tcp listener/acceptor process has crashed or misbehaved.
>>   Look for 'CRASH REPORT' entries in the logs and mentions of
>>   'tcp_acceptor', 'cannot_accept' , "{error,emfile}" and similar.
>>
>>  - is Rabbit spinning frantically?  Check 'top' and your preferred
>>   process monitoring tools to see if you have Rabbit beam.smp
>>   processes consuming large amounts of CPU while the problem is
>>   manifesting. [ In your current case it seems you've ruled this out
>>   already]
>>
>>  - generally scour rabbit.log and rabbit-sasl.log for suspicious errors.
>>
>>  - use 'rabbitmqctl list_WHATEVER' for WHATEVERs including connections,
>>   channels, exchanges, queues, bindings, etc. to see if there are
>>   unexpectedly massive quantities of anything.  Also check your
>>   queues to see if any have absurd numbers of messages languishing
>>   in them, as this could indicate a misbehaving client application.
>>
>> On Dec 3, 2010, at 4:03 AM, Robert Fuller wrote:
>>
>>> Hi,
>>>
>>> I had an issue for several hours where clients from many hosts were
>>> not able to connect to a running rabbitmq server (2.2.0). list_queues
>>> showed only a small number of items in the queues. Eventually after
>>> restarting rabbitmq server everything returned immediately to normal.
>>>
>>> client stack seems to be sometimes this:
>>> java.net.ConnectException: Connection timed out
>>>       at java.net.PlainSocketImpl.socketConnect(Native Method)
>>>       at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
>>>       at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
>>>       at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
>>>       at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
>>>       at java.net.Socket.connect(Socket.java:529)
>>>       at java.net.Socket.connect(Socket.java:478)
>>>       at com.rabbitmq.client.ConnectionFactory.createFrameHandler(ConnectionFactory.java:338)
>>>       at com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:376)
>>>
>>>
>>> and sometimes this:
>>> java.io.IOException
>>>       at com.rabbitmq.client.impl.AMQChannel.wrap(AMQChannel.java:121)
>>>       at com.rabbitmq.client.impl.AMQConnection.start(AMQConnection.java:274)
>>>       at com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:379)
>>>       at com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:399)
>>> ...
>>> Caused by: com.rabbitmq.client.ShutdownSignalException: connection
>>> error; reason: java.net.SocketException: Connection reset
>>>       at com.rabbitmq.utility.ValueOrException.getValue(ValueOrException.java:81)
>>>       at com.rabbitmq.utility.BlockingValueOrException.uninterruptibleGetValue(BlockingValueOrException.java:47)
>>>       at com.rabbitmq.client.impl.AMQChannel$BlockingRpcContinuation.getReply(AMQChannel.java:342)
>>>       at com.rabbitmq.client.impl.AMQConnection.start(AMQConnection.java:258)
>>>       ... 39 more
>>> Caused by: java.net.SocketException: Connection reset
>>>       at java.net.SocketInputStream.read(SocketInputStream.java:168)
>>>       at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
>>>       at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
>>>       at java.io.DataInputStream.readUnsignedByte(DataInputStream.java:271)
>>>       at com.rabbitmq.client.impl.Frame.readFrom(Frame.java:118)
>>>       at com.rabbitmq.client.impl.SocketFrameHandler.readFrame(SocketFrameHandler.java:155)
>>>       at com.rabbitmq.client.impl.AMQConnection.readFrame(AMQConnection.java:393)
>>>       at com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:421)
>>>
>>>
>>>
>>> I could see nothing unusual on the server using top (memory and cpu
>>> usage ok), or in the rabbitmq logs.
>>>
>>> What can I do to produce you a log on the server should this situation re-occur?
>>>
>>> Thanks,
>>> Rob.
>>> _______________________________________________
>>> rabbitmq-discuss mailing list
>>> rabbitmq-discuss at lists.rabbitmq.com
>>> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>>
>>
>


More information about the rabbitmq-discuss mailing list