[rabbitmq-discuss] client connect problems

Jerry Kuch jerryk at vmware.com
Fri Dec 3 18:11:27 GMT 2010


Hi, Robert.

With a broker version this new, these things are often either a misbehaving client, or the broker running in an environment where there's a resource limit configuration problem, e.g. running out of file descriptors.

Here are a few things to investigate in such circumstances (derived from a support incident on an older Rabbit version, but most of this remains sensible stuff to check).

Please let us know how it goes.  If none of the above yield meaningful clues, would you feel comfortable sharing some snippets of your client code with which you're encountering the problem?

Jerry

==============

 Here's a quick laundry list of things to investigate when one
 encounters problems of this flavor.

 - check `ulimit -n` as whichever user is being used to run rabbit;
   depending on your system this may be the 'rabbitmq' user.  Be wary
   of raised limits not making it to the user/shell/process one
   wanted them to.  Again, newer Rabbits will announce at startup
   time what sort of file handle limits they think they're working under
   with a log message something like:

           =INFO REPORT==== 2-Dec-2010::14:37:38 ===
        Limiting to approx 156 file handles (138 sockets)
 
  This example is from my desktop machine where by default 'ulimit -n' 
  gives 256, a value that's likely rather too low for a broker that's going
  to be serving a lot of clients.

 - check the logs for memory alarms; severe memory pressure can cause
   Rabbit to start refusing connections.

 - check that Rabbit hasn't run out of sockets; to do this use
   `rabbitmqctl list_connections` and netstat to see if there are
   suspiciously large numbers of existing connections.  Client
   misbehavior may cause this on an otherwise healthy Rabbit.

 - check the rabbit-sasl.log and the main rabbit.log for any sign
   that the tcp listener/acceptor process has crashed or misbehaved.  
   Look for 'CRASH REPORT' entries in the logs and mentions of 
   'tcp_acceptor', 'cannot_accept' , "{error,emfile}" and similar.

 - is Rabbit spinning frantically?  Check 'top' and your preferred
   process monitoring tools to see if you have Rabbit beam.smp
   processes consuming large amounts of CPU while the problem is
   manifesting. [ In your current case it seems you've ruled this out 
   already]

 - generally scour rabbit.log and rabbit-sasl.log for suspicious errors.

 - use 'rabbitmqctl list_WHATEVER' for WHATEVERs including connections,
   channels, exchanges, queues, bindings, etc. to see if there are
   unexpectedly massive quantities of anything.  Also check your
   queues to see if any have absurd numbers of messages languishing
   in them, as this could indicate a misbehaving client application.

On Dec 3, 2010, at 4:03 AM, Robert Fuller wrote:

> Hi,
> 
> I had an issue for several hours where clients from many hosts were
> not able to connect to a running rabbitmq server (2.2.0). list_queues
> showed only a small number of items in the queues. Eventually after
> restarting rabbitmq server everything returned immediately to normal.
> 
> client stack seems to be sometimes this:
> java.net.ConnectException: Connection timed out
> 	at java.net.PlainSocketImpl.socketConnect(Native Method)
> 	at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
> 	at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
> 	at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
> 	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
> 	at java.net.Socket.connect(Socket.java:529)
> 	at java.net.Socket.connect(Socket.java:478)
> 	at com.rabbitmq.client.ConnectionFactory.createFrameHandler(ConnectionFactory.java:338)
> 	at com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:376)
> 
> 
> and sometimes this:
> java.io.IOException
> 	at com.rabbitmq.client.impl.AMQChannel.wrap(AMQChannel.java:121)
> 	at com.rabbitmq.client.impl.AMQConnection.start(AMQConnection.java:274)
> 	at com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:379)
> 	at com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:399)
> ...
> Caused by: com.rabbitmq.client.ShutdownSignalException: connection
> error; reason: java.net.SocketException: Connection reset
> 	at com.rabbitmq.utility.ValueOrException.getValue(ValueOrException.java:81)
> 	at com.rabbitmq.utility.BlockingValueOrException.uninterruptibleGetValue(BlockingValueOrException.java:47)
> 	at com.rabbitmq.client.impl.AMQChannel$BlockingRpcContinuation.getReply(AMQChannel.java:342)
> 	at com.rabbitmq.client.impl.AMQConnection.start(AMQConnection.java:258)
> 	... 39 more
> Caused by: java.net.SocketException: Connection reset
> 	at java.net.SocketInputStream.read(SocketInputStream.java:168)
> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
> 	at java.io.DataInputStream.readUnsignedByte(DataInputStream.java:271)
> 	at com.rabbitmq.client.impl.Frame.readFrom(Frame.java:118)
> 	at com.rabbitmq.client.impl.SocketFrameHandler.readFrame(SocketFrameHandler.java:155)
> 	at com.rabbitmq.client.impl.AMQConnection.readFrame(AMQConnection.java:393)
> 	at com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:421)
> 
> 
> 
> I could see nothing unusual on the server using top (memory and cpu
> usage ok), or in the rabbitmq logs.
> 
> What can I do to produce you a log on the server should this situation re-occur?
> 
> Thanks,
> Rob.
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss



More information about the rabbitmq-discuss mailing list