[rabbitmq-discuss] Long timeout if server host becomes unreachable

Oleg Lyalikov oleg.lyalikov at gmail.com
Mon Oct 7 13:26:20 BST 2013


Hello,

I tried to search some discussion on this topic but didn't find.

The problem : the client sends some messages to the server on another host
and at some moment server host becomes unreachable (network is down, host is
powered off, etc).
As a result the sender thread is blocked while writing to socket and
recognizes network failure after some time which is not comfortable (> 15
min for our rhel machine). I tried to tune some tcp/socket settings but
didn't succeed.

Here is a part of thread dump when the thread is stuck:

"main" prio=10 tid=0xf7505800 nid=0x2c3e runnable [0xf7728000]
   java.lang.Thread.State: RUNNABLE
        at java.net.SocketOutputStream.socketWrite0(Native Method)
        at
java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
        at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
        at
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
        at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
        - locked <0xb4532608> (a java.io.BufferedOutputStream)
        at java.io.DataOutputStream.flush(DataOutputStream.java:123)
        at
com.rabbitmq.client.impl.SocketFrameHandler.flush(SocketFrameHandler.java:142)
        at
com.rabbitmq.client.impl.AMQConnection.flush(AMQConnection.java:488)
        at com.rabbitmq.client.impl.AMQCommand.transmit(AMQCommand.java:125)
        at
com.rabbitmq.client.impl.AMQChannel.quiescingTransmit(AMQChannel.java:316)
        - locked <0xb4532ee8> (a java.lang.Object)
        at com.rabbitmq.client.impl.AMQChannel.transmit(AMQChannel.java:292)
        - locked <0xb4532ee8> (a java.lang.Object)
        at com.rabbitmq.client.impl.ChannelN.basicPublish(ChannelN.java:634)
        at com.rabbitmq.client.impl.ChannelN.basicPublish(ChannelN.java:617)
        at com.rabbitmq.client.impl.ChannelN.basicPublish(ChannelN.java:608)

"AMQP Connection 172.17.32.170:5672" prio=10 tid=0xac15d800 nid=0x2c5c
waiting for monitor entry [0xacb56000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at
com.rabbitmq.client.impl.AMQChannel.processShutdownSignal(AMQChannel.java:263)
        - waiting to lock <0xb4532ee8> (a java.lang.Object)
        at
com.rabbitmq.client.impl.ChannelN.startProcessShutdownSignal(ChannelN.java:259)
        at
com.rabbitmq.client.impl.ChannelN.processShutdownSignal(ChannelN.java:283)
        at
com.rabbitmq.client.impl.ChannelManager.handleSignal(ChannelManager.java:90)
        at
com.rabbitmq.client.impl.AMQConnection.finishShutdown(AMQConnection.java:696)
        at
com.rabbitmq.client.impl.AMQConnection.shutdown(AMQConnection.java:669)
        at
com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:550)

"pool-2-thread-1" prio=10 tid=0xac167000 nid=0x2c60 waiting for monitor
entry [0xac77d000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at java.io.BufferedOutputStream.write(BufferedOutputStream.java:94)
        - waiting to lock <0xb4532608> (a java.io.BufferedOutputStream)
        at java.io.DataOutputStream.writeByte(DataOutputStream.java:153)
        at com.rabbitmq.client.impl.Frame.writeTo(Frame.java:189)
        at
com.rabbitmq.client.impl.SocketFrameHandler.writeFrame(SocketFrameHandler.java:137)
        - locked <0xb45325f0> (a java.io.DataOutputStream)
        at
com.rabbitmq.client.impl.HeartbeatSender$HeartbeatRunnable.run(HeartbeatSender.java:133)
        at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at
java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
        at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
        at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:722)

RabbitMQ server v 3.0.2, client (java) v 3.0.1, both run on rhel 6.4.

Do you know if there is some setting which can manage this timeout period
(OS level / JVM level / client library level)?
Also the thread "AMQP Connection ..." seems is blocked on monitor object
while trying to shutdown - could it be a bug which prevents to make shutdown
in time?

Thanks,
Oleg



--
View this message in context: http://rabbitmq.1065348.n5.nabble.com/Long-timeout-if-server-host-becomes-unreachable-tp30275.html
Sent from the RabbitMQ mailing list archive at Nabble.com.


More information about the rabbitmq-discuss mailing list