[rabbitmq-discuss] Broken Pipe in RabbitMQ ConnectionFactory.newConnection()
John M.
approche.pratique at gmail.com
Sat Sep 7 07:15:14 BST 2013
Rarely, when under more load than usual my RabbitMQ application starts
returning SocketException: Broken pipe (and basically doesn't process any
further messages).
The system is using the RPC pattern, with workers listening on a few
predefined queues for jobs, clients submitting tasks on these jobs while
opening a temporary auto-delete queues that they specify as replyTo queue
where they listen for the replies on (and use a correlation ID as well to
match the messages).
The code that actually leads to the Broken pipe is quite simple, it is in
the client part and basically does:
factory = new ConnectionFactory();
factory.setUri(uri);
connection = factory.newConnection(); // this is when we get the
exception
The exception is as follows:
2013-09-06 21:37:03,947 +0000 [http-bio-8080-exec-350] ERROR
RabbitRpcClient:79 - IOException
java.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
at
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
at java.io.DataOutputStream.flush(DataOutputStream.java:123)
at
com.rabbitmq.client.impl.SocketFrameHandler.flush(SocketFrameHandler.java:142)
at com.rabbitmq.client.impl.AMQConnection.flush(AMQConnection.java:488)
at com.rabbitmq.client.impl.AMQCommand.transmit(AMQCommand.java:125)
at
com.rabbitmq.client.impl.AMQChannel.quiescingTransmit(AMQChannel.java:316)
at com.rabbitmq.client.impl.AMQChannel.transmit(AMQChannel.java:292)
at com.rabbitmq.client.impl.AMQChannel.transmit(AMQChannel.java:285)
at com.rabbitmq.client.impl.AMQConnection.start(AMQConnection.java:383)
at
com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:516)
at
com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:533)
...
I think this generally coincides with the workers taking longer than usual
about their business, and thus more temporary client queues concurrently
open (about 20-30 perhaps?), however as far as I know I'm not running into
any of the usual watermarks (memory, disk - I could be running into some
limit I don't know about).
I've reviewed the Rabbit logs and the only kind of errors I find there are:
=ERROR REPORT==== 6-Sep-2013::21:36:59 ===
closing AMQP connection <0.3105.1297> (10.118.69.132:42582 ->
10.12.111.134:5672):
{handshake_timeout,frame_header}
I checked both logs and the first "broken pipe" on the client appeared at
21:37:03, while the first ERROR of any kind in RabbitMQ logs on that date
appeared at 21:36:59, with regular errors of the same kind appearing
regularly thereafter until the systems were restarted. Thus I believe the
ones published are corresponding log entries.
I'm using the Rabbit Java client 3.1.4 (latest on Maven central) with
Rabbit server 3.1.4 running on Amazon Linux un AWS EC2.
Here is the rabbitmqctl status under normal situation (unfortunately not
during the failure, I will try to get one when it appears next):
[rabbitmq]$ sudo rabbitmqctl status
Status of node 'rabbit at ip-some-ip' ...
[{pid,2654},
{running_applications,
[{rabbitmq_management,"RabbitMQ Management Console","3.1.4"},
{rabbitmq_management_agent,"RabbitMQ Management Agent","3.1.4"},
{rabbit,"RabbitMQ","3.1.4"},
{os_mon,"CPO CXC 138 46","2.2.7"},
{rabbitmq_web_dispatch,"RabbitMQ Web Dispatcher","3.1.4"},
{webmachine,"webmachine","1.10.3-rmq3.1.4-gite9359c7"},
{mochiweb,"MochiMedia Web Server","2.7.0-rmq3.1.4-git680dba8"},
{xmerl,"XML parser","1.2.10"},
{inets,"INETS CXC 138 49","5.7.1"},
{mnesia,"MNESIA CXC 138 12","4.5"},
{amqp_client,"RabbitMQ AMQP Client","3.1.4"},
{sasl,"SASL CXC 138 11","2.1.10"},
{stdlib,"ERTS CXC 138 10","1.17.5"},
{kernel,"ERTS CXC 138 10","2.14.5"}]},
{os,{unix,linux}},
{erlang_version,
"Erlang R14B04 (erts-5.8.5) [source] [64-bit] [smp:2:2] [rq:2]
[async-threads:30] [kernel-poll:true]\n"},
{memory,
[{total,331967824},
{connection_procs,5389784},
{queue_procs,2669016},
{plugins,654768},
{other_proc,10063336},
{mnesia,90352},
{mgmt_db,2706344},
{msg_index,7148168},
{other_ets,3495648},
{binary,1952040},
{code,17696200},
{atom,1567425},
{other_system,278534743}]},
{vm_memory_high_watermark,0.4},
{vm_memory_limit,3126832332},
{disk_free_limit,1000000000},
{disk_free,1487147008},
{file_descriptors,
[{total_limit,349900},
{total_used,71},
{sockets_limit,314908},
{sockets_used,66}]},
{processes,[{limit,1048576},{used,930}]},
{run_queue,0},
{uptime,5680}]
...done.
Any ideas what could be wrong or at least what I can do to debug this / get
more clarity on what is happening?
Best regards,
John
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20130906/5fda7d81/attachment.htm>
More information about the rabbitmq-discuss
mailing list