[rabbitmq-discuss] tcp -> rabbitmq gives emfile + 541

Wed Jul 1 10:13:49 BST 2009

Dear Tony,

thank you for your email concerning the 0.9.1 C client (*), it is obviously
the right way to go, but it would force me to deploy 0.9.1 in our
infrastructure, I was hoping to go live with the current setup in the next
couple of weeks. So, please forgive me if I insist on debugging the current
setup.

The reason I am annoying you guys with this is that, surely, you must have
come across situations when your system runs out of file descriptors while
you are opening/closing tcp connections in stress tests.  Valentino, for
example, has posted some suggestions concerning kernel params (please see
the second link below).

Here's what I'm trying to do:

  +------------+     +----------------+      +-------------+
  | DB Trigger |    \|   TCP listener |     \|   RabbitMQ  |
  |     (C)    '''''/|    (Erlang)    ''''''/|   (Erlang)  |
  +------------+     +----------------+      +-------------+

I have got a C trigger function in a database (Postgresql) which opens and
closes a new TCP connection to an Erlang node every time it fires. That's
the nature of row-level trigger functions, there is no way I can pool
connections/maintain state across invocations. So this is 2 file descriptors
(for 2 minutes -- TIME_WAIT) just getting the message from the trigger to
the Erlang node, and then rabbitmq will probably consume some file
descriptors for opening/closing channels. The trigger is executed several
times per second, say every 50-100ms, I have increased the user's nofile
ulimit to 32768, I have increased the number of concurrent Erlang processes
to millions, I have even tried running my tcp acceptor code within the same
VM as RabbitMQ, so that I can use a direct client connection and avoid any
TCP overhead between the tcp listener and RabbitMQ. At this stage, I'm not
consuming the (persistent) messages as they come in, they are stored in
mnesia. And I either run out of file descriptors (separate VMs) or the
RabbitMQ VM just dies after approx. 300000 messages (if tcp listener is
running within RabbitMQ vm).

I guess what I am asking is some feedback/stats, such as throughput limits
you have observed with RabbitMQ, mnesia or the operating system, whether
deploying 0.9.1 would lift any barriers, is there something fundamentally
wrong with the way I am trying to use RabbitMQ etc. I remember from the
Erlang factory talks you have been working with new persistence/offload to
disk mechanisms.

cheers, Michael

Links:
---------
(*)
http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/2009-June/004337.html
http://www.nabble.com/Large-number-of-connections-td22994598.html
http://www.nabble.com/spawn-problem-td9684760.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20090701/feeefb2d/attachment.htm