[rabbitmq-discuss] Possible process leak in RabbitMQ server
tim at rabbitmq.com
Fri Jun 7 11:20:06 BST 2013
On 5 Jun 2013, at 17:00, Daniel Luna wrote:
> We are experiencing a growth of processes in the rabbitmq server for long running connections. I have no idea if we are doing something wrong or if this is a bug in rabbitmq itself.
> In an idle system we have about 30k connections to rabbitmq, using around 31k channels.
> Sockets and file descriptors match more or less one-to-one to the connections and so does the number of Erlang processes initally. My problem here is that the number of Erlang processes is steadily growing.
There should be quite a few more Erlang processes than there are connections + channels. For each connection, for example, there will be more than one process (i.e., a reader plus a writer). For each queue there will be a process, and so on.
> At the moment we have ~500k Erlang processes in our rabbitmq server with 47k connections using 54k channels. As soon as the load on our system lightens up the number of connections and channels will go down as expected, but the number of processes will stay high.
That figure doesn't necessarily look wrong, given the 47k connections require a reader and a writer process, plus all the channels, plus a process per queue, plus various other system processes, supervisors, mirror queue masters/coordinators/slaves, and so on. It might be a little high, but the key point is whether or not the process count goes down as the load lightens, which you're saying it doesn't.
Do note that processes associated with a client, i.e., related to connections and channels, won't necessarily die immediately upon client disconnect. It will depend on how the disconnect occurs, and various other factors.
> Restarting the client will drop the process count, so this is something related to keeping long living connections and/or channels around.
Possibly. I'm not saying "nothing's wrong", but as you've mentioned below, a bit more research is a good idea.
> Oh, yeah, upgrading from 2.8.2 to 3.1.1 made no difference.
> Connection counts are steady as are channel counts (i.e. they grow and shrink as expected depending on load). We use both a polling and a subscribing pattern depending on the particular consumer. I've seen a system with all empty queues have this issue so it's not based on that either.
> I'm suspecting that there's a missing exit that leaves a dangling processes for some sort of operation. It's something that's properly linked with the connection (or channel) though since they go away on connection restart.
That seems unlikely. Can you tell us a bit more about your system? Are you clustered? Are you running any plugins? How many queues are there? How are you counting the number of Erlang processes? Anything interesting in the logs? The output of `rabbitmqctl report' on each node would be useful. Which client libraries are you using? Are we talking just AMQP connections, or other things (such as STOMP, MQTT, etc) as well?
> I'll research some more and see what I can dig up.
Thanks for that. The more info the better.
> Help would be appreciated.
If we're leaking processes, then we definitely want to know. I suspect not, but the more info you can provide the better.
More information about the rabbitmq-discuss