[rabbitmq-discuss] queue hands - timeout

Mon May 23 13:07:34 BST 2011

Hello Emile,

thanks a lot for your very quick reply. Please find enclosed the requested details

This morning I have seen a response on this mailing list, where I assume, that this could be the case:
>> If you have a fast producer running constantly, accompanied by a slow consumer then the number of queued messages will grow. To prevent this from causing out of memory problems, Rabbit will throttle the producer once the memory high watermark is hit.
>> 
>> The throttling works by simply blocking the producer from sending any more data over the TCP socket. Once memory drops below the high watermark, due to the consumer catching up or messages being flushed to disk, the producer can send data again.

regards, tom

On May 20, 2011, at 1:02 PM, Emile Joubert wrote:

> Hi Thomas,
> 
> On 19/05/11 13:35, Thomas Stagl wrote:
>> Hello,
>> 
>> we are running a queue in our test environment with different vhosts.
>> When we do a long running perf test, we recognise that the responses
>> from the queue are timing out and from that moment on, it is not
>> possible to connect to the queue again. Neither with the java lib
>> (which we use in our application) nor with celery (which I use for
>> testing purposes.)
>> 
>> We have recognised that beam.smp is running on 30% CPU time from that
>> moment on.
>> 
>> When we trace a connection, we see this with tcpdump: 12:23:52.700093
>> IP CLIENT.43248>  RABBIT.amqp: Flags [P.], seq 2421288989:2421289002,
>> ack 2617563671, win 54, options [nop,nop,TS val 2047452 ecr
>> 1377663677], length 13 12:23:52.700124 IP RABBIT.amqp>  CLIENT.43248:
>> Flags [.], ack 13, win 62, options [nop,nop,TS val 1377881158 ecr
>> 2047452], length 0
>> 
>> From that moment on, the connection is hanging.
>> 
>> After a restart of the rabbitmq server, everything is running fine
>> again, for a certain amount of time.
>> 
>> We also have a second test stack and we switched to the second
>> rabbitmq test server, same behaviour here. It worked fine and after a
>> couple of hours it stopped responding.
>> 
>> Any ideas? Any though about where we can start debugging would be
>> very much appreciated.
> 
> We have brokers that have been running for weeks without exhibiting the behaviour you describe. Can you help us to emulate your test environment more closely?
> 
> What version of Erlang, OS and rabbit are you using?
RabbitMQ 2.3.1 / R14A running on Debian Squeeze

> What is the contents of the rabbit config file?
We do not use the config file

> What exactly does the perf test doing?
We are creating new users on our application and we issue the new profile to another application, we use the queue to send the message asynchronously.

> What is the rate and size of messsages, number of producers, consumers, are the messages persistent?
> If you have the management plugin installed then a copy of the broker configuration may be useful.
We use persistent queues and we send ~500 messages per second, there is one producer active and one consumer. Where the consumer is definitly much slower than the producer.
> 
> Does rabbitmqctl still work when the problem occurs? The output of all the list_* commands at the time of failure will be useful.
No, we can connect, but we can not send anything to the queue

> 
> The broker logfile entries around the onset of the problem will also help.

> 
> Were there any other applications running on the same OS as rabbit?
No, there are no other applications running instead of rabbitmq
> 
> Were the consumers (if any) keeping up with producers at or just before the onset of the problem?
> 
> What was the memory consumption reported by the management plugin at this time?
1.6GB
> 
> If you can help us to narrow this down to a specific set of conditions that reliably trigger the problem that will be a big step towards a solution.
> 
> 
> 
> Thanks
> 
> Emile