[rabbitmq-discuss] flow control issues

Wed Sep 8 10:33:03 BST 2010

We are performing evaluation on rabbitmq message broker, and we  
currently encounter difficulties to understand how does flow control  
work.

- Our application implies 10 000 peers producing messages periodically  
to a unique queue. This queue is listen asynchronously by another peer.
- All peer are written in Java, using amqp-client-1.8.1.
- The production rate of a single peer is 4 messages / hour
- We can simulate a time-consuming task in the consumer callback,  
simulating more or less fast consumer.
- we are using SSL certificate on the broker side to allow the peer to  
authenticate the broker.
	- we have noticed that the use of SSL as dramatic incidence on the  
memory occupied by Rabbitmq process

- we give us the possibility to add a second consumer to load-balance  
the consumption of messages
	- we are using prefetch windows of 1 message to enable credit-based  
flow control in this case

- we have settled several monitoring indicator on the broker side :
	- virtual memory occupied by rabbitmq process
	- cpu load
	- queue depth
	- disk occupation

- our test scenario is as follow:
	- during the 5 first hours, all peers join in the party (prods and  
cons)
	- after 5 hours, producers stop publishing messages
	- the test goes on for a configurable duration to allow the consumer  
to finish emptying the queue

during long running tests, we have encountered strange behaviour due  
to flow control :

The queue depth starts to increase linearly for about 2 hours, these  
is coherent since the message throughput of the single consumer
is not enough to absorb message ingress. Memory occupation grow faster  
as well, until the memory watermark is reached on the broker side.

 From that point, the producers are indeed paused, as flow control  
request has been issued by the broker, but the consumer seems to be  
blocked
as well. The queue level is flatten at its top value until the end of  
the test, even when memory occupation lowered under the threshold.

By registering the FlowListener callback, we have noticed that not all  
of the producers are notified all the time the alarm handler is set.
Does this mean that the broker applies some heuristic to try not to  
block every body every time ?
Or does it mean that some of the channels have been somehow  
blacklisted by the broker ?

Could anybody explain how the blocking of consumer is assumed to be  
implemented ?
Does the call of Channel.publish() is someHow blocking the connection  
Thread ?
How come that the consumer connection is also blocked ?
Does the implementation of FlowListener interface may help to handle  
flow control request ?
(I thought at first glance that the flow control should be implemented  
by hand using this interface,
but looking at this http://hopper.squarespace.com/blog/2008/11/9/flow-control-in-rabbitmq.html 
  after all, it seems that it is not the case anyway)

Best regards,

Romary.