[rabbitmq-discuss] Queue depth and no. of consumers.

Mon Oct 31 16:27:46 GMT 2011

On 31/10/11 14:31, michael neidhardt wrote:
> With no QoS set, I assume each consumer fetches as many messages as
> possible (whatever that means),
> and 'queues' them itself.

Yes, although the "queueing" doesn't mean there's an explicit queue in 
the client. But messages can back up in networking buffers etc.

> I will try to set QoS/prefetch_count to 1,
> though I seem to remember
> having read that this can cause trouble. I would assume that it would
> be no problem, have you got any thoughts on whether it's a good idea?

I don't see why it should cause trouble. Setting prefetch to 1 does mean 
that throughput will drop since the broker cannot deliver another 
message until the previous one has been acknowledged, so if throughput 
is a concern then you might set the prefetch to some larger number for 
less fairness but more performance.

In general setting the prefetch count is a good idea if you're concerned 
with fair distribution of messages.

> As I wrote earlier, we have a number (several thousand) of files to
> process, each of which may contain up to several hundred thousand
> records (around 2KB each).
>
> In an earlier test, I let the processes handling files push the ID of
> each record to a queue.
> (Simply add a publish to the above code after the ack). The consumer
> for this queue (which uses autoack)

(Of course, setting prefetch doesn't do anything in noack mode since the 
server has no way to know how many messages have made it all the way to 
the client.)

> would do a lookup in a Postgresql
> DB, and nothing else.
> After about 50 million records, the vm_memory_high_watermark would be
> set, and shortly after
> that, I got<"timeout waiting for channel.flow_ok{active=false}",none}
>> . Eventually the whole system froze.
> I guess this timeout is caused by my client not reacting to the flow
> control from the RabbitMQ server. Is that correct? Unfortunately, the
> client I use does not have methods for that. Should I expect to handle
> this in normal operation, or could it be handled by a client for me?

I think you said you were using 1.8.1. Much has changed since then:

* We no longer use channel.flow to throttle producers since as you're 
seeing many clients did not implement it correctly / at all. We now use 
TCP backpressure instead.

* Prior to 2.0 all messages in all queues had to fit in memory. Messages 
are now paged to disk when memory is low.

So I would strongly advise you to upgrade to 2.6.1. It's available in 
our apt repository:

http://www.rabbitmq.com/debian.html

> Oh, and the big question: Is it out of the question to handle approx.
> 300 mill. messages (where payload is essentially a bigint) over a few
> days?

That should be no big deal. My workstation can churn through that in a 
few hours.

Cheers, Simon

-- 
Simon MacMullen
RabbitMQ, VMware