[rabbitmq-discuss] Queue depth and no. of consumers.
Simon MacMullen
simon at rabbitmq.com
Mon Oct 31 16:27:46 GMT 2011
On 31/10/11 14:31, michael neidhardt wrote:
> With no QoS set, I assume each consumer fetches as many messages as
> possible (whatever that means),
> and 'queues' them itself.
Yes, although the "queueing" doesn't mean there's an explicit queue in
the client. But messages can back up in networking buffers etc.
> I will try to set QoS/prefetch_count to 1,
> though I seem to remember
> having read that this can cause trouble. I would assume that it would
> be no problem, have you got any thoughts on whether it's a good idea?
I don't see why it should cause trouble. Setting prefetch to 1 does mean
that throughput will drop since the broker cannot deliver another
message until the previous one has been acknowledged, so if throughput
is a concern then you might set the prefetch to some larger number for
less fairness but more performance.
In general setting the prefetch count is a good idea if you're concerned
with fair distribution of messages.
> As I wrote earlier, we have a number (several thousand) of files to
> process, each of which may contain up to several hundred thousand
> records (around 2KB each).
>
> In an earlier test, I let the processes handling files push the ID of
> each record to a queue.
> (Simply add a publish to the above code after the ack). The consumer
> for this queue (which uses autoack)
(Of course, setting prefetch doesn't do anything in noack mode since the
server has no way to know how many messages have made it all the way to
the client.)
> would do a lookup in a Postgresql
> DB, and nothing else.
> After about 50 million records, the vm_memory_high_watermark would be
> set, and shortly after
> that, I got<"timeout waiting for channel.flow_ok{active=false}",none}
>> . Eventually the whole system froze.
> I guess this timeout is caused by my client not reacting to the flow
> control from the RabbitMQ server. Is that correct? Unfortunately, the
> client I use does not have methods for that. Should I expect to handle
> this in normal operation, or could it be handled by a client for me?
I think you said you were using 1.8.1. Much has changed since then:
* We no longer use channel.flow to throttle producers since as you're
seeing many clients did not implement it correctly / at all. We now use
TCP backpressure instead.
* Prior to 2.0 all messages in all queues had to fit in memory. Messages
are now paged to disk when memory is low.
So I would strongly advise you to upgrade to 2.6.1. It's available in
our apt repository:
http://www.rabbitmq.com/debian.html
> Oh, and the big question: Is it out of the question to handle approx.
> 300 mill. messages (where payload is essentially a bigint) over a few
> days?
That should be no big deal. My workstation can churn through that in a
few hours.
Cheers, Simon
--
Simon MacMullen
RabbitMQ, VMware
More information about the rabbitmq-discuss
mailing list