[rabbitmq-discuss] Scaling consumers

majek04 majek04 at gmail.com
Fri Mar 26 13:45:36 GMT 2010


On Wed, Mar 24, 2010 at 11:17, Suhail Doshi <suhail at mixpanel.com> wrote:
> How are people scaling consumers to handle tens of millions of messages
> inside a queue? We use Python but do you guys Twisted, Eventlet, Gevent,
> etc?
> Creating a bunch of Python processes seems very inefficient and most
> applications I would think our mostly network io bound. What are you guys
> doing to scale your consumers?

If I understand the question correctly, the question is:
 - a _lot_ of messages
 - a _lot_ of consumers
 - but the simplistic scenario is that every consumer is
  a separate process or thread.
 - as consumers are network bound, how to be able to
  have a lot of them, without using too much RAM and
  not waste CPU time on context switches.

Few ideas:

1) use proper asynchronous Python client - txamqp for Twisted.
 * That should be the best option.
 * You will be fully asynchronous on the network level.
 * That's what hardcore Python people use.

2) use hybrid Python client - Pika based on asyncore.
 * It is asynchronous in some aspects, it can handle a lot
   of parallel comsumers in one process (aka event loop).
 * On the other hand it's not really asynchronous
   with some other things, like publishing or declaring queues.
 * You'll need extra work to asynchronously support other
   protocols than AMQP.
 * But should work fine if your app is doing AMQP-to-AMQP things.

3) use synchronous Python client - py-amqp:
 * writing customer code is very simple
 * you can scale by spawning large number of threads/processes.
 * But you loose some CPU on context switches.
 * You can loose some RAM on the thread or process overhead.
 * This technique is widely used.

> Creating a bunch of Python processes seems very inefficient and most
> applications I would think our mostly network io bound. What are you guys
> doing to scale your consumers?

Although it's 'inefficient', it's enough for most scenarios.

Marek Majkowski




More information about the rabbitmq-discuss mailing list