[rabbitmq-discuss] Scalability?

Thu May 6 20:22:40 BST 2010

Something smells rotten in Denmark.  I can't comment on your particular 
implementation but what I can tell you is that we're pumping a lot of 
data through RabbitMQ (avg 320KiB/s 24x7), that we're far exceeding your 
volumes (as I can decipher from your post), and that we routinely run 
for days/weeks on end without any issues.

Given that your store is Cassandra, you probably shouldn't even be 
seeing a meaningful CPU load at those volumes (for the database).

And I don't understand your 'hold what fits in memory' comment at all.

Would someone who knows something about the python API please steer this 
lost soul in the right direction?

On 5/6/2010 12:49 PM, Wayne Van Den Handel wrote:
> I am evaluating RabbitMQ for purposes of parallelization on top of a
> Cassandra data store. I have created a simple test scenario of a set of
> Queues that are given data to be loaded from a single Python publisher
> and 3-4 Python Consumer applications take the data from the Queues and
> load into Cassandra. The entire scenario was easily set up and runs
> great for about 10 minutes when RabbitMQ proceeds to use up all
> available memory and crashes.  I then discovered the passive mode to
> create a queue (and find out how many messages it has) and now only add
> more work to the queue when there is less than 1000 messages in the
> queue (which easily fit into memory). I start up my test again and still
> blow RabbitMQ up in 10 minutes. I am watching with the admin console the
> entire time and there is never more than a total 1000 messages in all
> queues at any given time. Watching top I see RabbitMQ take up more and
> more memory over time. It seems that it can only process 30-40k messages
> in total/aggregate before it crashes (even though there is never more
> than 1000 messages in all queues at one time).
>
> Am I missing something here? The product seems very easy to use and
> works great but it totally un-scalable. Is RabbitMQ not meant for high
> data volumes/traffic? What would better serve this purpose? We need
> something on top of Cassandra to provide high volume parallelization. I
> understand that we can only hold what fits in memory right now (when
> will that be fixed?), but even that is not true as memory is never given
> back.
>
> Environment:
> CentOS 5.4 64 Bit
> RabbitMQ v1.7.2-1.el5 installed from yum
> py-amqplib
>
> Create Queue
> chan.queue_declare(queue="dr_load.1", durable=True, exclusive=False,
> auto_delete=False)
> chan.exchange_declare(exchange="dr_load", type="direct", durable=True,
> auto_delete=False)
> chan.queue_bind(queue="dr_load.1", exchange="dr_load",
> routing_key="Instance.1")
>
> Publish Data
> chan.basic_publish(msg,exchange="dr_load",routing_key="Instance.1",mandatory=True)
>
> Consume Data
> msg = chan.basic_get("Instance.1")
> chan.basic_ack(msg.delivery_tag)
>
> Thanks!
>
>    
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20100506/41d33418/attachment-0001.htm