[rabbitmq-discuss] Scalability?
Wayne Van Den Handel
wvandenhandel at dataraker.com
Thu May 6 18:49:50 BST 2010
I am evaluating RabbitMQ for purposes of parallelization on top of a
Cassandra data store. I have created a simple test scenario of a set of
Queues that are given data to be loaded from a single Python publisher
and 3-4 Python Consumer applications take the data from the Queues and
load into Cassandra. The entire scenario was easily set up and runs
great for about 10 minutes when RabbitMQ proceeds to use up all
available memory and crashes. I then discovered the passive mode to
create a queue (and find out how many messages it has) and now only add
more work to the queue when there is less than 1000 messages in the
queue (which easily fit into memory). I start up my test again and still
blow RabbitMQ up in 10 minutes. I am watching with the admin console the
entire time and there is never more than a total 1000 messages in all
queues at any given time. Watching top I see RabbitMQ take up more and
more memory over time. It seems that it can only process 30-40k messages
in total/aggregate before it crashes (even though there is never more
than 1000 messages in all queues at one time).
Am I missing something here? The product seems very easy to use and
works great but it totally un-scalable. Is RabbitMQ not meant for high
data volumes/traffic? What would better serve this purpose? We need
something on top of Cassandra to provide high volume parallelization. I
understand that we can only hold what fits in memory right now (when
will that be fixed?), but even that is not true as memory is never given
back.
Environment:
CentOS 5.4 64 Bit
RabbitMQ v1.7.2-1.el5 installed from yum
py-amqplib
Create Queue
chan.queue_declare(queue="dr_load.1", durable=True, exclusive=False,
auto_delete=False)
chan.exchange_declare(exchange="dr_load", type="direct", durable=True,
auto_delete=False)
chan.queue_bind(queue="dr_load.1", exchange="dr_load",
routing_key="Instance.1")
Publish Data
chan.basic_publish(msg,exchange="dr_load",routing_key="Instance.1",mandatory=True)
Consume Data
msg = chan.basic_get("Instance.1")
chan.basic_ack(msg.delivery_tag)
Thanks!
--
Wayne Van Den Handel, DataRaker Inc
Phone: 703.996.4891
Mobile: 305.849.1794
Skype: wayne.van.den.handel
Email: wvandenhandel at dataraker.com
More information about the rabbitmq-discuss
mailing list