<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
<font face="Calibri">Something smells rotten in Denmark. I can't
comment on your particular
implementation but what I can tell you is that we're pumping a lot of
data through RabbitMQ (avg 320KiB/s 24x7), that we're far exceeding
your
volumes (as I can decipher from your post), and that we routinely run
for days/weeks on end without any issues. <br>
<br>
Given that your store is Cassandra, you probably shouldn't even be
seeing a meaningful CPU load at those volumes (for the database).<br>
<br>
And I don't understand your 'hold what fits in memory' comment at all.<br>
<br>
Would someone who knows something about the python API please steer
this lost soul in the right direction?</font><br>
<br>
On 5/6/2010 12:49 PM, Wayne Van Den Handel wrote:
<blockquote cite="mid:4BE3013E.9080308@dataraker.com" type="cite">
<pre wrap="">I am evaluating RabbitMQ for purposes of parallelization on top of a
Cassandra data store. I have created a simple test scenario of a set of
Queues that are given data to be loaded from a single Python publisher
and 3-4 Python Consumer applications take the data from the Queues and
load into Cassandra. The entire scenario was easily set up and runs
great for about 10 minutes when RabbitMQ proceeds to use up all
available memory and crashes. I then discovered the passive mode to
create a queue (and find out how many messages it has) and now only add
more work to the queue when there is less than 1000 messages in the
queue (which easily fit into memory). I start up my test again and still
blow RabbitMQ up in 10 minutes. I am watching with the admin console the
entire time and there is never more than a total 1000 messages in all
queues at any given time. Watching top I see RabbitMQ take up more and
more memory over time. It seems that it can only process 30-40k messages
in total/aggregate before it crashes (even though there is never more
than 1000 messages in all queues at one time).
Am I missing something here? The product seems very easy to use and
works great but it totally un-scalable. Is RabbitMQ not meant for high
data volumes/traffic? What would better serve this purpose? We need
something on top of Cassandra to provide high volume parallelization. I
understand that we can only hold what fits in memory right now (when
will that be fixed?), but even that is not true as memory is never given
back.
Environment:
CentOS 5.4 64 Bit
RabbitMQ v1.7.2-1.el5 installed from yum
py-amqplib
Create Queue
chan.queue_declare(queue="dr_load.1", durable=True, exclusive=False,
auto_delete=False)
chan.exchange_declare(exchange="dr_load", type="direct", durable=True,
auto_delete=False)
chan.queue_bind(queue="dr_load.1", exchange="dr_load",
routing_key="Instance.1")
Publish Data
chan.basic_publish(msg,exchange="dr_load",routing_key="Instance.1",mandatory=True)
Consume Data
msg = chan.basic_get("Instance.1")
chan.basic_ack(msg.delivery_tag)
Thanks!
</pre>
</blockquote>
</body>
</html>