[rabbitmq-discuss] RabbitMQ scalability design question
Flavio Pompermaier
pompermaier at okkam.it
Wed Jul 24 10:11:06 BST 2013
Thank you for the support Michael.
So my guess were right..The problem is that in my use case I should start
receiving messages until I receive a special "end-message" (true/false
attribute on a json object).
When I receive this end-message I should start a map/reduce job on received
data (that gets stored in HBase in this first step).
Obviously I need a way to ensure that no worker is still working on the
data, so the consumer receiving the end-message should be able to wait for
all consumer to finish their work.
In my current implementation I use a single queue and a single
multi-threaded consumer (via Spring-AMQP) that allow me to achieve this
goal.
But, I'm not sure what's the best way of scaling..is there some RabbitMQ
mechanism that I can exploit?
Should I change my design?
Maybe I should create a dedicated exchange per source and add more queues
as the load grows..but here I still got problems when I receive this end
message..
how can I ask Consumers if they finished their work? How add more queues
when a certain load is achieved? How remove them when they are no more
requested?
Best,
Flavio
On Wed, Jul 24, 2013 at 10:50 AM, Michael Klishin <mklishin at gopivotal.com>wrote:
> Flavio Pompermaier:
>
> > From what I understood queues cannot be splitted among different
> machines, a single queue resided only on a single node..am I wrong?
>
> Queue contents are not sharded but can be mirrored.
>
> > What if I discover that my machine is not able to keep up with message
> rate?
>
> Use multiple queues. A single queue certainly will have scalability
> limitations.
>
> > Maybe I should start mirroring the queue on multiple servers so I can
> multiply the number of consumers?
>
> Mirroring won't help. Increasing the # of consumers and/or queues likely
> will. Mirroring is an HA
> feature.
>
> > Is this approach scalable or does the synchronization overhead kill the
> performance?
>
> Mirroring involves moving more data between cluster nodes so yes, it does
> affect cluster
> throughput.
> --
> MK
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20130724/96e8f56e/attachment.htm>
More information about the rabbitmq-discuss
mailing list