[rabbitmq-discuss] Scaling rabbitmq on ec2

Laing, Michael michael.laing at nytimes.com
Tue Dec 31 13:20:10 GMT 2013


Hi Roman,

We use ec2 and autoscaling extensively as well.

If you are pushing 100% then you are out of the 'sweet spot' for the
instance.

The first thing I suggest is to switch to C1 medium instances for your
workload (1.7GB memory, 2 CPUs). These instances have 2.5 times the
processing power of the M1 mediums for about the same price. Given what you
have said about your workload, you can get by with less memory.
Erlang/rabbitmq are very memory efficient (not like Java) and can really
make use of the extra CPU.

Clustering will not buy you anything on smallish autoscaling instances,
based upon your description, and will add more complexity than it removes,
IMHO.

The second thing I would suggest is to experiment and test:

Spreading out across multiple instances, as you describe, is operationally
complex. If your base load warrants it, you could try larger instances and
fewer of them, e.g. C1 xlarge with 8 CPUs. We cluster 3 of these as the
'core' of a processing pipeline (we have variable numbers of pipelines in a
region).

An advantage of larger instances is that they have consistently higher
bandwidth and are less subject to 'noisy neighbors'.

However, you may want to use multiple queues anyway, in your architecture,
in order to keep all the CPUs busy. We don't do that but there are many on
the list who can offer good advice - and lots of previous messages about
that.

We cluster and mirror our queues to avoid losing messages and to centralize
administration. I would experiment with it if I were you. There is a small
performance penalty.

BTW you don't need the TCP load balancer with a cluster: every machine can
see every queue regardless of whether they are mirrored or not.

Hope this helps.

Michael




On Tue, Dec 31, 2013 at 4:20 AM, Roman Kournjaev <kournjaev at gmail.com>wrote:

> Hi
>
> I hope someone can help me out with some advice , cause I am struggling
> for quite a while now.
>
> I have some ad server application running on AWS ec2. The server are
> tomcats that can scale , with the load right now I have 10 servers running.
> The overall load on the servers is something 1K rps . It produces twice as
> much messages that have to be persisted eventually.
>
> Right now the servers send these messages to rabbitmq instances ( that are
> not connected to a cluster ) , each tomcat has an open connection to every
> rabbitmq broker and just picks one randomly to send a message.
>
> On the other side I have a consumer that consumes all the messages from
> all the brokers.
>
> The brokers are running on a medium instance ( thats 1 cpu and 3.5 GB ram
> ) and the queues are not persistent or mirrored. The issue is that the
> brokers get to 100 cpu at speeds reaching 700 messages/sec.
>
> I guess connecting the brokers to a cluster will only decrease the
> performance also i will have to configure a tcp load balancer to connect to
> the cluster from tomcats. Also creating a rabbitmq cluster is not trivial
> at all , especially if the brokers are on an auto scaling group.
>
> So for the question :
>
>
>    1. I read some benchmarks out there and could not understand whether
>    the 700m/sec is slow or not. I can use a bigger instance with more CPU ,
>    but will the load grow linear ?
>    2. Do you spot anything wrong in the architecture ? Can you think of
>    an overall better approach in the message chain ?
>
>
> Thanks
> Roman
>
>
>
>
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20131231/8f3f291d/attachment.html>


More information about the rabbitmq-discuss mailing list