[rabbitmq-discuss] High Availability

Thu Mar 18 16:15:33 GMT 2010

Hi Alvaro,

On Thu, Mar 18, 2010 at 10:12 AM, Alvaro Videla <videlalvaro at gmail.com>wrote:

> Hi,
>
> While I don't know how HIGH is your high, I can tell you what we do for our
> live site
>

Our high need to be the HIGHEST, we produce inbound about ~3000 messages per
second and ~9000 messages per second of outbounds and can't loose any
messages, well we are working hard to it, our concerns is about RabbitMQ
cluster HP model, so for hight throughput it works very very well. I will
report about our volume in other e-mail and we are very happy with the power
of RabbitMQ it's putting our server to cry to process in one test 1.000.000
messages per second and do it very well. Sure we have some problems that I
had reported here one with memory consume, until now we unable to use rabbit
with more that 3GB, other with queue monitoring and etc..

> Currently we have around 400.000 messages a day  –which is not much–, sent
> to 2 RabbitMQ servers running in low end machines. The unixload on those
> machines is always below 0.2
>
> So here's my story of HA based on that setup:
>
> When we deployed the system we started the two servers, NOT using
> clustering. Their version was 1.5.2. This was mid 2009.
>
> Then we have 28 PHP machines publishing the messages to a SINGLE IP, using
> IP Failover (Heartbeat+LVS), this means that for the PHP publishers there's
> only one IP to connect-to/send-messages.

> Then we have 2 PHP machines consuming messages, but they connect directly
> to the IPs of the RabbitMQ servers, we do that to be sure that we connect
> directly to a specific broker.
>
> This also means that the queues, exchanges, bindings, etc, are all
> duplicated between the two servers.
>
> But there was a problem! We wanted to upgrade to the latest version of
> RabbitMQ which has a non compatible storage format, at least with our
> version of RabbitMQ.
>
> What we did was the following:
>
> The sysadmin took out of the load balancing one of the brokers, and we
> waited till the workers consumed all the messages. When their queues were
> empty, we shut down the server and did the upgrade. Then we put it back into
> the load balancer and repeated the process with the other broker.
>
> In this way we didn't lose any message.
>
> But we wanted to test the native RabbitMQ clustering...
>
> The sysadmin ran the commands from the Clustering Guide and we had the
> cluster up an running, until we had another problem...
>
> Sometimes the RabbitMQ sent the redirect response to the consumers, and
> told them, to connect to the other node, the problem we had here, is that
> RabbitMQ uses the node() function from Erlang, which for the way we have
> configured the /etc/hosts file, it was returning a node that was unreachable
> from outside (This was only happening to the consumers, because they connect
> directly to the RabbitMQ nodes).
>
> So here we did the same as before, we took one broker out of the load
> balancing, we took it out of the clustering, and put it back again, and the
> same thing with the other node.
>
> Again we didn't lose any message.
>
> Then on the connection configuration, we have a really simple .yml file to
> tell the PHP process where to connect, basically by providing an argument on
> the CLI.
>
> I hope this helps,
>

I will take a look about LVM now.. so I would like to do some questions that
I'm in doubt about your solution. Do you know what is the overhead using LVM
? How many messages per second do you have ? Do you use a persistent queue ?
If yes how did you to continue consume queue when one node is down ?

You are using a heartbeat, so how you share inter nodes the same disk ? NFS
? Storage ?

Thank you so much for your report it helps a lot.

Best wishes.

>
> Alvaro
>
>
>
>
>
> On Mar 18, 2010, at 7:57 PM, Gustavo Aquino wrote:
>
> > Hi,
> >
> > I have done this question before for many peoples, without success,
> because I don't found (Documentation, discussion lists and etc) any way to
> do High Availability with RabbitMQ without a lot of workaround, so exist a
> way to do HA with RabbitMQ without implementing a lot of stuffs by client
> side, like recreating queues when node down, recreating configurations,
> recreating client connections and etc ?
> >
> > What's recommendation from RabbitMQ to do HA ?
> >
> > Someone here have done some HA implementation to RabbitMQ ?
> >
> > Regards.
> >
> > Gustavo
> >
> > _______________________________________________
> > rabbitmq-discuss mailing list
> > rabbitmq-discuss at lists.rabbitmq.com
> > http://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>
>
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
> http://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20100318/ce1ac412/attachment.htm