[rabbitmq-discuss] High Availability

Fri Mar 19 19:32:15 GMT 2010

If you are looking for high availability, the only way that I am aware of is
to use something like DRBD (or SAN backed storage) along with a cluster
resource manager (eg: corosync/heartbeat/etc) and make all your messages
durable and persist to disk.

You can then scale clusters (eg multiple 2-node clusters) for reaching scale
using Erlang's native cluster capabilities without sacrificing on high
availability. A load balancer won't help with HA unless you have some
mechanism for ensuring messages replicate between nodes.

Mark Steele
Director of development
Bering Media Inc.

On Fri, Mar 19, 2010 at 1:38 AM, Alvaro Videla <videlalvaro at gmail.com>wrote:

> Gustavo,
>
> 1) We haven't benchmarked the overhead of using LVS
>
> 2) As you can find out from our numbers, not that much, and it's hard to
> tell, since during peak time we have 20.000 concurrent users, but maybe in
> the morning the are only 5000. And at the end it depends on what they do on
> the site. Soon we plan to add other 300.000 messages a day.
>
> 3) The queues are persistent.
>
> This means that if one node is out of the LVS, anyway we can reach that
> node by it's exact IP. In our setup that node won't receive more messages
> –because the publishers connect to the LVS IP–, but we are able to consume
> the old ones.
>
> 4) We don't share storage. 1 message will go to one queue, there's no
> replication. The sysadmin is working on solving that. I prefer to try
> something "native" i.e.: provided or implemented inside RabbitMQ. He's
> researching some Linux technologies for filesystems.
>
> Regards,
>
> Alvaro
>
> P. S. My second name is Gustavo, so this feels like replying to myself :)
>
> On Mar 18, 2010, at 11:59 PM, Gustavo Aquino wrote:
>
> Alvaro,
>
> Thank you so much to report your experience.
>
> I will take a look about LVM now.. so I would like to do some questions
> that I'm in doubt about your solution. Do you know what is the overhead
> using LVM ? How many messages per second do you have ? Do you use a
> persistent queue ? If yes how did you to continue consume queue when one
> node is down ?
>
> You are using a heartbeat, so how you share inter nodes the same disk ? NFS
> ? Storage ?
>
> Thank you so much for your report it helps a lot.
>
> Best wishes.
>
> On Thu, Mar 18, 2010 at 12:17 PM, Alvaro Videla <videlalvaro at gmail.com>wrote:
>
>> Sorry, It's me again, forgot to add something :)
>>
>> That LVS setup was live tested when we got the dreaded out of memory
>> problem, when the Erlang VM just crash shutting down the RabbitMQ node. One
>> of the brokers went down, but we continued working from the other.
>>
>> After that we decided to finally upgrade our RabbitMQ server to the latest
>> version
>>
>>
>> On Thu, Mar 18, 2010 at 11:05 PM, Alvaro Videla <videlalvaro at gmail.com>wrote:
>>
>>> Hi,
>>>
>>> After some feedback by @etrepum at Twitter, I feel that what I commented
>>> needs some more details, so I did some research about LVS, since that set up
>>> was done by one of our sysadmins.
>>>
>>> You can check about a LVS setup here:
>>>
>>>
>>> http://www.ibiblio.org/oswg/oswg-nightly/oswg/en_US.ISO_8859-1/articles/cluster-howto/cluster-howto/index.html
>>>
>>> From that page:
>>>
>>> "the role of the *active router* is to redirect service requests from
>>> the virtual server address to the real servers." [...]
>>>
>>> "The active router dynamically monitors the health of the real servers,
>>> and the workload on each." [...]
>>>
>>> "If a real server becomes disabled, the active router stops sending jobs
>>> to the server until it returns to normal operation."
>>>
>>> I hope this mails clarifies things up,
>>>
>>> Alvaro
>>>
>>>
>>> On Mar 18, 2010, at 7:57 PM, Gustavo Aquino wrote:
>>>
>>> Hi,
>>>
>>> I have done this question before for many peoples, without success,
>>> because I don't found (Documentation, discussion lists and etc) any way to
>>> do High Availability with RabbitMQ without a lot of workaround, so exist a
>>> way to do HA with RabbitMQ without implementing a lot of stuffs by client
>>> side, like recreating queues when node down, recreating configurations,
>>> recreating client connections and etc ?
>>>
>>> What's recommendation from RabbitMQ to do HA ?
>>>
>>> Someone here have done some HA implementation to RabbitMQ ?
>>>
>>> Regards.
>>>
>>> Gustavo
>>>
>>> _______________________________________________
>>> rabbitmq-discuss mailing list
>>> rabbitmq-discuss at lists.rabbitmq.com
>>> http://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>>>
>>>
>>>
>>
>> _______________________________________________
>> rabbitmq-discuss mailing list
>> rabbitmq-discuss at lists.rabbitmq.com
>> http://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>>
>>
>
>
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
> http://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20100319/0eab6acc/attachment.htm