[rabbitmq-discuss] HA between data centers

Fri Jan 14 16:20:24 GMT 2011

Hi Bill,

On Tue, Jan 11, 2011 at 07:57:19PM -0800, Bill Moseley wrote:
> The High Availablity docs at http://www.rabbitmq.com/pacemaker.html seem
> pretty thorough.  Are there any other approaches commonly used for HA? What
> about for syncing between data centers? Can anyone discuss their HA approach
> if it differs from the link above?

There are a couple of other things to mention. One is that work is
currently been done on active/active HA, but that will provide
"mirrored" queues (think RAID-1) within a cluster only. The other thing
to mention is the shovel plugin, which can be used for a very
rudimentary form of federation between different brokers. The main
limitation with the shovel though is that its configuration can't be
dynamically changed, so it's only really a sensible solution if you're
topology is mainly static.

> We are evaluating messaging systems and the question of HA has come up
> frequently.  The concern is that some items, once entered onto the queue,
> should never be lost -- even if the entire data center goes down.

Right, but is that "lost" as in "provided it's stored on disk, that's
ok, even if it takes us a month to get the data back off disk", or is it
"lost" as it "must always be (near) instantly available"?

> We are comparing RabbitMQ with writing a system in-house.  The in-house
> queue system would use a Postgresql table for the queue with replication
> (currently via Slony) for hot-backup (it's not really HA).  We also
> replicate to a secondary data center with the eventual goal of reasonably
> fast tip-over between data centers.  We are not in the financial or medical
> industry so nobody's life is at risk if we drop a few jobs.  I suspect we
> only need to handle three to five million message a day -- nothing too big.
>  (Oddly, one argument against using RabbitMQ was it was overkill for our
> needs.)

Yeah, rather obviously, we're somewhat biased against building message
brokers on top of databases ;) I guess the things I'd suggest you look
as is whilst you may be fine with postgres at the moment, what happens
in a couple of years time? What will your messaging requirements be
then, and will you have sufficient flexibility in your home-grown system
to be able to cope with those needs?

> Postgresql and replication is what we use for application data currently, so
> it is a familiar technology for us.  Another reason we are considering
> building a custom message queue system is to put more functionality into the
> broker -- such as scheduling and job routing that would be specific to our
> business.  And there's fear that nobody knows Erlang if something broke and
> we needed to try and resolve.

Sure, those are valid concerns. There are ways of extending the
functionality of Rabbit, for example through exchange types, or even
custom plugins, but these do normally require writing in Erlang.

> My opinion is AMQP is very flexible and we should be able to make it meet
> our needs.  We are not doing anything that unusual.  And I suspect building
> something as reliable as RabbitMQ is no easy task -- especially if the point
> is to make a system more complex than what RabbitMQ provides.  Scheduling,
> for example, seems like something a simple database table and cron could
> solve easily with RabbitMQ.

Indeed - use the right tool for the job etc. Job scheduling and such are
probably on the boundary of what we consider pure messaging, and so it
does normally require additional client-side applications to add to
Rabbit to provide such functionality. You might like to look at celery
in this space which does job scheduling on top of Rabbit.

> Another argument for a custom broker was to make better use of workers --
> i.e. the broker would look at load and other factors when determining where
> to send jobs.  My feeling here is resources are limited so it's a matter of
> balancing the number and type of consumers with queue load -- and an
> external process can manage starting and stopping consumers easily as demand
> profile changes (by looking at queue sizes and rates) without having to be
> part of the broker.  Are there common approaches for dynamically adjusting
> workers?

I suspect that's something that falls squarely in the remit of tools
like celery. It's definitely outside the scope of Rabbit itself.

Matthew