[rabbitmq-discuss] HA between data centers

Fri Jan 14 19:51:58 GMT 2011

Matthew, thanks very much for your comments.  I'll just add a brief top-post
note:

After posting the message below and reading your remarks my thinking is a
better design is to just not depend on messages never being "lost". (By lost
I was probably thinking once published that they would be consumed in a few
minutes of time regardless of a server failure.)

If we design with the knowledge that a message could be lost and thus be
prepared to publish again when necessary then our messaging system is much
more simple -- and probably less fragile and more reliable as a result.
 Might not work for all businesses, but likely for ours.

We have a database for recording state.  The messaging system is about
changing state in many cases.  So, for important tasks we should be able to
look at the state and say "this was suppose to be done an hour ago, so queue
it again."  A trickier part is determining if a job was actually lost or is
just still in the queue.

A much more important task is just making sure that clients can publish
messages when needed.

Oh, yes Celery has a lot of nice features that are probably commonly needed
by applications.  I've been testing it for the last week, and even know we
are not a Python shop I suspect we could still use it to fork off other jobs
very easily.

Thanks,

On Fri, Jan 14, 2011 at 8:20 AM, Matthew Sackman <matthew at rabbitmq.com>wrote:

> Hi Bill,
>
> On Tue, Jan 11, 2011 at 07:57:19PM -0800, Bill Moseley wrote:
> > The High Availablity docs at http://www.rabbitmq.com/pacemaker.html seem
> > pretty thorough.  Are there any other approaches commonly used for HA?
> What
> > about for syncing between data centers? Can anyone discuss their HA
> approach
> > if it differs from the link above?
>
> There are a couple of other things to mention. One is that work is
> currently been done on active/active HA, but that will provide
> "mirrored" queues (think RAID-1) within a cluster only. The other thing
> to mention is the shovel plugin, which can be used for a very
> rudimentary form of federation between different brokers. The main
> limitation with the shovel though is that its configuration can't be
> dynamically changed, so it's only really a sensible solution if you're
> topology is mainly static.
>
> > We are evaluating messaging systems and the question of HA has come up
> > frequently.  The concern is that some items, once entered onto the queue,
> > should never be lost -- even if the entire data center goes down.
>
> Right, but is that "lost" as in "provided it's stored on disk, that's
> ok, even if it takes us a month to get the data back off disk", or is it
> "lost" as it "must always be (near) instantly available"?
>
> > We are comparing RabbitMQ with writing a system in-house.  The in-house
> > queue system would use a Postgresql table for the queue with replication
> > (currently via Slony) for hot-backup (it's not really HA).  We also
> > replicate to a secondary data center with the eventual goal of reasonably
> > fast tip-over between data centers.  We are not in the financial or
> medical
> > industry so nobody's life is at risk if we drop a few jobs.  I suspect we
> > only need to handle three to five million message a day -- nothing too
> big.
> >  (Oddly, one argument against using RabbitMQ was it was overkill for our
> > needs.)
>
> Yeah, rather obviously, we're somewhat biased against building message
> brokers on top of databases ;) I guess the things I'd suggest you look
> as is whilst you may be fine with postgres at the moment, what happens
> in a couple of years time? What will your messaging requirements be
> then, and will you have sufficient flexibility in your home-grown system
> to be able to cope with those needs?
>
> > Postgresql and replication is what we use for application data currently,
> so
> > it is a familiar technology for us.  Another reason we are considering
> > building a custom message queue system is to put more functionality into
> the
> > broker -- such as scheduling and job routing that would be specific to
> our
> > business.  And there's fear that nobody knows Erlang if something broke
> and
> > we needed to try and resolve.
>
> Sure, those are valid concerns. There are ways of extending the
> functionality of Rabbit, for example through exchange types, or even
> custom plugins, but these do normally require writing in Erlang.
>
> > My opinion is AMQP is very flexible and we should be able to make it meet
> > our needs.  We are not doing anything that unusual.  And I suspect
> building
> > something as reliable as RabbitMQ is no easy task -- especially if the
> point
> > is to make a system more complex than what RabbitMQ provides.
>  Scheduling,
> > for example, seems like something a simple database table and cron could
> > solve easily with RabbitMQ.
>
> Indeed - use the right tool for the job etc. Job scheduling and such are
> probably on the boundary of what we consider pure messaging, and so it
> does normally require additional client-side applications to add to
> Rabbit to provide such functionality. You might like to look at celery
> in this space which does job scheduling on top of Rabbit.
>
> > Another argument for a custom broker was to make better use of workers --
> > i.e. the broker would look at load and other factors when determining
> where
> > to send jobs.  My feeling here is resources are limited so it's a matter
> of
> > balancing the number and type of consumers with queue load -- and an
> > external process can manage starting and stopping consumers easily as
> demand
> > profile changes (by looking at queue sizes and rates) without having to
> be
> > part of the broker.  Are there common approaches for dynamically
> adjusting
> > workers?
>
> I suspect that's something that falls squarely in the remit of tools
> like celery. It's definitely outside the scope of Rabbit itself.
>
> Matthew
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>

-- 
Bill Moseley
moseley at hank.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20110114/6b0fcda6/attachment-0001.htm>