[rabbitmq-discuss] Active/active HA setup

Alexis Richardson alexis at rabbitmq.com
Fri Sep 3 10:04:37 BST 2010


Jiri

It makes some sense, but you *will* need to migrate resources, or recreate
them, to continue working on the secondary.  You will not need to migrate
messages, but there is other broker state that you will need: for example a
consumer's exclusive queue on the primary, will need to exist on the
secondary, with the correct bindings, connection, etc.

So my question is: do you plan to create that queue (etc) on the secondary
after the primary fails?  If "yes" then a potential issue is managing the
failover window size and (possible, but less likely) resource limits on the
secondary.  If "no" then you will need to 'pre build' the spare resources
somehow.

These are not necessarily difficult issues to solve in any particular
instance, however I am highlighting them in response to your request for
feedback.  These issues do become deeper when you try to generalise to
multiple different failure scenarios, of course.

alexis


On Fri, Sep 3, 2010 at 8:40 AM, <jiri at krutil.com> wrote:

> Alexis
>
> Our plan is to have two brokers with the same setup, both being actively
> used at the same time, but each broker serving a different set of clients.
>
> Our back-end service will be connected to and process requests from both
> brokers at the same time.
>
> When one broker fails, the clients will loose connection and will have to
> reconnect, ending up on the other broker. Messages that were on the dead
> broker will be lost.
>
> So we don't need migration of resources between brokers in case of failure,
> we only need the clients to move their connections to the other broker.
>
> Hope that makes sense...
>
> Jiri
>
>
>
>  Jiri
>>
>> Cool.  So yes messages will then only arrive out of order in the case
>> where some arrive from the secondary before 'delayed' messages from
>> the failed primary; and then, for reordering them, it suffices to know
>> which broker they came from.  (In the absence of failure, TCP should
>> take care of reordering).
>>
>> I think the issues will be:
>>
>> 1. Deciding when to stop listening to a primary.  Given consumers
>> don't care about message loss, I would suggest "as soon as consumers
>> are aware of primary failure, then they should ignore further messages
>> from the primary"
>>
>> 2. Failover time.  AIUI you want to minimise this by having a copy of
>> the whole queue/exchange/binding set-up on both brokers.  But how
>> exactly do you plan to do this?
>>
>> alexis
>>
>>
>> On Fri, Sep 3, 2010 at 8:12 AM,  <jiri at krutil.com> wrote:
>>
>>> Alexis
>>>
>>> The answer is no - a client can send requests to only one broker at any
>>> given moment. The client connects via load balancer to one of the brokers
>>> and stays connected all the time. The client does not even know that
>>> there
>>> are two brokers (it only sees one IP address).
>>>
>>> I think requests may be delivered out of order only if a client fails
>>> over
>>> to another broker. Then messages send to one broker can get mixed up with
>>> messages sent to the other.
>>>
>>> My concern was: are there any other issues with this kind of setup that I
>>> might have missed? Does anyone have experience with this?
>>>
>>> Thanks a lot for your help
>>> Jiri
>>>
>>>
>>>
>>>  Jiri
>>>>
>>>> You say that "Some clients send requests to one broker, some to the
>>>> other".
>>>>
>>>>
>>>> Does this mean that one client publisher can send messages (requests) to
>>>> both brokers, in such a way that a pair of messages may arrive out of
>>>> order
>>>> if one is sent to each broker?
>>>>
>>>> If the answer is no, then I think my answer stands, because causal order
>>>> will be preserved even if messages are lost.  That is: messages that
>>>> arrive
>>>> successfully, will not be out of order with each other.
>>>>
>>>> If the answer is yes, then I am not sure how you can recover global
>>>> ordering
>>>> without imposing it at the publisher using sequence numbers at the app
>>>> level.
>>>>
>>>> Does this make sense?
>>>>
>>>> alexis
>>>>
>>>>
>>>>
>>>> On Thu, Sep 2, 2010 at 9:46 PM, Jiri Krutil <jiri at krutil.com> wrote:
>>>>
>>>>  Alexis
>>>>>
>>>>> Sorry I probably didn't express myself well.
>>>>>
>>>>> We don't plan a primary and secondary broker, but a pair of brokers
>>>>> that
>>>>> are both active at the same time. A load balancer divides client
>>>>> connections
>>>>> to these brokers. A request queue with the same name exists on both
>>>>> brokers,
>>>>> but with different contents. Some clients send requests to one broker,
>>>>> some
>>>>> to the other. Our back-end server listens to both queues, processes
>>>>> requests
>>>>> and sends each response to an exclusive client queue on the broker from
>>>>> where the request came.
>>>>>
>>>>> Ideally this would be transparent to the clients, because the brokers
>>>>> would
>>>>> be hidden by a virtual IP address. Of course it can't be transparent to
>>>>> the
>>>>> back-end server, which needs to talk to both brokers at the same time.
>>>>>
>>>>> So (a) is correct, but (b) not.
>>>>>
>>>>> Hope that makes it a bit clearer...
>>>>>
>>>>> Cheers
>>>>> Jiri
>>>>>
>>>>>
>>>>>
>>>>>  Jiri
>>>>>
>>>>>>
>>>>>> That answered my questions.  Now, as I understood your example:
>>>>>>
>>>>>> a. you don't mind messages being lost
>>>>>> *and*
>>>>>> b. you don't use the secondary until after the primary has failed.
>>>>>>
>>>>>> Note that if consumption is completely 'fire and forget' then it is
>>>>>> possible that a message from the primary may *arrive* after a message
>>>>>> from
>>>>>> the secondary.  But this can happen whether you use sequence numbers
>>>>>> or
>>>>>> not.
>>>>>>
>>>>>> So if the primary broker fails, why not just forget all undelivered
>>>>>> messages?  Consumers will know that any message consumed from the
>>>>>> secondary
>>>>>> must be later in *all* orderings than any message consumed from the
>>>>>> primary.
>>>>>>  So, additional sequence numbering is not necessary.
>>>>>>
>>>>>> alexis
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>>
>>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20100903/8e70450d/attachment.htm>


More information about the rabbitmq-discuss mailing list