[rabbitmq-discuss] Active/active HA setup

Fri Sep 3 11:15:57 BST 2010

Alexis

Yes, precisely. One request queue bound to one request exchange per  
broker plus one response exchange per broker, all durable.

Responses go via the response exchange to client-specific exclusive  
auto-delete queues according to message reply-to attribute.

Sorry for not being clear about that.

Cheers
Jiri

> Jiri
>
> Ahh.  So maybe I misunderstood something.  Is it the case that there
> is exactly one 'request' queue on each broker?
>
> alexis
>
>
> On Fri, Sep 3, 2010 at 10:33 AM,  <jiri at krutil.com> wrote:
>> Alexis
>>
>> I don't think that exclusive queues are a problem. Our clients use
>> auto-delete server-named exclusive queues for receiving responses, so every
>> time a client re-connects, it must re-create and re-bind its exclusive
>> queue(s) anyway, even with a single broker.
>>
>> The common exchange and queue where requests are sent are pre-declared as
>> durable on both brokers.
>>
>> So I really don't see any resources that require migration.
>>
>> Regards
>> Jiri
>>
>>
>>> Jiri
>>>
>>> It makes some sense, but you *will* need to migrate resources, or recreate
>>> them, to continue working on the secondary.  You will not need to migrate
>>> messages, but there is other broker state that you will need: for example
>>> a
>>> consumer's exclusive queue on the primary, will need to exist on the
>>> secondary, with the correct bindings, connection, etc.
>>>
>>> So my question is: do you plan to create that queue (etc) on the secondary
>>> after the primary fails?  If "yes" then a potential issue is managing the
>>> failover window size and (possible, but less likely) resource limits on
>>> the
>>> secondary.  If "no" then you will need to 'pre build' the spare resources
>>> somehow.
>>>
>>> These are not necessarily difficult issues to solve in any particular
>>> instance, however I am highlighting them in response to your request for
>>> feedback.  These issues do become deeper when you try to generalise to
>>> multiple different failure scenarios, of course.
>>>
>>> alexis
>>>
>>>
>>> On Fri, Sep 3, 2010 at 8:40 AM, <jiri at krutil.com> wrote:
>>>
>>>> Alexis
>>>>
>>>> Our plan is to have two brokers with the same setup, both being actively
>>>> used at the same time, but each broker serving a different set of
>>>> clients.
>>>>
>>>> Our back-end service will be connected to and process requests from both
>>>> brokers at the same time.
>>>>
>>>> When one broker fails, the clients will loose connection and will have to
>>>> reconnect, ending up on the other broker. Messages that were on the dead
>>>> broker will be lost.
>>>>
>>>> So we don't need migration of resources between brokers in case of
>>>> failure,
>>>> we only need the clients to move their connections to the other broker.
>>>>
>>>> Hope that makes sense...
>>>>
>>>> Jiri
>>>>
>>>>
>>>>
>>>>  Jiri
>>>>>
>>>>> Cool.  So yes messages will then only arrive out of order in the case
>>>>> where some arrive from the secondary before 'delayed' messages from
>>>>> the failed primary; and then, for reordering them, it suffices to know
>>>>> which broker they came from.  (In the absence of failure, TCP should
>>>>> take care of reordering).
>>>>>
>>>>> I think the issues will be:
>>>>>
>>>>> 1. Deciding when to stop listening to a primary.  Given consumers
>>>>> don't care about message loss, I would suggest "as soon as consumers
>>>>> are aware of primary failure, then they should ignore further messages
>>>>> from the primary"
>>>>>
>>>>> 2. Failover time.  AIUI you want to minimise this by having a copy of
>>>>> the whole queue/exchange/binding set-up on both brokers.  But how
>>>>> exactly do you plan to do this?
>>>>>
>>>>> alexis
>>>>>
>>>>>
>>>>> On Fri, Sep 3, 2010 at 8:12 AM,  <jiri at krutil.com> wrote:
>>>>>
>>>>>> Alexis
>>>>>>
>>>>>> The answer is no - a client can send requests to only one broker at any
>>>>>> given moment. The client connects via load balancer to one of the
>>>>>> brokers
>>>>>> and stays connected all the time. The client does not even know that
>>>>>> there
>>>>>> are two brokers (it only sees one IP address).
>>>>>>
>>>>>> I think requests may be delivered out of order only if a client fails
>>>>>> over
>>>>>> to another broker. Then messages send to one broker can get mixed up
>>>>>> with
>>>>>> messages sent to the other.
>>>>>>
>>>>>> My concern was: are there any other issues with this kind of setup that
>>>>>> I
>>>>>> might have missed? Does anyone have experience with this?
>>>>>>
>>>>>> Thanks a lot for your help
>>>>>> Jiri
>>>>>>
>>>>>>
>>>>>>
>>>>>>  Jiri
>>>>>>>
>>>>>>> You say that "Some clients send requests to one broker, some to the
>>>>>>> other".
>>>>>>>
>>>>>>>
>>>>>>> Does this mean that one client publisher can send messages (requests)
>>>>>>> to
>>>>>>> both brokers, in such a way that a pair of messages may arrive out of
>>>>>>> order
>>>>>>> if one is sent to each broker?
>>>>>>>
>>>>>>> If the answer is no, then I think my answer stands, because causal
>>>>>>> order
>>>>>>> will be preserved even if messages are lost.  That is: messages that
>>>>>>> arrive
>>>>>>> successfully, will not be out of order with each other.
>>>>>>>
>>>>>>> If the answer is yes, then I am not sure how you can recover global
>>>>>>> ordering
>>>>>>> without imposing it at the publisher using sequence numbers at the app
>>>>>>> level.
>>>>>>>
>>>>>>> Does this make sense?
>>>>>>>
>>>>>>> alexis
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Sep 2, 2010 at 9:46 PM, Jiri Krutil <jiri at krutil.com> wrote:
>>>>>>>
>>>>>>>  Alexis
>>>>>>>>
>>>>>>>> Sorry I probably didn't express myself well.
>>>>>>>>
>>>>>>>> We don't plan a primary and secondary broker, but a pair of brokers
>>>>>>>> that
>>>>>>>> are both active at the same time. A load balancer divides client
>>>>>>>> connections
>>>>>>>> to these brokers. A request queue with the same name exists on both
>>>>>>>> brokers,
>>>>>>>> but with different contents. Some clients send requests to one
>>>>>>>> broker,
>>>>>>>> some
>>>>>>>> to the other. Our back-end server listens to both queues, processes
>>>>>>>> requests
>>>>>>>> and sends each response to an exclusive client queue on the broker
>>>>>>>> from
>>>>>>>> where the request came.
>>>>>>>>
>>>>>>>> Ideally this would be transparent to the clients, because the brokers
>>>>>>>> would
>>>>>>>> be hidden by a virtual IP address. Of course it can't be transparent
>>>>>>>> to
>>>>>>>> the
>>>>>>>> back-end server, which needs to talk to both brokers at the same
>>>>>>>> time.
>>>>>>>>
>>>>>>>> So (a) is correct, but (b) not.
>>>>>>>>
>>>>>>>> Hope that makes it a bit clearer...
>>>>>>>>
>>>>>>>> Cheers
>>>>>>>> Jiri
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>  Jiri
>>>>>>>>
>>>>>>>>>
>>>>>>>>> That answered my questions.  Now, as I understood your example:
>>>>>>>>>
>>>>>>>>> a. you don't mind messages being lost
>>>>>>>>> *and*
>>>>>>>>> b. you don't use the secondary until after the primary has failed.
>>>>>>>>>
>>>>>>>>> Note that if consumption is completely 'fire and forget' then it is
>>>>>>>>> possible that a message from the primary may *arrive* after a
>>>>>>>>> message
>>>>>>>>> from
>>>>>>>>> the secondary.  But this can happen whether you use sequence numbers
>>>>>>>>> or
>>>>>>>>> not.
>>>>>>>>>
>>>>>>>>> So if the primary broker fails, why not just forget all undelivered
>>>>>>>>> messages?  Consumers will know that any message consumed from the
>>>>>>>>> secondary
>>>>>>>>> must be later in *all* orderings than any message consumed from the
>>>>>>>>> primary.
>>>>>>>>>  So, additional sequence numbering is not necessary.
>>>>>>>>>
>>>>>>>>> alexis
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
>>
>>
>