[rabbitmq-discuss] Active/active HA setup
jiri at krutil.com
jiri at krutil.com
Fri Sep 3 10:33:59 BST 2010
Alexis
I don't think that exclusive queues are a problem. Our clients use
auto-delete server-named exclusive queues for receiving responses, so
every time a client re-connects, it must re-create and re-bind its
exclusive queue(s) anyway, even with a single broker.
The common exchange and queue where requests are sent are pre-declared
as durable on both brokers.
So I really don't see any resources that require migration.
Regards
Jiri
> Jiri
>
> It makes some sense, but you *will* need to migrate resources, or recreate
> them, to continue working on the secondary. You will not need to migrate
> messages, but there is other broker state that you will need: for example a
> consumer's exclusive queue on the primary, will need to exist on the
> secondary, with the correct bindings, connection, etc.
>
> So my question is: do you plan to create that queue (etc) on the secondary
> after the primary fails? If "yes" then a potential issue is managing the
> failover window size and (possible, but less likely) resource limits on the
> secondary. If "no" then you will need to 'pre build' the spare resources
> somehow.
>
> These are not necessarily difficult issues to solve in any particular
> instance, however I am highlighting them in response to your request for
> feedback. These issues do become deeper when you try to generalise to
> multiple different failure scenarios, of course.
>
> alexis
>
>
> On Fri, Sep 3, 2010 at 8:40 AM, <jiri at krutil.com> wrote:
>
>> Alexis
>>
>> Our plan is to have two brokers with the same setup, both being actively
>> used at the same time, but each broker serving a different set of clients.
>>
>> Our back-end service will be connected to and process requests from both
>> brokers at the same time.
>>
>> When one broker fails, the clients will loose connection and will have to
>> reconnect, ending up on the other broker. Messages that were on the dead
>> broker will be lost.
>>
>> So we don't need migration of resources between brokers in case of failure,
>> we only need the clients to move their connections to the other broker.
>>
>> Hope that makes sense...
>>
>> Jiri
>>
>>
>>
>> Jiri
>>>
>>> Cool. So yes messages will then only arrive out of order in the case
>>> where some arrive from the secondary before 'delayed' messages from
>>> the failed primary; and then, for reordering them, it suffices to know
>>> which broker they came from. (In the absence of failure, TCP should
>>> take care of reordering).
>>>
>>> I think the issues will be:
>>>
>>> 1. Deciding when to stop listening to a primary. Given consumers
>>> don't care about message loss, I would suggest "as soon as consumers
>>> are aware of primary failure, then they should ignore further messages
>>> from the primary"
>>>
>>> 2. Failover time. AIUI you want to minimise this by having a copy of
>>> the whole queue/exchange/binding set-up on both brokers. But how
>>> exactly do you plan to do this?
>>>
>>> alexis
>>>
>>>
>>> On Fri, Sep 3, 2010 at 8:12 AM, <jiri at krutil.com> wrote:
>>>
>>>> Alexis
>>>>
>>>> The answer is no - a client can send requests to only one broker at any
>>>> given moment. The client connects via load balancer to one of the brokers
>>>> and stays connected all the time. The client does not even know that
>>>> there
>>>> are two brokers (it only sees one IP address).
>>>>
>>>> I think requests may be delivered out of order only if a client fails
>>>> over
>>>> to another broker. Then messages send to one broker can get mixed up with
>>>> messages sent to the other.
>>>>
>>>> My concern was: are there any other issues with this kind of setup that I
>>>> might have missed? Does anyone have experience with this?
>>>>
>>>> Thanks a lot for your help
>>>> Jiri
>>>>
>>>>
>>>>
>>>> Jiri
>>>>>
>>>>> You say that "Some clients send requests to one broker, some to the
>>>>> other".
>>>>>
>>>>>
>>>>> Does this mean that one client publisher can send messages (requests) to
>>>>> both brokers, in such a way that a pair of messages may arrive out of
>>>>> order
>>>>> if one is sent to each broker?
>>>>>
>>>>> If the answer is no, then I think my answer stands, because causal order
>>>>> will be preserved even if messages are lost. That is: messages that
>>>>> arrive
>>>>> successfully, will not be out of order with each other.
>>>>>
>>>>> If the answer is yes, then I am not sure how you can recover global
>>>>> ordering
>>>>> without imposing it at the publisher using sequence numbers at the app
>>>>> level.
>>>>>
>>>>> Does this make sense?
>>>>>
>>>>> alexis
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Sep 2, 2010 at 9:46 PM, Jiri Krutil <jiri at krutil.com> wrote:
>>>>>
>>>>> Alexis
>>>>>>
>>>>>> Sorry I probably didn't express myself well.
>>>>>>
>>>>>> We don't plan a primary and secondary broker, but a pair of brokers
>>>>>> that
>>>>>> are both active at the same time. A load balancer divides client
>>>>>> connections
>>>>>> to these brokers. A request queue with the same name exists on both
>>>>>> brokers,
>>>>>> but with different contents. Some clients send requests to one broker,
>>>>>> some
>>>>>> to the other. Our back-end server listens to both queues, processes
>>>>>> requests
>>>>>> and sends each response to an exclusive client queue on the broker from
>>>>>> where the request came.
>>>>>>
>>>>>> Ideally this would be transparent to the clients, because the brokers
>>>>>> would
>>>>>> be hidden by a virtual IP address. Of course it can't be transparent to
>>>>>> the
>>>>>> back-end server, which needs to talk to both brokers at the same time.
>>>>>>
>>>>>> So (a) is correct, but (b) not.
>>>>>>
>>>>>> Hope that makes it a bit clearer...
>>>>>>
>>>>>> Cheers
>>>>>> Jiri
>>>>>>
>>>>>>
>>>>>>
>>>>>> Jiri
>>>>>>
>>>>>>>
>>>>>>> That answered my questions. Now, as I understood your example:
>>>>>>>
>>>>>>> a. you don't mind messages being lost
>>>>>>> *and*
>>>>>>> b. you don't use the secondary until after the primary has failed.
>>>>>>>
>>>>>>> Note that if consumption is completely 'fire and forget' then it is
>>>>>>> possible that a message from the primary may *arrive* after a message
>>>>>>> from
>>>>>>> the secondary. But this can happen whether you use sequence numbers
>>>>>>> or
>>>>>>> not.
>>>>>>>
>>>>>>> So if the primary broker fails, why not just forget all undelivered
>>>>>>> messages? Consumers will know that any message consumed from the
>>>>>>> secondary
>>>>>>> must be later in *all* orderings than any message consumed from the
>>>>>>> primary.
>>>>>>> So, additional sequence numbering is not necessary.
>>>>>>>
>>>>>>> alexis
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>
More information about the rabbitmq-discuss
mailing list