[rabbitmq-discuss] Errors starting RabbitMQ when cluster membership changes

Matt Pietrek mpietrek at skytap.com
Mon Oct 1 20:56:36 BST 2012


Just so I understand, do you mean that I should be able to take the right
steps in 2.8.7, or that in some future version I'll be able to? I assume
the latter but just want to be sure.

Thanks for the quick reply!

On Mon, Oct 1, 2012 at 10:43 AM, Tim Watson <watson.timothy at gmail.com>wrote:

> Don't take this as gospel, yet, but my understanding is that you'll be
> able to make cluster changes like this successfully, but I'll make sure
> there's a test case to prove this scenario.
>
> On 1 Oct 2012, at 18:07, Matt Pietrek <mpietrek at skytap.com> wrote:
>
> Hey Tim,
>
> I was expecting that basic response. Thanks!
>
> However... I still think there may an issue here that hinders RabbitMQ's
> deployment in production scenarios. Please correct me if I'm wrong or
> missing something.
>
> Hypothetically, what would happen if my 'util' node was zapped by
> lightning and I had no way to bring it up in a timely manner. Would I be
> able to start the existing cluster nodes (play, play2) "far enough" to run
> the proper rabbitmqctl command to remove util from the cluster?
>
> That is, once I've gotten into a bad situation, can I back out of it
> gracefully and without message loss? Or is the only option to reset the
> entire cluster?
>
> Thanks,
>
> Matt
> On Mon, Oct 1, 2012 at 3:04 AM, Tim Watson <tim at rabbitmq.com> wrote:
>
>>  Hi Matt,
>>
>>
>> On 09/28/2012 08:23 PM, Matt Pietrek wrote:
>>
>> For example, at one point I had a three node cluster: play, play2, and
>> util. I then removed util from the cluster, although to be honest, simply
>> by changing the rabbitmq.config file, rather than explicitly running
>> rabbitmqctl stop_app while the cluster is still running.
>>
>>
>> I'm pretty sure you're not supposed to do that! :)
>>
>>
>> My steps:
>>
>>    - Running as three node cluster, stop all brokers
>>    - Create a new rabbitmq.config file with just two brokers
>>    - Attempt to start the new cluster.
>>
>>
>> If you don't take util offline using the right procedure, I suspect
>> mnesia will get out of sorts and this isn't something you want to happen.
>> It's important to make cluster changes using the right procedure, as mnesia
>> is rather a fussy beast.
>>
>> BTW we've made some improvements that (hopefully) simplify working with
>> clusters and these will be in the forthcoming feature release!
>>
>> Cheers,
>> Tim
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20121001/d601d109/attachment.htm>


More information about the rabbitmq-discuss mailing list