[rabbitmq-discuss] rabbitmq_mnesia:wait_for_tables() - make timeout of 30 sec configurable?

Thu Jun 30 15:07:52 BST 2011

Hi,

I'm facing an issue where upon restart of a whole rabbitmq cluster, it
is sometimes the case that the current "main" node of the cluster (if
I'm understanding correctly, there exists a notion of leadership in a
RabbitMQ cluster?) takes longer to restart (e.g. the machine takes
longer to reboot) and all other nodes fail to start up with an error
like the one described below.

I looked into the source and found this:

wait_for_tables(TableNames) ->
    case mnesia:wait_for_tables(TableNames, 30000) of
        ok ->
            ok;
        {timeout, BadTabs} ->
            throw({error, {timeout_waiting_for_tables, BadTabs}});
        {error, Reason} ->
            throw({error, {failed_waiting_for_tables, Reason}})
    end.

So, there's a hard-coded value of 30 seconds.

I propose to replace that line with:
WaitTimeout = case application:get_env(mnesia_wait_for_tables_timeout)
of {ok,T} -> t; _ -> 30000),
case mnesia:wait_for_tables(TableNames, WaitTimeout) of ...

Is my understanding of what's going on right? Does it make sense to
make this setting configurable in this way?

=ERROR REPORT==== 30-Jun-2011::06:01:28 ===
FAILED
Reason: {error,
            {timeout_waiting_for_tables,
                [rabbit_user,rabbit_user_permission,rabbit_vhost,
                 rabbit_durable_route,rabbit_durable_exchange,
                 rabbit_durable_queue]}}
Stacktrace: [{rabbit_mnesia,wait_for_tables,1},
             {rabbit_mnesia,check_schema_integrity,0},
             {rabbit_mnesia,ensure_schema_integrity,0},
             {rabbit_mnesia,init_db,3},
             {rabbit_mnesia,init,0},
             {rabbit,'-run_boot_step/1-lc$^1/1-1-',1},
             {rabbit,run_boot_step,1},
             {rabbit,'-start/2-lc$^0/1-0-',1}]

=INFO REPORT==== 30-Jun-2011::06:01:29 ===
    application: rabbit
    exited: {bad_return,{{rabbit,start,[normal,[]]},
                         {'EXIT',{rabbit,failure_during_boot}}}}
    type: permanent

-- 
Eugene Kirpichov
Principal Engineer, Mirantis Inc. http://www.mirantis.com/
Editor, http://fprog.ru/