[rabbitmq-discuss] HA queue disappears when a node rejoins the cluster

Matt Pietrek mpietrek at skytap.com
Wed Aug 22 23:06:42 BST 2012


I'm experiencing a consistently reproducible scenario where an HA queue
disappears when a node rejoins the cluster. This is using RabbitMQ 2.8.5
and Erlang R15B01.

The brokers here are play, play2 and util. I have a set of queues mirrored
on all the brokers, including a durable queue called 'unit_test'.

Using the management UI, I publish some dummy messages to unit_test and
verify that they show up in the queue, and that the queue is synched on all
nodes.

I then take down the play node and start it back up. Afterwards, I force
everything to be synchronized by doing a management API 'get messages" with
requeue=True. When this completes, everything shows up synched as expected.

I then bring down play2 and start it back up. Immediately after it starts,
the unit_test disappears. Interestingly, none of the other (empty) queues
disappear - Only the unit_test queue.

I have the following from rabbit at play.log file:

=ERROR REPORT==== 22-Aug-2012::14:42:42 ===
** Generic server <0.278.0> terminating
** Last message in was {'$gen_cast',{gm,{length,4}}}
** When Server state == {state,
                         {amqqueue,
                          {resource,<<"/">>,queue,<<"unit_test">>},
                          true,false,none,
                          [{<<"x-ha-policy">>,longstr,<<"all">>}],
                          <3299.4430.0>,
                          [<3298.2580.0>],
                          all},
                         <0.280.0>,<3299.4430.0>,rabbit_variable_queue,
                         {vqstate,
                          {0,{[],[]}},
                          {0,{[],[]}},
                          {delta,undefined,0,undefined},
                          {0,{[],[]}},
                          {0,{[],[]}},
                          0,
                          {0,nil},
                          undefined,
                          {0,nil},
                          {qistate,
                           "/home/mpietrek/work/var/lib/rabbit at play
/queues/5CPQWZ6GS6L07B10CDNMTM42C",
                           {{dict,0,16,16,8,80,48,

{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},
                             {{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],
                               []}}},
                            []},
                           undefined,0,262144,
                           #Fun<rabbit_variable_queue.2.121033067>,
                           {0,nil}},
                          {{client_msstate,msg_store_persistent,
                            <<3,210,105,14,156,98,212,28,20,242,100,176,253,
                              212,5,205>>,
                            {dict,0,16,16,8,80,48,

{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},
                             {{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],
                               []}}},
                            {state,327753,
                             "/home/mpietrek/work/var/lib/rabbit at play
/msg_store_persistent"},
                            rabbit_msg_store_ets_index,
                            "/home/mpietrek/work/var/lib/rabbit at play
/msg_store_persistent",
                            <0.264.0>,331850,323656,335947,340044},
                           {client_msstate,msg_store_transient,

<<178,59,10,119,58,171,186,219,211,141,219,166,46,
                              89,222,107>>,
                            {dict,0,16,16,8,80,48,

{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},
                             {{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],
                               []}}},
                            {state,307268,
                             "/home/mpietrek/work/var/lib/rabbit at play
/msg_store_transient"},
                            rabbit_msg_store_ets_index,
                            "/home/mpietrek/work/var/lib/rabbit at play
/msg_store_transient",
                            <0.259.0>,311365,303166,315462,319559}},

true,0,#Fun<rabbit_mirror_queue_slave.1.52898136>,0,
                          0,infinity,0,0,0,0,0,
                          {rates,
                           {{1345,671562,425012},0},
                           {{1345,671562,425012},0},
                           0.0,0.0,
                           {1345,671606,221014}},
                          {0,nil},
                          {0,nil},
                          {0,nil},
                          {0,nil},
                          0,0,
                          {rates,
                           {{1345,671562,425012},0},
                           {{1345,671562,425012},0},
                           0.0,0.0,
                           {1345,671606,221014}}},
                         undefined,#Ref<0.0.0.8008>,
                         {dict,0,16,16,8,80,48,
                          {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},

{{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]}}},
                         {dict,0,16,16,8,80,48,
                          {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},

{{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]}}},
                         0,
                         {dict,0,16,16,8,80,48,
                          {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},

{{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]}}},
                         {dict,0,16,16,8,80,48,
                          {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},

{{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]}}},
                         true}
** Reason for termination ==
** {function_clause,
       [{rabbit_mirror_queue_slave,set_synchronised,
            [false,
             {state,
                 {amqqueue,
                     {resource,<<"/">>,queue,<<"unit_test">>},
                     true,false,none,
                     [{<<"x-ha-policy">>,longstr,<<"all">>}],
                     <3299.4430.0>,
                     [<3298.2580.0>],
                     all},
                 <0.280.0>,<3299.4430.0>,rabbit_variable_queue,
                 {vqstate,
                     {0,{[],[]}},
                     {0,{[],[]}},
                     {delta,undefined,0,undefined},
                     {0,{[],[]}},
                     {0,{[],[]}},
                     0,
                     {0,nil},
                     undefined,
                     {0,nil},
                     {qistate,
                         "/home/mpietrek/work/var/lib/rabbit at play
/queues/5CPQWZ6GS6L07B10CDNMTM42C",
                         {{dict,0,16,16,8,80,48,
                              {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],
                               []},

{{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],
                                []}}},
                          []},
                         undefined,0,262144,
                         #Fun<rabbit_variable_queue.2.121033067>,
                         {0,nil}},
                     {{client_msstate,msg_store_persistent,

<<3,210,105,14,156,98,212,28,20,242,100,176,253,212,
                            5,205>>,
                          {dict,0,16,16,8,80,48,
                              {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],
                               []},

{{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],
                                []}}},
                          {state,327753,
                              "/home/mpietrek/work/var/lib/rabbit at play
/msg_store_persistent"},
                          rabbit_msg_store_ets_index,
                          "/home/mpietrek/work/var/lib/rabbit at play
/msg_store_persistent",
                          <0.264.0>,331850,323656,335947,340044},
                      {client_msstate,msg_store_transient,

<<178,59,10,119,58,171,186,219,211,141,219,166,46,89,
                            222,107>>,
                          {dict,0,16,16,8,80,48,
                              {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],
                               []},

{{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],
                                []}}},
                          {state,307268,
                              "/home/mpietrek/work/var/lib/rabbit at play
/msg_store_transient"},
                          rabbit_msg_store_ets_index,
                          "/home/mpietrek/work/var/lib/rabbit at play
/msg_store_transient",
                          <0.259.0>,311365,303166,315462,319559}},
                     true,0,#Fun<rabbit_mirror_queue_slave.1.52898136>,0,0,
                     infinity,0,0,0,0,0,
                     {rates,
                         {{1345,671562,425012},0},
                         {{1345,671562,425012},0},
                         0.0,0.0,
                         {1345,671606,221014}},
                     {0,nil},
                     {0,nil},
                     {0,nil},
                     {0,nil},
                     0,0,
                     {rates,
                         {{1345,671562,425012},0},
                         {{1345,671562,425012},0},
                         0.0,0.0,
                         {1345,671606,221014}}},
                 undefined,#Ref<0.0.0.8008>,
                 {dict,0,16,16,8,80,48,
                     {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},
                     {{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]}}},
                 {dict,0,16,16,8,80,48,
                     {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},
                     {{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]}}},
                 0,
                 {dict,0,16,16,8,80,48,
                     {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},
                     {{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]}}},
                 {dict,0,16,16,8,80,48,
                     {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},
                     {{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]}}},
                 true}],
            []},
        {rabbit_mirror_queue_slave,process_instruction,2,[]},
        {rabbit_mirror_queue_slave,handle_cast,2,[]},
        {gen_server2,handle_msg,2,[]},
        {proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,237}]}]}

=INFO REPORT==== 22-Aug-2012::14:42:42 ===
rabbit on node rabbit at play2 up
--------
On play2, the console spew includes this:

Mnesia(rabbit at play2): write performed by {tid,3817,<3297.298.0>} on record:
        {gm_group,{resource,<<"/">>,queue,<<"system_test">>},
                  83,
                  [{70,<3299.289.0>},{79,<3297.298.0>},{82,<0.295.0>}]}
Mnesia(rabbit at play2): write performed by {tid,3817,<0.277.0>} on record:
        {gm_group,{resource,<<"/">>,queue,<<"unit_test">>},
                  10,
                  [{12,<3299.4432.0>},{7,<0.277.0>}]}
done
-- message delivery logic ready
starting error log relay
...Mnesia(rabbit at play2): Restarting transaction {tid,3818,<0.165.0>}: in
8ms {cyclic,rabbit at play2
,{rabbit_queue,{resource,<<47>>,queue,<<117,110,105,116,95,116,101,115,116>>}},read,read,{tid,3816,<3297.184.0>}}
done
starting networking
...Mnesia(rabbit at play2): delete_object performed by {tid,3816,<3297.184.0>}
on record:
        {rabbit_reverse_route,
            {reverse_binding,
                {resource,<<"/">>,queue,<<"unit_test">>},
                <<"unit_test">>,
                {resource,<<"/">>,exchange,<<>>},
                []},
            const}
Mnesia(rabbit at play2): delete_object performed by {tid,3816,<3297.184.0>} on
record:
        {rabbit_route,{binding,{resource,<<"/">>,exchange,<<>>},
                               <<"unit_test">>,
                               {resource,<<"/">>,queue,<<"unit_test">>},
                               []},
                      const}
Mnesia(rabbit at play2): delete performed by {tid,3816,<3297.184.0>} on record:
        {rabbit_queue,{resource,<<"/">>,queue,<<"unit_test">>}}
Mnesia(rabbit at play2): delete_object performed by {tid,3816,<3297.184.0>} on
record:
        {rabbit_semi_durable_route,
            {binding,
                {resource,<<"/">>,exchange,<<>>},
                <<"unit_test">>,
                {resource,<<"/">>,queue,<<"unit_test">>},
                []},
            const}
Mnesia(rabbit at play2): delete_object performed by {tid,3816,<3297.184.0>} on
record:
        {rabbit_durable_route,
            {binding,
                {resource,<<"/">>,exchange,<<>>},
                <<"unit_test">>,
                {resource,<<"/">>,queue,<<"unit_test">>},
                []},
            const}
Mnesia(rabbit at play2): delete performed by {tid,3816,<3297.184.0>} on record:
        {rabbit_durable_queue,{resource,<<"/">>,queue,<<"unit_test">>}}
Mnesia(rabbit at play2): write performed by {dirty,<0.301.0>} on record:
        {rabbit_listener,rabbit at play2,amqp,
                         "play2.mpietrek.internal.illumita.com",
                         {0,0,0,0},
                         5672}
done
starting direct client
...done
starting notify cluster nodes
...done

broker running
Mnesia(rabbit at play2): write performed by {tid,3820,<3299.4432.0>} on record:
        {gm_group,{resource,<<"/">>,queue,<<"unit_test">>},
                  11,
                  [{12,<3299.4432.0>},{dead,{7,<0.277.0>}}]}
Mnesia(rabbit at play2): write performed by {tid,3821,<3299.4432.0>} on record:
        {gm_group,{resource,<<"/">>,queue,<<"unit_test">>},
                  12,
                  [{12,<3299.4432.0>}]}
Mnesia(rabbit at play2): Transaction log dump initiated by time_threshold:
{needs_dump,47}
Mnesia(rabbit at play2): Transaction log dump initiated by time_threshold:
already_dumped
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20120822/d45a9a9c/attachment.htm>


More information about the rabbitmq-discuss mailing list