Hi Francesco,<br><br>Argh... I was so happy that this bug appeared to have been reproduced.<br><br>I would happily give you a script to repro, except that the logic is intertwined with our own infrastructure. So instead, here's a brain dump of everything I can fathom would be relevant.<br>
<br>Each node (rabbit@play, rabbit@play2, rabbit@util) is run on it's own separate Ubuntu 10.04 64-bit VM. The management and tracing plugin are enabled on all nodes. All nodes are disc nodes.<br><br>Clustering is performed by a config file that's the same on all nodes:<br>
<br>----<br>[<br> {rabbit, [{cluster_nodes, [rabbit@play,rabbit@play2,rabbit@util] },<br> {disk_free_limit, 104857600}<br> ]<br> },<br> {mnesia, [{debug, trace}<br> ]<br> }<br>
].<br>----<br><br>All queues are created as HA and durable.<br><br>Starting from a fully operational cluster with all nodes running and a set of ~6 empty queues. <br><ul><li>Pick a queue to work with (say, "foo")</li>
<li>Using the management UI, select the queue tab and publish 4 messages to "foo", delivery-mode = persistent, but message content doesn't matter. I use "abc" for the content.</li><li>Using rabbitmqctl, bring the first node (play) down. In the management UI, note that the queues are now backed by only two nodes (play2, util)</li>
<li>Using rabbitmqctl, bring the first node (play) back up. In the management UI, note that the "foo" queue is synched on (play2, util), but not on play. (UI shows "+1 +1")</li><li>Using the management UI, retrieve all the messages from the queue in question, but with the requeue flag enabled. After this, the queue appears synced on all nodes ("+2).</li>
<li>Using rabbitmqctl, bring down the 2nd node (play2). If you wait long enough, you should see that the "foo" queue has vanished. However, all other queues remain. You should also have logs similar to what I included previously.</li>
<li>On the off chance that "foo" doesn't disappear within 10-15 seconds, use rabbitmqctl to start the second node up again. See if "foo" has vanished.<br></li></ul><br>Regarding the resync logic I have (HTTP api to retrieve all messages, with requeue), I'm not wedded to that. I'm just looking for some way to quickly resynch all contents, preferably without myself reading and re-publishing all messages. If there's some way to let the broker do most (all?) of the work, I'm all ears. My observation was that the "retrieve with requeue" seemed to work as intended, but you're obviously the expert on this, so I'm all ears.<br>
<br><br><div class="gmail_quote">On Wed, Aug 29, 2012 at 3:56 AM, Francesco Mazzoli <span dir="ltr"><<a href="mailto:francesco@rabbitmq.com" target="_blank">francesco@rabbitmq.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Hi Matt,<br>
<br>
At Wed, 22 Aug 2012 15:06:42 -0700,<br>
<div class="im">Matt Pietrek wrote:<br>
> I then take down the play node and start it back up. Afterwards, I force<br>
> everything to be synchronized by doing a management API 'get messages" with<br>
> requeue=True. When this completes, everything shows up synched as expected.<br>
<br>
</div>This should not happen. Messages are requeued at the original position in the<br>
queue, see <<a href="http://www.rabbitmq.com/semantics.html" target="_blank">http://www.rabbitmq.com/semantics.html</a>>, and thus that has no effect<br>
to the syncing of slaves. Republishing would.<br>
<br>
I tried to reproduce the problem following exactly what you did, with no<br>
success. Can you describe, in detail, your setup and the steps you're taking to<br>
reproduce that? The most convenient thing would be to automate the procedure in<br>
a script.<br>
<br>
--<br>
Francesco * Often in error, never in doubt<br>
<div class="HOEnZb"><div class="h5">_______________________________________________<br>
rabbitmq-discuss mailing list<br>
<a href="mailto:rabbitmq-discuss@lists.rabbitmq.com">rabbitmq-discuss@lists.rabbitmq.com</a><br>
<a href="https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss" target="_blank">https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss</a><br>
</div></div></blockquote></div><br>