Hi Matthew --<br><br>Ok thanks for explaining in such detail the issue at hand. The queues aren't supposed to normally be as big as they were so hopefully with smaller queues, the problem will be less pronounced in the future. <br>
<br>thanks,<br>Scott<br><br><div class="gmail_quote">On Mon, Mar 1, 2010 at 9:49 AM, Matthew Sackman <span dir="ltr"><<a href="mailto:matthew@lshift.net">matthew@lshift.net</a>></span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
Hi Scott,<br>
<br>
On Mon, Mar 01, 2010 at 09:35:29AM -0800, scott w wrote:<br>
> I've noticed the following behavior when there are a large number of<br>
> messages persisted to disk, e.g. > 1M, and I restart rabbitmq. Initially,<br>
> when I list the queues using rabbitmqctl it says there are no queues. Then<br>
> after a minute or two a couple queues begin to appear and then after ~5-10<br>
> minutes all the queues show up. I am using branch bug21637. Is this expected<br>
> behaviour? FWIW, I would have expected all of the queues themselves to show<br>
> up immediately assuming that the queues and their sizes are indexed and<br>
> stored separately from the actual data files.<br>
<br>
Yes, that is expected. Firstly, although ctl will respond, the tcp<br>
listeners won't be started at that point, so you are "safe" in the sense<br>
that nothing else can come along and start creating or deleting queues.<br>
<br>
Secondly, what is going on is basically a pretty thorough fsck of the<br>
data. I have spent a long time optimising this and basically it really<br>
can't be made to go much faster - the main reason why storing "clean"<br>
shutdown and then restoring state isn't sufficient for instant start up<br>
is that on start up we have to go through the disk and delete any<br>
messages that are transient (i.e. they only got pushed to disk due to<br>
memory pressure). As such, we have to walk over every queue on disk.<br>
This can take a long time.<br>
<br>
Finally, what you're suggesting would require that we write out, eg<br>
queue size (and flush disk caches), on every single publish and<br>
deliver/get to and from the queue. Doing such a thing would be<br>
phenominally expensive.<br>
<br>
If it wasn't for the fact that people (and AMQP) expect queues to know<br>
their own length, we could probably drop transient messages lazily.<br>
However, at the end of the day, the same amount of work has to be done.<br>
It's really just a matter of when it gets done.<br>
<br>
Matthew<br>
<br>
_______________________________________________<br>
rabbitmq-discuss mailing list<br>
<a href="mailto:rabbitmq-discuss@lists.rabbitmq.com">rabbitmq-discuss@lists.rabbitmq.com</a><br>
<a href="http://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss" target="_blank">http://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss</a><br>
</blockquote></div><br>