[rabbitmq-discuss] Queue slow to come up with a lot of messages stored on disk w/ bug21673 branch

Mon Mar 1 17:49:56 GMT 2010

Hi Scott,

On Mon, Mar 01, 2010 at 09:35:29AM -0800, scott w wrote:
> I've noticed the following behavior when there are a large number of
> messages persisted to disk, e.g. > 1M, and I restart rabbitmq. Initially,
> when I list the queues using rabbitmqctl it says there are no queues. Then
> after a minute or two a couple queues begin to appear and then after ~5-10
> minutes all the queues show up. I am using branch bug21637. Is this expected
> behaviour? FWIW, I would have expected all of the queues themselves to show
> up immediately assuming that the queues and their sizes are indexed and
> stored separately from the actual data files.

Yes, that is expected. Firstly, although ctl will respond, the tcp
listeners won't be started at that point, so you are "safe" in the sense
that nothing else can come along and start creating or deleting queues.

Secondly, what is going on is basically a pretty thorough fsck of the
data. I have spent a long time optimising this and basically it really
can't be made to go much faster - the main reason why storing "clean"
shutdown and then restoring state isn't sufficient for instant start up
is that on start up we have to go through the disk and delete any
messages that are transient (i.e. they only got pushed to disk due to
memory pressure). As such, we have to walk over every queue on disk.
This can take a long time.

Finally, what you're suggesting would require that we write out, eg
queue size (and flush disk caches), on every single publish and
deliver/get to and from the queue. Doing such a thing would be
phenominally expensive.

If it wasn't for the fact that people (and AMQP) expect queues to know
their own length, we could probably drop transient messages lazily.
However, at the end of the day, the same amount of work has to be done.
It's really just a matter of when it gets done.

Matthew