[rabbitmq-discuss] RabbitMQ stopping when trying to delete locked mnesia files?

Simon MacMullen simon at rabbitmq.com
Wed May 30 11:49:54 BST 2012


On 29/05/12 13:01, Øyvind Tjervaag wrote:
> Hi!

Hi.

> I've had this problem a couple of times now where it seems, if I'm
> deciphering the logs correctly, the RabbitMQ crashed when it tries to
> delete mnesia files that are locked by backup or virus-scanning. I
> thought I had seen something about this on this list a while back,
> but I can't seem to find it now.

That's a correct decipherment.

> I'm running RabbitMQ version 2.8.2 and Erlang R15B01 on Windows 2008
> R2 (64 bit). Now, I've told the IT-ops-people to stop locking any
> files in the mnesia folder, but I think RabbitMQ should not fail this
> badly when it tries deleting a locked file?

Hmm. We do in general assume that file operations on files owned by 
Rabbit will succeed; it's hard to know in the general case what else to do.

In the crash that you're seeing, Rabbit was not able to delete an old 
file from the message store. That's a *comparatively* benign event, but 
the question of what Rabbit should do in this case is still not obvious. 
Should it ignore the fact that the delete failed (and thus leak that 
file)? Should it maintain records of which deletions have failed with 
the intent of retrying them later? (And does that get persisted?) Should 
the message store hang until the file can be deleted? (That could be a 
long time, and you won't accept any new persistent messages until then.)

But it's worse than that - while Rabbit in general tries to open files 
and keep them open, it will close and reopen files when it is running 
low on file descriptors. If reopening (for example) a queue index file 
fails, it's *really* not obvious what our plan B could be.

Rabbit really needs to know that its files are under its control.

Cheers, Simon

-- 
Simon MacMullen
RabbitMQ, VMware


More information about the rabbitmq-discuss mailing list