[rabbitmq-discuss] RabbitMQ stopping when trying to delete locked mnesia files?

Øyvind Tjervaag oyvind at tjervaag.com
Wed May 30 13:33:32 BST 2012


On May 30, 2012, at 12:49 PM, Simon MacMullen wrote:

> On 29/05/12 13:01, Øyvind Tjervaag wrote:
> 
>> I'm running RabbitMQ version 2.8.2 and Erlang R15B01 on Windows 2008
>> R2 (64 bit). Now, I've told the IT-ops-people to stop locking any
>> files in the mnesia folder, but I think RabbitMQ should not fail this
>> badly when it tries deleting a locked file?
> 
> Hmm. We do in general assume that file operations on files owned by Rabbit will succeed; it's hard to know in the general case what else to do.

Agreed.

> In the crash that you're seeing, Rabbit was not able to delete an old file from the message store. That's a *comparatively* benign event, but the question of what Rabbit should do in this case is still not obvious. Should it ignore the fact that the delete failed (and thus leak that file)? Should it maintain records of which deletions have failed with the intent of retrying them later? (And does that get persisted?) Should the message store hang until the file can be deleted? (That could be a long time, and you won't accept any new persistent messages until then.)
> 
> But it's worse than that - while Rabbit in general tries to open files and keep them open, it will close and reopen files when it is running low on file descriptors. If reopening (for example) a queue index file fails, it's *really* not obvious what our plan B could be.

I definitely see that the issue is bigger than I first though it was, I just thought it should crash as seldom as possible ;).

> Rabbit really needs to know that its files are under its control.

Now that I know I'll be more careful letting the people in charge of backups and virus-scans know not to touch the files! 

Thanks,
Øyvind


More information about the rabbitmq-discuss mailing list