[rabbitmq-discuss] Mnesia Corruption Bug
Lee Hambley
lee.hambley at gmail.com
Thu Jun 13 12:36:04 BST 2013
Posting this to the list after some discussion on IRC with bob2351 on
irc.freenode.net.
We have a *slightly* strange situation with using RabbitMQ, we start it
under `runit`, and it effectively believes that it's running in the
foreground. I have anecdotal evidence that this causes other problems, but
at least not anything that hurts too often (i.e you lose "persistent
messages" in this setup)
That all aside, attached ( https://gist.github.com/leehambley/5773039 ) is
a stacktrace from a problematic box, we couldn't get it to recover (single
node, single replica, etc, etc) - we simply deleted the mnesia database,
which worked well enough.
Some information about our environment:
$ erl --version
Erlang R14B04 (erts-5.8.5) [source] [64-bit] [smp:8:8] [rq:8]
[async-threads:0] [kernel-poll:false]
$ dpkg --list | grep rabbit
ii rabbitmq-server 3.0.4-1 AMQP server written in Erlang
$ sudo RABBITMQ_NODENAME=ourproject rabbitmqctl status
Status of node ourproject at carla ...
[{pid,8055},
{running_applications,
[{rabbitmq_management,"RabbitMQ Management Console","3.0.4"},
{rabbitmq_management_agent,"RabbitMQ Management Agent","3.0.4"},
{rabbit,"RabbitMQ","3.0.4"},
{os_mon,"CPO CXC 138 46","2.2.7"},
{rabbitmq_web_dispatch,"RabbitMQ Web Dispatcher","3.0.4"},
{webmachine,"webmachine","1.9.1-rmq3.0.4-git52e62bc"},
{mochiweb,"MochiMedia Web Server","2.3.1-rmq3.0.4-gitd541e9a"},
{xmerl,"XML parser","1.2.10"},
{inets,"INETS CXC 138 49","5.7.1"},
{mnesia,"MNESIA CXC 138 12","4.5"},
{amqp_client,"RabbitMQ AMQP Client","3.0.4"},
{sasl,"SASL CXC 138 11","2.1.10"},
{stdlib,"ERTS CXC 138 10","1.17.5"},
{kernel,"ERTS CXC 138 10","2.14.5"}]},
{os,{unix,linux}},
{erlang_version,
"Erlang R14B04 (erts-5.8.5) [source] [64-bit] [smp:8:8] [rq:8]
[async-threads:30] [kernel-poll:true]\n"},
{memory,
[{total,33984216},
{connection_procs,756760},
{queue_procs,325576},
{plugins,218728},
{other_proc,9518440},
{mnesia,93728},
{mgmt_db,148472},
{msg_index,71528},
{other_ets,1145600},
{binary,604208},
{code,17266925},
{atom,1550457},
{other_system,2283794}]},
{vm_memory_high_watermark,0.4},
{vm_memory_limit,6656894566},
{disk_free_limit,1000000000},
{disk_free,11247643770880},
{file_descriptors,
[{total_limit,924},
{total_used,23},
{sockets_limit,829},
{sockets_used,12}]},
{processes,[{limit,1048576},{used,345}]},
{run_queue,0},
{uptime,2692}]
...done.
I believe this bug is already being tracked internally, and I post the
report here in the hope that I'll have a place to attach a snapshot of an
mnesia database the next time this happens to us, or that someone else
might find this report and be able to contribute. Finally, selfishly, in
the hope that I'll get notified when this gets fixed, and I upgrade, and
sleep at night again.
- Lee Hambley
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20130613/b70dd7e6/attachment.htm>
More information about the rabbitmq-discuss
mailing list