[rabbitmq-discuss] Mnesia Corruption Bug

Lee Hambley lee.hambley at gmail.com
Thu Jun 13 12:36:04 BST 2013


Posting this to the list after some discussion on IRC with bob2351 on 
irc.freenode.net.

We have a *slightly* strange situation with using RabbitMQ, we start it 
under `runit`, and it effectively believes that it's running in the 
foreground. I have anecdotal evidence that this causes other problems, but 
at least not anything that hurts too often (i.e you lose "persistent 
messages" in this setup)

That all aside, attached ( https://gist.github.com/leehambley/5773039 ) is 
a stacktrace from a problematic box, we couldn't get it to recover (single 
node, single replica, etc, etc) - we simply deleted the mnesia database, 
which worked well enough.

Some information about our environment:

$ erl --version
Erlang R14B04 (erts-5.8.5) [source] [64-bit] [smp:8:8] [rq:8] 
[async-threads:0] [kernel-poll:false]
$ dpkg --list | grep rabbit
ii  rabbitmq-server     3.0.4-1     AMQP server written in Erlang
$ sudo RABBITMQ_NODENAME=ourproject rabbitmqctl status
Status of node ourproject at carla ...
[{pid,8055},
 {running_applications,
     [{rabbitmq_management,"RabbitMQ Management Console","3.0.4"},
      {rabbitmq_management_agent,"RabbitMQ Management Agent","3.0.4"},
      {rabbit,"RabbitMQ","3.0.4"},
      {os_mon,"CPO  CXC 138 46","2.2.7"},
      {rabbitmq_web_dispatch,"RabbitMQ Web Dispatcher","3.0.4"},
      {webmachine,"webmachine","1.9.1-rmq3.0.4-git52e62bc"},
      {mochiweb,"MochiMedia Web Server","2.3.1-rmq3.0.4-gitd541e9a"},
      {xmerl,"XML parser","1.2.10"},
      {inets,"INETS  CXC 138 49","5.7.1"},
      {mnesia,"MNESIA  CXC 138 12","4.5"},
      {amqp_client,"RabbitMQ AMQP Client","3.0.4"},
      {sasl,"SASL  CXC 138 11","2.1.10"},
      {stdlib,"ERTS  CXC 138 10","1.17.5"},
      {kernel,"ERTS  CXC 138 10","2.14.5"}]},
 {os,{unix,linux}},
 {erlang_version,
     "Erlang R14B04 (erts-5.8.5) [source] [64-bit] [smp:8:8] [rq:8] 
[async-threads:30] [kernel-poll:true]\n"},
 {memory,
     [{total,33984216},
      {connection_procs,756760},
      {queue_procs,325576},
      {plugins,218728},
      {other_proc,9518440},
      {mnesia,93728},
      {mgmt_db,148472},
      {msg_index,71528},
      {other_ets,1145600},
      {binary,604208},
      {code,17266925},
      {atom,1550457},
      {other_system,2283794}]},
 {vm_memory_high_watermark,0.4},
 {vm_memory_limit,6656894566},
 {disk_free_limit,1000000000},
 {disk_free,11247643770880},
 {file_descriptors,
     [{total_limit,924},
      {total_used,23},
      {sockets_limit,829},
      {sockets_used,12}]},
 {processes,[{limit,1048576},{used,345}]},
 {run_queue,0},
 {uptime,2692}]
...done.


I believe this bug is already being tracked internally, and I post the 
report here in the hope that I'll have a place to attach a snapshot of an 
mnesia database the next time this happens to us, or that someone else 
might find this report and be able to contribute. Finally, selfishly, in 
the hope that I'll get notified when this gets fixed, and I upgrade, and 
sleep at night again.

- Lee Hambley
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20130613/b70dd7e6/attachment.htm>


More information about the rabbitmq-discuss mailing list