[rabbitmq-discuss] Mnesia Corruption Bug

Simon MacMullen simon at rabbitmq.com
Thu Jun 13 13:49:42 BST 2013


Hi Lee. I would be interested to know how you got the machine into that 
state.

There is a bug with a similar stack trace that will be fixed in the next 
release - but I don't think it's the same bug. In your case we are 
seeing a message which has been published and delivered according to the 
queue index, but only published (and not delivered) according to the 
queue index's journal. As the journal should always record the same 
state or newer as the main index, this should be impossible.

So to eliminate obvious causes of weirdness first: are you usuing an 
unusual filesystem, or mounting the filesystem with unusual options?

Cheers, Simon

On 13/06/13 12:36, Lee Hambley wrote:
> Posting this to the list after some discussion on IRC with bob2351 on
> irc.freenode.net.
>
> We have a *slightly* strange situation with using RabbitMQ, we start it
> under `runit`, and it effectively believes that it's running in the
> foreground. I have anecdotal evidence that this causes other problems,
> but at least not anything that hurts too often (i.e you lose "persistent
> messages" in this setup)
>
> That all aside, attached ( https://gist.github.com/leehambley/5773039 )
> is a stacktrace from a problematic box, we couldn't get it to recover
> (single node, single replica, etc, etc) - we simply deleted the mnesia
> database, which worked well enough.
>
> Some information about our environment:
>
>     $ erl --version
>     Erlang R14B04 (erts-5.8.5) [source] [64-bit] [smp:8:8] [rq:8]
>     [async-threads:0] [kernel-poll:false]
>     $ dpkg --list | grep rabbit
>     ii  rabbitmq-server     3.0.4-1     AMQP server written in Erlang
>     $ sudo RABBITMQ_NODENAME=ourproject rabbitmqctl status
>     Status of node ourproject at carla ...
>     [{pid,8055},
>       {running_applications,
>           [{rabbitmq_management,"RabbitMQ Management Console","3.0.4"},
>            {rabbitmq_management_agent,"RabbitMQ Management Agent","3.0.4"},
>            {rabbit,"RabbitMQ","3.0.4"},
>            {os_mon,"CPO  CXC 138 46","2.2.7"},
>            {rabbitmq_web_dispatch,"RabbitMQ Web Dispatcher","3.0.4"},
>            {webmachine,"webmachine","1.9.1-rmq3.0.4-git52e62bc"},
>            {mochiweb,"MochiMedia Web Server","2.3.1-rmq3.0.4-gitd541e9a"},
>            {xmerl,"XML parser","1.2.10"},
>            {inets,"INETS  CXC 138 49","5.7.1"},
>            {mnesia,"MNESIA  CXC 138 12","4.5"},
>            {amqp_client,"RabbitMQ AMQP Client","3.0.4"},
>            {sasl,"SASL  CXC 138 11","2.1.10"},
>            {stdlib,"ERTS  CXC 138 10","1.17.5"},
>            {kernel,"ERTS  CXC 138 10","2.14.5"}]},
>       {os,{unix,linux}},
>       {erlang_version,
>           "Erlang R14B04 (erts-5.8.5) [source] [64-bit] [smp:8:8] [rq:8]
>     [async-threads:30] [kernel-poll:true]\n"},
>       {memory,
>           [{total,33984216},
>            {connection_procs,756760},
>            {queue_procs,325576},
>            {plugins,218728},
>            {other_proc,9518440},
>            {mnesia,93728},
>            {mgmt_db,148472},
>            {msg_index,71528},
>            {other_ets,1145600},
>            {binary,604208},
>            {code,17266925},
>            {atom,1550457},
>            {other_system,2283794}]},
>       {vm_memory_high_watermark,0.4},
>       {vm_memory_limit,6656894566},
>       {disk_free_limit,1000000000},
>       {disk_free,11247643770880},
>       {file_descriptors,
>           [{total_limit,924},
>            {total_used,23},
>            {sockets_limit,829},
>            {sockets_used,12}]},
>       {processes,[{limit,1048576},{used,345}]},
>       {run_queue,0},
>       {uptime,2692}]
>     ...done.
>
>
> I believe this bug is already being tracked internally, and I post the
> report here in the hope that I'll have a place to attach a snapshot of
> an mnesia database the next time this happens to us, or that someone
> else might find this report and be able to contribute. Finally,
> selfishly, in the hope that I'll get notified when this gets fixed, and
> I upgrade, and sleep at night again.
>
> - Lee Hambley
>
>
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>


-- 
Simon MacMullen
RabbitMQ, Pivotal


More information about the rabbitmq-discuss mailing list