[rabbitmq-discuss] RabbitMQ crashed in ets:insert_new - looks like a genuine bug...

Matthew Sackman matthew at rabbitmq.com
Fri Aug 12 16:47:26 BST 2011


Hi Eugene,

On Fri, Aug 12, 2011 at 08:38:18AM -0700, Eugene Kirpichov wrote:
> Thanks a lot for taking the time for investigation.

No problem. I rather enjoy trying to track down such bugs.

> Do you refer to reproducing the INTERNAL_ERROR bug in tx.commit (which
> only happened once and didn't cause a node crash), or to the bug that
> was causing the node crash (and happened on a different node of the
> cluster)?

Well, a msg_store on a node crashed when one of the queues that was
using it was deleted. The crash of said msg_store subsequently took out
all other queues that were using that msg_store. The loss of those
queues would have caused all in-flight tx.commits to abort with an
INTERNAL_ERROR.

So I strongly suspect they're all one and the same thing.

> By the way, I ran the stress test with even more stress on the cluster
> several times afterwards and wasn't able to cause it to crash again,
> though before that 2 of 2 tests crashed. So I was "lucky" in a sense.

All the best bugs have this property :D

Matthew


More information about the rabbitmq-discuss mailing list