[rabbitmq-discuss] RabbitMQ crashed in ets:insert_new - looks like a genuine bug...
Eugene Kirpichov
ekirpichov at gmail.com
Fri Aug 12 04:49:20 BST 2011
...And also this error I found in the log of one of the nodes that didn't crash.
What does this error mean?
=ERROR REPORT==== 11-Aug-2011::19:57:27 ===
connection <0.745.0>, channel 2 - error:
{amqp_error,internal_error,
"commit failed:
[{<7566.257.0>,{exit,{{noproc,{gen_server2,call,[msg_store_persistent,{client_terminate,<<64,136,153,249,138,79,147,63,147,97,52,189,50,189,255,66>>},infinity]}},{gen_server2,call,[<7566.257.0>,{commit,<<131,20,0,33,243,147,31,205,97,98,116,217,18,139,207,116>>,<0.19127.8>},infinity]}},[{gen_server2,call,3},{rabbit_misc,with_exit_handler,2},{delegate,safe_invoke,2},{delegate,'-safe_invoke/2-lc$^0/1-0-',2},{delegate,handle_call,3},{gen_server2,handle_msg,2},{proc_lib,wake_up,3}]}}]",
'tx.commit'}
On Thu, Aug 11, 2011 at 8:45 PM, Eugene Kirpichov <ekirpichov at gmail.com> wrote:
> I ran another test like this and looked a little more closely at the
> logs. This time just 1 of 4 nodes crashed and some new errors
> appeared.
> I'm attaching a slightly snipped version of the logs (all binaries and
> some too repetitive stuff snipped).
>
> So:
> * There's not only this failure in ets:insert_new, there's also ets:lookup
> * There are supervisor reports in sasl.log about
> reached_max_restart_intensity; they happen after a few similar
> child_terminated reports about rabbit_channels and amqp_queues
> * After these things, apparently msg_store_persistent crashes, and so
> everything crashes.
>
> Again, the failed rabbitmq node started successfully after a manual restart.
>
> (Folks, is this the right place to report this kind of things? Is it
> ok to attach several hundred kb files?)
>
> On Thu, Aug 11, 2011 at 6:23 PM, Eugene Kirpichov <ekirpichov at gmail.com> wrote:
>> A lot of clients (a thousand or more) were rapidly publishing 1kb
>> messages to a queue, and then rabbitmq crashed.
>>
>> In fact I had a cluster of 4 rabbits, and 2 of them crashed as a
>> result. The remaining 2 continued working ok.
>>
>> Here's a crash report from rabbit-sasl.log. I do not give the full log
>> because it's large, contains message data (which my employer might not
>> like) and I'm too lazy to automatically snip it.
>> But the log is really full of things exactly like what I show. This
>> exact message gets repeated many times in the same second, and then it
>> finally crashed.
>>
>> What other information can I provide to resolve this? Could this be an
>> error on my, not rabbit's, part? Having a sudden rabbitmq crash is not
>> really what I'd like to have in production :-|
>>
>> =CRASH REPORT==== 11-Aug-2011::17:56:19 ===
>> crasher:
>> initial call: gen:init_it/6
>> pid: <0.16624.0>
>> registered_name: []
>> exception exit: {badarg,
>> [{ets,insert_new,
>> [303172,
>> {<<223,221,16,201,23,190,196,251,169,11,157,145,
>> 94,36,1,105>>,
>> {basic_message,
>> {resource,<<"/">>,exchange,<<>>},
>>
>> [<<"results-8808E5FBBC714C9E880F9FD30F443151.TestApp.rmq002">>],
>> {content,60,none,
>> <<....>>, % (snipped)
>> rabbit_framing_amqp_0_9_1,
>> [<<....>>]}, % (snipped too)
>> <<223,221,16,201,23,190,196,251,169,11,157,
>> 145,94,36,1,105>>,
>> true},
>> 1}]},
>> {rabbit_msg_store,update_msg_cache,3},
>> {rabbit_msg_store,write,3},
>> {rabbit_variable_queue,
>> '-with_immutable_msg_store_state/3-fun-0-',2},
>> {rabbit_variable_queue,with_msg_store_state,3},
>> {rabbit_variable_queue,
>> with_immutable_msg_store_state,3},
>> {rabbit_variable_queue,maybe_write_msg_to_disk,3},
>> {rabbit_variable_queue,maybe_write_to_disk,4}]}
>> in function gen_server2:terminate/3
>> ancestors: [rabbit_amqqueue_sup,rabbit_sup,<0.137.0>]
>> messages: [{'$gen_cast',{ack,none,[46689],<0.16623.0>}},
>> {'$gen_cast',{ack,none,[46690],<0.16623.0>}}]
>> links: [<0.263.0>]
>> dictionary: [{fhc_age_tree,{0,nil}},
>> {{ch,<0.16623.0>},
>> {cr,1,<0.16623.0>,<0.16628.0>,#Ref<0.0.0.16807>,
>> {set,2,16,16,8,80,48,
>> {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},
>> {{[],[],[],[],[],[],[],[],
>> [46690],
>> [],[],[],[],
>> [46689],
>> [],[]}}},
>> false,none,0}},
>> {guid,{{7,<0.16624.0>},1}}]
>> trap_exit: true
>> status: running
>> heap_size: 1682835
>> stack_size: 24
>> reductions: 1260360700
>> neighbours:
>>
>>
>> --
>> Eugene Kirpichov
>> Principal Engineer, Mirantis Inc. http://www.mirantis.com/
>> Editor, http://fprog.ru/
>>
>
>
>
> --
> Eugene Kirpichov
> Principal Engineer, Mirantis Inc. http://www.mirantis.com/
> Editor, http://fprog.ru/
>
--
Eugene Kirpichov
Principal Engineer, Mirantis Inc. http://www.mirantis.com/
Editor, http://fprog.ru/
More information about the rabbitmq-discuss
mailing list