[rabbitmq-discuss] RabbitMQ (branch bug21673) running out of file descriptors
Andrey Smirnov
smirnov.andrey at gmail.com
Thu Feb 18 08:02:42 GMT 2010
Forgot to add: at the time of crash we had more than 10k queues...
(Looks like more than file descriptor limit)
2010/2/18 Andrey Smirnov <smirnov.andrey at gmail.com>:
> Hello!
>
> We're running RabbitMQ from branch bug21673 (built yesterday). We have
> a lot of non-persisten queues (2k - 10k+) that are created/destroyed
> all the time. Some queues have unconsumed messaged and they will get
> destroyed with unconsumed messages.
>
> When file descriptor limit for RabbitMQ was left to default 1024 files
> it was dying in 2 hours. I've raised the limit to 10240 files and it
> worked for 16 hours and crashed.
>
> Crash for clients looks like "INTERNAL_ERROR" error on connections.
> RabbitMQ is still running and eating 100% of CPU. Restarting RabbitMQ
> helps to revive it.
>
> Adding log file excerpt (log is huge, it contains dump of all internal
> state, I have it if someone will be interested):
>
> =INFO REPORT==== 16-Feb-2010::04:42:05 ===
> Limiting to approx 10190 file handles
>
> =INFO REPORT==== 16-Feb-2010::04:42:05 ===
> Memory limit set to 3196MB.
>
> =INFO REPORT==== 16-Feb-2010::04:42:05 ===
> Using rabbit_msg_store_ets_index to provide index for message store
>
> =INFO REPORT==== 16-Feb-2010::04:42:05 ===
> started TCP Listener on 0.0.0.0:5672
>
> =INFO REPORT==== 16-Feb-2010::04:42:26 ===
> accepted TCP connection on 0.0.0.0:5672 from <redacted>:35525
>
> .....
>
> ERROR REPORT==== 17-Feb-2010::17:32:16 ===
> ** Generic server <0.7466.264> terminating
> ** Last message in was {'$gen_cast',
> {method,
> {'queue.declare',0,
>
> <<"qik.item.session.83c5e30a-180e-4f2a-b59d-c6bd7938faac">>,
> false,false,false,false,false,[]},
> none}}
> ** When Server state == {ch,running,3,<0.1622.0>,<0.7464.264>,undefined,none,
> {set,0,16,16,8,80,48,
> {[],[],[],[],[],[],[],[],[],[],[],[],[],[],
> [],[]},
> {{[],[],[],[],[],[],[],[],[],[],[],[],[],[],
> [],[]}}},
> 1,
> {[],[]},
> {[],[]},
> <<"guest">>,<<"localhost">>,
>
> <<"qik.item.session.6c6a3c8e-38d7-4c69-bb1c-26ae3cc00d67">>,
> {dict,0,16,16,8,80,48,
> {[],[],[],[],[],[],[],[],[],[],[],[],[],[],
> [],[]},
> {{[],[],[],[],[],[],[],[],[],[],[],[],[],[],
> [],[]}}}}
> ** Reason for termination ==
> ** {{badmatch,{error,{{badmatch,{error,emfile}},
> [{rabbit_queue_index,get_journal_handle,1},
> {rabbit_queue_index,load_journal,1},
> {rabbit_queue_index,init,1},
> {rabbit_variable_queue,init,1},
> {rabbit_amqqueue_process,init,1},
> {gen_server2,init_it,6},
> {proc_lib,init_p_do_apply,3}]}}},
> [{rabbit_amqqueue,start_queue_process,1},
> {rabbit_amqqueue,declare,4},
> {rabbit_channel,handle_method,3},
> {rabbit_channel,handle_cast,2},
> {gen_server2,handle_msg,7},
> {proc_lib,wake_up,3}]}
>
> =ERROR REPORT==== 17-Feb-2010::17:32:16 ===
> connection <0.1622.0> (running), channel 3 - error:
> {{badmatch,{error,{{badmatch,{error,emfile}},
> [{rabbit_queue_index,get_journal_handle,1},
> {rabbit_queue_index,load_journal,1},
> {rabbit_queue_index,init,1},
> {rabbit_variable_queue,init,1},
> {rabbit_amqqueue_process,init,1},
> {gen_server2,init_it,6},
>
> ....
>
> =ERROR REPORT==== 17-Feb-2010::17:33:08 ===
> Mnesia(rabbit at SM2): ** ERROR ** (core dumped to file:
> "/usr/local/sbin/MnesiaCore.rabbit at SM2_1266_456788_594733")
> ** FATAL ** Cannot open log file
> "/var/lib/rabbitmq/mnesia/rabbit/rabbit_durable_route.DCL":
> {file_error,
>
>
> "/var/lib/rabbitmq/mnesia/rabbit/rabbit_durable_route.DCL",
>
> emfile}
>
> =ERROR REPORT==== 17-Feb-2010::17:33:08 ===
> ** gen_event handler rabbit_error_logger crashed.
> ** Was installed in error_logger
> ** Last event was: {error,<0.58.0>,
> {<0.61.0>,
> "Mnesia(~p): ** ERROR ** (core dumped to file:
> ~p)~n ** FATAL ** Cannot open log file ~p: ~p~n",
> [rabbit at SM2,
>
> "/usr/local/sbin/MnesiaCore.rabbit at SM2_1266_456788_594733",
>
> "/var/lib/rabbitmq/mnesia/rabbit/rabbit_durable_route.DCL",
> {file_error,
>
> "/var/lib/rabbitmq/mnesia/rabbit/rabbit_durable_route.DCL",
> emfile}]}}
> ** When handler state == {resource,<<"/">>,exchange,<<"amq.rabbitmq.log">>}
> ** Reason == {{function_clause,
> [{gen,call,
> [{rex,{error,{node_not_running,rabbit at SM2}}},
> '$gen_call',
> {call,mnesia_lib,db_select_cont,
> [ram_copies,
> {rabbit_route,
> {binding,
> {resource,<<"localhost">>,exchange,
> <<"qik.exc.session">>},
>
> <<"qik.item.session.22ff9df4-c115-4cdf-a721-27f99eba9aaa">>,
> {resource,<<"localhost">>,queue,
>
> <<"qik.item.session.22ff9df4-c115-4cdf-a721-27f99eba9aaa">>},
> []},
> [],100,<<>>,[],0,0},
> [{'$1',[],['$1']}]],
> <0.0.0>},
> infinity]},
> {gen_server,call,3},
> {rpc,do_call,3},
> {mnesia,do_dirty_rpc,5},
> {mnesia,select_cont,3},
> {mnesia,'-qlc_select/1-fun-0-',1},
> {rabbit_exchange,'-match_bindings/2-fun-1-',10},
> {rabbit_variable_queue,flush_journal,1},
> {rabbit_amqqueue_process,handle_pre_hibernate,1}]}
>
> =ERROR REPORT==== 17-Feb-2010::17:33:08 ===
> Mnesia(rabbit at SM2): ** ERROR ** (core dumped to file:
> "/usr/local/sbin/MnesiaCore.rabbit at SM2_1266_456788_594733")
> ** FATAL ** Cannot open log file
> "/var/lib/rabbitmq/mnesia/rabbit/rabbit_durable_route.DCL":
> {file_error,
>
>
> "/var/lib/rabbitmq/mnesia/rabbit/rabbit_durable_route.DCL",
>
> emfile}
>
> =ERROR REPORT==== 17-Feb-2010::17:33:08 ===
> ** gen_event handler rabbit_error_logger crashed.
> ** Was installed in error_logger
> ** Last event was: {error,<0.58.0>,
> {<0.61.0>,
> "Mnesia(~p): ** ERROR ** (core dumped to file:
> ~p)~n ** FATAL ** Cannot open log file ~p: ~p~n",
> [rabbit at SM2,
>
> "/usr/local/sbin/MnesiaCore.rabbit at SM2_1266_456788_594733",
>
> "/var/lib/rabbitmq/mnesia/rabbit/rabbit_durable_route.DCL",
> {file_error,
>
> "/var/lib/rabbitmq/mnesia/rabbit/rabbit_durable_route.DCL",
> emfile}]}}
>
> ** When handler state == {resource,<<"/">>,exchange,<<"amq.rabbitmq.log">>}
> ** Reason == {{function_clause,
> [{gen,call,
> [{rex,{error,{node_not_running,rabbit at SM2}}},
> '$gen_call',
> {call,mnesia_lib,db_select_cont,
> [ram_copies,
> {rabbit_route,
> {binding,
> {resource,<<"localhost">>,exchange,
> <<"qik.exc.session">>},
>
> <<"qik.item.session.22ff9df4-c115-4cdf-a721-27f99eba9aaa">>,
> {resource,<<"localhost">>,queue,
>
> <<"qik.item.session.22ff9df4-c115-4cdf-a721-27f99eba9aaa">>},
> []},
> [],100,<<>>,[],0,0},
> [{'$1',[],['$1']}]],
> <0.0.0>},
> infinity]},
> {gen_server,call,3},
> {rpc,do_call,3},
> {mnesia,do_dirty_rpc,5},
> {mnesia,select_cont,3},
> {mnesia,'-qlc_select/1-fun-0-',1},
> {rabbit_exchange,'-match_bindings/2-fun-1-',10},
> {qlc,collect,1}]},
> {gen_server,call,
> [{rex,{error,{node_not_running,rabbit at SM2}}},
> {call,mnesia_lib,db_select_cont,
> [ram_copies,
> {rabbit_route,
>
> .....
>
>
> ** Reason for termination ==
> ** {{badmatch,{error,emfile}},
> [{rabbit_queue_index,get_segment_handle,1},
> {rabbit_queue_index,append_journal_to_segment,2},
> {rabbit_queue_index,'-flush_journal/1-fun-0-',3},
> {lists,foldl,3},
> {rabbit_queue_index,segment_fold,3},
> {rabbit_queue_index,flush_journal,1},
> {rabbit_variable_queue,flush_journal,1},
> {rabbit_amqqueue_process,handle_pre_hibernate,1}]}
>
>
>
> --
> Andrey Smirnov
>
--
Andrey Smirnov,
phone. +7 (905) 769-83-20
More information about the rabbitmq-discuss
mailing list