<div dir="ltr"><div>Karl,<br><br>Wild speculation would be that messages have been consumed from a queue but that queue (name = document_queue?) is for some reason missing in action (gone, inaccessible, ...) when the consumer subsequently tries to ack the message(s). <br>
<br></div>Brett<br><br></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Sat, May 18, 2013 at 3:20 PM, Karl Rieb <span dir="ltr"><<a href="mailto:karl.rieb@gmail.com" target="_blank">karl.rieb@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi,<div><br></div><div>I am running a RabbitMQ cluster of 4 nodes in AWS EC2 on <b>c1.xlarge </b>instances using Ubuntu 12.04 LTS (kernel 3.2.0-38). Periodically two of my nodes holding the largest queues will crash and require me to restart the rabbitmq-server service. Looking at the logs, I'm at a loss as to why the crash is occurring. I've have tried reproducing the crash manually by flooding my nodes with messages, but that doesn't seem to trigger the issue. Below is output from various RabbitMQ logs:</div>
<div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"># cat /var/log/rabbitmq/rabbit@ip-10-123-123-123-sasl.log<br>
=CRASH REPORT==== 18-May-2013::01:41:51 ===<br> crasher:<br> initial call: gen:init_it/6<br> pid: <0.301.0><br> registered_name: []<br> exception exit: {{case_clause,{empty,{[],[]}}},<br> [{rabbit_amqqueue_process,subtract_acks,3,[]},<br>
{rabbit_amqqueue_process,subtract_acks,4,[]},<br> {rabbit_amqqueue_process,handle_cast,2,[]},<br> {gen_server2,handle_msg,2,[]},<br> {proc_lib,wake_up,3,<br>
[{file,"proc_lib.erl"},{line,249}]}]}<br> in function gen_server2:terminate/3<br> ancestors: [rabbit_amqqueue_sup,rabbit_sup,<0.207.0>]<br> messages: [{'$gen_cast',{resume,<0.676.0>}},<br>
{'$gen_cast',{ack,[27583,27606,27629,27652,27675,27698,<br> 27721,27744,27767,27790,27813,27836,<br> 27859,27882,27905,27928,27951,27974,<br>
27997,28020,28043,28066,28089,28112,<br> 28135,28158,28179,28200,28221,28242,<br> 28263,28284,28305,28324,28342,28360,<br>
28378,28396,28414,28432,28450,28468,<br> 28486,28504,28522,28540,28558,28576,<br> 28594,28612],<br> <0.737.0>}},<br>
{'$gen_cast',{resume,<0.737.0>}},<br> {'$gen_cast',{ack,[27608,27631,27654,27677,27700,27723,<br> 27746,27769,27792,27815,27838,27861,<br>
27884,27907,27930,27953,27976,27999,<br> 28022,28045,28068,28091,28114,28137,<br> 28160,28181,28202,28223,28244,28265,<br>
28286,28307,28326,28344,28362,28380,<br> 28398,28416,28434,28452,28470,28488,<br> 28506,28524,28542,28560,28578,28596,<br>
28614],<br> <0.676.0>}},<br> {'$gen_cast',{ack,[27578,27601,27624,27647,27670,27693,<br> 27716,27739,27762,27785,27808,27831,<br>
27854,27877,27900,27923,27946,27969,<br> 27992,28015,28038,28061,28084,28107,<br> 28130,28153,28174,28195,28216,28237,<br>
28258,28279,28300,28321,28340,28358,<br> 28376,28394,28412,28430,28448,28466,<br> 28484,28502,28520,28538,28556,28574,<br>
28592,28610],<br> <0.716.0>}},<br> {'$gen_cast',{resume,<0.716.0>}},<br> {'$gen_cast',{ack,[27582,27605,27628,27651,27674,27697,<br>
27720,27743,27766,27789,27812,27835,<br> 27858,27881,27904,27927,27950,27973,<br> 27996,28019,28042,28065,28088,28111,<br>
28134,28157,28178,28199,28220,28241,<br> 28262,28283,28304,28323,28341,28359,<br> 28377,28395,28413,28431,28449,28467,<br>
28485,28503,28521,28539,28557,28575,<br> 28593,28611],<br> <0.737.0>}}]<br> links: [<0.300.0>]<br>
dictionary: [{{ch,<0.729.0>},<br> {cr,<0.729.0>,#Ref<0.0.0.12296>,<br> {[28715,28696,28677,28658,28639,28620,28602,28584,<br> 28566,28548,28530,28512,28494,28476,28458,28440,<br>
28422,28404,28386,28368,28350,28332,28313,28292,<br> 28271,28250,28229,28208,28187,28166,28143,28120,<br> 28097,28074,28051,28028,28005,27982,27959,27936,<br>
27913,27890,27867,27844,27821,27798,27775,27752,<br> 27729,27706,27683,27660,27637,27614],<br> [27591]},<br> 1,<br> {[],[]},<br>
{qstate,<0.728.0>,active,{0,nil}},<br> 11}},<br> {{credit_to,<7350.877.0>},22},<br> {{ch,<0.608.0>},<br> {cr,<0.608.0>,#Ref<0.0.0.9041>,<br>
{[28708,28689,28670,28651,28632,28607,28589,28571,<br> 28553,28535,28517,28499,28481,28463,28445,28427,<br> 28409,28391,28373,28355,28337,28318,28297,28276,<br>
28255,28234,28213,28192,28171,28150,28127,28104,<br> 28081,28058,28035,28012,27989,27966,27943,27920,<br> 27897,27874,27851,27828,27805,27782,27759,27736,<br>
27713,27690,27667,27644,27621,27598],<br> [27575]},<br> 1,<br> {[],[]},<br> {qstate,<0.607.0>,active,{0,nil}},<br>
38}},<br> {{ch,<0.624.0>},<br> {cr,<0.624.0>,#Ref<0.0.0.9436>,<br> {[28722,28703,28684,28665,28646],[28627]},<br> 1,<br>
{[],[]},<br> {qstate,<0.623.0>,active,{0,nil}},<br> 36}},<br> {{ch,<0.696.0>},<br> {cr,<0.696.0>,#Ref<0.0.0.11414>,<br>
{[],[]},<br> 2,<br> {[],[]},<br> {qstate,<0.695.0>,active,{0,nil}},<br> 24}},<br> {{ch,<0.564.0>},<br>
{cr,<0.564.0>,#Ref<0.0.0.7285>,<br> {[28718,28699,28680,28661,28642],[28623]},<br> 1,<br> {[],[]},<br> {qstate,<0.563.0>,active,{0,nil}},<br>
12}},<br> {{ch,<0.644.0>},<br> {cr,<0.644.0>,#Ref<0.0.0.10071>,<br> {[28713,28694,28675,28656,28637,28618,28600,28582,<br> 28564,28546,28528,28510,28492,28474,28456,28438,<br>
28420,28402,28384,28366,28348,28330,28311,28290,<br> 28269,28248,28227,28206,28185,28164,28141,28118,<br> 28095,28072,28049,28026,28003,27980,27957,27934,<br>
27911,27888,27865,27842,27819,27796,27773,27750,<br> 27727,27704,27681,27658,27635,27612],<br> [27589]},<br> 1,<br> {[],[]},<br>
{qstate,<0.643.0>,active,{0,nil}},<br> 11}},<br> {{ch,<0.676.0>},<br> {cr,<0.676.0>,#Ref<0.0.0.11309>,<br> {[28614,28613,28596,28595,28578,28577,28560,28559,<br>
28542,28541,28524,28523,28506,28505,28488,28487,<br> 28470,28469,28452,28451,28434,28433,28416,28415,<br> 28398,28397,28380,28379,28362,28361,28344,28343,<br>
28326,28325,28307,28306,28286,28285,28265,28264,<br> 28244,28243,28223,28222,28202,28201,28181,28180,<br> 28160,28159,28137,28136,28114,28113,28091,28090,<br>
28068,28067,28045,28044,28022,28021,27999,27998,<br> 27976,27975,27953,27952,27930,27929,27907,27906,<br> 27884,27883,27861,27860,27838,27837,27815,27814,<br>
27792,27791,27769,27768,27746,27745,27723,27722,<br> 27700,27699,27677,27676,27654,27653,27631,27630,<br> 27608,27607,27585],<br> [27584]},<br>
2,<br> {[{<0.676.0>,<br> {consumer,<<"amq.ctag-0cvDGmyaAbEW_j15AfhGHQ">>,<br> true}}],<br>
[{<0.676.0>,<br> {consumer,<<"amq.ctag-YqU04SIfQv6P-QVueEjbRg">>,<br> true}}]},<br> {qstate,<0.675.0>,suspended,{0,nil}},<br>
25}},<br> {{xtype_to_module,direct},rabbit_exchange_type_direct},<br> {{credit_to,<7350.833.0>},31},<br> {{ch,<0.745.0>},<br> {cr,<0.745.0>,#Ref<0.0.0.12674>,<br>
{[28716,28697,28678,28659,28640,28621,28603,28585,<br> 28567,28549,28531,28513,28495,28477,28459,28441,<br> 28423,28405,28387,28369,28351,28333,28314,28293,<br>
28272,28251,28230,28209,28188,28167,28144,28121,<br> 28098,28075,28052,28029,28006,27983,27960,27937,<br> 27914,27891,27868,27845,27822,27799,27776,27753,<br>
27730,27707,27684,27661,27638,27615],<br> [27592]},<br> 1,<br> {[],[]},<br> {qstate,<0.744.0>,active,{0,nil}},<br>
37}},<br> {{ch,<0.660.0>},<br> {cr,<0.660.0>,#Ref<0.0.0.10650>,<br> {[28712,28693,28674,28655,28636,28617,28599,28581,<br> 28563,28545,28527,28509,28491,28473,28455,28437,<br>
28419,28401,28383,28365,28347,28329,28310,28289,<br> 28268,28247,28226,28205,28184,28163,28140,28117,<br> 28094,28071,28048,28025,28002,27979,27956,27933,<br>
27910,27887,27864,27841,27818,27795,27772,27749,<br> 27726,27703,27680,27657,27634,27611],<br> [27588]},<br> 1,<br> {[],[]},<br>
{qstate,<0.659.0>,active,{0,nil}},<br> 11}},<br> {{credit_to,<7350.837.0>},32},<br> {{credit_to,<7350.881.0>},17},<br> {{ch,<0.737.0>},<br>
{cr,<0.737.0>,#Ref<0.0.0.12429>,<br> {[28612,28611,28594,28593,28576,28575,28558,28557,<br> 28540,28539,28522,28521,28504,28503,28486,28485,<br>
28468,28467,28450,28449,28432,28431,28414,28413,<br> 28396,28395,28378,28377,28360,28359,28342,28341,<br> 28324,28323,28305,28304,28284,28283,28263,28262,<br>
28242,28241,28221,28220,28200,28199,28179,28178,<br> 28158,28157,28135,28134,28112,28111,28089,28088,<br> 28066,28065,28043,28042,28020,28019,27997,27996,<br>
27974,27973,27951,27950,27928,27927,27905,27904,<br> 27882,27881,27859,27858,27836,27835,27813,27812,<br> 27790,27789,27767,27766,27744,27743,27721,27720,<br>
27698,27697,27675,27674,27652,27651,27629,27628,<br> 27606,27605,27583],<br> [27582]},<br> 2,<br> {[{<0.737.0>,<br>
{consumer,<<"amq.ctag-ITZqhulzHhUzx1uy1Eot3g">>,<br> true}}],<br> [{<0.737.0>,<br> {consumer,<<"amq.ctag-hpo5ejdoJRdLrT6JS42jGw">>,<br>
true}}]},<br> {qstate,<0.736.0>,suspended,{0,nil}},<br> 11}},<br> {credit_blocked,[]},<br> {{credit_to,<7350.854.0>},3},<br>
{{ch,<0.716.0>},<br> {cr,<0.716.0>,#Ref<0.0.0.11820>,<br> {[28610,28609,28592,28591,28574,28573,28556,28555,<br> 28538,28537,28520,28519,28502,28501,28484,28483,<br>
28466,28465,28448,28447,28430,28429,28412,28411,<br> 28394,28393,28376,28375,28358,28357,28340,28339,<br> 28321,28320,28300,28299,28279,28278,28258,28257,<br>
28237,28236,28216,28215,28195,28194,28174,28173,<br> 28153,28152,28130,28129,28107,28106,28084,28083,<br> 28061,28060,28038,28037,28015,28014,27992,27991,<br>
27969,27968,27946,27945,27923,27922,27900,27899,<br> 27877,27876,27854,27853,27831,27830,27808,27807,<br> 27785,27784,27762,27761,27739,27738,27716,27715,<br>
27693,27692,27670,27669,27647,27646,27624,27623,<br> 27601,27600,27578],<br> [27577]},<br> 2,<br> {[{<0.716.0>,<br>
{consumer,<<"amq.ctag-BTlS-GmbXNpcU-wUlMI60Q">>,<br> true}}],<br> [{<0.716.0>,<br> {consumer,<<"amq.ctag-loc5monF8M1pz_Bb8ypumw">>,<br>
true}}]},<br> {qstate,<0.715.0>,suspended,{0,nil}},<br> 44}},<br> {{ch,<0.782.0>},<br> {cr,<0.782.0>,#Ref<0.0.0.15649>,<br>
{[28714,28695,28676,28657,28638,28619,28601,28583,<br> 28565,28547,28529,28511,28493,28475,28457,28439,<br> 28421,28403,28385,28367,28349,28331,28312,28291,<br>
28270,28249,28228,28207,28186,28165,28142,28119,<br> 28096,28073,28050,28027,28004,27981,27958,27935,<br> 27912,27889,27866,27843,27820,27797,27774,27751,<br>
27728,27705,27682,27659,27636,27613],<br> [27590]},<br> 1,<br> {[],[]},<br> {qstate,<0.781.0>,active,{0,nil}},<br>
35}},<br> {{credit_to,<7350.846.0>},22},<br> {{ch,<0.684.0>},<br> {cr,<0.684.0>,#Ref<0.0.0.11198>,<br> {[28711,28692,28673,28654,28635,28616,28598,28580,<br>
28562,28544,28526,28508,28490,28472,28454,28436,<br> 28418,28400,28382,28364,28346,28328,28309,28288,<br> 28267,28246,28225,28204,28183,28162,28139,28116,<br>
28093,28070,28047,28024,28001,27978,27955,27932,<br> 27909,27886,27863,27840,27817,27794,27771,27748,<br> 27725,27702,27679,27656,27633,27610],<br> [27587]},<br>
1,<br> {[],[]},<br> {qstate,<0.683.0>,active,{0,nil}},<br> 11}},<br> {{ch,<0.692.0>},<br> {cr,<0.692.0>,#Ref<0.0.0.11088>,<br>
{[28710,28691,28672,28653,28634,28615,28597,28579,<br> 28561,28543,28525,28507,28489,28471,28453,28435,<br> 28417,28399,28381,28363,28345,28327,28308,28287,<br>
28266,28245,28224,28203,28182,28161,28138,28115,<br> 28092,28069,28046,28023,28000,27977,27954,27931,<br> 27908,27885,27862,27839,27816,27793,27770,27747,<br>
27724,27701,27678,27655,27632,27609],<br> [27586]},<br> 1,<br> {[],[]},<br> {qstate,<0.691.0>,active,{0,nil}},<br>
12}},<br> {{ch,<0.648.0>},<br> {cr,<0.648.0>,#Ref<0.0.0.10283>,<br> {[28717,28698,28679,28660,28641],[28622]},<br> 1,<br>
{[],[]},<br> {qstate,<0.647.0>,active,{0,nil}},<br> 12}},<br> {guid,{{<a href="tel:2296174761" value="+12296174761" target="_blank">2296174761</a>,591959305,1562876388,1822892473},1}},<br>
{{ch,<0.720.0>},<br> {cr,<0.720.0>,#Ref<0.0.0.12036>,<br> {[28723,28704,28685,28666,28647],[28628]},<br> 1,<br> {[],[]},<br>
{qstate,<0.719.0>,active,{0,nil}},<br> 37}},<br> {{credit_to,<7350.869.0>},39},<br> {{ch,<0.636.0>},<br> {cr,<0.636.0>,#Ref<0.0.0.9960>,<br>
{[28705,28686,28667,28648,28629,28604,28586,28568,<br> 28550,28532,28514,28496,28478,28460,28442,28424,<br> 28406,28388,28370,28352,28334,28315,28294,28273,<br>
28252,28231,28210,28189,28168,28147,28124,28101,<br> 28078,28055,28032,28009,27986,27963,27940,27917,<br> 27894,27871,27848,27825,27802,27779,27756,27733,<br>
27710,27687,27664,27641,27618,27595],<br> [27572]},<br> 1,<br> {[],[]},<br> {qstate,<0.635.0>,active,{0,nil}},<br>
15}},<br> {{credit_to,<7350.850.0>},50},<br> {{ch,<0.616.0>},<br> {cr,<0.616.0>,#Ref<0.0.0.9300>,<br> {[28719,28700,28681,28662,28643],[28624]},<br>
1,<br> {[],[]},<br> {qstate,<0.615.0>,active,{0,nil}},<br> 12}},<br> {{ch,<0.509.0>},<br> {cr,<0.509.0>,#Ref<0.0.0.2243>,<br>
{[28302,28281,28260,28239,28218,28197,28176,28155,<br> 28132,28109,28086,28063,28040,28017,27994,27971],<br> [27580,27603,27626,27649,27672,27695,27718,27741,<br>
27764,27787,27810,27833,27856,27879,27902,27925,<br> 27948]},<br> 3,<br> {[],[]},<br> {qstate,<0.508.0>,active,{0,nil}},<br>
36}},<br> {{credit_to,<7350.873.0>},17},<br> {{ch,<0.588.0>},<br> {cr,<0.588.0>,#Ref<0.0.0.7808>,<br> {[28721,28702,28683,28664,28645],[28626]},<br>
1,<br> {[],[]},<br> {qstate,<0.587.0>,active,{0,nil}},<br> 12}},<br> {{ch,<0.664.0>},<br> {cr,<0.664.0>,#Ref<0.0.0.10772>,<br>
{[28720,28701,28682,28663,28644],[28625]},<br> 1,<br> {[],[]},<br> {qstate,<0.663.0>,active,{0,nil}},<br> 13}},<br>
{fhc_age_tree,{0,nil}},<br> {{ch,<0.708.0>},<br> {cr,<0.708.0>,#Ref<0.0.0.11695>,<br> {[28707,28688,28669,28650,28631,28606,28588,28570,<br>
28552,28534,28516,28498,28480,28462,28444,28426,<br> 28408,28390,28372,28354,28336,28317,28296,28275,<br> 28254,28233,28212,28191,28170,28149,28126,28103,<br>
28080,28057,28034,28011,27988,27965,27942,27919,<br> 27896,27873,27850,27827,27804,27781,27758,27735,<br> 27712,27689,27666,27643,27620,27597],<br> [27574]},<br>
1,<br> {[],[]},<br> {qstate,<0.707.0>,active,{0,nil}},<br> 35}},<br> {{credit_to,<7350.829.0>},13},<br>
{{ch,<0.757.0>},<br> {cr,<0.757.0>,#Ref<0.0.0.13150>,<br> {[28706,28687,28668,28649,28630,28605,28587,28569,<br> 28551,28533,28515,28497,28479,28461,28443,28425,<br>
28407,28389,28371,28353,28335,28316,28295,28274,<br> 28253,28232,28211,28190,28169,28148,28125,28102,<br> 28079,28056,28033,28010,27987,27964,27941,27918,<br>
27895,27872,27849,27826,27803,27780,27757,27734,<br> 27711,27688,27665,27642,27619,27596],<br> [27573]},<br> 1,<br> {[],[]},<br>
{qstate,<0.756.0>,active,{0,nil}},<br> 49}},<br> {{ch,<0.572.0>},<br> {cr,<0.572.0>,#Ref<0.0.0.7536>,<br> {[28709,28690,28671,28652,28633,28608,28590,28572,<br>
28554,28536,28518,28500,28482,28464,28446,28428,<br> 28410,28392,28374,28356,28338,28319,28298,28277,<br> 28256,28235,28214,28193,28172,28151,28128,28105,<br>
28082,28059,28036,28013,27990,27967,27944,27921,<br> 27898,27875,27852,27829,27806,27783,27760,27737,<br> 27714,27691,27668,27645,27622,27599],<br> [27576]},<br>
1,<br> {[],[]},<br> {qstate,<0.571.0>,active,{0,nil}},<br> 35}},<br> {{credit_to,<7350.825.0>},2},<br> {{credit_to,<7350.858.0>},3}]<br>
trap_exit: true<br> status: running<br> heap_size: 121536<br> stack_size: 27<br> reductions: 9363545<br> neighbours:<br><br>=SUPERVISOR REPORT==== 18-May-2013::01:41:51 ===<br> Supervisor: {local,rabbit_amqqueue_sup}<br>
Context: child_terminated<br> Reason: {{case_clause,{empty,{[],[]}}},<br> [{rabbit_amqqueue_process,subtract_acks,3,[]},<br> {rabbit_amqqueue_process,subtract_acks,4,[]},<br>
{rabbit_amqqueue_process,handle_cast,2,[]},<br> {gen_server2,handle_msg,2,[]},<br> {proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,249}]}]}<br> Offender: [{pid,<0.301.0>},<br>
{name,rabbit_amqqueue},<br> {mfa,<br> {rabbit_amqqueue_process,start_link,<br> [{amqqueue,<br> {resource,<<"/">>,queue,<br>
<<"document_queue">>},<br> true,false,none,[],<0.7622.0>,[],[],undefined,<br> []}]}},<br> {restart_type,temporary},<br>
{shutdown,4294967295},<br> {child_type,worker}]</blockquote><div><br></div><div>Notice how the <b>case_clause </b>is <b>empty</b>. The other rabbit log looks similar:</div><div><br></div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"># cat /var/log/rabbitmq/rabbit@ip-10-123-123-123.log </blockquote>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"> </blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
=ERROR REPORT==== 18-May-2013::01:41:48 ===</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">** Generic server <0.301.0> terminating</blockquote>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">** Last message in was {'$gen_cast',</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
{ack,</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"> [27585,27584,27607,27630,27653,27676,27699,</blockquote>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"> 27722,27745,27768,27791,27814,27837,27860,</blockquote>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"> 27883,27906,27929,27952,27975,27998,28021,</blockquote>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"> 28044,28067,28090,28113,28136,28159,28180,</blockquote>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"> 28201,28222,28243,28264,28285,28306,28325,</blockquote>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"> 28343,28361,28379,28397,28415,28433,28451,</blockquote>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"> 28469,28487,28505,28523,28541,28559,28577,</blockquote>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"> 28595,28613],</blockquote>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"> <0.676.0>}}</blockquote>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">** When Server state == {q,</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
{amqqueue,</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"> {resource,<<"/">>,queue,</blockquote>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"> <<"document_queue">>},</blockquote>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"> true,false,none,[],<0.301.0>,[],[],undefined,[]},</blockquote>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"> none,true,rabbit_variable_queue,</blockquote>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"> {vqstate,</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
{0,{[],[]}},</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"> {0,{[],[]}},</blockquote>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"> {delta,undefined,0,undefined},</blockquote>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"> {0,{[],[]}},</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
{0,{[],[]}}, </blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><b><font color="#0000ff"> ... [ thousands of lines of state ] ... </font></b></blockquote>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"> </blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
** Reason for termination ==</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">** {{case_clause,{empty,{[],[]}}},</blockquote>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"> [{rabbit_amqqueue_process,subtract_acks,3,[]},</blockquote>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"> {rabbit_amqqueue_process,subtract_acks,4,[]},</blockquote>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"> {rabbit_amqqueue_process,handle_cast,2,[]},</blockquote>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"> {gen_server2,handle_msg,2,[]},</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
{proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,249}]}]}</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<br></blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">=ERROR REPORT==== 18-May-2013::01:41:51 ===</blockquote>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">connection <0.430.0>, channel 22 - soft error:</blockquote>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">{amqp_error,not_found,</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
"no queue 'document_queue' in vhost '/'",</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
'queue.declare'}</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><br></blockquote>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">=ERROR REPORT==== 18-May-2013::01:41:51 ===</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
connection <0.384.0>, channel 12 - soft error:</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
{amqp_error,not_found,</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"> "no queue 'document_queue' in vhost '/'",</blockquote>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"> 'queue.declare'}</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<br></blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">=ERROR REPORT==== 18-May-2013::01:41:51 ===</blockquote>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">connection <0.384.0>, channel 15 - soft error:</blockquote>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">{amqp_error,not_found,</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
"no queue 'document_queue' in vhost '/'",</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
'queue.declare'}</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><br></blockquote>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">=ERROR REPORT==== 18-May-2013::01:41:51 ===</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
connection <0.373.0>, channel 23 - soft error:</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
{amqp_error,not_found,</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"> "no queue 'document_queue' in vhost '/'",</blockquote>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"> 'queue.declare'} </blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<b><font color="#0000ff"> ... [ repeated a bunch of times ] ...</font></b></blockquote><div><br></div><div>I also see this in my <b>/var/log/syslog</b> with regards to <b>beam.smp</b>:</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
# cat /var/log/syslog<br>May 17 22:46:23 ip-10-123-123-123 kernel: [244800.176041] INFO: task beam.smp:18971 blocked for more than 120 seconds.<br>May 17 22:46:23 ip-10-123-123-123 kernel: [244800.176053] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.<br>
May 17 22:46:23 ip-10-123-123-123 kernel: [244800.176060] beam.smp D ffff8801bfd93700 0 18971 18962 0x00000000<br>May 17 22:46:23 ip-10-123-123-123 kernel: [244800.176065] ffff8801b11cdcb8 0000000000000282 0000000000000000 ffffffffffffffe0<br>
May 17 22:46:23 ip-10-123-123-123 kernel: [244800.176070] ffff8801b11cdfd8 ffff8801b11cdfd8 ffff8801b11cdfd8 0000000000013700<br>May 17 22:46:23 ip-10-123-123-123 kernel: [244800.176075] ffff8801b3180000 ffff8801b35244a0 00007f500e2079e0 ffff8801b146bb80<br>
May 17 22:46:23 ip-10-123-123-123 kernel: [244800.176079] Call Trace:<br>May 17 22:46:23 ip-10-123-123-123 kernel: [244800.176089] [<ffffffff8165434f>] schedule+0x3f/0x60<br>May 17 22:46:23 ip-10-123-123-123 kernel: [244800.176094] [<ffffffff8106aef5>] exit_mm+0x85/0x130<br>
May 17 22:46:23 ip-10-123-123-123 kernel: [244800.176097] [<ffffffff8106b10e>] do_exit+0x16e/0x450<br>May 17 22:46:23 ip-10-123-123-123 kernel: [244800.176102] [<ffffffff810797aa>] ? __dequeue_signal+0x6a/0xb0<br>
May 17 22:46:23 ip-10-123-123-123 kernel: [244800.176105] [<ffffffff8106b594>] do_group_exit+0x44/0xa0<br>May 17 22:46:23 ip-10-123-123-123 kernel: [244800.176108] [<ffffffff8107c36c>] get_signal_to_deliver+0x21c/0x420<br>
May 17 22:46:23 ip-10-123-123-123 kernel: [244800.176113] [<ffffffff81014825>] do_signal+0x45/0x130<br>May 17 22:46:23 ip-10-123-123-123 kernel: [244800.176116] [<ffffffff8105539d>] ? set_next_entity+0xad/0xd0<br>
May 17 22:46:23 ip-10-123-123-123 kernel: [244800.176120] [<ffffffff81014ad5>] do_notify_resume+0x65/0x80<br>May 17 22:46:23 ip-10-123-123-123 kernel: [244800.176124] [<ffffffff8165ea90>] int_signal+0x12/0x17<br>
<b><font color="#0000ff">... [ repeated 9 more times with same timestamp ] ...</font></b></blockquote><div><br></div><div>There are no strange CPU spikes on my EC2 node based on AWS monitoring. The only thing that seems to stand out is that my writes to my EBS volume spike during the crash (~90 write OPs/sec, 4 MiB/sec write bandwidth, ~30ms/op write latency).</div>
<div><br></div><div>My team and I are completely stumped and would appreciate any insight you might have to the crash reports. Is this a problem with writes hanging when going against the EBS volume since EBS volumes are essentially network mounted? Is there a way to configure rabbit to be more tolerant of these delays? </div>
<div><br></div><div><br></div><div>For debugging purposes, below is the status output of one of the nodes when it is healthy (so you can see the versions of everything i'm running):<br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
# rabbitmqctl status<br>Status of node 'rabbit@ip-10-123-123-123' ...<br>[{pid,28978},<br> {running_applications,<br> [{rabbitmq_management,"RabbitMQ Management Console","3.1.0"},<br> {rabbitmq_web_dispatch,"RabbitMQ Web Dispatcher","3.1.0"},<br>
{webmachine,"webmachine","1.9.1-rmq3.1.0-git52e62bc"},<br> {mochiweb,"MochiMedia Web Server","2.3.1-rmq3.1.0-gitd541e9a"},<br> {rabbitmq_management_agent,"RabbitMQ Management Agent","3.1.0"},<br>
{rabbit,"RabbitMQ","3.1.0"},<br> {os_mon,"CPO CXC 138 46","2.2.11"},<br> {inets,"INETS CXC 138 49","5.9.4"},<br> {mnesia,"MNESIA CXC 138 12","4.8"},<br>
{amqp_client,"RabbitMQ AMQP Client","3.1.0"},<br> {xmerl,"XML parser","1.3.3"},<br> {sasl,"SASL CXC 138 11","2.3.1"},<br> {stdlib,"ERTS CXC 138 10","1.19.1"},<br>
{kernel,"ERTS CXC 138 10","2.16.1"}]},<br> {os,{unix,linux}},<br> {erlang_version,<br> "Erlang R16B (erts-5.10.1) [source-05f1189] [64-bit] [smp:8:8] [async-threads:30] [hipe] [kernel-poll:true]\n"},<br>
{memory,<br> [{total,60179048},<br> {connection_procs,5274808},<br> {queue_procs,<a href="tel:13107264" value="+13107264" target="_blank">13107264</a>},<br> {plugins,354272},<br> {other_proc,9741830},<br>
{mnesia,150456},<br> {mgmt_db,12248},<br> {msg_index,34768},<br> {other_ets,1148248},<br> {binary,4905584},<br> {code,19565863},<br> {atom,703377},<br> {other_system,5180330}]},<br>
{vm_memory_high_watermark,0.8},<br> {vm_memory_limit,5828434329},<br> {disk_free_limit,1000000000},<br> {disk_free,3953344512},<br> {file_descriptors,<br> [{total_limit,924},{total_used,6},{sockets_limit,829},{sockets_used,4}]},<br>
{processes,[{limit,1048576},{used,348}]},<br> {run_queue,0},<br> {uptime,3796}]<br>...done.</blockquote><div><br></div><div>Note that I am using Erlang v16 from the Erlang Solutions repo (esl-erlang package). I originally was using Erlang v14 from Ubuntu repos (erlang-base), but decided to upgrade hoping it would resolve the issue (which it did not). I have also tried running RabbitMQ 3.0.4 without success. This issue seems to affect both versions.</div>
<div><br></div><div>Thanks,</div><div>Karl</div><br>_______________________________________________<br>
rabbitmq-discuss mailing list<br>
<a href="mailto:rabbitmq-discuss@lists.rabbitmq.com">rabbitmq-discuss@lists.rabbitmq.com</a><br>
<a href="https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss" target="_blank">https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss</a><br>
<br></blockquote></div><br></div>