No subject


Tue Apr 12 10:32:41 BST 2011


d
it never goes down.

PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
13997 root      20   0 3184m 2.5g   2324 S  210      21.2       8:17.97
 beam.smp

Feel free to let me know if you need more info. I can provide you with
memory dumps and stack traces if required.

Thanks a lot for your help.
Praveen



On Fri, Sep 9, 2011 at 3:12 AM, Alexandru Scvor=C5=A3ov
<alexandru at rabbitmq.com>wrote:

> Hi Praveen,
>
> > However I realized when i wanted to shutdown the broker before starting=
 a
> > new test, the stop command (rabbitmqctl stop) took a long time
> > to complete.
>
> We are aware of this problem.  The fix is currently going through QA and
> will probably be in the next release, which should be around fairly
> soon.
>
> > Query 1)
> > I am curious as to what causes the latency to stop the broker when issu=
ed
> a
> > rabbitmqctl stop command. It seems to be something to do with the numbe=
r
> of
> > queues created as the stop time increase proportionally as the number o=
f
> > queues increases.
>
> Internally, when we terminate a queue, we do a few file operations.  This
> is usually not a problem, but when you close a connection with 100 000s
> of queues, the same order of file operations get scheduled.  Erlang's
> IO system then does some expensive operations of this long queue and
> it ends up processing the operations in quadratic time.  The fix going
> through QA brings this down to linear time; for instance, I can delete
> 40k queues in 20s (compared to 211s on the latest release).
>
> > Query 2)
> > In the case of durable queues, I measured the time taken to restart the
> > broker after stopping it (a clean and unclean stop).
> > I found that even after a clean/unclean stop the time to restart the
> broker
> > was just about 20 seconds on an average.
> > However, in the case where i created 50000 durable queues and did an
> unclean
> > stop(just aborted the broker) and tried to restart the broker it didn't
> > start for over to 6 minutes (when I gave up)...
> > It was hung in the step of "starting exchange,queue and binding
> recovery.."
> > It will be great if someone could explain why this could be caused.
>
> I can't reproduce this.  Declaring 100 000 durable queues, killing the
> broker
> and re-starting it seems to work fine.  It takes about 1 min on my
> machine.
>
> Is it really hung?  Is it using the CPU or disk at all at this time?  Is
> there anything in the logs (both the rabbit and SASL logs)?
>
> > It will be great if someone could answer the above queries or provide m=
e
> > with some pointers about the same.
>
> There's not much you can do at the moment except avoiding terminating a
> large number of queues at the same time.
>
> Hope this clears things up.
>
> Cheers,
> Alex
>
> On Thu, Sep 08, 2011 at 07:38:14PM -0700, Praveen M wrote:
> > Hi,
> >
> > I'm a rabbitmq newbie and am trying to run some experiments to figure o=
ut
> if
> > rabbitmq would serve my use case.
> >
> > I would like to create queues in the order of 100,000s. (one for each o=
f
> my
> > customers).
> >
> > I ran various tests,
> >
> > I'm using the latest 2.6.0 server and 2.6.0 client, and the following
> tests
> > in durable queues mode and in non-durable queues mode.
> >
> > Tests,
> > 1) to create 1000 queues , produce, consume
> > 2) to create 10000 queues , produce, consume
> > 3) to create 50000 queues, produce and consume.
> >
> > It works like a charm and the memory usage even with 50,000 queues seem
> very
> > reasonable. (the order of 1-1.7G)
> >
> > However I realized when i wanted to shutdown the broker before starting=
 a
> > new test, the stop command (rabbitmqctl stop) took a long time
> > to complete.
> >
> > I made a small chart of how long the stop command on the broker takes t=
o
> > execute after the test creates 'N' queues listed below.
> > Also, in the case of durable queues, i found some weird numbers for the
> time
> > taken to restart the queues after a clean/unclean(aborting broker) stop
> >
> > *NON_DURABLE_QUEUES TEST*
> > *No of Queues Stop Time*
> > 1000 10.7 seconds
> > 10000 2 minutes
> > 50000 11 minutes
> >
> > *
> > *
> > *DURABLE_QUEUES TEST
> > *No of Queues Start Time Stop Time*
> > 1000 2 seconds 10 seconds
> > 10000 24 seconds 2 minutes
> > 10000 after crash it recovers in 20 seconds (on improper shutdown).
> > 50000 even at 6 minutes the queues doesn't start on a improper shutdown
> >
> >
> > Query 1)
> > I am curious as to what causes the latency to stop the broker when issu=
ed
> a
> > rabbitmqctl stop command. It seems to be something to do with the numbe=
r
> of
> > queues created as the stop time increase proportionally as the number o=
f
> > queues increases.
> >
> > Query 2)
> > In the case of durable queues, I measured the time taken to restart the
> > broker after stopping it (a clean and unclean stop).
> > I found that even after a clean/unclean stop the time to restart the
> broker
> > was just about 20 seconds on an average.
> > However, in the case where i created 50000 durable queues and did an
> unclean
> > stop(just aborted the broker) and tried to restart the broker it didn't
> > start for over to 6 minutes (when I gave up)...
> > It was hung in the step of "starting exchange,queue and binding
> recovery.."
> > It will be great if someone could explain why this could be caused.
> >
> > It will be great if someone could answer the above queries or provide m=
e
> > with some pointers about the same.
> >
> > Thank you for your help,
> > --
> > -Praveen
>
> > _______________________________________________
> > rabbitmq-discuss mailing list
> > rabbitmq-discuss at lists.rabbitmq.com
> > https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>
>


--=20
-Praveen

--0016368340188e595004ac86bec2
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Hi Alex, thanks for your email. That helped a lot.=C2=A0<div><br></div><div=
>To answer your question about the hang in the &quot;starting exchange, que=
ue and binding recovery..&quot; step on creating 100,000 durable queues and=
 restarting the broker,=C2=A0</div>
<div><br></div><div><b>Is it really hung? =C2=A0Is it using the CPU or disk=
 at all at this time? =C2=A0Is<br>there anything in the logs (both the rabb=
it and SASL logs)?</b><br></div><div><b><br></b></div><div>The SASL log doe=
sn&#39;t have anything. But the rabbit log has something.=C2=A0</div>
<div><br></div><div>I have attached the .log file for your reference.=C2=A0=
</div><div><br></div><div>It says that it is rebuilding the index from scra=
tch..and that mnesia is overloaded with =C2=A0write_threshold and then time=
_threshold.</div>
<div>I&#39;m not very sure I understand what they really mean. :(</div><div=
><br></div><div>My /etc/rabbitmq/rabbitmq.config file entry is as follows:<=
/div><div><br></div><div><div>[ {mnesia, [{dump_log_write_threshold, 50000}=
, {dc_dump_limit, 40}]},</div>
<div>{rabbit, [{vm_memory_high_watermark, 0.34}]}].</div></div><div><br></d=
iv><div>Can you please tell me if these configs are ok, or am I missing som=
ething?</div><div><br></div><div>Also, I checked the IO and CPU...when I ju=
st start the broker after the 100,000 queues creation</div>
<div>both IO and CPU shoots up for the first minute, but then when everythi=
ng required is fetched to=C2=A0</div><div>memory there is no activity in IO=
. But CPU consistently stays up.</div><div><br></div><div>From top =C2=A0th=
e values are like below ~ and the CPU almost always stays up and it never g=
oes down.</div>
<div><br></div><div><div><div>PID USER =C2=A0 =C2=A0 =C2=A0PR =C2=A0NI =C2=
=A0VIRT =C2=A0RES =C2=A0SHR S %CPU %MEM =C2=A0 =C2=A0TIME+ =C2=A0COMMAND =
=C2=A0=C2=A0</div></div></div><div>13997 root =C2=A0 =C2=A0 =C2=A020 =C2=A0=
 0 3184m 2.5g =C2=A0 2324 S =C2=A0210 =C2=A0 =C2=A0 =C2=A021.2 =C2=A0 =C2=
=A0 =C2=A0 8:17.97 =C2=A0 =C2=A0 =C2=A0beam.smp =C2=A0 =C2=A0=C2=A0</div><d=
iv>
<br></div><div>Feel free to let me know if you need more info. I can provid=
e you with memory dumps and stack traces if required.</div><div><br></div><=
div>Thanks a lot for your help.</div><div>Praveen</div><div><br></div><div>
<br><br><div class=3D"gmail_quote">On Fri, Sep 9, 2011 at 3:12 AM, Alexandr=
u Scvor=C5=A3ov <span dir=3D"ltr">&lt;<a href=3D"mailto:alexandru at rabbitmq.=
com">alexandru at rabbitmq.com</a>&gt;</span> wrote:<br><blockquote class=3D"g=
mail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-l=
eft:1ex;">
Hi Praveen,<br>
<div class=3D"im"><br>
&gt; However I realized when i wanted to shutdown the broker before startin=
g a<br>
&gt; new test, the stop command (rabbitmqctl stop) took a long time<br>
&gt; to complete.<br>
<br>
</div>We are aware of this problem. =C2=A0The fix is currently going throug=
h QA and<br>
will probably be in the next release, which should be around fairly<br>
soon.<br>
<div class=3D"im"><br>
&gt; Query 1)<br>
&gt; I am curious as to what causes the latency to stop the broker when iss=
ued a<br>
&gt; rabbitmqctl stop command. It seems to be something to do with the numb=
er of<br>
&gt; queues created as the stop time increase proportionally as the number =
of<br>
&gt; queues increases.<br>
<br>
</div>Internally, when we terminate a queue, we do a few file operations. =
=C2=A0This<br>
is usually not a problem, but when you close a connection with 100 000s<br>
of queues, the same order of file operations get scheduled. =C2=A0Erlang&#3=
9;s<br>
IO system then does some expensive operations of this long queue and<br>
it ends up processing the operations in quadratic time. =C2=A0The fix going=
<br>
through QA brings this down to linear time; for instance, I can delete<br>
40k queues in 20s (compared to 211s on the latest release).<br>
<div class=3D"im"><br>
&gt; Query 2)<br>
&gt; In the case of durable queues, I measured the time taken to restart th=
e<br>
&gt; broker after stopping it (a clean and unclean stop).<br>
&gt; I found that even after a clean/unclean stop the time to restart the b=
roker<br>
&gt; was just about 20 seconds on an average.<br>
&gt; However, in the case where i created 50000 durable queues and did an u=
nclean<br>
&gt; stop(just aborted the broker) and tried to restart the broker it didn&=
#39;t<br>
&gt; start for over to 6 minutes (when I gave up)...<br>
&gt; It was hung in the step of &quot;starting exchange,queue and binding r=
ecovery..&quot;<br>
&gt; It will be great if someone could explain why this could be caused.<br=
>
<br>
</div>I can&#39;t reproduce this. =C2=A0Declaring 100 000 durable queues, k=
illing the broker<br>
and re-starting it seems to work fine. =C2=A0It takes about 1 min on my<br>
machine.<br>
<br>
Is it really hung? =C2=A0Is it using the CPU or disk at all at this time? =
=C2=A0Is<br>
there anything in the logs (both the rabbit and SASL logs)?<br>
<div class=3D"im"><br>
&gt; It will be great if someone could answer the above queries or provide =
me<br>
&gt; with some pointers about the same.<br>
<br>
</div>There&#39;s not much you can do at the moment except avoiding termina=
ting a<br>
large number of queues at the same time.<br>
<br>
Hope this clears things up.<br>
<br>
Cheers,<br>
Alex<br>
<div><div></div><div class=3D"h5"><br>
On Thu, Sep 08, 2011 at 07:38:14PM -0700, Praveen M wrote:<br>
&gt; Hi,<br>
&gt;<br>
&gt; I&#39;m a rabbitmq newbie and am trying to run some experiments to fig=
ure out if<br>
&gt; rabbitmq would serve my use case.<br>
&gt;<br>
&gt; I would like to create queues in the order of 100,000s. (one for each =
of my<br>
&gt; customers).<br>
&gt;<br>
&gt; I ran various tests,<br>
&gt;<br>
&gt; I&#39;m using the latest 2.6.0 server and 2.6.0 client, and the follow=
ing tests<br>
&gt; in durable queues mode and in non-durable queues mode.<br>
&gt;<br>
&gt; Tests,<br>
&gt; 1) to create 1000 queues , produce, consume<br>
&gt; 2) to create 10000 queues , produce, consume<br>
&gt; 3) to create 50000 queues, produce and consume.<br>
&gt;<br>
&gt; It works like a charm and the memory usage even with 50,000 queues see=
m very<br>
&gt; reasonable. (the order of 1-1.7G)<br>
&gt;<br>
&gt; However I realized when i wanted to shutdown the broker before startin=
g a<br>
&gt; new test, the stop command (rabbitmqctl stop) took a long time<br>
&gt; to complete.<br>
&gt;<br>
&gt; I made a small chart of how long the stop command on the broker takes =
to<br>
&gt; execute after the test creates &#39;N&#39; queues listed below.<br>
&gt; Also, in the case of durable queues, i found some weird numbers for th=
e time<br>
&gt; taken to restart the queues after a clean/unclean(aborting broker) sto=
p<br>
&gt;<br>
&gt; *NON_DURABLE_QUEUES TEST*<br>
&gt; *No of Queues Stop Time*<br>
&gt; 1000 10.7 seconds<br>
&gt; 10000 2 minutes<br>
&gt; 50000 11 minutes<br>
&gt;<br>
&gt; *<br>
&gt; *<br>
&gt; *DURABLE_QUEUES TEST<br>
&gt; *No of Queues Start Time Stop Time*<br>
&gt; 1000 2 seconds 10 seconds<br>
&gt; 10000 24 seconds 2 minutes<br>
&gt; 10000 after crash it recovers in 20 seconds (on improper shutdown).<br=
>
&gt; 50000 even at 6 minutes the queues doesn&#39;t start on a improper shu=
tdown<br>
&gt;<br>
&gt;<br>
&gt; Query 1)<br>
&gt; I am curious as to what causes the latency to stop the broker when iss=
ued a<br>
&gt; rabbitmqctl stop command. It seems to be something to do with the numb=
er of<br>
&gt; queues created as the stop time increase proportionally as the number =
of<br>
&gt; queues increases.<br>
&gt;<br>
&gt; Query 2)<br>
&gt; In the case of durable queues, I measured the time taken to restart th=
e<br>
&gt; broker after stopping it (a clean and unclean stop).<br>
&gt; I found that even after a clean/unclean stop the time to restart the b=
roker<br>
&gt; was just about 20 seconds on an average.<br>
&gt; However, in the case where i created 50000 durable queues and did an u=
nclean<br>
&gt; stop(just aborted the broker) and tried to restart the broker it didn&=
#39;t<br>
&gt; start for over to 6 minutes (when I gave up)...<br>
&gt; It was hung in the step of &quot;starting exchange,queue and binding r=
ecovery..&quot;<br>
&gt; It will be great if someone could explain why this could be caused.<br=
>
&gt;<br>
&gt; It will be great if someone could answer the above queries or provide =
me<br>
&gt; with some pointers about the same.<br>
&gt;<br>
&gt; Thank you for your help,<br>
&gt; --<br>
&gt; -Praveen<br>
<br>
</div></div>&gt; _______________________________________________<br>
&gt; rabbitmq-discuss mailing list<br>
&gt; <a href=3D"mailto:rabbitmq-discuss at lists.rabbitmq.com">rabbitmq-discus=
s at lists.rabbitmq.com</a><br>
&gt; <a href=3D"https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitm=
q-discuss" target=3D"_blank">https://lists.rabbitmq.com/cgi-bin/mailman/lis=
tinfo/rabbitmq-discuss</a><br>
<br>
</blockquote></div><br><br clear=3D"all"><div><br></div>-- <br>-Praveen<br>
</div>

--0016368340188e595004ac86bec2--
--0016368340188e595604ac86bec4
Content-Type: text/plain; charset=US-ASCII; name="rabbitmq.config.txt"
Content-Disposition: attachment; filename="rabbitmq.config.txt"
Content-Transfer-Encoding: base64
X-Attachment-Id: f_gsdgruoa1

WyB7bW5lc2lhLCBbe2R1bXBfbG9nX3dyaXRlX3RocmVzaG9sZCwgNTAwMDB9LCB7ZGNfZHVtcF9s
aW1pdCwgNDB9XX0sDQp7cmFiYml0LCBbe3ZtX21lbW9yeV9oaWdoX3dhdGVybWFyaywgMC4zNH1d
fV0uDQo=
--0016368340188e595604ac86bec4
Content-Type: application/octet-stream; name="rabbit at pmurugesan-wsl.log"
Content-Disposition: attachment; filename="rabbit at pmurugesan-wsl.log"
Content-Transfer-Encoding: base64
X-Attachment-Id: f_gsdgruo60

DQo9SU5GTyBSRVBPUlQ9PT09IDktU2VwLTIwMTE6OjEwOjMzOjMzID09PQ0KTGltaXRpbmcgdG8g
YXBwcm94IDkyNCBmaWxlIGhhbmRsZXMgKDgyOSBzb2NrZXRzKQ0KDQo9SU5GTyBSRVBPUlQ9PT09
IDktU2VwLTIwMTE6OjEwOjMzOjMzID09PQ0KTWVtb3J5IGxpbWl0IHNldCB0byA0MDkyTUIuDQoN
Cj1JTkZPIFJFUE9SVD09PT0gOS1TZXAtMjAxMTo6MTA6MzM6MzcgPT09DQptc2dfc3RvcmVfdHJh
bnNpZW50OiB1c2luZyByYWJiaXRfbXNnX3N0b3JlX2V0c19pbmRleCB0byBwcm92aWRlIGluZGV4
DQoNCj1JTkZPIFJFUE9SVD09PT0gOS1TZXAtMjAxMTo6MTA6MzM6MzcgPT09DQptc2dfc3RvcmVf
cGVyc2lzdGVudDogdXNpbmcgcmFiYml0X21zZ19zdG9yZV9ldHNfaW5kZXggdG8gcHJvdmlkZSBp
bmRleA0KDQo9V0FSTklORyBSRVBPUlQ9PT09IDktU2VwLTIwMTE6OjEwOjMzOjM3ID09PQ0KbXNn
X3N0b3JlX3BlcnNpc3RlbnQ6IHJlYnVpbGRpbmcgaW5kaWNlcyBmcm9tIHNjcmF0Y2gNCg0KPVdB
Uk5JTkcgUkVQT1JUPT09PSA5LVNlcC0yMDExOjoxMDozNDowMCA9PT0NCk1uZXNpYSgncmFiYml0
QHBtdXJ1Z2VzYW4td3NsJyk6ICoqIFdBUk5JTkcgKiogTW5lc2lhIGlzIG92ZXJsb2FkZWQ6IHtk
dW1wX2xvZywNCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgICAgICB3cml0ZV90aHJlc2hvbGR9DQoNCj1XQVJOSU5HIFJFUE9S
VD09PT0gOS1TZXAtMjAxMTo6MTA6Mzk6MzMgPT09DQpNbmVzaWEoJ3JhYmJpdEBwbXVydWdlc2Fu
LXdzbCcpOiAqKiBXQVJOSU5HICoqIE1uZXNpYSBpcyBvdmVybG9hZGVkOiB7ZHVtcF9sb2csDQog
ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgICAgdGltZV90aHJlc2hvbGR9DQoNCj1XQVJOSU5HIFJFUE9SVD09PT0gOS1TZXAt
MjAxMTo6MTA6NDI6MzMgPT09DQpNbmVzaWEoJ3JhYmJpdEBwbXVydWdlc2FuLXdzbCcpOiAqKiBX
QVJOSU5HICoqIE1uZXNpYSBpcyBvdmVybG9hZGVkOiB7ZHVtcF9sb2csDQogICAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg
dGltZV90aHJlc2hvbGR9DQoNCg==
--0016368340188e595604ac86bec4--


More information about the rabbitmq-discuss mailing list