<div dir="ltr">Hi, <br><br>Simon, as soon as you said 'fsync' i knew what the problem is. I'm testing on KVMs with virtual disks - thus very slow fsync. Moving /var/lib/rabbitmq to tempfs (in memory) on nodeB showed that there is no problem to speak of. <br>
My production machines will have either real disks or LVM LVs exported to XEN. Which means there should be<br>no problem. <br><br>My target design is a "star", local collectors publish to central node. The idea is not to reconfigure the central node (NodeB in my test) every time a new local collector is create or destroyed. <br>
<br>Simon, many thanks!<br><br>Boris. <br><br><div class="gmail_quote">On Thu, Feb 21, 2013 at 5:57 PM, Simon MacMullen <span dir="ltr"><<a href="mailto:simon@rabbitmq.com" target="_blank">simon@rabbitmq.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi. If the shovel is on node A then {ack_mode, on_publish} is not particularly safe - if the network connection goes down then you will lose messages that were on the wire.<br>
<br>
If the shovel were to be running on node B then {ack_mode, on_publish} would be safer, as it would tolerate network failures (but not a crash at node B).<br>
<br>
on_confirm would still be better. Of course, since you're consuming in autoack mode in the php script you can lose messages there anyway...<br>
<br>
You didn't say which version of RabbitMQ you were running. The script seems perfectly reasonable (I assume that you are not doing anything AMQPish in the part elided by "/* do stuff here, count messages, run mps stats etc... */").<br>
<br>
So it's a bit of a puzzle. If confirms + persistence are so much slower than persistence alone, then I wonder if somehow you have a machine that fsyncs very slowly, since that's the primary difference in what node B will be doing.<br>
<br>
Cheers, Simon<div><div class="h5"><br>
<br>
<br>
On 21/02/13 15:10, bratner bratner wrote:<br>
</div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div class="h5">
Hi!<br>
<br>
Sorry for the long delay. I did some more tests and found a way to speed<br>
things up.<br>
After changing {ack_mode, on_confirm} to {ack_mode, on_publish} things<br>
started to work well.<br>
Which means that the speed of reading the messages from the queue NodeB<br>
was the speed they<br>
were published to the queue on NodeA.<br>
<br>
Before this change (when 'on_confirm' was set) messages where piling up<br>
in the queue on NodeB.<br>
So if i published 100 messages per sec into a queue on nodeA, the queue<br>
on nodeB would receive all of those and the queue on nodeA would be<br>
empty (as it should). But the PHP script on nodeB (below) would read<br>
something like 3-8 messages per sec and the rest (92-97 messages per<br>
sec) would keep piling up in the queue on nodeB.<br>
That is until i completely stopped writing new messages to the queue on<br>
nodeA. Then this script will read messages like crazy until the queue on<br>
nodeB was completely empty.<br>
<br>
So I left with a couple of questions. What is causing the write and the<br>
read to be so uneven with 'on_confirm' given that there no real<br>
resources problem? Is this a bug or a feature?<br>
How much reliability do I lose if i use 'on_publish' ? Is there a chance<br>
messages can just disappear with 'on_publish' ?<br>
<br>
Here is the outline of my php test reader, very simple:<br>
<br>
<?php<br>
$connection = new AMQPConnection();<br>
$connection->setLogin("<login_<u></u>nodeb>");<br>
$connection->setPassword("<<u></u>password_nodeb");<br>
$connection->setVhost("vhostB"<u></u>);<br>
$connection->connect();<br>
<br>
if (!$connection->isConnected()) {<br>
die('Not connected :('. PHP_EOL);<br>
}<br>
$channel = new AMQPChannel($connection);<br>
$queue = new AMQPQueue($channel);<br>
$queue->setName('queueB');<br>
<br>
while(true) {<br>
while($queue->get(AMQP_<u></u>AUTOACK)) {<br>
/* do stuff here, count messages, run mps stats etc... */<br>
}<br>
/* sleep some not to eat the whole cpu */<br>
sleep(1);<br>
}<br>
<br>
<br>
Thanks,<br>
Boris<br>
<br>
On Wed, Feb 13, 2013 at 5:52 PM, Simon MacMullen <<a href="mailto:simon@rabbitmq.com" target="_blank">simon@rabbitmq.com</a><br></div></div><div><div class="h5">
<mailto:<a href="mailto:simon@rabbitmq.com" target="_blank">simon@rabbitmq.com</a>>> wrote:<br>
<br>
That sounds very wrong.<br>
<br>
Which version of RabbitMQ are you using? Can you post (a cut down<br>
version of) your PHP script somewhere? (I have no familiarity with<br>
the PHP client but I'd like to see what it's doing...)<br>
<br>
Cheers, Simon<br>
<br>
<br>
On 13/02/13 14:33, bratner bratner wrote:<br>
<br>
Hi!<br>
<br>
My setup includes a rabbitmq-c application that publishes<br>
messages on<br>
node A.<br>
The publishing rate is about 200mps, each message can be up to 10Kb.<br>
<br>
Shovel (node A) is configured to move them to node B.<br>
On node B i'm dequeuing the messages with a PHP test script with<br>
$queue->get(AMQP_AUTOACK).<br>
If i keep the publishing rate at 200mps then i can see that the<br>
queue on<br>
node A is empty and the Q on node B is<br>
filling up. The read-rate of my test script is really low.<br>
<br>
If i stop the publishing, 3-5 seconds later, my test script starts<br>
reading like crazy until the queue on node B is empty.<br>
<br>
Even If I slow down the publishing rate to 5mps , same is<br>
happening ,<br>
messages are piling up on node B until i dial down the pressure.<br>
<br>
This problem disappears if I set ack_mode to on_publish or<br>
no_ack. In<br>
this case the reader script reads with the publishing speed.<br>
<br>
My configuration :<br>
<br>
<br>
<br>
[<br>
{rabbit, [<br>
{log_levels, [{connection, error}]}<br>
]},<br>
{rabbitmq_shovel,<br>
[ {shovels, [ {messagemover, [<br>
{sources, [<br></div></div>
{broker, "amqp://usera:pass@localhost/_<u></u>_vhosta<br>
<amqp://cdrposter:4VkI6MKH@__<u></u>localhost/sipout>"},<br>
<br>
{declarations, [<br>
{'exchange.declare',[{__<u></u>exchange, <<"my-fanout">>},{type,<br>
<<"fanout">>},durable]},<br>
{'queue.declare',[{queue,<<"__<u></u>messages">>},durable]},<div class="im"><br>
{'queue.bind',[{exchange, <<"my-fanout">>},{queue, <<"messages">>}]}<br>
]}<br>
]},<br>
{destinations, [<br></div>
{brokers, [ "amqp://userb:pass@nodeB/__<u></u>vhostb<br>
<amqp://cdr_manager:Ci2XOb3b@_<u></u>_<a href="http://10.200.10.218/cdrs" target="_blank">10.200.10.218/cdrs</a><br>
<<a href="http://cdr_manager:Ci2XOb3b@10.200.10.218/cdrs" target="_blank">http://cdr_manager:Ci2XOb3b@<u></u>10.200.10.218/cdrs</a>>>" ]},<br>
<br>
{declarations, [<br>
{'exchange.declare',[{__<u></u>exchange, <<"my-fanout">>},{type,<br>
<<"fanout">>},durable]},<br>
{'queue.declare',[{queue,<<"__<u></u>messages">>},durable]},<div class="im"><br>
{'queue.bind',[{exchange, <<"my-fanout">>},{queue, <<"messages">>}]}<br>
]}<br>
]},<br>
{queue, <<"messages">>},<br>
{prefetch_count, 200},<br>
{ack_mode, on_confirm},<br>
{publish_properties, [{delivery_mode, 2}]},<br>
{reconnect_delay, 5}<br>
]}<br>
]}<br>
]}<br>
].<br>
<br>
<br>
Thank you,<br>
Boris.<br>
<br>
<br></div>
______________________________<u></u>___________________<br>
rabbitmq-discuss mailing list<br>
rabbitmq-discuss@lists.__<a href="http://rabbitmq.com" target="_blank">rabbi<u></u>tmq.com</a><br>
<mailto:<a href="mailto:rabbitmq-discuss@lists.rabbitmq.com" target="_blank">rabbitmq-discuss@<u></u>lists.rabbitmq.com</a>><br>
<a href="https://lists.rabbitmq.com/__cgi-bin/mailman/listinfo/__rabbitmq-discuss" target="_blank">https://lists.rabbitmq.com/__<u></u>cgi-bin/mailman/listinfo/__<u></u>rabbitmq-discuss</a><div class="im"><br>
<<a href="https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss" target="_blank">https://lists.rabbitmq.com/<u></u>cgi-bin/mailman/listinfo/<u></u>rabbitmq-discuss</a>><br>
<br>
<br>
<br>
--<br>
Simon MacMullen<br>
RabbitMQ, VMware<br>
<br>
<br>
</div></blockquote><div class="HOEnZb"><div class="h5">
<br>
<br>
-- <br>
Simon MacMullen<br>
RabbitMQ, VMware<br>
</div></div></blockquote></div><br></div>