[rabbitmq-discuss] Unexaplainable behaviour with shovel plugin.
simon at rabbitmq.com
Thu Feb 27 12:03:34 GMT 2014
On 26/02/14 08:57, Claire Fautsch wrote:
> This leads to the fact, that we have on the destination servers a total
> of two hanful of messages that are not yet confirmed, however on the
> source servers we have millions of messages that are waiting for
> confirmation (acknowledgement)
> We would expect with some threshold that
> delivery rate on source = publish rate on destination (which is the case)
> confirm rate on destination = acknowledge rate on source (which shows
> considerable difference)
> Does anyone have an idea or suggestion what could be the reason for
> this? Is it a bad idea to have load balancer as destination in the
> shovel, or should that work fine? Network issue?
I doubt the load balancer is the problem. I think I have a reasonable
idea where the problem lies.
The issue is that the shovel does not enforce any form of flow control
other than (optionally) using prefetch limiting, which you are not using.
So your source servers are delivering messages into the shovel as fast
as they can, and your destination servers are accepting messages as fast
as *they* can, but they are ending up being a bit slower. Nothing is
creating any back pressure on the source servers, and so messages are
queuing up inside the shovel. Since you are using on_confirm ack mode,
these show as unacknowledged messages on the source.
> Here some more details on our shovel setup:
> prefetch_count=0 (default)
I suspect that if you set prefetch_count to some high-but-not-insane
number (exactly how high depends on your message size + rate but I might
start the bidding at 1,000) this might solve your problem.
Of course, if your destination servers are actually slower than your
source ones, then you might need to do something about that. But turning
on prefetch limiting would make the system better-behaved and make it
clearer where your issues are.
There might be another issue though - on all released versions of
RabbitMQ turning on prefetch limiting reduces performance somewhat. This
will get fixed in the next release.
More information about the rabbitmq-discuss