[rabbitmq-discuss] Inconsistent Shovel Delivery

Thu Mar 13 09:40:43 GMT 2014

On 12/03/2014 21:27, rparker wrote:
> We have a simple shovel that takes messages published to a queue on a
> rabbitmq cluster by a custom Windows service (using Mass Transit ) and then
> shovels these messages across a WAN to a fanout exchange on a remote
> rabbitmq cluster that routes to a queue of the same name on that remote
> cluster.  Here's the basic config:

<snip>

The config looks fine.

> So, what's happening is that only every other message makes it to the
> ultimate queue on the remote cluster.  We originally were addressing the
> remote cluster via a load balancer and we thought this was an easy culprit,
> but after changing the shovel config to go directly to the master node of
> the remote cluster bypassing the LB, the problem still persisted.  We tried
> decoupling both clusters and shoveling between two standalone nodes, and the
> problem still persisted.  We tried unconfiguring shovel and published
> directly to the exchanges/queues on both the local and remote clusters and
> the correct number of messages were published.  Its only when shoveling is
> used that we are getting an inconsistent number of messages, and when
> examining the messages it shows that exactly every other message is making
> it to the remote cluster.  No errors are evident in the log file and when I
> show a status of the shovel, it shows it as running:

<snip>

Which also looks fine.

> What on earth would cause this behavior other than some sort of faulty
> firewall?  Why would only shovel messages be affected and not directly
> published messages?

It is very very hard to believe that the shovel is throwing away every 
other message: there's absolutely nothing in the shovel code that would 
even be able to do that.

Your shovel is in on_confirm mode - that means that messages will only 
get acked to the source broker once the destination broker has accepted 
them. So even if somehow the shovel were losing messages, it would 
manifest as the unacked message count growing forever on the source 
queue. I assume this is not happening!

The fact that you're losing exactly every other message strongly hints 
towards there being a second consumer somewhere, probably on the source 
queue. That could be another shovel, or a client app of yours. That's 
the only thing in RabbitMQ which will distribute messages in a round 
robin fashion like that. So first of all I'd check for a consumer you 
haven't seen. "rabbitmqctl list_consumers" or the queue details page in 
mgmt should help here.

> Is there a way to turn up the verbosity of the
> shoveling logging?

Not in the shovel itself. But you could check what's happening on the 
wire with wireshark, or turn on the tracer 
http://www.rabbitmq.com/firehose.html. Depending on your message volumes 
that could be a lot of extra verbosity though.

Cheers, Simon