[rabbitmq-discuss] Shovel bug? Consumer hangs after network failure.

Øyvind Tjervaag oyvind at tjervaag.com
Thu Jan 20 08:03:13 GMT 2011


Hi!

I don't know if I've found a bug or if there is something in my setup that needs changing..
I have 2 RabbitMQ servers (v2.2.0 of everything, windows 2003 r2, Erlang 5.8.1.1) connected via a very unstable radio link (well.. they're in my development LAN at the moment, but will be moved to an unstable link in not too many weeks...). My test setup is something like this:

Rabbit1 --> Network switch --> Horrible bad network link --> Network switch --> Rabbit2

I needed to move messages in both directions between the rabbit servers. I first installed the shovel plugin on rabbit2 only, and configured that to move messages in both directions. If I consume from a queue on Rabbit1 with the shovel on Rabbit2 and disconnect the network between them, the consumer doesn't seem to be able to reconnect to rabbit1 when the network comes back up, generating errors like this:

=SUPERVISOR REPORT==== 19-Jan-2011::12:57:30 ===
     Supervisor: {<0.1973.0>,amqp_connection_sup}
     Context:    child_terminated
     Reason:     {error,etimedout}
     Offender:   [{pid,<0.1976.0>},
                  {name,connection},
                  {mfa,
                      {amqp_gen_connection,start_link,
                          [amqp_network_connection,
                           {amqp_params,<<"guest">>,<<"guest">>,<<"/">>,
                               "192.168.0.2",5672,0,0,0,none,[]},
                           #Fun<amqp_connection_sup.0.64044238>,
                           #Fun<amqp_connection_sup.2.105785818>,[]]}},
                  {restart_type,intrinsic},
                  {shutdown,brutal_kill},
                  {child_type,worker}]

The shovel that moves messages from Rabbit2 to Rabbit1 however detects that the network was down, and reconnects properly. I've done a temporary workaround by always having my shovel plugin on the same server where I consume messages.

This seems to be the same "issue" I had with the .NET client a while ago where I had to call a passive queue declare in my consumer code to trigger a (channel?-) fault to detect that a connection had gone away.

.. So I suppose the question is: Is there any way to configure the shovel plugin differently to get it to detect that a connection to a remote rabbit server has gone away?

thanks,

Øyvind


More information about the rabbitmq-discuss mailing list