<div>I'd do a few things:</div><div><br></div><div>Since you're using 0.5.2, I'd use the SimpleReconnectionStrategy handler and instead of extending the class like you are instead create a on_connected method in your app, utilizing the Connection.addStateChangeHandler() method to be notified when you are connected, and in this event handler do all of your setup.</div>
<div><br></div><div>There was a bug report on the 0.5.2 branch with the reconnection handler system. I did not trace that back, however.</div><div><br></div><div>You might be better served to check out the current version, as I have tested the SRS pretty extensively, including cases where the server goes away and comes back. If you use 0.9.3 (or the soon to be released 0.94) instead of the single callback function with a bool flag in 0.5.2, it has a callback handler split out for each state:</div>
<div><br></div><div>Connection.add_on_close_callback(callback)</div><div>Connection.add_on_open_callback(callback)</div><div><br></div><div>In either version, I would suggest that the reconnection strategy class is not the right place to handle your post-connection setup.</div>
<div><br></div>Regards,<div><br></div><div>Gavin<br><br><div class="gmail_quote">On Sun, Feb 20, 2011 at 7:13 PM, Jason J. W. Williams <span dir="ltr"><<a href="mailto:jasonjwwilliams@gmail.com">jasonjwwilliams@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">BTW this problem only seems to occur when rabbit@Phantome is restarted<br>
while the consumer is connected to it. Reconnections occur properly<br>
when restarting rabbit_1@Phantome and rabbit_2@Phantome while the<br>
consumer is connected.<br>
<br>
Status of node rabbit_1@Phantome ...<br>
[{running_applications,[{rabbit,"RabbitMQ","2.3.0"},<br>
{mnesia,"MNESIA CXC 138 12","4.4.16"},<br>
{os_mon,"CPO CXC 138 46","2.2.5"},<br>
{sasl,"SASL CXC 138 11","2.1.9.2"},<br>
{stdlib,"ERTS CXC 138 10","1.17.2"},<br>
{kernel,"ERTS CXC 138 10","2.14.2"}]},<br>
{nodes,[{disc,[rabbit_1@Phantome,rabbit@Phantome]},<br>
{ram,[rabbit_2@Phantome]}]},<br>
{running_nodes,[rabbit@Phantome,rabbit_2@Phantome,rabbit_1@Phantome]}]<br>
...done.<br>
<font color="#888888"><br>
-J<br>
</font><div><div></div><div class="h5"><br>
On Sun, Feb 20, 2011 at 5:04 PM, Jason J. W. Williams<br>
<<a href="mailto:jasonjwwilliams@gmail.com">jasonjwwilliams@gmail.com</a>> wrote:<br>
> Hi Guys,<br>
><br>
> Having some issues with a test cluster setup. It's a 3-node cluster<br>
> with 2-disk and 1-RAM nodes. The RAM node is joined to both disk<br>
> nodes.<br>
><br>
> rabbit@Phantome - disk<br>
> rabbit_1@Phantome - disk<br>
> rabbit_2@Phantome - RAM<br>
><br>
> In front of the cluster is an HAProxy instance listening on port 5670<br>
> doing simple round-robin TCP load balancing between the cluster<br>
> members.<br>
><br>
> As long as all the nodes are running I can have my consumer connect<br>
> through the load balancer to any node and successfully consume.<br>
> Likewise with the producer. However, when I shutdown the node the<br>
> consumer is currently connected to and allow Pika to do the<br>
> reconnection, I get this error:<br>
><br>
> error: uncaptured python exception, closing channel<br>
> <pika.asyncore_adapter.RabbitDispatcher connected at 0x632350> (<class<br>
> 'pika.exceptions.ChannelClosed'>:Connection.Close(class_id = 0,<br>
> method_id = 0, reply_code = 541, reply_text = 'INTERNAL_ERROR')<br>
> [/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/asyncore.py|read|74]<br>
> [/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/asyncore.py|handle_read_event|413]<br>
> [build/bdist.macosx-10.6-universal/egg/pika/asyncore_adapter.py|handle_read|86]<br>
> [build/bdist.macosx-10.6-universal/egg/pika/connection.py|on_data_available|268]<br>
> [build/bdist.macosx-10.6-universal/egg/pika/connection.py|_login2|375]<br>
> [build/bdist.macosx-10.6-universal/egg/pika/connection.py|handle_connection_open|201]<br>
> [cluster_test_consumer.py|on_connection_open|53]<br>
> [build/bdist.macosx-10.6-universal/egg/pika/spec.py|queue_declare|3003]<br>
> [build/bdist.macosx-10.6-universal/egg/pika/channel.py|_rpc|187]<br>
> [build/bdist.macosx-10.6-universal/egg/pika/connection.py|_rpc|326]<br>
> [build/bdist.macosx-10.6-universal/egg/pika/channel.py|wait_for_reply|125]<br>
> [build/bdist.macosx-10.6-universal/egg/pika/channel.py|_ensure|84])<br>
><br>
> Every few seconds Pika will attempt to reconnect the consumer and the<br>
> error repeats. The only unusual error in the Rabbit logs occurs on the<br>
> first reconnection: <a href="https://gist.github.com/836441" target="_blank">https://gist.github.com/836441</a><br>
><br>
> The rest of the time during these errors the logs look like this:<br>
> <a href="https://gist.github.com/836442" target="_blank">https://gist.github.com/836442</a><br>
><br>
> The consumer is disconnecting and reconnecting to HAProxy because the<br>
> stats show the connection moving between backend nodes.<br>
><br>
> The producer continues to operate correctly through the load balancer.<br>
><br>
> Consumer code is here: <a href="https://gist.github.com/836445" target="_blank">https://gist.github.com/836445</a><br>
><br>
> Any help/ideas are greatly appreciated. Thank you in advance.<br>
><br>
> -J<br>
><br>
_______________________________________________<br>
rabbitmq-discuss mailing list<br>
<a href="mailto:rabbitmq-discuss@lists.rabbitmq.com">rabbitmq-discuss@lists.rabbitmq.com</a><br>
<a href="https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss" target="_blank">https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss</a><br>
</div></div></blockquote></div><br></div>