<div>I'd do a few things:</div><div><br></div><div>Since you're using 0.5.2, I'd use the SimpleReconnectionStrategy handler and instead of extending the class like you are instead create a on_connected method in your app, utilizing the Connection.addStateChangeHandler() method to be notified when you are connected, and in this event handler do all of your setup.</div>
<div><br></div><div>There was a bug report on the 0.5.2 branch with the reconnection handler system. I did not trace that back, however.</div><div><br></div><div>You might be better served to check out the current version, as I have tested the SRS pretty extensively, including cases where the server goes away and comes back. �If you use 0.9.3 (or the soon to be released 0.94) instead of the single callback function with a bool flag in 0.5.2, it has a callback handler split out for each state:</div>
<div><br></div><div>Connection.add_on_close_callback(callback)</div><div>Connection.add_on_open_callback(callback)</div><div><br></div><div>In either version, I would suggest that the reconnection strategy class is not the right place to handle your post-connection setup.</div>
<div><br></div>Regards,<div><br></div><div>Gavin<br><br><div class="gmail_quote">On Sun, Feb 20, 2011 at 7:13 PM, Jason J. W. Williams <span dir="ltr"><<a href="mailto:jasonjwwilliams@gmail.com">jasonjwwilliams@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">BTW this problem only seems to occur when rabbit@Phantome is restarted<br>
while the consumer is connected to it. Reconnections occur properly<br>
when restarting rabbit_1@Phantome and rabbit_2@Phantome while the<br>
consumer is connected.<br>
<br>
Status of node rabbit_1@Phantome ...<br>
[{running_applications,[{rabbit,"RabbitMQ","2.3.0"},<br>
� � � � � � � � � � � �{mnesia,"MNESIA �CXC 138 12","4.4.16"},<br>
� � � � � � � � � � � �{os_mon,"CPO �CXC 138 46","2.2.5"},<br>
� � � � � � � � � � � �{sasl,"SASL �CXC 138 11","2.1.9.2"},<br>
� � � � � � � � � � � �{stdlib,"ERTS �CXC 138 10","1.17.2"},<br>
� � � � � � � � � � � �{kernel,"ERTS �CXC 138 10","2.14.2"}]},<br>
�{nodes,[{disc,[rabbit_1@Phantome,rabbit@Phantome]},<br>
� � � � {ram,[rabbit_2@Phantome]}]},<br>
�{running_nodes,[rabbit@Phantome,rabbit_2@Phantome,rabbit_1@Phantome]}]<br>
...done.<br>
<font color="#888888"><br>
-J<br>
</font><div><div></div><div class="h5"><br>
On Sun, Feb 20, 2011 at 5:04 PM, Jason J. W. Williams<br>
<<a href="mailto:jasonjwwilliams@gmail.com">jasonjwwilliams@gmail.com</a>> wrote:<br>
> Hi Guys,<br>
><br>
> Having some issues with a test cluster setup. It's a 3-node cluster<br>
> with 2-disk and 1-RAM nodes. The RAM node is joined to both disk<br>
> nodes.<br>
><br>
> rabbit@Phantome - disk<br>
> rabbit_1@Phantome - disk<br>
> rabbit_2@Phantome - RAM<br>
><br>
> In front of the cluster is an HAProxy instance listening on port 5670<br>
> doing simple round-robin TCP load balancing between the cluster<br>
> members.<br>
><br>
> As long as all the nodes are running I can have my consumer connect<br>
> through the load balancer to any node and successfully consume.<br>
> Likewise with the producer. However, when I shutdown the node the<br>
> consumer is currently connected to and allow Pika to do the<br>
> reconnection, I get this error:<br>
><br>
> error: uncaptured python exception, closing channel<br>
> <pika.asyncore_adapter.RabbitDispatcher connected at 0x632350> (<class<br>
> 'pika.exceptions.ChannelClosed'>:Connection.Close(class_id = 0,<br>
> method_id = 0, reply_code = 541, reply_text = 'INTERNAL_ERROR')<br>
> [/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/asyncore.py|read|74]<br>
> [/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/asyncore.py|handle_read_event|413]<br>
> [build/bdist.macosx-10.6-universal/egg/pika/asyncore_adapter.py|handle_read|86]<br>
> [build/bdist.macosx-10.6-universal/egg/pika/connection.py|on_data_available|268]<br>
> [build/bdist.macosx-10.6-universal/egg/pika/connection.py|_login2|375]<br>
> [build/bdist.macosx-10.6-universal/egg/pika/connection.py|handle_connection_open|201]<br>
> [cluster_test_consumer.py|on_connection_open|53]<br>
> [build/bdist.macosx-10.6-universal/egg/pika/spec.py|queue_declare|3003]<br>
> [build/bdist.macosx-10.6-universal/egg/pika/channel.py|_rpc|187]<br>
> [build/bdist.macosx-10.6-universal/egg/pika/connection.py|_rpc|326]<br>
> [build/bdist.macosx-10.6-universal/egg/pika/channel.py|wait_for_reply|125]<br>
> [build/bdist.macosx-10.6-universal/egg/pika/channel.py|_ensure|84])<br>
><br>
> Every few seconds Pika will attempt to reconnect the consumer and the<br>
> error repeats. The only unusual error in the Rabbit logs occurs on the<br>
> first reconnection: <a href="https://gist.github.com/836441" target="_blank">https://gist.github.com/836441</a><br>
><br>
> The rest of the time during these errors the logs look like this:<br>
> <a href="https://gist.github.com/836442" target="_blank">https://gist.github.com/836442</a><br>
><br>
> The consumer is disconnecting and reconnecting to HAProxy because the<br>
> stats show the connection moving between backend nodes.<br>
><br>
> The producer continues to operate correctly through the load balancer.<br>
><br>
> Consumer code is here: <a href="https://gist.github.com/836445" target="_blank">https://gist.github.com/836445</a><br>
><br>
> Any help/ideas are greatly appreciated. �Thank you in advance.<br>
><br>
> -J<br>
><br>
_______________________________________________<br>
rabbitmq-discuss mailing list<br>
<a href="mailto:rabbitmq-discuss@lists.rabbitmq.com">rabbitmq-discuss@lists.rabbitmq.com</a><br>
<a href="https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss" target="_blank">https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss</a><br>
</div></div></blockquote></div><br></div>