[rabbitmq-discuss] HA active/active cluster in a bad state
Bryan Murphy
bmurphy1976 at gmail.com
Thu Oct 13 21:28:39 BST 2011
On Thu, Oct 13, 2011 at 10:48 AM, Matthew Sackman <matthew at rabbitmq.com>wrote:
> On Thu, Oct 13, 2011 at 10:44:43AM -0500, Bryan Murphy wrote:
> > I'll try to get it into a bad state later today. If I can manage that, I
> > can easily grant temporary remote access to anybody who needs it.
>
> That'd be great, many thanks. However, please note that almost all of
> the Rabbit team is in London, UK, and so there's the usual fun-and-games
> with timezones...
>
> Matthew
>
I've managed to get it into a bad state, but yet again the behavior is
inconsistent with what I've seen before. Now, I've got the following
behavior:
/etc/init.d/rabbitmq-server stop works
/etc/init.d/rabbitmq-server start never exits
rabbitmqctl list_queues works
rabbitmqctl cluster_status works
Sending messages to the server fails:
WARNING:root:Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/mediafly/bus/__init__.py",
line 111, in publish
connection = pika.BlockingConnection(host)
File
"/usr/local/lib/python2.7/dist-packages/pika/adapters/blocking_connection.py",
line 32, in __init__
BaseConnection.__init__(self, parameters, None, reconnection_strategy)
File
"/usr/local/lib/python2.7/dist-packages/pika/adapters/base_connection.py",
line 50, in __init__
reconnection_strategy)
File "/usr/local/lib/python2.7/dist-packages/pika/connection.py", line
170, in __init__
self._connect()
File "/usr/local/lib/python2.7/dist-packages/pika/connection.py", line
228, in _connect
self.parameters.port or spec.PORT)
File
"/usr/local/lib/python2.7/dist-packages/pika/adapters/blocking_connection.py",
line 36, in _adapter_connect
BaseConnection._adapter_connect(self, host, port)
File
"/usr/local/lib/python2.7/dist-packages/pika/adapters/base_connection.py",
line 58, in _adapter_connect
self.socket.connect((host, port))
File "/usr/lib/python2.7/socket.py", line 224, in meth
return getattr(self._sock,name)(*args)
error: [Errno 111] Connection refused
I was able to get it into this state by repeatedly stopping and/or killing
the first node in the cluster and restarting it while simultaneously trying
various ways of tickling it with our application.
startup_log is getting stuck at "starting database" but there's no activity
going on in the cluster and I'd be surprised if I've sent >100 messages
since I provisioned it.
I can provide remote access to whomever needs it, I just need their public
ssh key.
Thanks,
Bryan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20111013/b5f13bcf/attachment.htm>
More information about the rabbitmq-discuss
mailing list