[rabbitmq-discuss] Outage with 3-node RabbitMQ 3.1.3 Cluster
Matt Wise
matt at nextdoor.com
Wed Nov 6 16:23:16 GMT 2013
Sure .. but I'm wondering if there is native syslog support in RabbitMQ.
Its much cleaner that way.
Matt Wise
Sr. Systems Architect
Nextdoor.com
On Wed, Nov 6, 2013 at 8:18 AM, Frank Shearar <frank.shearar at gmail.com>wrote:
> On 6 November 2013 16:02, Matt Wise <matt at nextdoor.com> wrote:
> > See comments inline.
> >
> >
> > On Wed, Nov 6, 2013 at 2:37 AM, Tim Watson <tim at rabbitmq.com> wrote:
> >>
> >> Hi Matt,
> >>
> >> Sorry to hear you've been running into problems.
> >>
> >> On 5 Nov 2013, at 22:05, Matt Wise wrote:
> >>
> >> > (sorry if this gets posted twice.. first email never seemed to make it
> >> > to the list)
> >> >
> >> > Hey... I had a pretty rough time today with a 3-node RabbitMQ 3.1.3
> >> > cluster thats under pretty heavy use (6-7 million messages per day --
> 100MB
> >> > peak bandwidth per node). I want to pose a few questions here. First
> off,
> >> > here's the basic configuration though.
> >> >
> >> > Configuration:
> >> > serverA, serverB and serverC are all configured with RabbitMQ 3.1.3.
> >> > They each are configured via Puppet ... and Puppet uses a dynamic node
> >> > discovery plugin (zookeeper) to find the nodes. The node lists are
> >> > hard-coded into the rabbitmq.config file. A dynamic server list
> generator
> >> > supplies Puppet with this list of servers (and is not really
> necessary to
> >> > describe here in this email).
> >> >
> >> > Scenario:
> >> > A momentary configuration blip caused serverA and serverB to begin
> >> > reconfiguring their rabbitmq.config files... when they did this, they
> also
> >> > both issued a 'service rabbitmq restart' command. This command took
> >> > 40+minutes and ultimately failed. During this failure, RabbitMQ was
> >> > technically running and accepting connections to the TCP ports ...
> but it
> >> > would not actually answer any queries. Commands like list_queues
> would hang
> >> > indefinitely.
> >> >
> >>
> >> What ha recovery policy (if any) do you have set up? A and B might get a
> >> different "view of the world" set up in their respective rabbitmq.config
> >> files (either to each other and/or to C) and then get restarted, but
> this
> >> should affect their view of the cluster, because as per
> >> http://www.rabbitmq.com/clustering.html:
> >>
> >> "Note that the cluster configuration is applied only to fresh nodes. A
> >> fresh nodes is a node which has just been reset or is being start for
> the
> >> first time. Thus, the automatic clustering won't take place after
> restarts
> >> of nodes. This means that any change to the clustering via rabbitmqctl
> will
> >> take precedence over the automatic clustering configuration."
> >>
> >
> > So far we've taken the approach that clustering configuration should be
> > hard-coded into the rabbitmq.config files. This works well in explicitly
> > defining all of the hosts in a cluster on every machine, but it also
> means
> > that adding a 4th node to a 3-node cluster will cause the 3 running live
> > nodes to do a full service restart, which is bad. Our rabbitmq.config
> though
> > is identical on all of the machines (other than the server-list, which
> may
> > have been in-flux when Puppet was restarting these services)
> >
> >> [
> >> {rabbit, [
> >> {log_levels, [{connection, warning}]},
> >> {cluster_partition_handling,pause_minority},
> >> {tcp_listeners, [ 5672 ] },
> >> {ssl_listeners, [ 5673 ] },
> >> {ssl_options,
> >> [{cacertfile,"/etc/rabbitmq/ssl/cacert.pem"},
> >> {certfile,"/etc/rabbitmq/ssl/cert.pem"},
> >> {keyfile,"/etc/rabbitmq/ssl/key.pem"},
> >> {verify,verify_peer},
> >> {fail_if_no_peer_cert,true}
> >> ]},
> >> {cluster_nodes,['rabbit at i-23cf477b', 'rabbit at i-07d8bc5f
> ',
> >> 'rabbit at i-a3291cf8']}
> >> ]}
> >> ].
> >
> >
> >>
> >> > Questions:
> >> > 1. We only had ~2500 messages in the queues (they are HA'd and
> >> > durable). The policy is { 'ha-mode': 'all' }. When serverA and serverB
> >> > restarted, why did they never come up? Unfortunately in the restart
> process,
> >> > they blew away their log files as well which makes this really tough
> to
> >> > troubleshoot.
> >>
> >> It's nigh on impossible to guess what might've gone wrong without any
> log
> >> files to verify against. We could sit and stare at all the relevant
> code for
> >> weeks and not spot a bug that's been triggered here, since if it were
> >> obvious we would've fixed it already.
> >>
> >> If you can give us a very precise set of steps (and timings) that led to
> >> this situation, I can try and replicate what you've seen, but I don't
> fancy
> >> my chances to be honest.
> >
> >
> > Its a tough one for us to reproduce.. but I think the closest steps would
> > be:
> >
> > 1. Create a 3-node cluster... configured with similar config to the
> one I
> > pasted above.
> > 2. Create enough publishers and subscribers that you have a few hundred
> > messages/sec going through the three machines.
> > 3. On MachineA and MachineC, remove MachineB from the config file.
> > 4. Restart MachineA's rabbitmq daemon using init script
> > 5. Wait 3 minutes... theoretically #4 is still in process.. now issue
> the
> > same restart to MachineC.
> >
> > Fail.
> >
> > Thats our best guess right now.. but agreed, the logs are a problem. Can
> we
> > configure RabbitMQ to log through syslog for the future?
>
> Syslog-ng can tail logs, dumping the logs in some arbitrary
> destination (another file, Papertrail, etc.)
>
> frank
>
> >> > 2. I know that restarting serverA and serverB at nearly the same
> time
> >> > is obviously a bad idea -- we'll be implementing some changes so this
> >> > doesn't happen again -- but could this have lead to data corruption?
> >>
> >> It's possible, though obviously that shouldn't really happen. How close
> >> were the restarts to one another? How many HA queues were mirrored
> across
> >> these nodes, and were they all very busy (as your previous comment about
> >> load seems to suggest)? We could try replicating that scenario in our
> tests,
> >> though it's not easy to get the timing right and obviously the
> existence of
> >> network infrastructure on which the nodes are running won't be the same
> (and
> >> that can make a surprisingly big difference IME).
> >
> >
> > The restarts were within a few minutes of each other. There are 5 queues,
> > and all 5 queues are set to mirror to 'all' nodes in the cluster. They
> were
> > busy, but no more than maybe 100 messages/sec coming in/out.
> >
> >>
> >>
> >> > Once the entire RabbitMQ farm was shut down, we actually were forced
> to
> >> > move the rabbitmq data directory out of the way and start up the farm
> >> > completely with blank databases. It seemed that RabbitMQ 3.1.3 really
> did
> >> > not want to recover from this failure. Any thoughts?
> >> >
> >> > 3. Lastly .. in the event of future failures, what tools are there
> for
> >> > recovering our Mnesia databases? Is there any way we can dump out the
> data
> >> > into some raw form, and then import it back into a new fresh cluster?
> >> >
> >>
> >> I'm afraid there are not, at least not "off the shelf" ones anyway. If
> you
> >> are desperate to recover important production data however, I'm sure we
> >> could explore the possibility of trying to help with that somehow. Let
> me
> >> know and I'll make some enquiries at this end.
> >
> >
> > At this point we can move on from the data loss... but it does make for
> an
> > interesting issue. Having tools to analyze the Mnesia DB and get "most
> of"
> > the messages out in some format where they could be re-injected into a
> fresh
> > cluster would be an incredibly useful tool. I wonder how hard it is to
> do?
> >
> >>
> >> Cheers,
> >> Tim
> >>
> >>
> >> _______________________________________________
> >> rabbitmq-discuss mailing list
> >> rabbitmq-discuss at lists.rabbitmq.com
> >> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
> >
> >
> >
> > _______________________________________________
> > rabbitmq-discuss mailing list
> > rabbitmq-discuss at lists.rabbitmq.com
> > https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
> >
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20131106/013bd6c0/attachment.htm>
More information about the rabbitmq-discuss
mailing list