[rabbitmq-discuss] Feature Req / Bug list

Thu Oct 24 19:34:54 BST 2013

"single node with ~35k active open filehandles"

I assume, all those file handles point to same content? If this would be
the case, programmers should go back to school, learn from scratch.

Where does this come from?
Am 03.10.2013 23:04 schrieb "Graeme N" <graeme at sudo.ca>:

> Hey everyone,
>
> I've recently been doing a deployment of a 5 node rabbit cluster, and
> found some rough edges I thought I should share. I realize many of these
> are feature reqs, but I'm hoping that I just haven't discovered the proper
> configuration to deal with some of these issues, or have misunderstood
> Rabbit's behaviour. If not, hopefully they can become feature req items
> that'll make a dev schedule at some point.
>
> All items below were discovered while deploying 3.1.5 over the past few
> days. Hosts in question have 24 sandy bridge HT cores, 64GB of RAM, XFS
> filesystem, running on CentOS 6. Cluster is 5 nodes, with a default HA
> policy on all queues of exact/3/automatic-sync.
>
> HA / Clustering:
>
> - expected queues to be distributed evenly among cluster machines, instead
> got all queues on first 3 machines in the cluster, nothing on the last 2.
> - expected message reads from a mirror machine for a queue to do the read
> i/o locally, so as to spread out workload, but it appears to always go to
> the host where the queue was created.
> - this led to a single node with ~35k active open filehandles, and 4 nodes
> with ~90. not an optimum distribution of read workload.
> - expected that if system a queue was created on is permanently removed
> (shut down and "rabbitmqctl forget_cluster_node hostname"'d), automatic
> sync would ensure there's the right number of copies replicated, but
> instead it just left every single queue under replicated.
> - when a new policy is applied that defines specific replication nodes, or
> a number of copies using 'exact, and auto-sync is set, sometimes it just
> syncs the first replica and leaves any others unsynced and calls it job
> done. This is bad.
> - had to add a new global HA policy and delete the existing one before
> rabbit fixed my queue replication.
> - Attempted to create small per-queue policies to redistribute messages
> and then delete the per-queue policies, but this often leads to a
> inconsistent cluster state where queues continued to show as being part of
> a policy that was already deleted, attempt to resync, and get stuck, unable
> to complete or switch back to the global default policy.
> - sometimes the cluster refuses to accept any more policy commands. Have
> to fully shut down and restart the cluster to clear this condition.
> - sometimes policies applied to empty and inactive queues don't get
> correctly applied, and the queue hangs on "resyncing / 100%". this makes no
> sense, given the queue is empty, and requires a full cluster restart to
> clear.
> - would like to see a tool to redistribute queues amongst available
> cluster machines according to HA policy. Ideally something that happens
> automatically on queue creation, cluster membership and policy changes, but
> would take something manual I could run out of cron.
> - I've managed to get the cluster into an inconsistent state a /lot/ using
> the HA features, so it feels like they need more automated stress testing
> and bulletproofing.
>
> Persistent message storage:
>
> - it appears as if messages are put into very small batch files on the
> filesystem (1-20 MB)
> - this causes the filesystem to thrash if your IO isn't good at random IO
> (SATA disks) and you have lots of persistent messages (>200k messages
> 500kB-1MB in size) that don't fit in RAM.
> - this caused CentOS 6 kernel to kill erlang after stalling the XFS
> filesystem for > 120s.
> - if a node crashes, Rabbit seems to rescan the entire on-disk datastore
> before continuing, instead of using some sort of checkpointing or
> journaling system to quickly recover from a crash.
> - all of above should be solvable by using an existing append-only
> datastore like eLevelDB or Bitcask.
> - we solved for now by using SSDs, but this bumps up the cost of each RMQ
> node, and doesn't solve the node crash recovery problem, just speeds up the
> process somewhat.
>
> Web API:
> - API seems to block when cluster is busy, even for informational GETs, so
> you can't determine what's going on with the cluster.
> - Some API operations seem to block until they complete (like putting a
> new policy), while others return immediately even though they're definitely
> not completed yet (like deleting a policy). It's not documented which have
> which behaviour, or why they don't just all block until op is completed.
>
> Hopefully you guys can educate me on what I'm doing wrong in some of these
> scenarios, or how to mitigate some of these issues. Any issue that requires
> taking down and restarting the cluster to fix is especially troubling.
>
> Thanks,
> Graeme
>
>
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20131024/3dda04bb/attachment.htm>