[rabbitmq-discuss] RabbitMQ production setup questions around clustering

Thu Jul 22 14:01:01 BST 2010

I sent this message yesterday to directly to Aaron and tried to send to the list as well but sent the wrong message again... Heh. 

Off-topic: Why isn't the reply-address set to the mailing address

On Wed Jul 21st, 2010 5:44 PM EDT Dave Greggory wrote:

>Thanks for the reply, Aaron. I look forward to reading your blog post.
>
>Your postal service analogy is good one and that is how I understood it mostly. 
>From what I understood, if a client requests a message from a queue that is not 
>on the node he has the current connection to, because of clustering, you will 
>get the message back indirectly. But your analogy makes it sound like the client 
>will actually need to establish a second connection to the node that has the 
>queue. I'm fine with that, that's not a big deal. 
>
>
>I declare my (durable) queues from the both producer and consumer ends because 
>exchanges would just drop the messages if we left it up to consumers and no 
>consumers were up at the time a producer sends message. So But I'm trying to 
>understand where a queue will be created when it's  declared (will it be in the 
>node the client has the connection open to or  could it end up existing in 
>another node)? 
>
>
>I may have misused the term load-balancer in this context (just because that's 
>what we use that piece of hardware most often). Let's just call it node-switcher 
>for this purpose. I intended it to serve two purposes,
>a) Hide the hostnames/ip-addresses of rabbitmq nodes from producers and clients
>b) Serve as switch for determining which rabbitmq nodes consumers/producers 
>connect to, so that we can:
>     i. Switch all new consumer/producer connections  automatically to the other 
>node if the first node goes down 
>
>     ii. Upgrade rabbitmq nodes without affecting producer/consumer nodes 
>because we can switch out different rabbitmq nodes
>
>If rabbitmq goes down on 1 node and the node-switches switches out that node, 
>I'm okay with queues on the downed node not being available. But would a new 
>queue be automatically be created on the switched-in node without having to be 
>declared? I'm hoping that because it was part of the cluster before the first 
>node went down, it knew that a certain queue existed on the other node and since 
>the first node is no longer available, it will take over receiving messages for 
>that queue from producers? If that's not true, how do people handle that 
>scenario?
>
>Thanks for the heads up on not using redirects.
>
>Dave
>
>
>----- Original Message ----
>From: Aaron Westendorf <aaron at agoragames.com>
>To: Dave Greggory <davegreggory at yahoo.com>
>Cc: rabbitmq-discuss at lists.rabbitmq.com
>Sent: Wed, July 21, 2010 5:02:46 PM
>Subject: Re: [rabbitmq-discuss] RabbitMQ production setup questions around  
>clustering
>
>Dave,
>
>I intend to write up a blog post on our cluster setup which will
>hopefully address some of your questions.  You're trying to set up
>your cluster much the way we originally did, but it's not quite the
>way Rabbit or AMQP operate.
>
>Think of it as the postal service, where exchanges are your
>street-corner deposit boxes and office tellers, and the postal service
>is the cluster itself.  Your address is your routing scheme, and that
>routing key will deliver a message to your postal box.  The box/queue
>is the ultimate endpoint, it lives in a discrete location, and you
>have to go to your local post office and consume the mail out of it.
>
>Your load balancer will be of little use for consumers, as your
>consumers must be connecting to the node on which your queues reside.
>Load balancing can be used for your producers though, in cases where
>they are easily separated.
>
>I've heard redirect is removed from later specifications.  Regardless,
>I recommend turning the "insist" flag on and avoiding redirects
>altogether, because they're inconsistent with the way AMQP clustering
>works.
>
>I hope that gives you a good start.
>
>-Aaron
>
>
>On Wed, Jul 21, 2010 at 4:43 PM, Dave Greggory <davegreggory at yahoo.com> wrote:
>> Hi,
>>
>> We are setting up a RabbitMQ cluster as our message broker for servicing our
>> guaranteed delivery needs. We will have a local data store on our producers 
>for
>> guaranteeing delivery, but we don't want to resort to that except as a last
>> ditch effort in cases of catastrophic failures. I would like to better
>> understand how the cluster setup in RabbitMQ works.
>>
>> Our setup:
>> 2 rabbitmq nodes clustered together sitting behind a hardware load balancer.
>> Upon initial release, our message volume will not be that high, but it will 
>>grow
>> real fast once we offload more work to the message broker from our current
>> non-messaging based infrastructure. Since the initial volume is not very high,
>> we do not intend to use the load balancer for actual load balancing but to
>> always send connections to a specific rabbitmq node in the cluster. If 
>RabbitMQ
>> does not respond or the port is not open, it will automatically switch to the
>> second node for new connections.
>>
>> Questions:
>> 1. Are there any case studies for setting up clustering in RabbitMQ?
>> 2. When a queue is declared (by one of our producers), is the queue always
>> created on the rabbitmq node the producer client has a connection to or is the
>> queue created on a randomly selected node within the cluster?
>> 3. If the queue is durable and the messages sent to it are marked persistent,
>> will these messages always be persisted to disk and be available after a 
>>restart
>> of the node that has that queue, regardless of whether the node is a disk node
>> or RAM node? (This line "Should you do this, and suffer           a power
>> failure to the entire cluster, the entire state of           the cluster,
>> including all messages, will be lost.         " in
>> http://www.rabbitmq.com/clustering.html is confusing)
>> 4. Should I configure both my nodes as disk nodes or will one disk node be
>> sufficient? In other words, if only 1 disk node was there in the cluster and 
>>its
>> hard drive went bust, what can I recover from the RAM node? If nothing can be
>> recovered from the RAM node, is it mainly for increasing the number of
>> connections without taking any hits to disk throughput?
>> 5. Are connections redirects actually supported by the current version? The 
>FAQ
>> and Clustering documents on site are contradictory of each other. (FAQ says
>> "Future releases will support live failover using, for        instance, a
>> combination of the "known hosts" field in connection.open-ok and the
>> connection.redirect message.")
>> 6. If redirects are supported, when all connections are being sent to a 
>>specific
>> rabbitmq node in the cluster by the loadbalancer, will that rabbitmq node 
>still
>> send a redirect request to the client if it's getting too taxed? If so, then 
>is
>> it possible to limit redirects to only the disk nodes within the cluster so 
>>that
>> we don't lose any data?
>>
>> 7. Will the connections be redirected solely based on RabbitMQ node's ability 
>>to
>> serve it or is it more round-robin?
>> 8. Is there a way to ensure that the cluster configuration is correct because
>> 'disc/RAM' is not reported correctly by rabbitmqctl (as per
>>http://old.nabble.com/Rabbitmq-v.-1.8.1---Bug-report---Could-not-start-a-node-as-RAM-node-in-cluster-ts29211668.html)?
>>?
>>
>>
>> Thanks so much,
>> Dave
>
>
>
>-- 
>Aaron Westendorf
>Senior Software Engineer
>Agora Games
>359 Broadway
>Troy, NY 12180
>Phone: 518.268.1000
>aaron at agoragames.com
>www.agoragames.com
>
>
>
>