[rabbitmq-discuss] Fwd: rabbit MQ high-through put stock quotes

Fri Jun 5 15:09:14 BST 2009

Oops, I meant to send this to the ML. Forwarding...

cr

Begin forwarded message:

> From: Chuck Remes <cremes.devlist at mac.com>
> Date: June 4, 2009 4:16:17 PM CDT
> To: smittycb10 <msmith1638 at gmail.com>
> Subject: Re: [rabbitmq-discuss] rabbit MQ high-through put stock  
> quotes
>
>
> On Jun 4, 2009, at 2:48 PM, smittycb10 wrote:
>
>>
>> This is a follow up to my previous message with some more details  
>> about how I
>> envision using Rabbit MQ.
>>
>> Currently I have a system where a server takes in all messages from  
>> our data
>> provider and then streams the messages via UDP to a consumer  
>> application.
>> This set up can handle the message load but is proving to be  
>> difficult and
>> costly to scale. One producer to one Consuming server which then  
>> filters
>> data to desktop clients.
>>
>> My idea for a future system is to have the same server simply push  
>> the
>> messages to a rabbit MQ Broker or Brokers and then have all clients  
>> simply
>> listen to whatever queues they need to get there jobs done. I need  
>> to design
>> for 100,000 (one hundred thousand) messages to made available for  
>> consuming
>> per second. Most consumers will only consume a fraction of these  
>> messages,
>> although some of our server pieces may need to subscribe to the whole
>> universe of incoming messages. Messages can and will be batched on  
>> the
>> publishing side.
>>
>> So the use case would be, one publisher pushing messages to the  
>> "Broker
>> Cloud" many consumers some needing to listen to all messages others  
>> would
>> only be interested in a particular Stock symbol at a time.
>>
>> I will experiment with some fanout experiments next, any advice or  
>> direction
>> you have would be greatly appreciated. In a fanout pattern if say a  
>> quote
>> comes in for IBM how do I distribute only to people interested in  
>> IBM?
>
> Let me take a crack at this. Be forewarned that everything I write  
> here might be wrong. :)
>
> We have 3 kinds of queues to choose from: direct, fanout and topic.
>
> In terms of performance, they behave in the order listed above from  
> fastest to slowest.
>
> For direct exchanges, you bind your queue to it with a specific  
> routing key. Any messages published to that exchange need to include  
> a routing key. If the message's routing key *exactly* matches your  
> binding's routing key, the message is delivered to your queue.
>
> For fanout exchanges, you bind a queue to it without any routing  
> key. Any messages published to it are *unconditionally* delivered to  
> the queue. The reason I didn't mark this as the fastest queue type  
> is because you generally only pick it when you have lots of bound  
> queues so the message needs to be copied to each. (Of course, this  
> is implementation specific and could be very fast or slow; I don't  
> know how Rabbit performs this operation.) If you only have one  
> subscriber, this will be faster than the direct exchange because you  
> skip the routing key comparison.
>
> For topic exchanges, you bind a queue to the exchange with a routing  
> pattern. Any messages published with a key that matches your pattern  
> are delivered to the queue.
>
> With that out of the way, I would suggest a fanout exchange per high- 
> volume symbol. Any interested subscribers can bind a uniquely-named  
> queue to that exchange and get all messages for IBM. Note I said you  
> need a uniquely-named queue per subscriber. If you use a single  
> queue for all subscribers, the subscribers will get messages in a  
> round-robin fashion (usually). So N subscribers would only see an  
> update every N messages which is probably not what you want.
>
> Alternately, you could use a direct exchange for the low-volume  
> stocks. Interested subscribers could bind their (again) uniquely- 
> named queue with a routing key matching the symbol (or some other  
> identifier). Rabbit is smart enough that it will only deliver  
> messages to queues where the keys match exactly. The reason I  
> suggest this is due to potential performance concerns on the server  
> side if it has to handle hundreds or thousands of unique exchanges.  
> I don't know if Rabbit suffers from any performance degradation in  
> this situation, so make sure to benchmark it.
>
> Additionally, you'll want to publish messages with "nowait" and  
> "noack" and "not persistent" and make sure the queues are not  
> "durable." See the API docs for your chosen library to see how to  
> set those flags. "Nowait" means the client should not wait for a  
> response to the publish method. "No ack" means the message itself  
> doesn't need to be acknowledged as received; the server will "pre- 
> ack" each message pulled from the queue. And the "not persistent"  
> setting interacts with the "durability" setting on the queue.  
> Durable messages (inside durable queues) are stored on disk so that  
> a crash and restart doesn't lose any messages. For transient  
> messages like stock quotes, you don't want them persisted at this  
> level.
>
> Someone else on the list will have to answer your questions about  
> scaling up to 100k/sec. I think Erlang processes are still single- 
> threaded so you probably need to run multiple instances to take  
> advantage of multi-core servers. Perhaps this is Rabbit  
> clustering... I honestly don't know.
>
> FYI, tests on my hardware show that publishing from one process to  
> another on the same box using a direct exchange and the suggestions  
> listed above introduces around 600 microseconds in delivery latency.  
> Pay close attention to your serialization/deserialization code; its  
> performance could be a gating factor.
>
> Let us know how things turn out.
>
> cr
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20090605/68069719/attachment.htm