[rabbitmq-discuss] Fwd: rabbit MQ high-through put stock quotes
Chuck Remes
cremes.devlist at mac.com
Fri Jun 5 15:09:14 BST 2009
Oops, I meant to send this to the ML. Forwarding...
cr
Begin forwarded message:
> From: Chuck Remes <cremes.devlist at mac.com>
> Date: June 4, 2009 4:16:17 PM CDT
> To: smittycb10 <msmith1638 at gmail.com>
> Subject: Re: [rabbitmq-discuss] rabbit MQ high-through put stock
> quotes
>
>
> On Jun 4, 2009, at 2:48 PM, smittycb10 wrote:
>
>>
>> This is a follow up to my previous message with some more details
>> about how I
>> envision using Rabbit MQ.
>>
>> Currently I have a system where a server takes in all messages from
>> our data
>> provider and then streams the messages via UDP to a consumer
>> application.
>> This set up can handle the message load but is proving to be
>> difficult and
>> costly to scale. One producer to one Consuming server which then
>> filters
>> data to desktop clients.
>>
>> My idea for a future system is to have the same server simply push
>> the
>> messages to a rabbit MQ Broker or Brokers and then have all clients
>> simply
>> listen to whatever queues they need to get there jobs done. I need
>> to design
>> for 100,000 (one hundred thousand) messages to made available for
>> consuming
>> per second. Most consumers will only consume a fraction of these
>> messages,
>> although some of our server pieces may need to subscribe to the whole
>> universe of incoming messages. Messages can and will be batched on
>> the
>> publishing side.
>>
>> So the use case would be, one publisher pushing messages to the
>> "Broker
>> Cloud" many consumers some needing to listen to all messages others
>> would
>> only be interested in a particular Stock symbol at a time.
>>
>> I will experiment with some fanout experiments next, any advice or
>> direction
>> you have would be greatly appreciated. In a fanout pattern if say a
>> quote
>> comes in for IBM how do I distribute only to people interested in
>> IBM?
>
> Let me take a crack at this. Be forewarned that everything I write
> here might be wrong. :)
>
> We have 3 kinds of queues to choose from: direct, fanout and topic.
>
> In terms of performance, they behave in the order listed above from
> fastest to slowest.
>
> For direct exchanges, you bind your queue to it with a specific
> routing key. Any messages published to that exchange need to include
> a routing key. If the message's routing key *exactly* matches your
> binding's routing key, the message is delivered to your queue.
>
> For fanout exchanges, you bind a queue to it without any routing
> key. Any messages published to it are *unconditionally* delivered to
> the queue. The reason I didn't mark this as the fastest queue type
> is because you generally only pick it when you have lots of bound
> queues so the message needs to be copied to each. (Of course, this
> is implementation specific and could be very fast or slow; I don't
> know how Rabbit performs this operation.) If you only have one
> subscriber, this will be faster than the direct exchange because you
> skip the routing key comparison.
>
> For topic exchanges, you bind a queue to the exchange with a routing
> pattern. Any messages published with a key that matches your pattern
> are delivered to the queue.
>
> With that out of the way, I would suggest a fanout exchange per high-
> volume symbol. Any interested subscribers can bind a uniquely-named
> queue to that exchange and get all messages for IBM. Note I said you
> need a uniquely-named queue per subscriber. If you use a single
> queue for all subscribers, the subscribers will get messages in a
> round-robin fashion (usually). So N subscribers would only see an
> update every N messages which is probably not what you want.
>
> Alternately, you could use a direct exchange for the low-volume
> stocks. Interested subscribers could bind their (again) uniquely-
> named queue with a routing key matching the symbol (or some other
> identifier). Rabbit is smart enough that it will only deliver
> messages to queues where the keys match exactly. The reason I
> suggest this is due to potential performance concerns on the server
> side if it has to handle hundreds or thousands of unique exchanges.
> I don't know if Rabbit suffers from any performance degradation in
> this situation, so make sure to benchmark it.
>
> Additionally, you'll want to publish messages with "nowait" and
> "noack" and "not persistent" and make sure the queues are not
> "durable." See the API docs for your chosen library to see how to
> set those flags. "Nowait" means the client should not wait for a
> response to the publish method. "No ack" means the message itself
> doesn't need to be acknowledged as received; the server will "pre-
> ack" each message pulled from the queue. And the "not persistent"
> setting interacts with the "durability" setting on the queue.
> Durable messages (inside durable queues) are stored on disk so that
> a crash and restart doesn't lose any messages. For transient
> messages like stock quotes, you don't want them persisted at this
> level.
>
> Someone else on the list will have to answer your questions about
> scaling up to 100k/sec. I think Erlang processes are still single-
> threaded so you probably need to run multiple instances to take
> advantage of multi-core servers. Perhaps this is Rabbit
> clustering... I honestly don't know.
>
> FYI, tests on my hardware show that publishing from one process to
> another on the same box using a direct exchange and the suggestions
> listed above introduces around 600 microseconds in delivery latency.
> Pay close attention to your serialization/deserialization code; its
> performance could be a gating factor.
>
> Let us know how things turn out.
>
> cr
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20090605/68069719/attachment.htm
More information about the rabbitmq-discuss
mailing list