[rabbitmq-discuss] fastest way to process messages?

Mon Jul 12 11:57:04 BST 2010

On 09/07/10 21:43, Jon Brisbin wrote:
> I pushed a rough draft/alpha version of the asynchronous distributed
> cache I blogged about last week to GitHub. It works pretty well for a
> rough draft. I am noticing some pretty large variances in my metrics
> from one run to another, though. I'm throwing 25 messages at a time at
> an object with 25 workers waiting for those objects (so there's no
> waiting for objects to be processed). I'm seeing run times as fast as
> 10ms and as slow as 150-200ms. It really varies from one run to the next.

Hi Jon. I would first of all suggest you use a profiler to check whether 
the bottlenecks are in the server or your client, and if so where. But I 
guess you already knew that :)

> I'm sure there are bottlenecks in my code. I'm starting to investigate
> that right now. But I was wondering if there was some consensus on
> maximizing throughput (using the Java client). Since this is a cache,
> I'm wanting to keep load times to an absolute minimum (duh). In several
> runs, I've gotten down to 10-15 ms but I can't get it to do that
> consistently.
>
> Will I be able to process more messages if each of my workers has its
> own queue, but binds to the exchange using a common key--or should I do
> what I'm currently doing, which is to use the QueueingConsumer and
> simply use multiple workers to pull messages off the single
> BlockingQueue? Which has the potential for higher throughput while not
> increasing the likelihood I'll end up with duplicate messages?

If each worker has its own queue, each worker will get a copy of all 
messages (or a subset depending on how you do routing). It sounds like 
this is not what you want.

> Is there a better way to process messages if performance is the primary
> consideration than using the QueueingConsumer? I've haven't looked at
> the code to see what it's doing, but I would think fewer method calls
> between delivery of the message and calling the callback provided by the
> application code would give me greater throughput and shorter run times.

QueueingConsumer implements the Consumer interface and stores messages 
inside a LinkedBlockingQueue. I'm not sure of how performant that is, 
but it shouldn't be too bad.

If you determine that's where the problems lie, you can implement 
Consumer yourself. You should make sure that you implement 
handleDelivery() in such a way as to be very fast - that method is 
invoked by the Connection main loop so the Connection will block while 
it runs (this is the motivation for the existence of QueueingConsumer).

> I should probably also investigate alternative languages. I started this
> as a Java object cache, so naturally I used Java. But I'm wondering if I
> could get better performance by skipping the serialize/deserialize step
> in the cache provider (I'm storing actual objects in memory rather than
> the byte array, which I considered doing at first).

I don't see why Java shouldn't be fast enough, but I don't know how 
optimised serialisation is.

Cheers, Simon

-- 
Simon MacMullen
Staff Engineer, RabbitMQ
SpringSource, a division of VMware