[rabbitmq-discuss] one thread is consistently slower than another

jonathan.newbrough at gyregroup.com jonathan.newbrough at gyregroup.com
Fri Mar 22 18:39:31 GMT 2013


I'm seeing really odd application performance and hoping for good ideas.

My application starts lots of threads that each subscribe to a queue.  When 
they receive a message, they persist to an NFS store, then publish a new 
message back to rabbit.  As the application runs, new "pathways" of data 
are created -- additional sources of data, additional queues, additional 
threads to persist the data.  Each thread reports the time taken to process 
each message (averaged over 100 messages).

I expected a distribution of time to handle each message, with the average 
gradually increasing with increasing load.  When I look at a graph of all 
these data points (average time to process message versus clock time) it 
looks a lot as I would expect.  You can see an example here<https://confluence.oceanobservatories.org/display/CIDev/Scale+Test+Results+Mar21>(second graph on page), but this has been reproduced each time I start to 
put load on the system.

But then when I single out individual threads, they follow very distinct 
paths on this graph.  For example, if I compare two threads A and B, I see 
that one thread may always take 0.5sec longer to process messages.  The 
performance of both threads rise and fall in parallel lines, but one thread 
is consistently slower than another!  It seems that the earlier the thread 
is started within the VM, the faster it can publish.  But as the 
cloud-based system grows and starts new VMs running the same application, 
new threads on the new VM perform as fast as the first threads on an 
earlier VM (so the performance drops off as the VM has been running longer 
-- not as the system or broker load grows).

The broker VM shows very low utilization (<10% CPU used on 4-core system, 
only 25% of 2GB RAM used).  The application VM grows until it reaches about 
65% CPU, memory shows 35% of 4GB used.  Message rates are relatively low -- 
growing from 0 to ~150msgs/sec with avg size ~500bytes.

I wish I could give you a nice easy code snippet to reproduce, but it will 
take some effort to extract a faithful reproduction from our complex 
publisher/subscriber/data process OO wrappings.  I'm hoping from the 
description, someone can point me at some possible explanations.

Many thanks for any suggestions or ideas!
Jonathan

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20130322/a5ab9c39/attachment.htm>


More information about the rabbitmq-discuss mailing list