[rabbitmq-discuss] RMQ performance between high MHz/low core vs lower MHz/high core servers

Chris Schmidt cischmidt77 at gmail.com
Fri Jan 4 23:08:04 GMT 2013


Thanks for the pointers Matthias, I wasn't able to do a lot of changes
before the holidays, but I've done a few things since the last email.

I've updated to RMQ 3.0.1 and R15B03 of Erlang. I noticed a slight
improvement in performance but not much. Increasing the number of queues
also didn't seem to make a large impact. One thing that I'm noticing while
running in HIPE mode is the amount of system utilization on the server.
System utilization is at 85-90% while user processes are only at 8%. I went
to the stock Erlang configuration and just changed the +swt setting. I
noticed that setting it to very_high caused a dip in performance while
very_low was about the same as not modifying the setting.

The server has 4 NUMA nodes configured (one for each 10 core CPU).
Interestingly, running against a single NUMA node by wrapping the start of
RMQ with numactl almost doubles performance. Kernel utilization drops to
15% and user utilization is about the same. As soon as I run RMQ across
more than one NUMA node performance suffers. Is there anything within the
Erlang settings that would allow me to run RMQ across more cores but keep
the kernel utilization down? I'm guessing that schedulers are spawning
across NUMA nodes and passing data back and forth between them, causing the
kernel time to increase. If there is some way to localize the schedulers
that work together within the same node that will be ideal. I tried a few
+sbt options again but that didn't seem to make a difference.

Thanks,

 Chris


On Fri, Dec 21, 2012 at 10:46 AM, Matthias Radestock
<matthias at rabbitmq.com>wrote:

> Chris,
>
>
> On 20/12/12 17:27, Chris Schmidt wrote:
>
>> RHEL 6.2 doesn't seem to have the latest version of Erlang available
>> in the repositories. Does anyone have experience with manually
>> building Erlang on that platform with success? We're on R14B04 right
>> now. There are a huge number of erlang related rpms that get
>> installed I'm not certain if a single Erlang version for CentOS
>> would work properly or not.
>>
>
> You may want to give
> https://my.vmware.com/web/**vmware/details?downloadGroup=**
> VFEL_15B02&productId=267<https://my.vmware.com/web/vmware/details?downloadGroup=VFEL_15B02&productId=267>
> a try - it's R15B02 (so not quite the latest but close enough) with
> "batteries included", i.e. no dependencies.
>
>
>  I had +K true -smp enable +native +sbtps plus the other RMQ defaults
>>
>
> "+K true" is already part of the standard config. So is "-smp auto",
> which enables smp when there is more than one core. "+native" is a
> compile-time option and I would not recommend using it. As for +sbtps, I
> suggest you leave that out and stick with the default.
>
> As I said in my earlier email, playing with the +swt option may yield
> some benefits.
>
>
>  I also had 15 consumers running against the initial queue that was
>> having problems.
>>
>
> Increasing the number of consumers only helps when the consumers, rather
> than the queue, are the bottleneck.
>
>
>  I also tried sending the data into 2 - 40 exchanges (and updated the
>> queue to read from those exchanges) in order to see if I could spread
>> the load that way.
>>
>
> Exchanges are not represented by processes, so you don't gain anything
> by spreading the load over them. Increasing the number of producers can
> increase the message ingestion rate, but only if the messages get routed to
> different queues.
>
>
>  Does a single process manage each queue?
>>
>
> yes.
>
>
>  If so I can see that being a potential bottleneck due to the core's
>> speed difference.
>>
>
> Exactly.
>
>
>  Is there an erlang or rabbitmqctl command that I can run against the
>> RMQ server to determine if the queues themselves are the bottleneck?
>>
>
> If the queues are growing then the bottleneck is on the outbound side
> (rabbit's AMQP encoding, network, consumers).
>
> If the queues stay short/empty then the bottleneck is on the inbound side
> (producer, network, rabbit's AMQP decoding, routing, the queue). If
> connections get blocked due to flow control then the bottleneck is in the
> latter three. Narrowing it down any further is tricky.
>
> Regards,
>
> Matthias.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20130104/c1cbea97/attachment.htm>


More information about the rabbitmq-discuss mailing list