[rabbitmq-discuss] performance
Matthias Radestock
matthias at lshift.net
Mon Sep 1 13:01:57 BST 2008
Edwin,
Edwin Fine wrote:
>
> Some testing we did in the past indicates that generally a clustered
> broker - with one node per core and smp disabled for the Erlang VM -
> performs significantly better than a single smp-enabled node.
>
>
> Now /that/ is very interesting. I have seen the same kind of thing in
> some ad-hoc (non-RabbitMQ related) experiments I did some time back
> (better performance from multiple VMs, each in single-cpu configuration,
> on a multi-core system) and thought it was my imagination because SMP
> was supposed to be the way to go. I haven't seen much discussion on this
> on the erlang-questions mailing lists, and quite frankly, I'm not going
> to start one without some solid, repeatable evidence. If you have seen
> this behavior, have you brought it up with the Erlang gurus, and if so,
> have they said anything enlightening about it?
I have mentioned our observations to a few folks, but, as you say, there
is no point in pursuing this further until we have solid, repeatable
evidence. Now, our results *are* repeatable, but they are all in the
context of RabbitMQ. To start a fruitful discussion on the Erlang list /
with the Erlang gurus we'd need to construct a simpler, standalone test
exhibiting the same behaviour. In the process we may well discover the
root cause of the problem ourselves.
Btw, one issue with performance testing of RabbitMQ is that it is really
difficult to measure the maximum throughput. RabbitMQ is a message
*queuing* system, and any test setup will have several message buffers
at various levels - the OS's network stack at the test client and
RabbitMQ server, various process message queues at the server and
buffers in the test client, and the queue processes at the server.
Optimum throughput is achieved when all these buffers contain just the
right amount of data so that the processing hanging off them never has
to wait for data and yet no data is buffered unnecessarily. There are
lots of tweakable parameters that affect buffering in the OS, the
Erlang/Java VM, and the client/server apps. Furthermore, due to jit-ing
and variations in scheduling decisions (by the VMs and the OS) the
optimal settings shift over time.
As others have discovered, if a test just blasts messages at RabbitMQ,
the broker will likely start queuing up most of them, consume increasing
amounts of memory, and eventually grind to a halt. To get a sensible max
throughput measurement a more sophisticated approach is required that
controls and adapts the sending rate to the prevailing conditions.
We'd love to get the help of the community to put together a really
simple "run this and it will report the maximum throughput" test
program. Initially this can be for just the simplest (and fastest)
routing scenario - single producer, single consumer (running in the same
OS process if that is convenient), single queue, direct exchange,
auto-ack basic.consume.
Note that this test app would work against all AMQP brokers, not just
RabbitMQ, so could be used for performance comparison.
Any takers?
Matthias.
More information about the rabbitmq-discuss
mailing list