[rabbitmq-discuss] rabbitmq performance

Sat Dec 1 10:26:46 GMT 2007

Hi everyone,

Recently we did some load testing of RabbitMQ, working with Intel.
Their press release is reported here:

http://www.intelfasterfs.com/trading/articles/071128-intellowlatency.aspx

The use case was a simulated OPRA data feed using combination of:

- Pantor FAST client (essentially a codec) combined with an AMQP
client written in C (yes we are hoping to get this into the community)
- RabbitMQ AMQP broker version 1.2 on the server

Please don't read too much into the latency numbers here: the timed
path included two network hops as well as message processing at the
broker; also, somewhat annoyingly, the numbers are averaged over
multiple scenarios.  We wanted to look at throughput because OPRA
feeds are heading to 1 million ticks per second and it's a good load
testing case.

We shall publish more info soon but the numbers are as follows:

1. Ingress of about 1.3 million OPRA messages per second
2. Replicated out to four clients at once (unicast pub/sub not multicast)
3. So simultaneously, egress of about 5 million OPRA messages per second

The broker cluster was on one multicore box with 16 cores.  The network
was a full TCP/IP stack, and a standard 1GigE network (= the bottleneck).

The set up was:

1 Client Box --> 1 Server Box --> 1 Client Box

We used Intel's 16 core Caneland box for the server and the FAST/AMQP
client was delivered by Pantor, working with us.

How come the numbers are so high?  Well, one reason is that we used
FAST, which is a codec.  Each OPRA message was FAST-compressed
and batched into a block or 'datagram' of 16 compressed OPRA messages.
This is gradually becoming normal practice in the world of market data
feeds because the loads are high and people do not have enough
bandwidth to cope.

So in our test, each datagram contained 16 OPRA messages, and was
sent as one 256 byte AMQP message.

So the throughput can also be seen as:

1. Ingress of 80,000 AMQP messages per second (256b per message)
2. Replicated out to four clients at once (unicast pub/sub)
3. So simultaneously, egress of 320,000 AMQP messages per second (256b
per message)

I.e the real load is about 400,000 mps.

There are several ways to get these numbers higher:

- tune RabbitMQ for speed
- use multicast
- use Infiniband
- use faster cores

We just did some more tests using Intel's 45nm cores which look
promising in this regard.

The point is: for most use cases you can get good performance using
COTS hardware.  This means you can spend your valuable project
investment dollars on making the user experience better instead of
messing about with deep tech.

We think scalability, stability and ease of use are more important than
raw speed.  If you try to run RabbitMQ and do not see what you
expect along any of these metrics, please let us know and we'll help you.

alexis

-- 
Alexis Richardson
+44 20 7617 7339 (UK)
+44 77 9865 2911 (cell)
+1 650 206 2517 (US)