[rabbitmq-discuss] [Platform Team] Preview Release of Haigha

Mon May 16 08:01:17 BST 2011

Hey Aaron, cool work. Using libevent is a nice approach. Given Pika's modular approach, I'll have to see what that looks like as an option in Pika v2. 

Given Jason's questions, I decided to port your bench to Pika and threw in py-amqplib (single channel only) to boot.

As I had expected, Haigha benched faster than Pika, but surprisingly only by 2.84% with 500 channels. I does, however, start to show its performance benefits on a single connection with multiple channels when the single socket is not over-saturated. It was 13.29% faster than Pika with 50 channels and 21.57% faster with 10 channels. 

I was somewhat surprised to find that Pika benched 13.08% faster with only 1 channel. So surprised that I had to triple check the numbers ;-) 

I used the 500 number as the initial baseline because it the default value in your test app. I am curious what the use case for so many channels is in the test. Is it to simulate concurrency in a client app?

Anyway here is the info on my tests:

Box:
Dual quad core AMD 2.2Gz (Model 2356)
8GB Ram
Max CPU utilization during all tests: 19%
Max IOWait during all tests: 0.2%
Gentoo Linux with a custom compiled 2.6.28 kernel

RabbitMQ:
2.4.1
Erlang R14B02
Default settings

Client libraries:
Haigha 0.2.1
Pika 0.9.6p0
py-amqplib 0.6.1

Test Parameters:
Duration: 300 seconds
Channel Count: 1, 10, 50, 500
No-Ack: True
Transactions: Off
Single threaded
Single Process

The 1 channel option was to accommodate py-admqplib and BlockingConnection. BlockingConnection can handle multiple channels but the overhead for setting up 500 connections on BlockingAdapter in Pika was prohibitive. The test ended prior to all queues being bound and a single full setup is roughly 1.2 seconds in my environment. For the Pika/TornadoConnection test, I used Tornado 1.2.

A few important things were uncovered in Pika with this test:
There is an interesting recursion bug under heavy loads in all 0.9.x versions. This bug would only present itself if there was always a frame to read or frame to publish in the buffer. If the IOLoop ever had a cycle without inbound or outbound data, the bug will not present. But if there is always data to process, for up to 1,000 frames, there will be a RuntimeError. 1,000 is the default Python recursion limit. This bug has been fixed.
BlockingConnection is very slow! It is roughly 1/8th the speed of py-amqplib. While Asyncore can do roughly 1,800 messages a second on a single channel, Blocking can only do roughly 4!

I didn't make much of an effort to clean up the code or remove the parts I don't use, but the stress_test app hack for pika is at https://gist.github.com/974021

Haigha looks nice. You're doing some of the things I have been thinking about with future versions of Pika such as in your haigha/classes modules (More natural AMQP mapping of classes for use in the client). I look forward to seeing where you take it.

Regards,

Gavin 
On Saturday, May 14, 2011 at 9:40 PM, Jason J. W. Williams wrote:
> How does this differ from Pika's asyncore or Tornado bindings in terms of performance?
> 
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20110516/b5c28ee4/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Benchmark.png
Type: image/png
Size: 44296 bytes
Desc: not available
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20110516/b5c28ee4/attachment-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Benchmark.csv
Type: application/octet-stream
Size: 1813 bytes
Desc: not available
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20110516/b5c28ee4/attachment-0001.obj>