[rabbitmq-discuss] RabbitMQ experience
matthias at lshift.net
Sat Jan 24 12:15:30 GMT 2009
[I am cross-posting this to the list with the permission of the OP]
>> I tried to benchmark both products [ActiveMQ and RabbitMQ] with
>> this initial test scenario:
>> - each test produces 100'000 messages from one of these sets
>> - +/-5KB, 50KB, 100KB, 200KB
>> - randomly into 5 queues
>> - test is executed with persistent messages (queues) and
>> then non-persistent
>> (total number of test types 4*2*number of iterations)
>> - consumer reads all the messages
>> - measure:
>> - time to enqueue all messages
>> - time to consume all messages
>> - run several iterations and get average numbers
>> Contestant +/- 5KB fix 50KB fix 100KB fix 200KB
>> ActiveMQ/stomp (np) 1398/1250 140/138 68/69 34/34
>> (persistent) 1247/967 127/125 66/66 34/34
>> RabbitMQ/stomp (np) 4475/360 362/33 175/17 85/8
>> (persistent) 3920/371 375/35 180/18 88/8
>> values are "processed avg. messages per second" on producer/consumer.
>> Performance of RabbitMQ producer is quite impressive!, but the consumer
>> falls behind.
This is not surprising. The tests push the system to its limits. When
that happens RabbitMQ (and the O/S) need to make choices as to what work
to do - drain messages of the tcp socket, push them through the
publication pipeline, store them, push them through the delivery
(consumer) pipelines, hand them over to the network card, etc, etc. Add
to that the fact that each of the pipelines has buffers (where messages
may end up getting stored temporarily), and that the stages require
different amounts of work.
The upshot of this is that when one pushes the system hard, it is
impossible to predict how exactly it will behave (within the bounds of
behaving correctly, which it does). Results can vary considerably
between tests. Chances are RabbitMQ will end up buffering messages at
various stages of the pipelines, since buffering costs very little
compared to other tasks. So we can see producer rates exceeding consumer
rates. [as an aside, your test is *not* measuring the "time to enqueue
all messages", it is measuring the time it takes the client to hand over
all messages to the client's network stack; they are a long way from
being enqueued at that point]. And this happens regardless of the number
of consumers because the requests from consumers, and the messages due
for delivery to them, are just more things that end up getting buffered.
This doesn't matter when the high load is temporary - RabbitMQ can cope
with that just fine - but if it the load is sustained over long periods
of time then RabbitMQ can run out of memory. More on that below.
The question is whether long periods of producers injecting messages at
a rate that exceeds the broker capacity is really something that occurs
in the use case you envisage. Most systems need to be able cope with
some specified average throughput, and peaks over short periods of time;
not sustained ingress rates that exceed the throughput capacity.
One way to measure the average sustainable throughput is to control the
rate at which the producers inject messages. Start with a low rate and
increase it incrementally, giving the system some time to settle at each
step. Then measure the consumption rate. Stop at the point where the
consumption rate falls behind the production rate.
>> And this is main reason of my diasppointment -- memory management of
>> messages is too greedy and RabbitMQ goes easily down. I know that
>> you have experimental memory-based flow control, but it only
>> _tells_ producers to slow down and furthermore it is not propagated
>> to STOMP clients. I think that this must be solved mainly on the
>> side of middleware, he has to slowdown clients like ActiveMQ does.
As you say, we *do* have an experimental flow control feature that
throttles producers. However, as Alexis has already mentioned, this
doesn't work for our STOMP adapater, and, more generally, the STOMP
adapter is experimental and certainly not the way to go if maximum
performance is the objective.
As for the middleware *forcing* producers to slow down, there is only
one way to do that: TCP's flow control. Unfortunately that doesn't work
too well in AMQP because a single TCP connection can be shared by both
producers and consumers, and consuming messages generally involves the
client *sending* some data on the network connection (e.g. for
acknowledgements). Therefore putting back-pressure on these connections
will slow down both producers and consumers and hence may not alleviate
By contrast, AMQP's channel.flow command (which is what our experimental
flow control implements) allows the producer to ask the clients to stop
publishing messages, without affecting message consumption. That does
require cooperation from clients, but note that a) this is fairly easy
to implement and is done in all the clients we officially support, b)
the spec permits the server to cut off clients who do not abide by the
channel.flow instruction (though the RabbitMQ server does not do this).
Finally, as Alexis mentioned, we are in the process of enabling RabbitMQ
to page messages to disk, thus eliminating the memory-bounded nature of
the current message queuing implementation. Note however that under high
load messages could still build up in memory in various buffers
throughout the system, as described above. So this new feature won't be
a substitute for flow control.
More information about the rabbitmq-discuss