[rabbitmq-discuss] The *old* persister: call for help in experimenting

Tony Garnock-Jones tonyg at lshift.net
Mon Apr 5 19:39:48 BST 2010


Hi all,

The current persister (*not* Matthew's new persister) is very much
TSTTCPW, and uses a few overly-simple heuristics to decide when to start
a GC run. Some of them are arbitrarily chosen, and I suspect that many
readers of this list *might* see a not inconsiderable improvement in
persistent-message performance by tweaking them. The tradeoff seems to
be in RAM usage.

If you can help by reporting experiences with tweaked persister-control
variables, please do try out the suggestions in the rest of this message
and let us know what happens!


The main control variables, all of which are compiled in (!) to the
code, are

 -define(LOG_BUNDLE_DELAY, 5).
 -define(COMPLETE_BUNDLE_DELAY, 2).
 -define(HIBERNATE_AFTER, 10000).
 -define(MAX_WRAP_ENTRIES, 500).

The *_BUNDLE_DELAY variables are milliseconds to wait for additional
work to arrive at the persister before performing some I/O to the
journal, in differing calling contexts.

The HIBERNATE_AFTER variable is the number of milliseconds of persister
idling after which we choose to GC the journal.

The MAX_WRAP_ENTRIES variable is a completely arbitrary count of I/O
operations after which we choose to GC the journal.

So here's the issue: MAX_WRAP_ENTRIES really is pretty arbitrary, and I
think the value 500 is probably one or maybe even two orders of
magnitude too low.

To see why, imagine that we're submitting a constant stream of work to
the persister in a situation that results in it waiting for
COMPLETE_BUNDLE_DELAY (i.e. 2) milliseconds for more work before
performing (and counting!) a bundled I/O to the journal. Since we will
GC after MAX_WRAP_ENTRIES I/Os, we will GC every MAX_WRAP_ENTRIES *
COMPLETE_BUNDLE_DELAY = 1000 milliseconds.

A garbage collection of the whole message store once a second, at worst!
Ouch.

If people are in a position such that

 - they use the persister heavily (durable queues, persistent messages)
 - they are submitting moderate load to Rabbit (> 100 messages/sec)
 - they are experiencing excessive persister log rollover
 - they run RabbitMQ built from source code

then I'd *love* to hear if changing MAX_WRAP_ENTRIES for your local
rabbit to something like 5000 makes a difference in terms of rollover
frequency and general rabbit performance for you.

Note that RAM requirements will rise a bit if you increase
MAX_WRAP_ENTRIES. I've just run some experiments on my Macbook. Using
ProducerMain/ConsumerMain to provider a 300 message-per-second
persistent load through a single durable queue, I notice:

|------------------+----------+--------------+----------+----------|
| MAX_WRAP_ENTRIES | rollover | max log size | real mem | virt mem |
|------------------+----------+--------------+----------+----------|
|              500 | 3.5 s    | 700 kB       | 13 MB    | 609 MB   |
|             5000 | 34 s     | 7.3 MB       | 19 MB    | 615 MB   |
|            50000 | ~5 min   | 72 MB        | ~60 MB   | ~660 MB  |
|------------------+----------+--------------+----------+----------|

(I'm not sure why Rabbit seems to use RAM in proportion to the size of
the journal.)

Changing MAX_WRAP_ENTRIES from 500 to 5000 gave a tenfold reduction in
the number of rollovers, at a cost of about 6 MB RAM, which seems pretty
reasonable to me.

Tony





More information about the rabbitmq-discuss mailing list