[rabbitmq-discuss] Priority Queue Implementation

Fri Nov 11 13:38:04 GMT 2011

Hi,

On Thu, Nov 10, 2011 at 01:14:21PM +0100, Alvaro Videla wrote:
> I just wanted to bring to your attention this priority queue implementation
> in Erlang that somehow compares itself against RabbitMQ and Riak
> implementations.
> 
> https://github.com/okeuday/pqueue/
> 
> More info in Russian Here:
> http://erlanger.ru//page/1886/ochered-s-prioritetom

In general, beware of large constant overheads. For simple use cases,
it's often the case that a less-efficient but simpler algorithm is
faster.

Our priority_queue is not intended to be a general purpose priority
queue that performs brilliantly for billions of items with uniformly
distributed random priorities. It's intended for use with a very small
number of priorities (we never use it with more than ten), and very
heavily skewed use to one particular priority (normally 0).

I've not analysed it, but my instinct is that a priority queue built
from a 2,3-fingertree will have better theoretical characteristics than
the pqueue2 linked above simply because it avoids the possibility of a
heavily skewed unbalanced tree forming. However, having implemented such
a thing myself (
http://hg.rabbitmq.com/erlang-data-structures/file/default/finger_tree/src/
) I know that in reality it doesn't perform brilliantly because it tends
to produce quite a lot of garbage and intermediate structures and whilst
its theoretical characteristics are vastly better than our
priority_queue.erl module, for our use case, it performs worse (not to
mention being the proverbial hammer to crack a nut).

I note from the wikipedia page for skew heaps:
"In contrast with binary heaps, there are no structural constraints, so
there is no guarantee that the height of the tree is logarithmic."

This suggests that pqueue2:in is not even guaranteed O(log_2(N)).
priority_queue:in will be O(M/2) where M is the number of priorities
(the actual insert to the sub-queue will be amortized O(1)). The
2,3-finger_tree as a priority queue is amortized O(1) for insert and
O(log_2(N)) for out. That said, except in rather extreme use cases, I'd
expect pqueue2 and priority_queue to beat it.

I'd be curious to see what the benchmark tests were that Michael Truog
used. I suspect they'd have been rather random and wide ranging in
nature.

Thanks for the heads up though - always interesting stuff.

Matthew