[rabbitmq-discuss] Unexpected Behavior When Using the "X-Consistent-Hash" Exchange Type

Thu Oct 17 00:13:54 BST 2013

An interesting update. In re-reading the x-consistent-hash type exchange
documentation, the following passage:

"The more points in the hash space each binding has, the closer the actual
distribution will be to the desired distribution (as indicated by the ratio
of points by binding). However, large numbers of points (many thousands)
will substantially decrease performance of the exchange type."

It appears that my initial understanding of the integer used to bind queues
to the x-consistent-hash exchange as a "weight" was incorrect, but rather
it represents the number of points along the overall hash continuum where
that particular queue is represented, with the points being (seemingly)
pseudorandomly distributed across the overall hash space. So I updated the
bindings on each of the queues from 1 to 100 and re-ran my tests with 50K
messages this time (in the interests of expediency).

The results are as follows:

**FIRST TEST (Using UUID as the Hash Header)**

Queue Name,Message Count,Percent of Total,Percent of Equality
error01,3505,7.01%,89.16%
error02,2919,5.84%,107.06%
error03,3418,6.84%,91.43%
error04,2912,5.82%,107.31%
error05,2945,5.89%,106.11%
error06,3146,6.29%,99.33%
error07,3057,6.11%,102.22%
error08,2868,5.74%,108.96%
error09,2968,5.94%,105.29%
error10,2866,5.73%,109.04%
error11,3513,7.03%,88.96%
error12,4077,8.15%,76.65%
error13,3054,6.11%,102.32%
error14,2648,5.30%,118.01%
error15,2718,5.44%,114.97%
error16,3386,6.77%,92.29%
ALL,50000,100.00%,N/A

**SECOND TEST (Using Random Int Between 1 and 1000000000 as the Hash
Header)**

Queue Name,Message Count,Percent of Total,Percent of Equality
error01,3313,6.63%,94.33%
error02,3255,6.51%,96.01%
error03,3077,6.15%,101.56%
error04,3048,6.10%,102.53%
error05,2384,4.77%,131.08%
error06,2954,5.91%,105.79%
error07,3586,7.17%,87.14%
error08,3055,6.11%,102.29%
error09,3030,6.06%,103.14%
error10,3517,7.03%,88.85%
error11,2954,5.91%,105.79%
error12,2921,5.84%,106.98%
error13,3046,6.09%,102.59%
error14,3710,7.42%,84.23%
error15,3024,6.05%,103.34%
error16,3126,6.25%,99.97%
ALL,50000,100.00%,N/A

So we now have a much more even distribution. I then changed the bindings
of the queues from 100 to 1000 and reran the test. I saw significant
increase in CPU usage and flow-control started kicking in (running this VM
on a single core on my laptop so not unexpected). This is in line with the
warning in the excerpt above and makes sense.

This leaves me with the following questions:

1) Is it true to say that it is now just a matter of choosing what tradeoff
we want to make in terms of performance vs. uniformity of distribution?
2) Can anyone comment on Michael's previous comment on how using a UUID
would be handled as the value in the hashed header? The numbers above don't
seem to show a large difference between using a random integer and a UUID.

Regards,

Richard Raseley

On Wed, Oct 16, 2013 at 3:34 PM, Michael Klishin <mklishin at gopivotal.com>wrote:

>
> On oct 17, 2013, at 2:33 a.m., Richard Raseley <richard at raseley.com>
> wrote:
>
> > Do you mean to say that those "voided" messages didn't make it to a
> queue or weren't hashed but simply put to an arbitrary queue?
>
> The former but it was just a guess.
> --
> MK
>
> Software Engineer, Pivotal/RabbitMQ
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20131016/4ff32d64/attachment.htm>