<div dir="ltr">As a final question, can anyone on the engineering team provide a recommendation as to the best way to generate a header value for hashing use in terms of what will implicitly provide the most even distribution?�<div>
<br></div><div>For example, if a random integer is the best way to go, should the range of possible integers be scoped from 1 to Q*B where Q is equal to the number of queues you have bound to the x-consistent-hash exchange and B is equal to the sum of all the integers used as routing keys when binding the queues to the x-consistent-hash exchange?</div>
<div><br></div><div>Regards,</div><div><br></div><div>Richard</div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Wed, Oct 16, 2013 at 4:13 PM, Richard Raseley <span dir="ltr">&lt;<a href="mailto:richard@raseley.com" target="_blank">richard@raseley.com</a>&gt;</span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra">An interesting update. In re-reading the x-consistent-hash type exchange documentation, the following passage:</div>
<div class="gmail_extra"><br></div><div class="gmail_extra">&quot;The more points in the hash space each binding has, the closer the actual distribution will be to the desired distribution (as indicated by the ratio of points by binding). However, large numbers of points (many thousands) will substantially decrease performance of the exchange type.&quot;</div>


<div class="gmail_extra"><br></div><div class="gmail_extra">It appears that my initial understanding of the integer used to bind queues to the x-consistent-hash exchange as a &quot;weight&quot; was incorrect, but rather it represents the number of points along the overall hash continuum where that particular queue is represented, with the points being (seemingly) pseudorandomly distributed across the overall hash space. So I updated the bindings on each of the queues from 1 to 100 and re-ran my tests with 50K messages this time (in the interests of expediency).</div>


<div class="gmail_extra"><br></div><div class="gmail_extra">The results are as follows:<br><br>**FIRST TEST (Using UUID as the Hash Header)**<br><br></div><div class="gmail_extra"><div class="gmail_extra">Queue Name,Message Count,Percent of Total,Percent of Equality</div>


<div class="gmail_extra">error01,3505,7.01%,89.16%</div><div class="gmail_extra">error02,2919,5.84%,107.06%</div><div class="gmail_extra">error03,3418,6.84%,91.43%</div><div class="gmail_extra">error04,2912,5.82%,107.31%</div>


<div class="gmail_extra">error05,2945,5.89%,106.11%</div><div class="gmail_extra">error06,3146,6.29%,99.33%</div><div class="gmail_extra">error07,3057,6.11%,102.22%</div><div class="gmail_extra">error08,2868,5.74%,108.96%</div>


<div class="gmail_extra">error09,2968,5.94%,105.29%</div><div class="gmail_extra">error10,2866,5.73%,109.04%</div><div class="gmail_extra">error11,3513,7.03%,88.96%</div><div class="gmail_extra">error12,4077,8.15%,76.65%</div>


<div class="gmail_extra">error13,3054,6.11%,102.32%</div><div class="gmail_extra">error14,2648,5.30%,118.01%</div><div class="gmail_extra">error15,2718,5.44%,114.97%</div><div class="gmail_extra">error16,3386,6.77%,92.29%</div>


<div class="gmail_extra">ALL,50000,100.00%,N/A</div><div class="gmail_extra"><br></div><div class="gmail_extra">**SECOND TEST (Using Random Int Between 1 and 1000000000 as the Hash Header)**</div><div class="gmail_extra">


<br></div><div class="gmail_extra"><div class="gmail_extra">Queue Name,Message Count,Percent of Total,Percent of Equality</div><div class="gmail_extra">error01,3313,6.63%,94.33%</div><div class="gmail_extra">error02,3255,6.51%,96.01%</div>


<div class="gmail_extra">error03,3077,6.15%,101.56%</div><div class="gmail_extra">error04,3048,6.10%,102.53%</div><div class="gmail_extra">error05,2384,4.77%,131.08%</div><div class="gmail_extra">error06,2954,5.91%,105.79%</div>


<div class="gmail_extra">error07,3586,7.17%,87.14%</div><div class="gmail_extra">error08,3055,6.11%,102.29%</div><div class="gmail_extra">error09,3030,6.06%,103.14%</div><div class="gmail_extra">error10,3517,7.03%,88.85%</div>


<div class="gmail_extra">error11,2954,5.91%,105.79%</div><div class="gmail_extra">error12,2921,5.84%,106.98%</div><div class="gmail_extra">error13,3046,6.09%,102.59%</div><div class="gmail_extra">error14,3710,7.42%,84.23%</div>


<div class="gmail_extra">error15,3024,6.05%,103.34%</div><div class="gmail_extra">error16,3126,6.25%,99.97%</div><div class="gmail_extra">ALL,50000,100.00%,N/A</div><div class="gmail_extra"><br></div><div class="gmail_extra">


So we now have a much more even distribution. I then changed the bindings of the queues from 100 to 1000 and reran the test. I saw significant increase in CPU usage and flow-control started kicking in (running this VM on a single core on my laptop so not unexpected). This is in line with the warning in the excerpt above and makes sense.</div>


<div class="gmail_extra"><br></div><div class="gmail_extra">This leaves me with the following questions:<br><br></div><div class="gmail_extra">1) Is it true to say that it is now just a matter of choosing what tradeoff we want to make in terms of performance vs. uniformity of distribution?</div>


<div class="gmail_extra">2) Can anyone comment on Michael&#39;s previous comment on how using a UUID would be handled as the value in the hashed header? The numbers above don&#39;t seem to show a large difference between using a random integer and a UUID.</div>

<div class="gmail_extra"><br></div><div class="gmail_extra">Regards,</div><div class="gmail_extra"><br></div><div class="gmail_extra">Richard Raseley</div></div></div><div><div class="h5"><div class="gmail_extra"><br><div class="gmail_quote">

On Wed, Oct 16, 2013 at 3:34 PM, Michael Klishin <span dir="ltr">&lt;<a href="mailto:mklishin@gopivotal.com" target="_blank">mklishin@gopivotal.com</a>&gt;</span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div><br>
On oct 17, 2013, at 2:33 a.m., Richard Raseley &lt;<a href="mailto:richard@raseley.com" target="_blank">richard@raseley.com</a>&gt; wrote:<br>
<br>
&gt; Do you mean to say that those &quot;voided&quot; messages didn&#39;t make it to a queue or weren&#39;t hashed but simply put to an arbitrary queue?<br>
<br>
</div>The former but it was just a guess.<br>
--<br>
MK<br>
<br>
Software Engineer, Pivotal/RabbitMQ<br>
<br>
</blockquote></div><br></div></div></div></div>
</blockquote></div><br></div>