<p>Why can&#39;t you use a checksum instead? Each time you create a set of n subtasks from some task T, attach a fraction m/n to each subtask where m is the fraction attached to T. Start with m equals 1. The sum of the fractions will always be 1. No need for shared counters...<br>

</p>

<p><blockquote type="cite">On Oct 1, 2010 3:35 PM, &quot;Jon Brisbin&quot; &lt;<a href="mailto:jon.brisbin@npcinternational.com">jon.brisbin@npcinternational.com</a>&gt; wrote:<br><br><div style="word-wrap:break-word">I&#39;m also wondering if anyone uses counts to determine when a job is finished or not. By that I mean, increment a counter for every outgoing message and decrement the counter when a response is received. In the case of a map/reduce job, I&#39;d need to do something like:<div>

<br></div><div>SQL -&gt; Map phase = +1 (per row)</div><div>Map phase -&gt; Reduce phase = -1 (that we got the original msg) +1 * (num of emit&#39;s)</div><div>Reduce phase -&gt; Response|ReReduce = -1 (for emit&#39;s) +1 (for response/rereduce)</div>

<div>[ReReduce -&gt; Response] = -1 +1 (for sending response)</div><div>Response = -1</div><div><br></div><div>Essentially, each step would decrement a counter for the incoming message and increment the counter for the outgoing message. A reduce phase might decrement the counter 1000 times and increment it once. But since the map phase incremented it 1000 times prior, the count after map/reduce would be &quot;1&quot;. The response listener would then decrement the counter when it processed the response, see that it&#39;s now zero, and know to continue.</div>

<div><br></div><div>If my goal is to beat processing times on the AS/400 when doing large financial calculations (daily acct&#39;g reports take several hours to generate), I can&#39;t really depend on timeouts to make sure I&#39;ve gathered all my results. I want the job to return as soon as results are ready. I&#39;d like to go to management and show them a 2 hr -&gt; 15 min improvement by using parallel processing.</div>

<div><br></div><div>I&#39;m just wondering if using ZooKeeper or similar to do distributed, synchronized counters will have enough atomicity to not miss a count incr/decr. If I miss even one, I&#39;m screwed because it&#39;ll never get back to zero (or get there prematurely).</div>

<div><br></div><div>I need a sentence with a question mark or this will definitely go unanswered: are message counters like this a good way to monitor asynchronous, distributed processing state?</div><div><br></div><div>Thanks! :)<p>

<font color="#500050">

Jon Brisbin

Portal Webmaster

NPC International, Inc.

</font></p><p><font color="#500050">On Oct 1, 2010, at 8:11 AM, Jon Brisbin wrote:

&gt; I had not really looked at the spring integration ...</font></p></div></div><br>_______________________________________________<br>

rabbitmq-discuss mailing list<br>

<a href="mailto:rabbitmq-discuss@lists.rabbitmq.com">rabbitmq-discuss@lists.rabbitmq.com</a><br>

<a href="https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss" target="_blank">https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss</a><br>

<br></blockquote></p>