Matthew,<div>  an excellent response and thank you for it! Yes, difficult it is!</div><div><br></div><div>It raises a somewhat philosophical discussion around where the onus is placed in terms of guaranteeing such things as &#39;guaranteed once&#39;, i.e., on the client side or on the server side? The JMS standard offers guaranteed once, whereby the onus is on the server (JMS implementation) and not on the client. </div>

<div><br></div><div>What I am trying to say is that, in my opinion, client programs should be as &#39;simple&#39; as possible with the servers doing all the hard work. This is what the JMS standard forces on implementors and, perhaps to a lesser extent today, do does AMQP.</div>

<div><br></div><div>Note: the word &#39;server&#39; is horribly overloaded these days. It is used here to indicate the software with which clients, producers and consumers, communicate.</div><div><br></div><div>Oh well, off to librabbitMQ and some example programs written in COBOL...</div>

<div><br></div><div>Cheers, John<br><div class="gmail_quote">On Thu, Aug 5, 2010 at 13:22, Matthew Sackman <span dir="ltr">&lt;<a href="mailto:matthew@rabbitmq.com">matthew@rabbitmq.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

Hi Mike,<br>

<br>

On Tue, Aug 03, 2010 at 04:43:56AM -0400, Mike Petrusis wrote:<br>

&gt; In reviewing the mailing list archives, I see various threads which state that ensuring &quot;exactly once&quot; delivery requires deduplication by the consumer.  For example the following:<br>

&gt;<br>

&gt; &quot;Exactly-once requires coordination between consumers, or idempotency,<br>

&gt; even when there is just a single queue. The consumer, broker or network<br>

&gt; may die during the transmission of the ack for a message, thus causing<br>

&gt; retransmission of the message (which the consumer has already seen and<br>

&gt; processed) at a later point.&quot;  <a href="http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/2009-July/004237.html" target="_blank">http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/2009-July/004237.html</a><br>


&gt;<br>

&gt; In the case of competing consumers which pull messages from the same queue, this will require some sort of shared state between consumers to de-duplicate messages (assuming the consumers are not idempotent).<br>

&gt;<br>

&gt; Our application is using RabbitMQ to distribute tasks across multiple workers residing on different servers, this adds to the cost of sharing state between the workers.<br>

&gt;<br>

&gt; Another message in the email archive mentions that &quot;You can guarantee exactly-once delivery if you use transactions, durable queues and exchanges, and persistent messages, but only as long as any failing node eventually recovers.&quot;<br>


<br>

All the above is sort of wrong. You can never *guarantee* exactly once<br>

(there&#39;s always some argument about whether receiving message duplicates<br>

but relying on idempotency is achieving exactly once. I don&#39;t feel it<br>

does, and this should become clearer as to why further on...)<br>

<br>

The problem is publishers. If the server on which RabbitMQ is running<br>

crashes, after commiting a transaction containing publishes, it&#39;s<br>

possible the commit-ok message may get lost. Thus the publishers still<br>

think they need to republish, so wait until the broker comes back up and<br>

then republishes. This can happen an infinite number of times: the<br>

publishers connect, start a transaction, publish messages, commit the<br>

transaction and then the commit-ok gets lost and so the publishers<br>

repeat the process.<br>

<br>

As a result, on the clients, you need to detect duplicates. Now this is<br>

really a barrier to making all operations idempotent. The problem is<br>

that you never know how many copies of a message there will be. Thus you<br>

never know when it&#39;s safe to remove messages from your dedup cache. Now<br>

things like redis apparently have the means to delete entries after an<br>

amount of time, which would at least allow you to avoid the database<br>

eating up all the RAM in the universe, but there&#39;s still the possibility<br>

that after the entry&#39;s been deleted, another duplicate will come along<br>

which you now won&#39;t detect as a duplicate.<br>

<br>

This isn&#39;t just a problem with RabbitMQ - in any messaging system, if<br>

any message can be lost, you can not achieve exactly once semantics. The<br>

best you can hope for is a probability of a large number of 9s that you<br>

will be able to detect all the duplicates. But that&#39;s the best you can<br>

achieve.<br>

<br>

Scaling horizontally is thus more tricky because, as you say, you may<br>

now have multiple consumers which each receive one copy of a message.<br>

Thus the dedup database would have to be distributed. With high message<br>

rates, this might well become prohibitive because of the amount of<br>

network traffic due to transactions between the consumers.<br>

<br>

&gt; What&#39;s the recommended way to deal with the potential of duplicate messages?<br>

<br>

Currently, there is no &quot;recommended&quot; way. If you have a single consumer,<br>

it&#39;s quite easy - something like tokyocabinet should be more than<br>

sufficiently performant. For multiple consumers, you&#39;re currently going<br>

to have to look at some sort of distributed database.<br>

<br>

&gt; Is this a rare enough edge case that most people just ignore it?<br>

<br>

No idea. But one way of making your life easier is for the producer to<br>

send slightly different messages on every republish (they would still<br>

obviously need to have the same msg id). That way, if you detect a msg<br>

with &quot;republish count&quot; == 0, then you know it&#39;s the first copy, so you<br>

can insert async into your shared database and then act on the message.<br>

You only need to do a query on the database whenever you receive a msg<br>

with &quot;republish count&quot; &gt; 0 - thus you can tune your database for<br>

inserts and hopefully save some work - the common case will then be the<br>

first case, and lookups will be exceedingly rare.<br>

<br>

The question then is: if you&#39;ve received a msg, republish count &gt; 0 but<br>

there are no entries in the database, what do you do? It shouldn&#39;t have<br>

overtaken the first publish (though if consumers disconnected without<br>

acking, or requeued messages, it could have), but you need to cause some<br>

sort of synchronise operation between all the consumers to ensure none<br>

are in the process of adding to the database - it all gets a bit hairy<br>

at this point.<br>

<br>

Thus if your message rate is low, you&#39;re much safer doing the insert and<br>

select on every message. If that&#39;s too expensive, you&#39;re going to have<br>

to think very hard indeed about how to avoid races between different<br>

consumers thinking they&#39;re both/all responsible for acting on the same<br>

message.<br>

<br>

This stuff isn&#39;t easy.<br>

<br>

Matthew<br>

_______________________________________________<br>

rabbitmq-discuss mailing list<br>

<a href="mailto:rabbitmq-discuss@lists.rabbitmq.com">rabbitmq-discuss@lists.rabbitmq.com</a><br>

<a href="https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss" target="_blank">https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss</a><br>

</blockquote></div><br><br clear="all"><br>-- <br>---<br>John Apps<br>(49) 171 869 1813<br>

</div>