<div dir="ltr">Yes - that's actually what we do currently, using Cassandra, and it scales well.<div><br></div><div>And we also do it in memory, at the retail level, and it is very fast as well.</div><div><br></div><div>
I am just trying to shave a millisecond off at the retail level.</div><div><br></div><div>Cheers,</div><div><br></div><div>Michael</div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Mon, Nov 11, 2013 at 2:22 PM, Matthias Reik <span dir="ltr"><<a href="mailto:maze@reik.se" target="_blank">maze@reik.se</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
Even though it sounds like a nice feature, it is probably difficult
to really implement, if not done on the client side. The duplicates
might happen when delivering to the client side. but on the client
side it should be quite easy to do the filtering:<br>
* get a message from the queue, <br>
* check against memcached (couchbase, or some other cache
technology) whether the messageID exists.<br>
* Add the new message to memcached (can be done with the previous
step)<br>
* Set the timeout in memcached to your window size.<br>
<br>
This should be straight forward, would scale up to quite a lot of
messages) and should remove (depending on your window size) all
duplicates.<br>
<br>
Is there a good reason why you wouldn't want to do this on the
client side as described?<br>
<br>
Cheers<br>
Matthias<br>
<br>
PS: as a caching technology you could of course do your own
in-memory-solution but that's probably more work than to use an
out-of-the-box solution.<div><div class="h5"><br>
<br>
<div>On 2013-11-11 12:35 , Laing, Michael
wrote:<br>
</div>
</div></div><blockquote type="cite"><div><div class="h5">
<div dir="ltr">In our scenarios, messages are ultimately delivered
to a 'retail' rabbitmq instance to be delivered to a client. The
pipelines that process and deliver messages are purposefully
redundant, hence there may be multiple replicas of each message
'racing' to the endpoint.
<div>
<br>
</div>
<div>Usually, the replicas are resolved before getting to the
retail rabbit. When components fail, however, duplicates can
leak through during a small window of time. We eliminate those
duplicates at the retail layer by looking at each message_id.
Ultimately, our client contract allows duplicates as well in
case one slips by.</div>
<div><br>
</div>
<div>It seems to me that this is a generic issue.</div>
<div><br>
</div>
<div>What would be useful in our case, and hopefully for many
others, would be a 'Duplicate Message ID Window' in
milliseconds, as an exchange attribute.</div>
<div><br>
</div>
<div>If non-zero, the exchange would drop any message with a
duplicate message_id that appeared within the specified window
of time, possibly routing it to the alternate exchange, if
set.</div>
<div><br>
</div>
<div>In our case, a window of a few seconds, perhaps up to a
minute would suffice.</div>
<div><br>
</div>
<div>Thanks,</div>
<div><br>
</div>
<div>Michael</div>
<div><br>
</div>
</div>
<br>
<fieldset></fieldset>
<br>
</div></div><pre>_______________________________________________
rabbitmq-discuss mailing list
<a href="mailto:rabbitmq-discuss@lists.rabbitmq.com" target="_blank">rabbitmq-discuss@lists.rabbitmq.com</a>
<a href="https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss" target="_blank">https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss</a>
</pre>
</blockquote>
<br>
</div>
<br>_______________________________________________<br>
rabbitmq-discuss mailing list<br>
<a href="mailto:rabbitmq-discuss@lists.rabbitmq.com">rabbitmq-discuss@lists.rabbitmq.com</a><br>
<a href="https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss" target="_blank">https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss</a><br>
<br></blockquote></div><br></div>