[rabbitmq-discuss] Exchange Feature request: Drop Duplicates

Laing, Michael michael.laing at nytimes.com
Mon Nov 11 19:28:20 GMT 2013


Yes - that's actually what we do currently, using Cassandra, and it scales
well.

And we also do it in memory, at the retail level, and it is very fast as
well.

I am just trying to shave a millisecond off at the retail level.

Cheers,

Michael


On Mon, Nov 11, 2013 at 2:22 PM, Matthias Reik <maze at reik.se> wrote:

>  Even though it sounds like a nice feature, it is probably difficult to
> really implement, if not done on the client side. The duplicates might
> happen when delivering to the client side. but on the client side it should
> be quite easy to do the filtering:
> * get a message from the queue,
> * check against memcached (couchbase, or some other cache technology)
> whether the messageID exists.
> * Add the new message to memcached (can be done with the previous step)
> * Set the timeout in memcached to your window size.
>
> This should be straight forward, would scale up to quite a lot of
> messages) and should remove (depending on your window size) all duplicates.
>
> Is there a good reason why you wouldn't want to do this on the client side
> as described?
>
> Cheers
> Matthias
>
> PS: as a caching technology you could of course do your own
> in-memory-solution but that's probably more work than to use an
> out-of-the-box solution.
>
>
> On 2013-11-11 12:35 , Laing, Michael wrote:
>
> In our scenarios, messages are ultimately delivered to a 'retail' rabbitmq
> instance to be delivered to a client. The pipelines that process and
> deliver messages are purposefully redundant, hence there may be multiple
> replicas of each message 'racing' to the endpoint.
>
>  Usually, the replicas are resolved before getting to the retail rabbit.
> When components fail, however, duplicates can leak through during a small
> window of time. We eliminate those duplicates at the retail layer by
> looking at each message_id. Ultimately, our client contract allows
> duplicates as well in case one slips by.
>
>  It seems to me that this is a generic issue.
>
>  What would be useful in our case, and hopefully for many others, would
> be a 'Duplicate Message ID Window' in milliseconds, as an exchange
> attribute.
>
>  If non-zero, the exchange would drop any message with a duplicate
> message_id that appeared within the specified window of time, possibly
> routing it to the alternate exchange, if set.
>
>  In our case, a window of a few seconds, perhaps up to a minute would
> suffice.
>
>  Thanks,
>
>  Michael
>
>
>
> _______________________________________________
> rabbitmq-discuss mailing listrabbitmq-discuss at lists.rabbitmq.comhttps://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>
>
>
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20131111/c04f6cb0/attachment.htm>


More information about the rabbitmq-discuss mailing list