[rabbitmq-discuss] Exchange Feature request: Drop Duplicates
Laing, Michael
michael.laing at nytimes.com
Tue Nov 12 10:54:15 GMT 2013
Yes I can see the point about statelessness.
It seems to me that in a messaging fabric, it is generally useful to have
ways of dampening duplicates.
It occurred to me this morning that federation uses hop counts - in some
topologies, esp. with planned redundancy, this does not work so well, and
perhaps a feature like this would help.
Michael
On Tue, Nov 12, 2013 at 4:48 AM, Simon MacMullen <simon at rabbitmq.com> wrote:
> The trouble is, exchanges are meant to be stateless. So it's possible to
> introduce some state into an exchange, but we have to choose between having
> per-node state (in which case dedup only works per-node), or having
> cluster-global state (where we either funnel all messages through one node
> in the cluster before they get routed to queues, or distribute the state
> around the cluster, making updates into expensive 2PC).
>
> So this is doable but it's not obvious where compromises should be made.
> And as Matthias sort of pointed out, duplication can still happen due to
> redelivery, so this has to be an optimisation rather than something that
> guarantees duplicates won't happen.
>
> Having said all that, it wouldn't be hideously difficult to implement, so
> I might give it a go. Depends on whether anybody else would find such a
> feature useful...
>
> Cheers, Simon
>
>
> On 11/11/2013 19:28, Laing, Michael wrote:
>
>> Yes - that's actually what we do currently, using Cassandra, and it
>> scales well.
>>
>> And we also do it in memory, at the retail level, and it is very fast as
>> well.
>>
>> I am just trying to shave a millisecond off at the retail level.
>>
>> Cheers,
>>
>> Michael
>>
>>
>> On Mon, Nov 11, 2013 at 2:22 PM, Matthias Reik <maze at reik.se
>> <mailto:maze at reik.se>> wrote:
>>
>> Even though it sounds like a nice feature, it is probably difficult
>> to really implement, if not done on the client side. The duplicates
>> might happen when delivering to the client side. but on the client
>> side it should be quite easy to do the filtering:
>> * get a message from the queue,
>> * check against memcached (couchbase, or some other cache
>> technology) whether the messageID exists.
>> * Add the new message to memcached (can be done with the previous
>> step)
>> * Set the timeout in memcached to your window size.
>>
>> This should be straight forward, would scale up to quite a lot of
>> messages) and should remove (depending on your window size) all
>> duplicates.
>>
>> Is there a good reason why you wouldn't want to do this on the
>> client side as described?
>>
>> Cheers
>> Matthias
>>
>> PS: as a caching technology you could of course do your own
>> in-memory-solution but that's probably more work than to use an
>> out-of-the-box solution.
>>
>>
>> On 2013-11-11 12:35 , Laing, Michael wrote:
>>
>>> In our scenarios, messages are ultimately delivered to a 'retail'
>>> rabbitmq instance to be delivered to a client. The pipelines that
>>> process and deliver messages are purposefully redundant, hence
>>> there may be multiple replicas of each message 'racing' to the
>>> endpoint.
>>>
>>> Usually, the replicas are resolved before getting to the retail
>>> rabbit. When components fail, however, duplicates can leak through
>>> during a small window of time. We eliminate those duplicates at
>>> the retail layer by looking at each message_id. Ultimately, our
>>> client contract allows duplicates as well in case one slips by.
>>>
>>> It seems to me that this is a generic issue.
>>>
>>> What would be useful in our case, and hopefully for many others,
>>> would be a 'Duplicate Message ID Window' in milliseconds, as an
>>> exchange attribute.
>>>
>>> If non-zero, the exchange would drop any message with a duplicate
>>> message_id that appeared within the specified window of time,
>>> possibly routing it to the alternate exchange, if set.
>>>
>>> In our case, a window of a few seconds, perhaps up to a minute
>>> would suffice.
>>>
>>> Thanks,
>>>
>>> Michael
>>>
>>>
>>>
>>> _______________________________________________
>>> rabbitmq-discuss mailing list
>>> rabbitmq-discuss at lists.rabbitmq.com <mailto:rabbitmq-discuss@
>>> lists.rabbitmq.com>
>>> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>>>
>>
>>
>> _______________________________________________
>> rabbitmq-discuss mailing list
>> rabbitmq-discuss at lists.rabbitmq.com
>> <mailto:rabbitmq-discuss at lists.rabbitmq.com>
>>
>> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>>
>>
>>
>>
>> _______________________________________________
>> rabbitmq-discuss mailing list
>> rabbitmq-discuss at lists.rabbitmq.com
>> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20131112/e1335401/attachment.htm>
More information about the rabbitmq-discuss
mailing list