[rabbitmq-discuss] Unique Messages in Queue

Tue Aug 4 17:39:05 BST 2009

Hi Vidit,

You wrote, at the start of this thread, that you "have a message source
that may provide duplicate messages". What kinds of duplicates are we
talking here? One per minute for the next six years, or the occasional
duplicate within a minute of the original followed by no more duplicates
ever?

If it's the former, then a long-term memory is clearly required; if the
latter (i.e. you're coping with the normal possibility of
duplication-because-of-connection-failure-etc), then a simple memory of
say an hour's worth of processed message IDs ought to be enough.

Regards,
  Tony

Vidit Drolia wrote:
> Matthias,
> 
> Minimizing the probability of sending out a duplicate message is the
> practical objective. So you are right in saying that the best we can
> do is to make it very unlikely that a duplicate mail is sent out.
> 
> A filtering proxy would make most sense because I wanted to move away
> from expensive I/O for persistence in the first place, plus if needed,
> redundancy can be introduced later for fault-tolerance.
> 
> Thanks for all the help!
> 
> Best,
> 
> Vidit
> 
> On Fri, Jul 31, 2009 at 2:08 PM, Matthias Radestock<matthias at lshift.net> wrote:
>> Vidit,
>>
>> Vidit Drolia wrote:
>>> The primary problem is that since the action being triggered by the
>>> message is an email, I can't revert the action. So I am trying to
>>> ensure that the application sending emails gets a message only once.
>>> Is there another approach I can take to this problem?
>> If you replace "process(msg)" in my last email with "send_email(msg)", you
>> will see that what you are asking for is impossible. The best one can do (in
>> any system, involving rabbit or not) is to make it *very unlikely* that an
>> email is sent more than once. As long as we can agree on that, let's proceed
>> ...
>>
>> If your main concern is removing the duplicates the senders can produce,
>> then I suggest inserting a filtering proxy, i.e. a process that consumes
>> messages from one queue, de-dups them and publishes the non-dups to another
>> exchange.
>>
>> This process does need to keep some state, so, as you say, if it crashes and
>> the state is lost then you may get some dups. The process is very simple
>> though, so the likelihood of it crashing should be low. Given that we have
>> established that there can be no 100% no-dup guarantee, is it really worth
>> worrying about that? If the answer is yes, then persisting that state, or
>> replicating it between several redundant nodes are possible options.
>>
>>
>> Regards,
>>
>> Matthias.
>>
> 
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
> http://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss

-- 
 [][][] Tony Garnock-Jones     | Mob: +44 (0)7905 974 211
   [][] LShift Ltd             | Tel: +44 (0)20 7729 7060
 []  [] http://www.lshift.net/ | Email: tonyg at lshift.net