[rabbitmq-discuss] Unique Messages in Queue

Tue Aug 4 17:56:00 BST 2009

Hi Tony,

There *may* be a one or more duplicates per day. The message source is
Amazon SQS and it does not guarantee that a message is deleted even
after issuing a delete command. Nor do I get an acknowledgement
confirming that the message is deleted. Thus, I am trying to make my
system immune to the constraints imposed by SQS. I am assuming that I
will be able to delete the message within a day but till I did so, my
application needs to be sure that the duplicates are not introduced
into the system.

Best,

Vidit

On Tue, Aug 4, 2009 at 12:39 PM, Tony Garnock-Jones<tonyg at lshift.net> wrote:
> Hi Vidit,
>
> You wrote, at the start of this thread, that you "have a message source
> that may provide duplicate messages". What kinds of duplicates are we
> talking here? One per minute for the next six years, or the occasional
> duplicate within a minute of the original followed by no more duplicates
> ever?
>
> If it's the former, then a long-term memory is clearly required; if the
> latter (i.e. you're coping with the normal possibility of
> duplication-because-of-connection-failure-etc), then a simple memory of
> say an hour's worth of processed message IDs ought to be enough.
>
> Regards,
>  Tony
>
>
> Vidit Drolia wrote:
>> Matthias,
>>
>> Minimizing the probability of sending out a duplicate message is the
>> practical objective. So you are right in saying that the best we can
>> do is to make it very unlikely that a duplicate mail is sent out.
>>
>> A filtering proxy would make most sense because I wanted to move away
>> from expensive I/O for persistence in the first place, plus if needed,
>> redundancy can be introduced later for fault-tolerance.
>>
>> Thanks for all the help!
>>
>> Best,
>>
>> Vidit
>>
>> On Fri, Jul 31, 2009 at 2:08 PM, Matthias Radestock<matthias at lshift.net> wrote:
>>> Vidit,
>>>
>>> Vidit Drolia wrote:
>>>> The primary problem is that since the action being triggered by the
>>>> message is an email, I can't revert the action. So I am trying to
>>>> ensure that the application sending emails gets a message only once.
>>>> Is there another approach I can take to this problem?
>>> If you replace "process(msg)" in my last email with "send_email(msg)", you
>>> will see that what you are asking for is impossible. The best one can do (in
>>> any system, involving rabbit or not) is to make it *very unlikely* that an
>>> email is sent more than once. As long as we can agree on that, let's proceed
>>> ...
>>>
>>> If your main concern is removing the duplicates the senders can produce,
>>> then I suggest inserting a filtering proxy, i.e. a process that consumes
>>> messages from one queue, de-dups them and publishes the non-dups to another
>>> exchange.
>>>
>>> This process does need to keep some state, so, as you say, if it crashes and
>>> the state is lost then you may get some dups. The process is very simple
>>> though, so the likelihood of it crashing should be low. Given that we have
>>> established that there can be no 100% no-dup guarantee, is it really worth
>>> worrying about that? If the answer is yes, then persisting that state, or
>>> replicating it between several redundant nodes are possible options.
>>>
>>>
>>> Regards,
>>>
>>> Matthias.
>>>
>>
>> _______________________________________________
>> rabbitmq-discuss mailing list
>> rabbitmq-discuss at lists.rabbitmq.com
>> http://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>
>
> --
>  [][][] Tony Garnock-Jones     | Mob: +44 (0)7905 974 211
>   [][] LShift Ltd             | Tel: +44 (0)20 7729 7060
>  []  [] http://www.lshift.net/ | Email: tonyg at lshift.net
>

-- 
Vidit Drolia