[rabbitmq-discuss] Fwd: One Producer, X Consumers where X can change

Mon Jan 14 15:15:29 GMT 2013

Putting the list back on CC - I didn't realise I hadn't hit reply-all.

Begin forwarded message:

> From: "Ryan R." <ryan.rajkomar at gmail.com>
> Subject: Re: [rabbitmq-discuss] One Producer, X Consumers where X can change
> Date: 14 January 2013 10:21:48 GMT
> To: Tim Watson <tim at rabbitmq.com>
> 
> Hi,
> 
> First of all thanks for the quick reply,
> 
> I'm well aware that trying to use such solution as many issues (mostly the data integrity should a error arise somewhere along the process) and that is indeed problematic.
> I was actually just looking for a potential way of doing what I need : I have one (or more) webservices whose objects might contains references to objects of another webservice (ideally each WS has its own DB on a different server)
> 
> The issue here is how to keep data integrity when trying to delete the referenced element : if one of the references can't be deleted (for any reason) nothing should be deleted. However here, since data on on different database/webservices/server, I can't do a simple DB transaction.
> 
> Initially I did not really intend to use rabbitMQ for this but only for a notifications system.
> But I figured, since having a delay in my deletion process between the request and the actual deletion wasn't an issue, maybe I might be able to find a solution there.
> 
> Finally, and this has nothing to do with my research, for some reason I do not see your post in google groups. How come?
> 
> Thanks again,
> Cheers.
> 
> 
> 2013/1/14 Tim Watson <tim at rabbitmq.com>
> Hi,
> 
> 
> On 01/14/2013 08:34 AM, Shadowalker wrote:
>> 
>> Hi again, 
>> Been doing a lot of googling on the queue/topic/listening for consumed messages count an found this on activemq : 
>> 
>> http://activemq.apache.org/cms/handling-advisory-messages.html
>> It allows one to check the count of currently listening consumer a queue/topic. 
>> 
> 
> I would not recommend an architecture for distributed resource tracking based on that. What happens if a consumer is temporarily disconnected when you go into the check, but reconnects after (or whilst) the rest of the participants are being updated? You've introduced even more possibilities for race conditions than before.
> 
> What I would suggest is that you carefully consider whether you actually need synchronous communications here, as messaging based architectures inherently de-couple producers from consumers, yet you've repeatedly attempted to force some 'awareness' of consumers into the producer whilst discussing this design. I would like to posit that this reveals an 'impedance mismatch' between your requirements and the inherently disconnected nature of a queue based solution. Of course distributed locking is often implemented using asynchronous communication protocols, but this is usually done at a *much lower protocol level* - I'd suggest researching Paxos or similar distributed consensus algorithms to get an idea of what's involved in designing a reliable solution to this kind of problem.
> 
> 
>> Is there anything like this in rabbit mq ?
> 
> Not that I know of, although it's possible to use the HTTP APIs in order to track consumers but that is, as I mentioned above, subject to lots of *nasty* race conditions. You *could* look at using Tony G's presence exchange (https://github.com/tonyg/presence-exchange) to track bindings - although this would complicate your topology quite a lot, it might make tracking the various participants plausibly useful, providing you use a known set of binding keys.
> 
> 
>> This might allow me to create a listener that would only send a message to notify the first manager that the references were removed.
> 
> I'm not clear on how that helps!? I did have a bit of an early start this morning though... ;)
> 
> 
>> Another could be to define the "delete referenrences" message to live for x consumptions (x being the number of listener on the "delete references" queue) and add an advisory listener on the deletion of the message from the queue to process deletion of initial data.
> 
> That doesn't help at all unless you've actually tracked the number of acquired messages in the first place. Plus you *can* do that without 'detecting' the number of consumers. You just insist on getting a 'make-ref' message from the consumer (with some unique id) before incrementing the reference count. There's no real difference between *detecting* the consumer's connection/channel and providing a ref/lock acquisition queue, except that the latter is probably more structured, architecturally clearer and quite likely to be more reliable.
> 
> Even if you used ActiveMQ's detection functionality or RabbitMQ's management HTTP APIs, the fundamental problem of race conditions wouldn't go away. Before we go much further discussing various ways you can design a solution - and I *am* interested in this discussion BTW - please read http://en.wikipedia.org/wiki/Byzantine_fault_tolerance#Byzantine_failures and make sure you've understood the consequences of nodes *just disappearing* and then *maybe* coming back later on.
> 
> You've also still not explained what the consequences of loosing track of resources actually are. If one of your nodes dies, when it comes back to life has any state been persisted and will that state thus be used to try and re-acquire or release the 'lock count' for this resource? What happens if your node sends an 'acquire' request asynchronously, then starts to write the resource/lock state to its own local database and dies (e.g., the machine crashes) before committing the transaction? Because the 'acquire' request was not synchronous, the master now thinks that your node holds the lock, whilst the node does *not* think the same. If you bring the node back online and then start asking for the resource lock, you're breaking the contract for lock acquisition on that node unless you're willing to make 'acquire' idempotent, which has its own pitfalls. If you don't make 'acquire' idempotent, then acquisition will fail. If you try to handle this by making 'acquire' re-entrant and then try to release the node's locl, the master will be confused as it thinks you hold the lock twice and the *lost lock acquisition* will never be released.
> 
> tldr; this is not a simple problem.
> 
> Cheers,
> Tim
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20130114/3fd592bd/attachment.htm>