<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
Hi,<br>
<br>
On 01/14/2013 08:34 AM, Shadowalker wrote:
<blockquote
cite="mid:762d6737-941d-44ed-b9b8-83d9e1c3781a@googlegroups.com"
type="cite">Hi again, <br>
Been doing a lot of googling on the queue/topic/listening for
consumed messages count an found this on activemq : <br>
<br>
<a moz-do-not-send="true"
href="http://activemq.apache.org/cms/handling-advisory-messages.html"
target="_blank">http://activemq.apache.org/<wbr>cms/handling-advisory-<wbr>messages.html</a><br>
It allows one to check the count of currently listening consumer a
queue/topic. <br>
<br>
</blockquote>
<br>
I would not recommend an architecture for distributed resource
tracking based on that. What happens if a consumer is temporarily
disconnected when you go into the check, but reconnects after (or
whilst) the rest of the participants are being updated? You've
introduced even more possibilities for race conditions than before.<br>
<br>
What I would suggest is that you carefully consider whether you
actually need synchronous communications here, as messaging based
architectures inherently de-couple producers from consumers, yet
you've repeatedly attempted to force some 'awareness' of consumers
into the producer whilst discussing this design. I would like to
posit that this reveals an 'impedance mismatch' between your
requirements and the inherently disconnected nature of a queue based
solution. Of course distributed locking is often implemented using
asynchronous communication protocols, but this is usually done at a
*much lower protocol level* - I'd suggest researching Paxos or
similar distributed consensus algorithms to get an idea of what's
involved in designing a reliable solution to this kind of problem.<br>
<br>
<blockquote
cite="mid:762d6737-941d-44ed-b9b8-83d9e1c3781a@googlegroups.com"
type="cite">Is there anything like this in rabbit mq ?<br>
</blockquote>
<br>
Not that I know of, although it's possible to use the HTTP APIs in
order to track consumers but that is, as I mentioned above, subject
to lots of *nasty* race conditions. You *could* look at using Tony
G's presence exchange (<a class="moz-txt-link-freetext" href="https://github.com/tonyg/presence-exchange">https://github.com/tonyg/presence-exchange</a>)
to track bindings - although this would complicate your topology
quite a lot, it might make tracking the various participants
plausibly useful, providing you use a known set of binding keys.<br>
<br>
<blockquote
cite="mid:762d6737-941d-44ed-b9b8-83d9e1c3781a@googlegroups.com"
type="cite">This might allow me to create a listener that would
only send a message to notify the first manager that the
references were removed.<br>
</blockquote>
<br>
I'm not clear on how that helps!? I did have a bit of an early start
this morning though... ;)<br>
<br>
<blockquote
cite="mid:762d6737-941d-44ed-b9b8-83d9e1c3781a@googlegroups.com"
type="cite">
Another could be to define the "delete referenrences" message to
live for x consumptions (x being the number of listener on the
"delete references" queue) and add an advisory listener on the
deletion of the message from the queue to process deletion of
initial data.<br>
</blockquote>
<br>
That doesn't help at all unless you've actually tracked the number
of acquired messages in the first place. Plus you *can* do that
without 'detecting' the number of consumers. You just insist on
getting a 'make-ref' message from the consumer (with some unique id)
before incrementing the reference count. There's no real difference
between *detecting* the consumer's connection/channel and providing
a ref/lock acquisition queue, except that the latter is probably
more structured, architecturally clearer and quite likely to be more
reliable.<br>
<br>
Even if you used ActiveMQ's detection functionality or RabbitMQ's
management HTTP APIs, the fundamental problem of race conditions
wouldn't go away. Before we go much further discussing various ways
you can design a solution - and I *am* interested in this discussion
BTW - please read
<a class="moz-txt-link-freetext" href="http://en.wikipedia.org/wiki/Byzantine_fault_tolerance#Byzantine_failures">http://en.wikipedia.org/wiki/Byzantine_fault_tolerance#Byzantine_failures</a>
and make sure you've understood the consequences of nodes *just
disappearing* and then *maybe* coming back later on.<br>
<br>
You've also still not explained what the consequences of loosing
track of resources actually are. If one of your nodes dies, when it
comes back to life has any state been persisted and will that state
thus be used to try and re-acquire or release the 'lock count' for
this resource? What happens if your node sends an 'acquire' request
asynchronously, then starts to write the resource/lock state to its
own local database and dies (e.g., the machine crashes) before
committing the transaction? Because the 'acquire' request was not
synchronous, the master now thinks that your node holds the lock,
whilst the node does *not* think the same. If you bring the node
back online and then start asking for the resource lock, you're
breaking the contract for lock acquisition on that node unless
you're willing to make 'acquire' idempotent, which has its own
pitfalls. If you don't make 'acquire' idempotent, then acquisition
will fail. If you try to handle this by making 'acquire' re-entrant
and then try to release the node's locl, the master will be confused
as it thinks you hold the lock twice and the *lost lock acquisition*
will never be released.<br>
<br>
tldr; this is not a simple problem.<br>
<br>
Cheers,<br>
Tim<br>
<br>
</body>
</html>