[rabbitmq-discuss] Use a mysql databse table as the provider for rabbitmq queue
tim at rabbitmq.com
Wed Oct 17 14:32:33 BST 2012
On 10/17/2012 02:03 PM, Ryan R. wrote:
> Ok I've definitely got what you're trying to say concerning the use of
> DBs and message broker.
Good, and your post has helped me to understand what you're doing more
clearly, so this should get easier now.
> To explain a bit more, I'll answer the few questiosn you've scattered
> across your reply :
> Is it a distributed architecture ?
> Definitely yes, each app will run on a different server
Ok, so that's more the classic integration architecture for sure.
> Why do I need to keep apps separate ?
> Simply because my project can be viewed as a application suite
> composed of about 5 different apps. The end-user will be able to
> choose which of the apps he wants and I want to only provide what is
> absolutely necessary for the end-user based on what he asked for,
> nothing more nothing less. Therefore if the end user only wants one
> app out five I'm not going to provide him with a RabbitMQ, he will
> have no use for it. However should he asked for a set of apps that
> need to communicate, then I'll install a RabbitMQ to let them communicate.
Ok, well that is certainly do-able and I can understand the reasons why
a particular customer site that doesn't need the middle-ware might as
well skip it.
> Now like I said, I'm looking for a way to do this without changing
> anything in the apps' code.
> What I'm looking for is a solution where I would have something like a
> "watcher" external app that would keep an eye on what's going on in
> each app and manage the content of each RabbitMQ queues accordingly.
Right, and in fact that's probably quite a common pattern in integration
architectures. What I would ask is this: how are these applications
supposed to know about the incoming data? And how, for that matter, is
the data supposed to get into the queues in the first place?
Now, one solution to this is to do what you say and write a new
application, a copy of which resides alongside each main app in the
system and basically does two things:
1. periodically read data from the database for application 1 and write
it to a queue
2. constantly (in a worker thread, or periodically in the main one) read
data from a queue and write it to the application database
This is fine, in theory, as long as you bare in mind that what you're
basically doing is making the overall system into a sort of 'shared
database where tables are replicated/synchronised via queues'. Think
about it a while and it's clear that, even if the tables being read from
and/or written to are different, that basically you end up with the same
thing. Then ask yourself if you'd design a distributed database that
uses a messaging system as its replication back-bone.
Now I don't want to completely put you off that approach. It certainly
has some things going for it: you do not need to change the application
code anywhere, and if the application doesn't need the messaging system
then you just avoid installing the 'watcher' process (and broker) in the
first place. Job done!
There are things to consider with this approach though, which will
re-enforce my point about it being like a distributed database that
isn't. Consider some rhetorical questions about the systems that may or
may not be communicating:
- do they rely on incoming information in order to do their job?
- does incoming information lead (causally) to more outgoing information?
- does outgoing information rely on future incoming information?
- and for any of the above, does the order in which information is
handled ever matter?
If the answer to any of those questions is 'yes', then it might be worth
reconsidering the use of a separate 'watcher'. All of those scenarios
can lead to races, deadlocks and all manner of other problems usually
associated with concurrent application development, because a
distributed system that uses asynchronous message passing is
*inherently* concurrent in nature and races/deadlocks/etc are all *much*
harder to spot (and fix) when they're distributed!
> That peculiar aspect of the apps is for my longterm evolution of my
> app : for my personal use, the simpler approach of the code for
> pushing a mesage in the queues embedded in the app would be fine. But
> since I'd like to make it evolve to something else, I'd rather think
> ahead and try to do so right now.
Well, I'm unable to comment on the design choices for a technology I
know nothing about, but I will offer this. If messaging is made into a
specific (service) layer for an application, that application should
usually be able to evolve without ever changing the messaging code. The
event publication subsystem is, after all, a fairly static design point
- you choose what to publish and when (which can, of course, be made
configurable in any language/platform) and then just leave the
publication code alone. The event listening subsystem is likewise fairly
static. You start a worker thread (or external process) and register
callbacks that get run when a message arrives. Just as with the
publication subsystem, you can pass whatever callbacks you like and
these can be chosen at build time *or* at runtime, possibly driven by
configuration settings if that is required. Finally, the remaining
problem is what to do when there *is no broker* because the application
is standalone. I think this is fairly simple - just set a flag (or
configuration value) when starting you application and let the event
handling subsystems (both the listener and publisher) treat all function
calls as a no-op when there is no broker to communicate with.
The great advantage of this approach, IMO, is precisely the one you seem
to want - that the applications can be evolved over time to publish new
data and/or respond to new (or existing) messages in varying ways.
Providing you have the infrastructure to make decisions about what to do
in callbacks based on configuration settings, you *might* even be able
to add new behaviours to your applications without writing any code! ;)
But..... There is the up-front cost of embedding the messaging
technology into your applications in the first place. Personally, I
would not make that cost the primary driver behind the design decision
though. In your shoes, I would consider carefully how the interactions
between these applications work and understand the distribution and
information sharing model they need to conform to first. Once that is
clear, then you can decide whether or not the timing, ordering and/or
existential questions about the system architecture as a whole really
matter or not. If they don't, then you might as well choose the external
'watcher' as this minimises the impact on your code base and simplifies
your deployments. If, on the other hand, they do matter, then you should
*very* carefully consider whether the decoupling that the 'watcher'
offers is likely to be a hindrance or a help. If enough of those causal
relationships in the information sharing model require coupling of some
kind, then interacting directly with a messaging system (where you can
choose whether or not you care about acknowledgements/receipts,
batching, transactions/confirms, and other such features) might well
prove to be architecturally important.
HTH and makes some degree of sense!
> Hope these explanations help a little more.
> 2012/10/17 Tim Watson <tim at rabbitmq.com <mailto:tim at rabbitmq.com>>
> On 10/17/2012 11:40 AM, Ryan R. wrote:
> I think I understand what you mean with the shared library.
> However, in my case, RabbitMQ would only be installed if need
> be (meaning more than one of the apps are present, and two of
> those need to be synchronised for part of their data).
> That actually complicates the picture somewhat - is there a reason
> why this is the case? In a typical integration architecture, the
> messaging broker is deployed centrally (on the LAN somewhere) and
> clients choose whether or not they want to connect to it from
> whatever machine they're running on. To me, it is sounding like
> you're describing an architecture where both applications reside
> on the same machine and assuming that the broker will also need to
> be co-resident with them, which is not really the case, though
> there's nothing to prohibit that either.
> That said, using a shared library would require me to
> "include/import" said library when I need to, therefore making
> me change my app code depending of the situation I'm in.
> Well yes, if you're going to add messaging capabilities to your
> applications that don't currently support it, then you are going
> to have to write *some* code and integrate it into them! :)
> And said library would only be required when there's a
> RabbitMQ available anyway.
> I think you're making your life more complicated than it needs to
> be by thinking about whether the messaging broker is available vs.
> not. The broker should *always* be available when applications
> residing on different machines need to communicate with one
> another, regardless of whether those applications are running or
> not. Again, it feels like you're trying to deal with applications
> running on the same machine - have I picked that up correctly? It
> might help if you explained your architecture in a bit more
> detail, so I can understand exactly what you're trying to achieve.
> Now a bit further in your message you talk about a listener
> I'd like to know a bit more about this.
> How would an external library be able to listen to anything
> happening within my app ?
> Would it be listening on the DB queries ?
> No, not at all. Let's say you've got two applications, App1 and
> App2. You'll write some library code that both applications share,
> that probably looks something like this (with *wide* variations
> depending on language/platform - I've just written pseudo code to
> keep things simple):
> function init = do
> function listen = do
> function publish = do
> Now in your applications, you'll call the shared 'init' library
> function when you're starting up to bootstrap the connection to
> the broker. When your application is publishing data, it calls
> publish and if/when you need to subscribe to data then you'll call
> 'listen'. The fact is that 'how to listen' for incoming messages
> really depends on how you're going to use them. But the point is
> that the applications read from and write to the messaging broker,
> and do so independently of database tables. You *may* decide to do
> something like write a middle-man application that periodically
> reads a database table and publishes each row to the messaging
> broker so it can be read from a queue, or do that with a worker
> thread instead of a separate application. I would *not* do
> anything here with the database though. If applications need to
> share data, then **they should send it to one another via message
> queues.** If they need to persist data, they should persist their
> own data in their own tables in the database, but they should
> **not use the database to communicate with one another.** That is
> the key thing with using messaging instead shared data(bases).
> There is an overhead in sending (and in some cases, duplicating)
> data between applications of course. This is *more* than
> compensated for by the reduced coupling that comes from
> integrating using messaging technology. This approach may not be
> suited to integrating applications that are running on the same
> physical machine and are tightly and deliberately coupled however.
> I can't really elaborate on the suitability of messaging for your
> project without understanding a good deal more about it I'm afraid.
> I hope that clears a few things up at least! :)
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the rabbitmq-discuss