[rabbitmq-discuss] Massive distributed pub/sub system
Jim Irrer
irrer at umich.edu
Tue Feb 15 17:20:04 GMT 2011
BTW - Thanks for the article reference. That helped.
The only real disadvantage to AMQP is not monitoring
whether clients are present/active or not, and that
requirement is application specific.
- Jim
Jim Irrer irrer at umich.edu (734) 647-4409
University of Michigan Hospital Radiation Oncology
519 W. William St. Ann Arbor, MI 48103
On Fri, Jan 28, 2011 at 7:51 AM, Jim Apperly <jim at rabbitmq.com> wrote:
> On 27 January 2011 19:25, Jim Irrer <irrer at umich.edu> wrote:
>
>> Would XMPP be more appropriate for this application? I'm not real
>> familiar with it but I've heard that it has different strengths than AMQP.
>> I'm interested in the pros and cons.
>>
>
> This may help shed some light:
>
> http://www.opensourcery.co.za/2009/04/19/to-amqp-or-to-xmpp-that-is-the-question/
>
> Jim
>
>
>
> Thanks,
>>
>> - Jim
>>
>> Jim Irrer irrer at umich.edu (734) 647-4409
>> University of Michigan Hospital Radiation Oncology
>> 519 W. William St. Ann Arbor, MI 48103
>>
>>
>>
>> On Thu, Jan 27, 2011 at 10:06 AM, Michael Bridgen <mikeb at rabbitmq.com>wrote:
>>
>>> Hi Kaiduan,
>>>
>>> Your questions suggest you're attempting something very interesting,
>>> which I would love to hear more about. Federation and distribution are very
>>> much on our minds here at Rabbit Towers, as you might imagine.
>>>
>>> In the meantime ---
>>>
>>>
>>> 1) Any user can subscribe to the interested topic, but only topic
>>>> owner can publish message to the group. There is no limit on the
>>>> number of subscribers in each group. Potentially it can be huge, for
>>>> example, the fans of U2 around the world.
>>>>
>>>
>>> Is there anything that determines who can own topics? For example, is it
>>> just the first person to declare a topic that is the exclusive publisher to
>>> that topic? Can the publishing rights, so to speak, be handed to another
>>> publisher?
>>>
>>>
>>> 2) User including subscriber and publisher is not always connected to
>>>> the system, and not always connected to the same node in the system,
>>>> and the message delivery should be guaranteed. When publisher
>>>> publishes a message, the system should deliver the message to all
>>>> subscribers. If the subscriber is connected to the system, the message
>>>> should be delivered immediately. If the subscriber is not connected,
>>>> system should hold the message, and the next time the user comes
>>>> connected, system will deliver the message to the user. Just imagine
>>>> the user can be any mobile user and moves out of cellular coverage.
>>>>
>>>
>>> So far as I know, this is still an area of active research. Typically,
>>> distributed pub/sub systems like Scribe and Hermes give rather weak
>>> guarantees of delivery and ordering and so on.
>>>
>>> In particular, "If the subscriber is not connected, system should hold
>>> the message, and the next time the user comes connected, system will deliver
>>> the message to the user" is difficult if the subscriber can connect to any
>>> node, and I don't think most pub/sub systems would allow that.
>>>
>>>
>>> 3) The system should be able to support tens to hundreds of millions
>>>> users spreading around the world, so the system will consist hundred
>>>> of nodes located in different physical locations.
>>>>
>>>
>>> This is rather ambitious. But systems on this kind of scale are indeed
>>> being built: http://ci.oceanobservatories.org/ for instance.
>>>
>>>
>>> 4) The number of topics/groups in the system is unlimited.
>>>>
>>>> 5) As to the latency, it should be in the range of 1 minute if
>>>> subscriber is connected.
>>>>
>>>> It looks like RabbitMQ already has functionalities to meet the above
>>>> requirement, for example, fan out exchange, and persistent message.
>>>> The following is my understanding on how to build the above system
>>>> with RabbitMQ,
>>>>
>>>
>>> Perhaps, in the sense that it can be a building block. But it doesn't
>>> fulfill all the requirements you've given above, "out of the box". In other
>>> words, you (or we) would have to invent a substantial part of the
>>> technology.
>>>
>>>
>>> a) Publisher creates an exchange. For example, U2 creates an exchange
>>>> noted as "U2" for "U2's next world wide tour" on Node 1.
>>>>
>>>> b) Each subscriber creates a queue in the system. For example, Alice
>>>> creates a queue noted as "Alice" on Node 2 and binds to exchange U2;
>>>> and Bob creates a queue noted as "Bob" on Node 3 and binds to exchange
>>>> "U2" on Node 2.
>>>>
>>>> c) U2 publishes a message, m1 on Node 1 to exchange "U2"; and RabbitMQ
>>>> will deliver the message m1 to queue "Alice" on Node 2 and to queue
>>>> "Bob" on Node 3.
>>>>
>>>> How we handles the following scenarios?
>>>>
>>>> 1) When U2 wants to publish a message, but Node 1 is done.
>>>>
>>>> 2) When message m1 is delivered to queue "Alice" on Node 2, Node 2
>>>> crashed or the network link between Node 1 (publisher's node) is
>>>> disconnected? Will exchange "U2" on Node 1 persist the message?
>>>>
>>>
>>> No; the message will be lost so far as Alice is concerned. In AMQP
>>> terms, you're asking for queues to be replicated. Rabbit doesn't do this,
>>> yet.
>>>
>>> (Actually, we are working on queue replication right now. I think you
>>> would need both replication and some kind of queue migration or
>>> distribution.)
>>>
>>>
>>> 3) After message m1 is arrived on queue "Alice", but the connection
>>>> between Alice and Node 2 is gone, the message will be stored on Node
>>>> 2, right? Next time, Alice connects to the system, but she is
>>>> connected to Node n instead of Node 2, how to handle this?
>>>>
>>>
>>> In Rabbit's clustering as it works now, the messages will be delivered
>>> across the cluster to Node n.
>>>
>>>
>>> 4) What is the multi-cast technology used in RabbitMQ to deliver the
>>>> message to queues located on different locations spreading around
>>>> different countries?
>>>>
>>>
>>> There isn't any right now. Clustering is really for nodes that are
>>> co-located and have reliable connections. It uses Erlang's distribution
>>> mechanism, which essentially forms a fully-connected graph of nodes. It
>>> doesn't really scale beyond a handful of nodes.
>>>
>>> There /is/ a plugin called the "shovel", which will relay messages from
>>> one broker to another. However, it is statically configured, and
>>> constrained by using AMQP to do the relaying (i.e., you cannot tell it to
>>> relay all messages from a direct exchange; only to relay, e.g., messages
>>> with a particular routing key).
>>>
>>>
>>> Regards,
>>> Michael
>>>
>>> _______________________________________________
>>> rabbitmq-discuss mailing list
>>> rabbitmq-discuss at lists.rabbitmq.com
>>> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>>>
>>
>>
>> _______________________________________________
>> rabbitmq-discuss mailing list
>> rabbitmq-discuss at lists.rabbitmq.com
>> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20110215/65fa7334/attachment.htm>
More information about the rabbitmq-discuss
mailing list