[rabbitmq-discuss] Massive distributed pub/sub system

Thu Jan 27 15:06:28 GMT 2011

Hi Kaiduan,

Your questions suggest you're attempting something very interesting, 
which I would love to hear more about.  Federation and distribution are 
very much on our minds here at Rabbit Towers, as you might imagine.

In the meantime ---

> 1) Any user can subscribe to the interested topic, but only topic
> owner can publish message to the group. There is no limit on the
> number of subscribers in each group. Potentially it can be huge, for
> example, the fans of U2 around the world.

Is there anything that determines who can own topics?  For example, is 
it just the first person to declare a topic that is the exclusive 
publisher to that topic?  Can the publishing rights, so to speak, be 
handed to another publisher?

> 2) User including subscriber and publisher is not always connected to
> the system, and not always connected to the same node in the system,
> and the message delivery should be guaranteed. When publisher
> publishes a message, the system should deliver the message to all
> subscribers. If the subscriber is connected to the system, the message
> should be delivered immediately. If the subscriber is not connected,
> system should hold the message, and the next time the user comes
> connected, system will deliver the message to the user. Just imagine
> the user can be any mobile user and moves out of cellular coverage.

So far as I know, this is still an area of active research.  Typically, 
distributed pub/sub systems like Scribe and Hermes give rather weak 
guarantees of delivery and ordering and so on.

In particular, "If the subscriber is not connected, system should hold 
the message, and the next time the user comes connected, system will 
deliver the message to the user" is difficult if the subscriber can 
connect to any node, and I don't think most pub/sub systems would allow 
that.

> 3) The system should be able to support tens to hundreds of millions
> users spreading around the world, so the system will consist hundred
> of nodes located in different physical locations.

This is rather ambitious.  But systems on this kind of scale are indeed 
being built: http://ci.oceanobservatories.org/ for instance.

> 4) The number of topics/groups in the system is unlimited.
>
> 5) As to the latency, it should be in the range of 1 minute if
> subscriber is connected.
>
> It looks like RabbitMQ already has functionalities to meet the above
> requirement, for example, fan out exchange, and persistent message.
> The following is my understanding on how to build the above system
> with RabbitMQ,

Perhaps, in the sense that it can be a building block.  But it doesn't 
fulfill all the requirements you've given above, "out of the box".  In 
other words, you (or we) would have to invent a substantial part of the 
technology.

> a) Publisher creates an exchange. For example, U2 creates an exchange
> noted as "U2" for "U2's next world wide tour" on Node 1.
>
> b) Each subscriber creates a queue in the system. For example, Alice
> creates a queue noted as "Alice" on Node 2 and binds to exchange U2;
> and Bob creates a queue noted as "Bob" on Node 3 and binds to exchange
> "U2" on Node 2.
>
> c) U2 publishes a message, m1 on Node 1 to exchange "U2"; and RabbitMQ
> will deliver the message m1 to queue "Alice" on Node 2 and to queue
> "Bob" on Node 3.
>
> How we handles the following scenarios?
>
> 1) When U2 wants to publish a message, but Node 1 is done.
>
> 2) When message m1 is delivered to queue "Alice" on Node 2, Node 2
> crashed or the network link between Node 1 (publisher's node) is
> disconnected? Will exchange "U2" on Node 1 persist the message?

No; the message will be lost so far as Alice is concerned.  In AMQP 
terms, you're asking for queues to be replicated.  Rabbit doesn't do 
this, yet.

(Actually, we are working on queue replication right now.  I think you 
would need both replication and some kind of queue migration or 
distribution.)

> 3) After message m1 is arrived on queue "Alice", but the connection
> between Alice and Node 2 is gone, the message will be stored on Node
> 2, right? Next time, Alice connects to the system, but she is
> connected to Node n instead of Node 2, how to handle this?

In Rabbit's clustering as it works now, the messages will be delivered 
across the cluster to Node n.

> 4) What is the multi-cast technology used in RabbitMQ to deliver the
> message to queues located on different locations spreading around
> different countries?

There isn't any right now.  Clustering is really for nodes that are 
co-located and have reliable connections.  It uses Erlang's distribution 
mechanism, which essentially forms a fully-connected graph of nodes.  It 
doesn't really scale beyond a handful of nodes.

There /is/ a plugin called the "shovel", which will relay messages from 
one broker to another.  However, it is statically configured, and 
constrained by using AMQP to do the relaying (i.e., you cannot tell it 
to relay all messages from a direct exchange; only to relay, e.g., 
messages with a particular routing key).

Regards,
Michael