[rabbitmq-discuss] request for help!

Rafael Schloming rafaels at redhat.com
Mon May 10 16:18:25 BST 2010


Tim Fox wrote:
> I've spent this morning going through the 1.0. PR3 spec, firstly, it's 
> considerably simpler than 0.10, which is great news :)
> 
> Here's my 2p:
> 
> One thing I find quite strange is that the core spec doesn't actually 
> seem to mandate any queueing semantics anywhere. I've nothing 
> particularly against that - in fact, the idea that a node can do 
> different types of ordering is actually quite nice, however it's not a 
> queueing protocol. Shouldn't AMQP therefore be renamed to AMTP (Advanced 
> Message Transfer Protocol) ? ;)

The PDF posted actually contains more than one specification. The 
Transport specification would actually be appropriately named AMTP if we 
felt like giving it a name. The Messaging specification starts to 
introduce the basics of store and forward style intermediary nodes, 
although this probably needs to be clarified a little as mentioned in 
some of the other threads here, but the intention is to be able to 
support simple queuing scenarios based solely on the Messaging 
specification (and its dependents).

We do intend to define more specialized node behaviors in layered 
specifications, as well as the specific node types that an "AMQP Broker" 
is required to support. However these layered specifications are still 
in progress.

> On a more serious note, my main concerns are mainly around complexity, 
> and verbosity of the wire format. The latter I suppose is not completely 
> independent from the former.
> 
> Regarding complexity. IMO a large part of the complexity in the spec. 
> seems to come from the way it tries to provide a once and only once 
> delivery guarantee. AIUI the way the spec. implements this guarantee is 
> something like the following when transferring a message from A to B:
> 
> a) message to be sent from A-->B
> b) ack sent back from B-->A
> c) "ack of ack" sent from A-->B - now the delivery tag can be removed 
> from the senders cache
> 
> This results in a complex set of message states, and puts the burden on 
> both sides of the link to maintain a map of delivery tags, which would 
> also have to be persisted in order to provide once and only delivery 
> guarantee in event of failures of node(s). This will also require 
> several syncs to storage at each transition (for durable messaging). 
> I.e. slow

This interaction is only slow if the Sender or Receiver waits for one 
message to be settled before moving onto the next. While the spec 
permits this it does not mandate it, in fact it is really an application 
choice how large a window of unsettled deliveries to permit, e.g. akin 
to a JMS producer choosing sync vs async publish, or a JMS consumer 
choosing sync vs async acknowledgment.

Likewise the choice to persist delivery state is really up to the 
application again, i.e. based on whether the message is persistent or not.

> Perhaps a simpler way of getting the once and only guarantee is to 
> forget the delivery tag altogether and allow the sender to specify a 
> de-duplication-id - this is just a user generated id - e.g. a String or 
> a byte[], (can be generated from user application domain concepts - e.g. 
> order number).
> 
> When sending a message this id can be specified on the transfer. The 
> receiving end can then maintain a de-duplication cache. The 
> de-duplication cache can be implemented as a circular buffer which just 
> overwrites itself when full (this is what we do in HornetQ for reliable 
> bridging between nodes), this means the interaction c) is not necessary 
> or can just be sent intermittently to allow the cache to be cleared. The 
> de-dup cache still requires syncing to non volatile storage to give the 
> once and only once (for durable messages), however it requires less 
> writes than the method described in the spec, and it it has one less 
> interaction (you can get rid of the "ack of ack")

What you're describing above is exactly one of the intended usages of 
the delivery-tag. It is explicitly intended to be open ended so that it 
can be generated from application domain concepts (order number, 
filename, uuid, whatever), and the interaction involving settling the 
transfer state is intended to permit exactly the pattern you describe.

One thing that might be confusing here is that all the examples tend to 
focus on a single isolated transfer which does not convey the 
potentially asynchronous nature of the conversation.

> On recovery after system failure, the sender just blindly sends the 
> messages again, on receipt at the server any messages seen before will 
> just be rejected. No need for reattaching, sending maps of unsettled 
> transfers or other complex stuff like that.
> 
> By removing all this delivery tag book-keeping and session re-attachment 
> stuff, which seems unnecessary to me, would result in a dramatic 
> simplification.

What you're describing here would be a valid implementation, however 
exchanging the unsettled state permits you to avoid unnecessary 
retransmits, and resume large messages part way through. These are 
behaviors that the spec wants to admit.

> Regarding verbosity of the wire format for message transfer; if you're 
> just passing a 12 byte message (e.g. stock price - 4 byte identifier + 8 
> byte price) then the overall encoded size is much higher than 12 bytes.
> 
> This will kill performance for small messages, making any AMQP compliant 
> implementation unable to compete in the world of lightweight 
> publish/subscribe messaging with other, non AMQP implementations which 
> don't have to conform to the AMQP wire format and can produce much more 
> lightweight encodings. The key to perf with lightweight pub/sub is to 
> make the encoded message size as small as possible and cram as many 
> messages as you can into single socket writes.
> 
> Now, lightweight pub/sub may not be the target domain for AMQP, in which 
> case it does not need to worry about it, however if a particular 
> messaging system supports multiple protocols including AMQP, it will not 
> do much for the adoption of AMQP if the best performance is not 
> achievable using the AMQP protocol - users will fall back to using the 
> proprietary protocol offered by the vendor.

The type system explicitly distinguishes between "types" and "encodings" 
in order to be able to allow more efficient encodings of a given type in 
the future.

Right now we've chosen a very flexible encoding because it is invaluable 
to be able to make certain changes (e.g. adding an optional field) and 
still preserve wire compatibility with any existing implementations.

Once the protocol is implemented and we have the benefit of profiling, 
deployment experience, etc, we can, if we choose to, make allowances for 
very small messages in a number of ways, e.g. we could reduce overhead 
by defining a special frame body or even a special frame type for small 
transfers.

> A short comment on transactions. I have to be honest here, I spent about 
> 30 mins reading the chapter on transactions several times. I have to say 
> at the end of it I am not much further understanding it. :(
> 
> However maybe that is moot - a part of me is thinking that transactions 
> don't really belong in the core spec. Perhaps the core spec should be 
> concerned with allowing the reliable movement of messages between nodes. 
> With that in place, transactions could be layered on top in another spec (?)

The transaction portion is actually already a separate specification 
(you'll notice that despite being in a single PDF, Types, Transport, 
Messaging, Transactions, and Security are all labeled "Books" not 
"Sections" or "Chapters").

--Rafael




More information about the rabbitmq-discuss mailing list