[rabbitmq-discuss] GPB and AMQP/RabbitMQ

Wed Mar 25 22:39:39 GMT 2009

On Wed, Mar 25, 2009 at 9:32 PM, Elsner, Robert
<Robert.Elsner at echostar.com> wrote:
>> Interestingly, when GPB came out, ActiveMQ did this precise
>> thing (to have a parallel wire protocol beyond the existing
>> ones that was based on using GPB as the wire representation),
>> and they said that it worked quite well. That being said, it
>> definitely wouldn't work for AMQP given that it is designed
>> as a completely neutral wire representation, but it might be
>> useful for members of the WG to know that many people find
>> the current wire encoding a bit too high-overhead.
>
> I did not know ActiveMQ had that, very interesting.

Bear in mind that I don't know where James went with it, I know that
he had the transport in an experimental branch, but I'm not sure if it
was anything other than a thought experiment for him. If nothing else,
the new transport was both faster and had lower CPU overhead than the
existing one, so he probably should have kept it. :-)

>  The part I have
> enjoyed the most about GPB is that the languages we use (C#, Java,
> Erlang, C) have generated source, operate very similarly, and work
> incredibly well.  GPB versus our current packet format adds a little
> overhead, but the flexibility and extensibility far exceeds what we've
> done thus far.  I know there are other libraries out there, do you know
> if anyone has done anything similar with AMQP and the other formats?

There are plenty of people using Thrift in similar use cases, and some
stuff that I've done with other MOM infrastructure was relatively
similar (which I'll talk about in a little bit).

> I've looked over some of the AMQPWG posts and they do seem to have the
> idea that the overhead can be a little bit extreme.  Perhaps a push from
> the RabbitMQ team to help work this out would be beneficial.  Hopefully
> some good things will come out of their next meeting.

And this is the time. AMQP 1.0 is under public review, so if you have
any comments, I'm sure you can pass them along to Alexis or another
Rabbit team member and they can help provide that feedback.

> I think if I was to go to GPB as a wireformat for AMQP, I would probably
> wait until multicast udp/unicast udp support were added, and use the
> datagram concept of UDP as the message delimiter.  This might be
> beneficial for my use, but adds a higher workload for the brokers.

That's quite an interesting concept, but in general, I think using
something like GPB isn't ideal for a fixed protocol like AMQP. AMQP
has to have at the wire level a very fixed, stable, well known
protocol for interoperability. That means that you're better off
defining the protocol itself in a very compact representation, because
you don't really care about the whole
generation-from-arbitrary-specification side that something like GPB
has, and any backwards compatibility has to be "baked in" to the
specification.

It also has to be a protocol which is particularly subject to hardware
implementation and stream analysis (think Wireshark). That has its own
protocol design requirements that something like GPB (which is a more
general purpose system) can ignore (varint128 is probably not ideal
for a small hardware implementation for example).

So in general, I think smaller framing overhead == Big Win, use of
GPB/Thrift itself == Unnecessary.

> The one factor I don't like about GPB is that it isn't self-describing,
> to really understand a message one needs the .proto or the code
> generated from it.  For something like AMQP it might be more useful to
> the community to pick a binary wireformat that encodes well, which also
> self-describes its message.

This is where some work that I did at my last job may pay off.
Essentially, we rolled our own packed, binary, self-describing,
hierarchical message format. Extremely compact (minimum 10 bytes
overhead), fields were all self-describing with type information,
everything was easy to map into direct CPUs (no varints), and fully
hierarchical. Unfortunately that implementation is trapped at the
former employer, but they've given the go-ahead to support a cleanroom
implementation of the encoding and tools in the open source arena (and
plan on helping with the effort, since they don't want to own it long
term).

I'm more than willing to discuss what this (Fudge) is, since I'm
getting down to the clean room implementation these days if you think
it's interesting (I've gotten a lot of positive feedback). The
ultimate plan is to support a low-level message format (getField,
setField, that type of thing) as well as mapping to/from higher level
message representations (.proto, .xsd) and full XPath lookups.

Kirk